How not to make a contact tracing app
Delivered on 23 June 2020 by Justin Pyvis. About a 6 min read.
Attention governments that built coronavirus tracing apps using a centralised model: apologise to taxpayers, toss the code and start over with a decentralised framework (Apple and Google literally provided one for free) or go back to relying solely on manual contact tracing. The latest centralised tracing app catastrophe comes from Norway:
One of the first national coronavirus contacts-tracing apps to be launched in Europe is being suspended in Norway after the country’s data protection authority raised concerns that the software, called “Smittestopp,” poses a disproportionate threat to user privacy — including by continuously uploading people’s location.
Norway's tracing app had a download rate of approximately 10% of the population, well below the levels to be of much use in contact tracing (for some reason people didn't want to install an app that centrally tracks them via GPS). But at least it worked, unlike Australia's COVIDSafe app, which doesn't function properly on iPhones and is still riddled with bugs:
Documents released by the Digital Transformation Agency (DTA) reveal the app's ability to communicate between two locked iPhones was rated as "poor" as of April 26, which means at launch, its own tests suggested it logged encounters at a rate of 25 per cent or below.
One recently patched flaw allowed long-term tracing of phones even if the app was uninstalled. Although a patch fixed the issue, Android users may not be getting the most up-to-date app after developers noticed it would not auto-update if it was already running – a requirement for effective operation.
The Australian National University professor Dr Alwen Tiu told Guardian Australia that he had discovered “a different bug, unrelated to [the previous vulnerability] that has the same effect of extracting a permanent, trackable identity from an Android device”.
He said that this issue has not yet been addressed despite him reporting it to the DTA on the 2 June along with suggested a fix.
When you 'open source' your app but don't allow developers to publicly raise issues or submit pull requests directly in the repository, you destroy an important feedback loop and replace it with bureaucratic incompetence, i.e. layers of middle management that don't understand it but are required to sign off on even a single line of code:
Jim Mussared, one of the developers who has been reporting flaws to the Digital Transformation Agency, expressed his dismay at how the DTA had been “not at all communicative” with developers about the issues.
“It takes them a long time to confirm the issues, many remain unfixed. Many of these issues have been one-line fixes. Additionally there’s been a complete lack of transparency around all aspects of the development of the app,” he said.
Contact tracing apps need to be open source (ticks the trust box) and decentralised (ticks the privacy/security box) for enough people to want to use them. Centralised contact tracing apps Do. Not. Work because the technology can't function in Apple's walled garden, they are not interoperable with cross-border apps, and they create a giant honey pot that malicious states, individuals or even one's own government will spend an eternity trying to access.
At least the United Kingdom acknowledged that fact and abandoned its centralised contact tracing app before it was launched:
In a major U-turn, the UK is ditching the way its current coronavirus-tracing app works and shifting to a model based on technology provided by Apple and Google.
One advantage of the switch - if deployed - is that the NHS Covid-19 app would be able to overcome a limitation of iPhones and carry out Bluetooth "handshakes" when the software is running in the background.
Another is that it should be easier to make the app compatible with other countries' counterparts, which are based on the same system - including the Republic of Ireland and Germany.
Curiously, the UK and Norway were both advised to adopt a centralised framework by Oxford University's Big Data Institute. The consultancy pushed for a centralised model because, like most epidemiological modellers, it failed to consider that humans were involved in the process and many wouldn't want to install an app that potentially violates their privacy, even if it's the first-best from an epidemiological data mining point of view.
But when decisions are made for politics rather than profit, what works is a secondary consideration. You see, UK politicians wanted glory:
They believed it [a centralised app] could gather the information it had collected on contacts into a protected data store, with the potential to be de-anonymized so people could be alerted if they had come in contact with someone who presented coronavirus symptoms or had received a positive test result.
The centralized approach would allow much more data analysis than decentralized models, which give users exposure notifications but don’t allow officials nearly so much access to data.
Ministers were focused on rolling out a “world-beating” app, rather than just a successful one, so that they could claim victory on the world stage. The momentum toward a centralized system became unstoppable—and the challenges of building one were largely brushed aside.
It's not the first Big Data modelling stuff-up during this pandemic, either. The International Institute of Forecasters (IIF) has compiled a long summary of where it all went wrong, including poor data input, incorrect modelling assumptions, high sensitivity of estimates, failure to incorporate epidemiological features (e.g. humans), assuming benefits of available interventions without evidence, a lack of transparency, lack of modelling expertise, group-think and bandwagon effects, selective reporting and looking at only one or a few dimensions of the problem at hand.
Here's a chart provided by the IIF showing predictions for the number of US deaths during week 27 (for ~3 weeks downstream). The eight models displayed ranged from 2419 to 11190 deaths - a 4.5-fold difference - and the spectrum of 95% confidence intervals ranged from fewer than 100 deaths to over 16,000 deaths, almost a 200-fold difference.
The lesson is clear: models are useful as a sensibility check but should under no circumstance be relied upon to inform policy decisions, especially those as drastic as shutting down entire economies. They are wrong far more than they're right.
Enjoy the rest of this week's issue. Cheers,
Other bits of interest
Trump's Section 230 legislation is a lame duck
The changes to Section 230 of the Communications Decency Act proposed by Hawley do not appear to place any new restrictions on how companies define their own moderation policies — only that they stick to, and evenly apply, whatever rules they ultimately decide upon... [However], to counteract this, all any company would need to do is state clearly in its terms of service that it does, in fact, reserve the right to selectively enforce its own rules. Most companies already do this, in one way or another.
- New Legislation Isn’t Going to Make Silicon Valley Sweat
- Two Different Proposals to Amend Section 230 Share A Similar Goal: Damage Online Users’ Speech
A subscription based search engine?
People say they hate advertising but will they pay to get rid of them? For some things, sure, but a search engine, where the adverts are often what you're after anyway? I'm not so sure.
Neeva is a search engine that looks for information on the web as well as personal files like emails and other documents. It will not show any advertisements and it will not collect or profit from user data, he said. It plans to make money on subscriptions from users paying for the service.
The coronavirus was good for some
In absolute terms Amazon was the biggest winner of the pandemic work-from-home shift, adding nearly half a trillion dollars in market cap.
The European Union does something right 👏
The EU has finally agreed on a set of technical specifications that will allow information to be exchanged between national contract tracing apps, so long as they use a decentralised approach (it's technically very, very difficult to do so using a centralised framework. Sorry France).