In Defence of Models. And Modellers

This article originally appeared in The Telegraph

We would all like to be able to predict the future, but in order to understand the future, you have to understand the past.  So is the quandary facing modellers, data analysts, and policy makers.  At the heart of the decisions facing politicians are an epidemic that is doubling, delays, and the very real risks of indecision.

In the hastily brought forward press conference last Saturday, the Prime Minister, flanked by the Chief Scientific Adviser, and the Chief Medical Officer, solemnly stated that the country needs to go back into an albeit less stringent ‘lockdown’.  How had he come to that conclusion, and what models and data had precipitated that press conference?

Looking forward into the future, is SAGE, the Scientific Advisory Group for Emergencies.  But beneath SAGE are several groups, one of which is SPI-M-O: Scientific Pandemic Influenza Group on Modelling, Operational sub-group. As well as estimating the reproduction number, R, they also project the course of the epidemic.

The Civil Contingencies Secretariat, part of the Cabinet Office, commissions SAGE and SPI-M-O to produce an estimate of the Reasonable Worst Case Scenario number of deaths, used for planning purposes – the idea being that you don’t want to prepare for the worst that could happen, but the worst that could reasonably happen. A reasonable worst case scenario is not what you think will happen, but what you want to stop happening.  Only policy can do this, not modelling.

Long Term Projections

Source: Number 10 Press Conference

The slides that accompanied the Number 10 press conference last Saturday included a slide showing projections from early October charting scenarios if there were no changes in policy or behaviour and under a number of assumptions: R remains constant, contacts increase over winter, no additional mitigations over and above those in early October when the projections were made.  It’s important to read the small print on these slides, as many clues lie therein.  Firstly, it shows preliminary, long term scenarios, and secondly, each independent modelling group doesn’t just forecast a trajectory, but a range of possible outcomes, some worse, some better.  All of these modelling groups showed projections where the daily deaths from Covid exceeded those of the first wave peak number of daily deaths.  Taken together, the output from the models show daily deaths peaking during December at a number greater than in the first wave.  The peak is useful for knowing the pressures on the NHS, but it’s the area under the graph that is the most sobering – this shows how many people could sadly die.

One group, PHE/Cambridge, had a projection that was much larger than the others, but importantly, SPI-M was not asked to prepare a consensus projection for daily deaths.  As projections go further into the future, they become less certain – think of the reliability of forecasting next month’s weather as opposed to forecasting tomorrow’s weather.  This is especially true when we are dealing with doubling – things can (and have) got out of hand very quickly, and small changes can have large effects.

None of this is new, of course.  The Academy of Medical Sciences produced a report in mid-July, Preparing for a Challenging Winter, which set out a reasonable worst case scenario number of deaths (excluding those in care homes) of around 119,000, over double the number in the first wave.  But crucially, it also included priorities for prevention and mitigation, including expanding the test, trace, and isolate system in order that it can respond quickly and accurately; and ‘maintaining a comprehensive, population-wide, near-real-time, granular health surveillance system’.  This of course did not happen, with testing capacity exceeded by demand in late August, leading to deterioration in data quality – data that in turn informs the models.

Medium Term Projections

Source: Number 10 Press Conference

As well as long term projections, which are useful for long-term Government planning, medium term projections are made, firstly for daily hospital admissions, and secondly for daily deaths.  These also show a central projection, a line, and a range for the projection. These slides were revised after the Saturday press conference, and now show a smaller shaded area, but the central projection, that solid line, remains the same.

These projections were more recent than the long-term scenarios, as they are needed for immediate planning in the NHS, and therefore need to be updated more frequently.  They are not forecasts or predictions, but represent a scenario where the epidemic is going if it follows current trends.  They do not take into account future policy changes or behaviour changes.  It is impossible for a model to know what will happen, but it does take into account a scenario, what could happen.  Take for example, West Yorkshire going into Tier 3 (Very High) restrictions.  This was due to come into effect last Monday, but it didn’t.  Should models have taken that into account, even if ministers were also uncertain?

But how do we know that the underlying modelling is good? We can look at earlier model projections and see how they coped against the actual – not modelled – data they were projecting.  Here is an earlier version of the hospitalizations graph produced on 6 October.

Source: SAGE / SPI-M-O

The red dots show data from before the projection was produced.  From the projection date, we see, as with the hospitalization projection presented at the press conference, a central projection, the line, and a ranges for the projection – in this case two, a central 50% prediction interval, and a wider, 90% prediction interval.  Importantly, we also see black dots – this is the real data plotted on the graph after the projection was made.  We can see that the black dots track the blue range very closely. (This graph is plotted on a log scale: 100 to 1,000 to 10,000 instead of 200 to 400 to 600, which some people find easier to interpret, particularly when we are dealing with exponential growth – doubling, or indeed exponential decay – halving.)

Models are projections into the future – they have assumptions, and these assumptions should be made clear.  There is also another set of models that inform policy – economic models.  We haven’t seen those.  And although we have seen the minutes and the papers of SAGE and SPI-M, we haven’t seen the minutes of where decisions are actually made, where the advice is considered and policy made.  Those decisions are made in the Cabinet Office Briefing Rooms – COBR.  Decisions made within those walls are not so transparent: Advisers advise, and Ministers decide.  And deciding to do nothing, particularly against scientific advice, is a decision in itself.

Data Sources for COVID-19 Analysis in England

Since the early Number 10 Downing Street press conferences, the data available to analyze the progression of the COVID-19 epidemic in the UK has become somewhat fragmented. This is an overview of the major sources of data.

This shows data for testing, cases, healthcare, and deaths, updated daily.

More detail is shown on each data set, for example cases:

which is broken down by Nation, Region, Upper Tier Local Authority (‘UTLAs’ / counties) and Lower Tier Local Authorities (‘LTLAs’ / districts). Note that some councils, such as Leicester, are Unitary Authorities, and are both UTLAs and LTLAs and their data is disclosed in both UTLA and LTLA data sets. Currently, this only shows the latest data, and cases per 100,000 people (making it easier to compare different sized locations).

This data is shown as a map at Middle Layer Super Output Area (‘MSOA’) level (although the colour bar makes interpretation a little difficult) and shows cases per week. The mean population of a MSOA is around 7,200 people, so the key is not very useful.

Data for UTLAs and LTLAs used to be reported by specimen date, but it is not clear where these data are now available.

Public Health England Surveillance Reports

Public Health England (‘PHE’) produce a COVID-19 surveillance report each week. This now shows a list of watchlist local authorities.

These data are shown on a

together with a chart of the number of English cases per week

and the history of the highest rated UTLAs

There is also a report of the number of ‘incidents’ (what are colloquially described as outbreaks) and where they originate from.

Searchable SAGE (Scientific Advisory Group for Emergencies) Minutes on COVID-19

The early minutes of SAGE (the Scientific Advisory Group for Emergencies) were searchable:

However, later minutes are not.

So, if you are searching for, say “asymptomatic transmission”, you won’t find anything. Which is odd, beacuase the SAGE meeting where this was first mooted, on 28 January, is searchable (the documents listed do not link to SAGE minutes). Or maybe it does not search within the documents themselves.

So I have OCR’d the minutes and they are listed at the end of this page.

Spoiler: the phrase ‘asymptomatic transmission’ is mentioned in SAGE minutes

The minutes are available here:

SAGE minutes to meeting 41 (11 June) are on the next page

Top Local Authorities for COVID-19 Positive Tests

Public Health England have tonight (2 July 2020) released data for the total number of people who have positive COVID-19 results at a local authority level. The data is at Critically, these include both Pillar 1 (NHS and PHE labs) and Pillar 2 (NHS Test & Trace commercial labs) data.

Here are the cumulative totals for upper tier local authorities ranked from top to bottom ranked by rate (i.e. controlling for population). Leicester is top, followed by Oldham, Barnsley, and Bradford. Note these are the cumulative totals, which doesn’t necessarily mean that there are current outbreaks there.

And here is the data for lower tier local authorities. Leicester (a unitary authority) still top, followed by Ashford, Barrow-in-Furness, and Preston.

Public Health England should be applauded for releasing this data, which will enable us to be far more informed about the epidemiology of the disease at a local level.

Spatial Transmission Models PowerPoint Slides presented at EURO2019 (30th European Conference on Opeartional Research), Dublin

These are the slides for my presentation at the 30th European Conference on Operational Research based on my Risk Analysis paper. The paper is available to download here.