Disclaimer

This text is for research and educational purposes only not a decision-making advice. The following calculations are made using voluntarily simplified assumptions.

For any question, please refer to the French public health authority, Santé Publique France, where one can find all official instructions: https://www.gouvernement.fr/info-coronavirus

The World Health Organisation (WHO) provides as well a comprehensive website: https://www.who.int/fr/emergencies/diseases/novel-coronavirus-2019

Context

Since December 2019, the COVID-19 epidemic caused by a previously unknown coronavirus (SARS-Cov-19) is spreading. Initially present in the Wuhan region of China, it was officially declared a pandemic on the 11th of March 2020 by the World Health Organization (WHO).

In France, the first isolated cases were detected in January 2020 among travelers returning from China. Since the 25th of February 2020, the incidence (number of new cases per day) has been strictly positive and increasing, suggesting local transmission of the virus. As of March 11th, 2281 cases had been reported. On March 12th, the French government announced the closure of school facilities starting March 16th.

Basic reproduction number

Using statistical methods, we estimated the basic reproduction number (denoted \(R_0\)) of the epidemic in France, i.e. the number of secondary infections generated by an infected person during his or her entire infectious period.

This is a key number in public health because it determines the magnitude of the epidemic. If the \(R_0\) falls below the threshold of 1, either because of control measures or because a sufficient proportion of the population is immune, the epidemic decreases.

This number of secondary infections varies over the course of an epidemic because fewer and fewer people are susceptible (after recovery, an immune memory is built up) and because public health policies (such as limiting contacts) are implemented We can then measure an instantaneous (or effective) number, hereafter noted \(R(t)\).

Methods

This section is more technical and aims at clarifying the working hypotheses we made.

To estimate these values, we used the number of new cases reported each week in France by the WHO (data retrieved from the website https://ourworldindata.org/coronavirus-source-data), as well as raw data from 28 infector/infected pairs compiled in an article by Nishiura et alii (2020, Int J Infect Dis). Finally, we estimated the \(R_0\) from several epidemic starting dates, in order to determine to what extent it was impacted by the imported cases (which were the majority in the early stafes of the epidemic).

Estimates were made using the R software and the functions est.R0.TD() and est.R0.ML() of the R0 package.

Our underlying assumptions are that:

  • the time for a case to be reported is neglected,

  • the screening strategy in France is assumed to be constant,

  • the spatial structure is neglected,

  • imported cases are not distinguished from non-imported cases,

  • the incidence data used are available from January 21 to March 16 (date of the implementation of strong containment measures in France) for \(R_0\) and to 01 avril 2020 for \(R(t)\).

Results

\(R_0\)

We estimated a classical \(R_0\) by arbitrarily setting a date for the beginning of the epidemic in France (the moment from which imported cases have little influence).

In this simplistic model, the \(R_0\) is assumed to be constant over time (thus neglecting the decrease in the number of susceptible host as well as the implementation of containment policies).

The result of the est.R0.ML() function is:

## [1] "R0 = 2.49  [2.39 ; 2.58]"

It appears that the epidemic spread very rapidly in France, since an infected person infected on average 2.5 persons.

This \(R_0\) estimate depends on the data at which the epidemics took off in France (i.e. when it stopped to depend on imported cases) as well as the available data. The plot below shows the value of \(R_0\) as a function of the date of origin.

We can see the date of origin has little effect on the value of \(R_0\).

On the other hand, the end of interval has a large effect. Indeed, if instead of ending our dataset on March 16 we end it on March 9, while starting on February 27, then the result of the function is.R0.ML is the following :

## [1] "R0 = 3.81  [3.5 ; 4.13]"

We can see that the values of \(R_0\) does not depend strongly on the date of origin chosen for the calculations (between January 29th and February 27th) but that they do depend on the last date of the interval. Indeed, the longer the interval, the smaller the \(R_0\). This makes sense because the more the epidemic unfolds, the more measures are put in place to fight it and the more herd immunity builds. It therefore makes sense to investigate temporal variations in the reproduction number.

Effective reproduction number

As indicated above, the basic reproductive number varies as a result of immunity being built up in the population and of public health measures. We therefore estimated the effective reproductive number (\(R(t)\)) to detect these temporal variations.

In the following plot, the shaded area corresponds to the confidence interval and the black curve to the median. Note that the figure does not show the number of cases but rather the speed of spread of the epidemic. If the number of cases falls below 1 (indicated by dashes), the epidemic is declining.

Initial peaks of \(R(t)\) can be observed corresponding to imported cases that were detected and isolated. These peaks are mainly due to the limited number of cases and can be ignored: with weekly incidences instead of daily incidences, these curves would be much smoother.

From February 28, the epidemic has settled in France (the peak value of \(R(t)\) is an artfect associated to this onset). Since this date, the value of the basic reproduction number seems to have stabilized on a value greater than 2, consistent with our estimate of \(R_0\).

Lock-down situation

On March 16, a national lock down began. To visualise it, we can zoom on more recent value of the effective \(R(t)\). The red line indicates the begining of the lock-down and the grey zone the 95% confidence interval.

We can see that the effect of the lock-down is slow, which partly expected (it takes 5 days for symptoms to appear, which creates a delay). However, the delay to cross the threshold of 1, and therefore to reach the epidemic peak, likely reflects a delay in decreasing contacts in the population.

Importantly, the estimate of the reproductive number has a potentially significant delay because it takes at least 4 days for symptoms to appear in an infected person, plus a delay for that person to be tested and, finally, for the results to be recorded. As best, the most recent value therefore likely reflects the state of the epidemic last week.

This most recent value for the basic reproduction rate is the following:

## [1] "R(t) = 1.28  with 95% of the values in 0.91 and 1.67."

If the upper bound of the \(R(t)\) confidence interval is strictly lower than 1, we are 97.5% certain that, based on our assumptions and the data used, the epidemic is decreasing.

Importantly, in addition to the delays, this estimate very sensitive to variations in screening intensity (if fewer tests are performed, fewer infections are detected, which brings \(R(t)\) down automatically).

Discussion

By analyzing incidence data from France and linking it to existing data collected on transmission chains in China, we estimated that the initial basic reproduction number (\(R_0\)) of the COVID-19 epidemic in France was close to 2.5. In comparison, seasonal influenza’s number is generally close to 1.5. This high value is likely due to the absence of pre-existing immunity in the population.

In addition, an analysis of temporal variations in this reproduction number indicates that the effective reproduction number (\(R(t)\)) has stabilized around values near 2 since the beginning of March. Containment measures in France then lowered this effective \(R\) if we assume that the screening effort has remained constant (but less screening will also decrease \(R\)).

These values are qualitatively consistent with those obtained by Abbott et alii. It should be noted that their estimates were obtained using another software (EpiEstim), although the mathematical and statistical methods are similar. In addition, Abbott et alii take into account a delay in reporting cases and account for imported cases, which we do not. We also used the raw data to estimate the serial interval (i.e. the time elapsed between the onset of symptoms in two individuals, one having infected the other) while they imposed a distribution with a given mean and variance. Finally, as reported by the website Our world in data, there are minor inconsistencies between the time series of incidence and total number of cases circulating on official websites.

Sources and acknowledgements