Sunday, November 2, 2025

Evaluating transmissibility of Omicron lineages


Monitoring lineages of the Omicron variant of the SARS-CoV-2 virus continues to be an necessary well being consideration. The World Well being Group identifies BA.1, BA.1.1, and the latest BA.2 as the commonest sublineages. A latest examine from Japan, Yamasoba et al. (2022), compares, amongst different traits, the transmissibility of those three Omicron lineages with the most recent Delta variant. It identifies BA.2 to have the best transmissibility of the 4. Preprint of the examine is obtainable at bioarxiv.org. One fascinating side of the examine is the applying of Bayesian multilevel fashions for representing lineage development dynamics. On this publish, I show the right way to use Stata’s bayesmh and bayesstats abstract instructions to carry out related evaluation.

The datasets can be found at github.com. The information had been compiled from the GISAID database, https://www.gisaid.org/hcov19-variants/, on February 1, 2022.

Observations on the variety of instances for every lineage are collected for 11 nations in a interval of 117 days, from October 1, 2021, to January 25, 2022.

. use omicron_ba2

. summarize

    Variable |        Obs        Imply    Std. dev.       Min        Max
-------------+---------------------------------------------------------
         day |      1,185    58.18312     33.1129          1        117
        yba1 |      1,185    377.2878     1301.07          0      10251
       yba11 |      1,185    168.5561    685.6369          0       7378
        yba2 |      1,185    23.60759    128.7147          0       1481
      ydelta |      1,185     1224.78    2513.555          0      12549
-------------+---------------------------------------------------------
     nation |      1,185     6.03038    3.245177          1         11

The variables yba1, yba11, yba2, and ydelta correspond, respectively, to the variety of instances of BA.1, BA.1.1, and BA.2 lineages and the Delta variant. The variables day and nation establish the day and nation, respectively. The next desk reveals the nation codes and the variety of noticed days for every nation.

Id Nation Days
1 Austria 105
2 Denmark 116
3 Germany 116
4 India 117
5 Israel 117
6 Philippines 43
7 Singapore 115
8 South Africa 110
9 Swiden 112
10 United Kingdom 117
11 USA 117

For some nations, the info are incomplete. Particularly, for the Philippines, information can be found for less than 43 days, which, as we’ll see later, impacts the evaluation.

Bayesian multilevel multinomial mannequin

In every of the 11 nations, the viral lineages are recorded because the variety of instances on the noticed days. To formally specify the mannequin, allow us to denote variables yba1, yba11, yba2, and ydelta, respectively, as (y_{1ct}), (y_{2ct}), (y_{3ct}), and (y_{4ct}), the place (c) is the nation code and (t) is the commentary time. (y_{lct}) is assumed to comply with a multinomial distribution with whole variety of instances (n_{ct}). The chance of an incidence of a lineage, (theta_{lct}), for lineage (l), nation (c), and time (t) is given by the four-dimensional logistic operate (softmax) utilized to linear features (a_{lc} + b_{lc}occasions t), the place (a_{lc}) and, (b_{lc}) are random-effect parameters.

Extra formally, for lineages (l=1,dots,4) and nations (c=1,dots,11), we now have
$$
y_{lct} sim {rm Multinomial}(n_{ct}, theta_{lct})
theta_{lct} = frac{{rm exp}(mu_{lct})}{sum_{ok=1}^4 {rm exp}(mu_{kct})}hspace{.5cm}
mu_{lct} = a_{lc} + b_{lc} occasions t hspace{1.5cm}
$$

The chance of observing counts (n_1) by way of (n_4), the place (n_1+n_2+n_3+n_4=n_{ct}), is
$$
P(y_{1ct}=n_1, y_{2ct}=n_2, y_{3ct}=n_3, y_{4ct}=n_4) =
frac{n_{ct}!}{n_1!n_2!n_3!n_4!}
theta_{1ct}^{n_1}theta_{2ct}^{n_2}theta_{3ct}^{n_3}theta_{4ct}^{n_4}
$$
For identifiability, I set the random parameters for the BA.1 lineage, (a_{1c}) and (b_{1c}), to 0. On this case, we now have
$$
theta_{1ct} = frac{1}{1+sum_{ok=2}^4 {rm exp}(mu_{kct})}
theta_{2ct} = frac{{rm exp}(mu_{2ct})}{1+sum_{ok=2}^4 {rm exp}(mu_{kct})}
theta_{3ct} = frac{{rm exp}(mu_{3ct})}{1+sum_{ok=2}^4 {rm exp}(mu_{kct})}
theta_{4ct} = frac{{rm exp}(mu_{4ct})}{1+sum_{ok=2}^4 {rm exp}(mu_{kct})}
$$
and the (lct)-observation of the log chances are
$$
{rm const} + y_{2ct}timesmu_{2ct} + y_{3ct}timesmu_{3ct} + y_{4ct}timesmu_{4ct}
hspace{1cm}- n_{ct}occasions{1+{rm exp}(mu_{2ct})+ {rm exp}(mu_{3ct})+ {rm exp}(mu_{4ct})}
$$
For the aim of sampling the posterior mannequin, we are able to ignore the fixed phrases.

The authors suggest Scholar’s (t)-distributions with 6 levels of freedom for the random slope parameters (b_{lc}) to scale back the results of outliers,
$$
b_{lc} sim t_6(beta_l, sigma_l^2)
$$
and flat priors for the random intercepts (a_{lc}) and hyperparameters (beta_l) and (sigma_l^2). As a substitute, to have a correct prior mannequin, I apply weakly informative priors: (N(0, 100)) for random intercepts (a_{lc})s and random slopes (beta_{l})s and ({rm InvGamma}(0.1,0.1)) for the variance parameters (sigma_l^2)s.

To compute the log chance, I outline a brand new variable, ytotal, for the full quantity instances (n_{ct}).

. gen double ytotal = yba1+yba11+yba2+ydelta

As a result of we don’t have a built-in distribution in bayesmh for this chance mannequin, I exploit the llf() choice to specify it in response to the above components. The llf() possibility requires a univariate mannequin specification, so I exploit ytotal as a dependent variable representing the vector of counts ((y_{1ct},y_{2ct},y_{3ct},y_{4ct})).

The linear mixtures (mu_{2ct}), (mu_{3ct}), and (mu_{4ct}) are specified utilizing the outline() possibility. The latter embrace random intercepts A2[country], A3[country], and A4[country] and random slopes B2[country], B3[country], and B4[country]. The random slopes describe the transmissibility of the lineages as compared with BA.1.

To enhance the sampling, I block among the parameters and supply preliminary values inside the area of the posterior.

. bayesmh ytotal, noconstant chance(llf(                      
>         yba11*{mu2:}+yba2*{mu3:}+ydelta*{mu4:} -                
>         ytotal*ln(1+exp({mu2:})+exp({mu3:})+exp({mu4:}))))      
>         outline(mu2: A2[country] B2[country]#c.day, noconstant)  
>         outline(mu3: A3[country] B3[country]#c.day, noconstant)  
>         outline(mu4: A4[country] B4[country]#c.day, noconstant)  
>         prior({A2}, regular(0, 100))                             
>         prior({A3}, regular(0, 100))                             
>         prior({A4}, regular(0, 100))                             
>         prior({B2}, t({beta2}, {sigma2}, 6))                    
>         prior({B3}, t({beta3}, {sigma3}, 6))                    
>         prior({B4}, t({beta4}, {sigma4}, 6))                    
>         prior({sigma2 sigma3 sigma4}, igamma(0.1, 0.1) break up)   
>         prior({beta2 beta3 beta4}, regular(0, 100) break up)        
>         block({beta2 sigma2})                                   
>         block({beta3 sigma3}) block({beta4 sigma4})             
>         init({beta2 beta3 beta4} rnormal(1, 1)                  
>                 {sigma2 sigma3 sigma4} 1) rseed(17)


Burn-in 2500 aaaaaaaaa1000aaaaaaaaa2000aaaaa performed
Simulation 10000 .........1000.........2000.........3000.........4000.........
> 5000.........6000.........7000.........8000.........9000.........10000 performed

Mannequin abstract
------------------------------------------------------------------------------
Probability:
  ytotal ~ logdensity()

Hyperpriors:
  {A2[country] A3[country] A4[country]} ~ regular(0,100)
                          {B2[country]} ~ t({beta2},{sigma2},6)
                          {B3[country]} ~ t({beta3},{sigma3},6)
                          {B4[country]} ~ t({beta4},{sigma4},6)
                 {sigma2 sigma3 sigma4} ~ igamma(0.1,0.1)
                    {beta2 beta3 beta4} ~ regular(0,100)

Expression:
  expr4 : yba11*xb_mu2+yba2*xb_mu3+ydelta*xb_mu4-ytotal*ln(1+exp(xb_mu2)
          +exp(xb_mu3)+exp(xb_mu4))
------------------------------------------------------------------------------

Bayesian regression                              MCMC iterations  =     12,500
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,500
                                                 MCMC pattern dimension =     10,000
                                                 Variety of obs    =      1,185
                                                 Acceptance fee  =      .2528
                                                 Effectivity:  min =     .02015
                                                              avg =     .06866
Log marginal-likelihood                                       max =     .09676

------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Imply   Std. dev.     MCSE     Median  [95% cred. interval]
-------------+----------------------------------------------------------------
       beta2 |  .0271004     .04504   .001448    .026472  -.0587521   .1217648
      sigma2 |  .0245979   .0132208   .000554   .0207922    .010104   .0613864
       beta3 |  .1527336   .0530795   .001722   .1533625   .0424005   .2561617
      sigma3 |  .0304257   .0174173   .000794   .0260487   .0112527   .0827476
       beta4 | -.1673996   .0474555   .001539  -.1678558  -.2638208  -.0749281
      sigma4 |  .0264834   .0164447   .001159   .0227708    .010781   .0628222
------------------------------------------------------------------------------

There aren’t any apparent convergence points with the mannequin, and the 7% common sampling effectivity for the hyperparameters is sweet.

The constructive estimate for beta2 and beta3 means that BA.1.1 and BA.2 lineages are each extra transmittable than BA.1. Alternatively, the posterior imply estimate for beta4 is damaging, (-0.17), implying much less transmittability of Delta as compared with the unique Omicron BA.1.

Relative efficient replica by nation

The authors of the examine calculate the relative efficient replica quantity in response to the components
$$
r_{lc} = {rm exp}(gammabeta_{lc})
$$
and the common relative efficient replica quantity throughout nations in response to
$$
r_{l} = {rm exp}(gammabeta_{l})
$$
the place (gamma) is the viral technology time, 2.1 days; see Estimating Technology Time of Omicron.

We estimate the replica numbers (r_l) utilizing the bayesstats abstract command. As a result of we assumed BA.1 lineage to be the baseline, the replica numbers are relative to BA.1.

. bayesstats abstract (rba11:exp(2.1*{beta2})) 
>         (rba2:exp(2.1*{beta3})) (rdelta:exp(2.1*{beta4}))

Posterior abstract statistics                      MCMC pattern dimension =    10,000

       rba11 : exp(2.1* { beta2 } )
        rba2 : exp(2.1* { beta3 } )
      rdelta : exp(2.1* { beta4 } )

------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Imply   Std. dev.     MCSE     Median  [95% cred. interval]
-------------+----------------------------------------------------------------
       rba11 |  1.063311   .1011051   .003258   1.057165   .8839283   1.291373
        rba2 |   1.38671   .1545093    .00502   1.379969   1.093126   1.712475
      rdelta |  .7070996   .0705203   .002292   .7029306    .574633   .8544061
------------------------------------------------------------------------------

Based mostly on posterior imply estimates, the replica of BA.1.1 is about 6% larger, BA.2 is about 39% larger, and Delta is about 30% decrease than that of BA.1. These outcomes agree with the findings within the paper that BA.2 has 1.4-fold larger efficient replica than that of BA.1.

Beneath, I present caterpillar plots for the relative efficient replica numbers by nation. Plotted are the posterior imply values together with 95% credible intervals.

The primary plot compares BA.1.1 with BA.1.

. gen id = mod(_n-1, 11)+1

. quietly bayesstats abstract (exp(2.1*{B2[country]}))

. mat bsum = r(abstract)

. mat bsum = (bsum[1..11,1], bsum[1..11,5], bsum[1..11,6])

. svmat bsum

. twoway scatter bsum1 id || rspike bsum3 bsum2 id, legend(off)   
>         xla(1 2 3 4 5 6 7 8 9 10 11, valuelabel) xsc(r(1 11))   
>         xtitle("") title("BA.1.1 vs BA.1")

. drop bsum*

All of the values are larger than 1 apart from nation 1, Austria. Additionally obvious is the broad credible interval for nation 6, Philippines, due, in all probability, to the a lot fewer commentary days accessible, solely 43, as compared with the opposite nations, 110 or extra.

The second plot compares BA.2 with BA.1.

. quietly bayesstats abstract (exp(2.1*{B3[country]}))

. mat bsum = r(abstract)

. mat bsum = (bsum[1..11,1], bsum[1..11,5], bsum[1..11,6])

. svmat bsum

. twoway scatter bsum1 id || rspike bsum3 bsum2 id, legend(off)   
>         xla(1 2 3 4 5 6 7 8 9 10 11, valuelabel) xsc(r(1 11))   
>         xtitle("") title("BA.2 vs BA.1")

. drop bsum*

graph1

Once more, the replica quantity for the Philippines has larger variability and tends to shift the common across-country replica upward.

The third plot compares Delta with BA.1.

. quietly bayesstats abstract (exp(2.1*{B4[country]}))

. mat bsum = r(abstract)

. mat bsum = (bsum[1..11,1], bsum[1..11,5], bsum[1..11,6])

. svmat bsum

. twoway scatter bsum1 id || rspike bsum3 bsum2 id, legend(off)   
>         xla(1 2 3 4 5 6 7 8 9 10 11, valuelabel) xsc(r(1 11))   
>         xtitle("") title("Delta vs BA.1")

. drop bsum*

graph1

As we’d anticipate, all replica numbers are lower than 1, and the replica quantity for the Philippines has larger variability than the others.

Sensitivity evaluation

The authors of the examine use Scholar’s (t)-prior with 6 levels of freedom for the random slopes (b_{lc}). It’s fascinating to examine whether or not altering the prior impacts the ends in any substantive method.

Beneath, I match fashions utilizing (t)-priors with 2 and 12 levels of freedom for (b_{lc}).

. quietly bayesmh ytotal, noconstant chance(llf(              
>         yba11*{mu2:}+yba2*{mu3:}+ydelta*{mu4:} -                
>         ytotal*ln(1+exp({mu2:})+exp({mu3:})+exp({mu4:}))))      
>         outline(mu2: A2[country] B2[country]#c.day, noconstant)  
>         outline(mu3: A3[country] B3[country]#c.day, noconstant)  
>         outline(mu4: A4[country] B4[country]#c.day, noconstant)  
>         prior({A2}, regular(0, 100))                             
>         prior({A3}, regular(0, 100))                             
>         prior({A4}, regular(0, 100))                             
>         prior({B2}, t({beta2}, {sigma2}, 2))                    
>         prior({B3}, t({beta3}, {sigma3}, 2))                    
>         prior({B4}, t({beta4}, {sigma4}, 2))                    
>         prior({sigma2}{sigma3}{sigma4}, igamma(0.1, 0.1) break up) 
>         prior({beta2}{beta3}{beta4}, regular(0, 100) break up)      
>         block({beta2}{sigma2})                                  
>         block({beta3}{sigma3}) block({beta4}{sigma4})           
>         init({beta2}{beta3}{beta4} rnormal(1, 1)                
>                 {sigma2}{sigma3}{sigma4} 1)                     
>         burnin(5000) rseed(17)

. bayesstats abstract (r2:exp(2.1*{beta2})) 
>         (r3:exp(2.1*{beta3})) (r4:exp(2.1*{beta4}))

Posterior abstract statistics                      MCMC pattern dimension =    10,000

          r2 : exp(2.1* { beta2 } )
          r3 : exp(2.1* { beta3 } )
          r4 : exp(2.1* { beta4 } )

------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Imply   Std. dev.     MCSE     Median  [95% cred. interval]
-------------+----------------------------------------------------------------
          r2 |  1.067439   .0938958   .003041   1.060892   .8874384   1.262244
          r3 |  1.394932   .1392574   .006867   1.381163   1.163633   1.702113
          r4 |   .700393   .0654377   .001962   .6977034   .5754148   .8349086
------------------------------------------------------------------------------

. quietly bayesmh ytotal, noconstant chance(llf(              
>         yba11*{mu2:}+yba2*{mu3:}+ydelta*{mu4:} -                
>         ytotal*ln(1+exp({mu2:})+exp({mu3:})+exp({mu4:}))))      
>         outline(mu2: A2[country] B2[country]#c.day, noconstant)  
>         outline(mu3: A3[country] B3[country]#c.day, noconstant)  
>         outline(mu4: A4[country] B4[country]#c.day, noconstant)  
>         prior({A2}, regular(0, 100))                             
>         prior({A3}, regular(0, 100))                             
>         prior({A4}, regular(0, 100))                             
>         prior({B2}, t({beta2}, {sigma2}, 12))                   
>         prior({B3}, t({beta3}, {sigma3}, 12))                   
>         prior({B4}, t({beta4}, {sigma4}, 12))                   
>         prior({sigma2}{sigma3}{sigma4}, igamma(0.1, 0.1) break up) 
>         prior({beta2}{beta3}{beta4}, regular(0, 100) break up)      
>         block({beta2}{sigma2})                                  
>         block({beta3}{sigma3}) block({beta4}{sigma4})           
>         init({beta2}{beta3}{beta4} rnormal(1, 1)                
>                 {sigma2}{sigma3}{sigma4} 1)                     
>         burnin(5000) rseed(17)

. bayesstats abstract (r2:exp(2.1*{beta2})) 
>         (r3:exp(2.1*{beta3})) (r4:exp(2.1*{beta4}))

Posterior abstract statistics                      MCMC pattern dimension =    10,000

          r2 : exp(2.1* { beta2 } )
          r3 : exp(2.1* { beta3 } )
          r4 : exp(2.1* { beta4 } )

------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Imply   Std. dev.     MCSE     Median  [95% cred. interval]
-------------+----------------------------------------------------------------
          r2 |  1.068262   .1032594    .00325   1.061218   .8759179   1.302069
          r3 |  1.414598     .15648   .005184    1.40231    1.14427   1.765012
          r4 |  .7024785   .0725748   .001915   .6985938    .574165   .8566458
------------------------------------------------------------------------------

I additionally match a mannequin with regular priors for the random slopes.

. quietly bayesmh ytotal, noconstant chance(llf(              
>         yba11*{mu2:}+yba2*{mu3:}+ydelta*{mu4:} -                
>         ytotal*ln(1+exp({mu2:})+exp({mu3:})+exp({mu4:}))))      
>         outline(mu2: A2[country] B2[country]#c.day, noconstant)  
>         outline(mu3: A3[country] B3[country]#c.day, noconstant)  
>         outline(mu4: A4[country] B4[country]#c.day, noconstant)  
>         prior({A2}, regular(0, 100))                             
>         prior({A3}, regular(0, 100))                             
>         prior({A4}, regular(0, 100))                             
>         prior({B2}, regular({beta2}, {sigma2}))                  
>         prior({B3}, regular({beta3}, {sigma3}))                  
>         prior({B4}, regular({beta4}, {sigma4}))                  
>         prior({sigma2}{sigma3}{sigma4}, igamma(0.1, 0.1) break up) 
>         prior({beta2}{beta3}{beta4}, regular(0, 100) break up)      
>         block({beta2}{sigma2}{beta3}{sigma3}{beta4}{sigma4},    
>                 break up gibbs)                                    
>         init({beta2}{beta3}{beta4} rnormal(1, 1)                
>                 {sigma2}{sigma3}{sigma4} 1) rseed(17)

. bayesstats abstract (r2:exp(2.1*{beta2})) 
>         (r3:exp(2.1*{beta3})) (r4:exp(2.1*{beta4}))

Posterior abstract statistics                      MCMC pattern dimension =    10,000

          r2 : exp(2.1* { beta2 } )
          r3 : exp(2.1* { beta3 } )
          r4 : exp(2.1* { beta4 } )

------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Imply   Std. dev.     MCSE     Median  [95% cred. interval]
-------------+----------------------------------------------------------------
          r2 |  1.066153   .1080605   .001081   1.061309   .8664183   1.292496
          r3 |  1.405522    .159427   .010062    1.39786   1.120683   1.746765
          r4 |  .7089474    .075455   .000755   .7048609   .5734278   .8714144
------------------------------------------------------------------------------

As evident from the summaries, there aren’t any substantive modifications within the replica quantity estimates, which reveals that our preliminary mannequin is strong to the prior specification of random slopes.

As we noticed above, the replica quantity estimates for the Philippines have excessive variability due to the smaller pattern. It’s fascinating to examine how these estimates change after excluding the Philippines from the estimation pattern.

. quietly bayesmh ytotal if nation != 6, noconstant chance(llf(      
>         yba11*{mu2:}+yba2*{mu3:}+ydelta*{mu4:} -                        
>         ytotal*ln(1+exp({mu2:})+exp({mu3:})+exp({mu4:}))))              
>         outline(mu2: A2[country] B2[country]#c.day, noconstant)          
>         outline(mu3: A3[country] B3[country]#c.day, noconstant)          
>         outline(mu4: A4[country] B4[country]#c.day, noconstant)          
>         prior({A2}, regular(0, 100))                                     
>         prior({A3}, regular(0, 100))                                     
>         prior({A4}, regular(0, 100))                                     
>         prior({B2}, t({beta2}, {sigma2}, 6))                            
>         prior({B3}, t({beta3}, {sigma3}, 6))                            
>         prior({B4}, t({beta4}, {sigma4}, 6))                            
>         prior({sigma2}{sigma3}{sigma4}, igamma(0.1, 0.1) break up)         
>         prior({beta2}{beta3}{beta4}, regular(0, 100) break up)              
>         block({beta2}{sigma2})                                          
>         block({beta3}{sigma3}) block({beta4}{sigma4})                   
>         init({sigma2}{sigma3}{sigma4} 1)                                
>         burnin(5000) rseed(17)

. bayesstats abstract (rba11:exp(2.1*{beta2})) 
>         (rba2:exp(2.1*{beta3})) (rdelta:exp(2.1*{beta4}))

Posterior abstract statistics                      MCMC pattern dimension =    10,000

       rba11 : exp(2.1* { beta2 } )
        rba2 : exp(2.1* { beta3 } )
      rdelta : exp(2.1* { beta4 } )

------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Imply   Std. dev.     MCSE     Median  [95% cred. interval]
-------------+----------------------------------------------------------------
       rba11 |  1.101936   .1422373   .006638   1.089014   .8338979   1.415828
        rba2 |  1.313135   .1978823   .006081   1.302977   .9514345   1.731837
      rdelta |  .7242147   .1055362   .010449   .7109377   .5368485   .9596856
------------------------------------------------------------------------------

Now the posterior estimates for the three relative replica numbers all get nearer to 1. Particularly, the replica of BA.2 as compared with BA.1.1, when it comes to posterior imply, drops from 1.4 to 1.3, thus suggesting extra average improve in transmissibility. As well as, the 95% credible intervals for the relative replica of each BA.1.1 and BA.2 now embrace 1.

Reference

Yamasoba, D., I. Kimura, H. Nasser, Y. Morioka, N. Nao, J. Ito, Ok. Uriu, et al. 2022. Virological traits of SARS-CoV-2 BA.2 variant. https://doi.org/10.1101/2022.02.14.480335.



Related Articles

Latest Articles