The recent emergence of SARS-CoV (2003), MERS-CoV (2013) and SARS-CoV-2 (2020) has demonstrated the severe risk of epidemic posed by coronaviruses. Many isolated strains present in bats or pangolins are very close to SARS-CoV-2 (>96% homology in RaTG13), which suggests that the COVID-19 pandemic has its origins in animals. However, the mechanisms responsible for the jumping of the species barrier from animal to human are unknown.
American researchers at the Regional College of Veterinary Medicine, Blacksburg, have searched the SARS-CoV-2 genome for molecular signatures of positive selection that could be at the origin of this jumping. Using powerful statistical algorithms (OmegaPlus and RAiSD), they analysed 182 792 viral genomes referenced in the GISAID database. They thus identified regions with mutations that have spread but have evolved very little (known as “selective sweep”).
Among the 8 genomic regions detected, 4 are found in the ORF1ab gene and 4 in the spike, the protein that regulates entry into human cells and determines the virus’ targets (or its “tropism”). SARS-CoV-2 belongs to the sarbecovirus family (a sub-category of the β-coronaviruses). It was therefore useful to investigate its specific differences with this family to retrace its genetic history.
One important disparity appeared. The researchers identified an amino acid in the spike protein which differs from one virus to another. This “residue” 372, situated in the RBD (Receptor Binding Domain) of the spike, is a threonine (T) in sarbecoviruses, but an alanine (A) in human SARS-CoV-2. These are therefore different amino acids deriving from a mutation.