The team analysed the strains using a phylogenetic network – a mathematical algorithm that can map the global movement of organisms through the mutation of their genes
According to a team of scientists led by the University of Cambridge, the first outbreak of the coronavirus could have happened further south than the central Chinese city of Wuhan as early as September.
Investigating the virus' origin, researchers analysed a large number of strains from around the world and calculated that the initial outbreak occurred in a window between September 13 and December 7, reports the South China Morning Post.
On Thursday, a University of Cambridge geneticist Peter Forster said: "The virus may have mutated into its final human-efficient form months ago, but stayed inside a bat or other animal or even human for several months without infecting other individuals."
"Then, it started infecting and spreading among humans between September 13 and December 7, generating the network we present in [the journal] Proceedings of the National Academy of Sciences (PNAS)."
The team analysed the strains using a phylogenetic network – a mathematical algorithm that can map the global movement of organisms through the mutation of their genes. They were still trying to pinpoint the location of patient zero, and were hoping for help from scientists in China, but some early signs were prompting them to look into areas to the south of Wuhan, where coronavirus infections were first reported in December.
"What we reconstruct in the network is the first significant spread among humans," Forster said.
The Cambridge team recently made international headlines with a paper about the virus' evolutionary history.
Published in PNAS this month, it found that most the strains sampled in the United States and Australia were genetically closer to a bat virus than the strains prevalent in patients from across East Asia, and the major European type of the virus was a descendant of the East Asian variant.
But that paper looked only at the first 160 strains collected after late December.
The small sample size limited the researchers' ability to determine when and where the first outbreak actually started.
In their new study, which has not been peer-reviewed, Forster and his colleagues from several institutes including the Institute of Forensic Genetics in Munster, Germany, expanded the database to include 1,001 high-quality full genome sequences released by scientists across the globe.
The more strains analysed, the more precisely they could trace the origin of the virus' global spread.
By counting the mutations, researchers could get closer to working out when the first person was infected by a strain that was closest to bat virus. Sars-CoV-2, the virus that causes Covid-19, originated from bats; it has been found to share 96 per cent identical genes with a coronavirus isolated by Chinese scientists from bat droppings in the southwestern province of Yunnan in 2013.
However, there were hundreds of mutations between Sars-CoV-2 and the one in Yunnan, and a coronavirus usually acquires one mutation per month.
Some scientists have therefore suspected the virus may have been spreading quietly in host animals and humans for years to gradually evolve to a highly adaptive form that could infect humans.
According to the Cambridge team, the first outbreak could be a recent event involving the last few mutations that completed the leap from harmless strain to deadly pathogen.
Origin of the virus has become a politically sensitive issue. US President Donald Trump has repeatedly called the coronavirus "Chinese virus", while Beijing has voiced a conspiracy theory that the virus was made and introduced to China by the American army.
This week, Fox News and CNN reported that the virus could have originated from a biosafety laboratory in Wuhan, quoting unnamed sources in the US government. The lab-origin theory has long been dismissed by the world's top life scientists, because all existing scientific evidence pointed to a natural origin.
The ongoing Cambridge study could shed more light on the issue.
"If I am pressed for an answer, I would say the original spread started more likely in southern China than in Wuhan," Forster said.
"But proof can only come from analysing more bats, possibly other potential host animals, and preserved tissue samples in Chinese hospitals stored between September and December. This kind of research project would help us understand how the transmission happened, and help us prevent similar instances in the future."
Su Bing, a genetic researcher with the Kunming Institute of Zoology in Yunnan, said phylogenetic networks were reliable tools used by gene detectives for decades and had found applications in a wide range of areas including tracing the movement of prehistoric humans.
But the method had its limits, he said. The accuracy of time estimate based on a phylogenetic network is affected by sample size and assumption of mutation speed.
During an unprecedented outbreak, the virus could undergo transformations in unpredictable patterns.
"So it cannot be very precise – there is always a margin for error," he said.
"This work may provide some important clues to future investigations, but the conclusions should be treated with caution."
The Cambridge study also raised some new questions. The first strain isolated and reported by Chinese scientists was actually younger than the original type that caused the outbreak.
US having more strains genetically closer to a bat virus than Wuhan has prompted heated debates in the research community. One explanation, according to Forster, was that the original strain may have first emerged in China but was more adaptive to the American population and environment.