Automated Author ProfileWorobey, Michael
University of Arizona
Worobey, Michael
Current S-Index
Sum of Dataset Indices for all datasets
Average Dataset Index per Dataset
Average Dataset Index per dataset
Total Datasets
Total datasets for this author
Average FAIR Score
Average FAIR Score per dataset
Total Citations
Total citations to the author's datasets
Total Mentions
Total mentions of the author's datasets
S-Index Interpretation
The S-Index (Sharing Index) is a comprehensive metric that represents the cumulative impact of all your datasets. It is calculated as the sum of Dataset Index scores across all your claimed datasets.
What it means:
- A higher S-index indicates greater overall impact of your datasets relative to typical datasets in their fields of research
- The S-Index grows as you add more datasets or as existing datasets gain more citations and mentions
- It provides a single number to track your research data impact over time
Current S-Index: 7.0 (sum of 4 datasets Dataset Index scores)
More information here.
S-Index Over Time
Cumulative Citations Over Time
Cumulative Mentions Over Time
Datasets
Understanding of pandemics depends on characterization of pathogen collections from well-defined and demographically diverse cohorts. Since its emergence in Congo almost a century ago, HIV-1 has geographically spread and genetically diversified into distinct viral subtypes. Phylogenetic analysis can be used to reconstruct the ancestry of the virus to inform on the origin and distribution of subtypes. We sequenced two 3.6 kb amplicons of HIV-1 genomes from 3,197 participants in a clinical trial with consistent and uniform sampling at sites across 35 countries and analyzed our data with another 2,632 genomes that comprehensively reflects the HIV-1 genetic diversity. We used maximum likelihood phylogenetic analysis coupled with geographical information to infer the state of ancestors. The majority of our sequenced genomes (n=2,501) were either pure subtypes (A-D, F, G) or CRF01_AE. The diversity and distribution of subtypes across geographical regions differed; United States showed the most homogenous subtype population, whereas African samples were most diverse. We delineated transmission of the four most prevalent subtypes in our dataset (A, B, C, and CRF01_AE), and our results suggest both continuous and frequent transmission of HIV-1 over country borders, as well as single transmission events being the seed of endemic population expansions. Overall, we show that coupling of genetic and geographical information of HIV-1 can be used to understand origin and spread of pandemic pathogens.
Authors
- Bennedbæk, Marc ;
- Zhukova, Anna ;
- Tang, Man-Hung Eric ;
- Bennet, Jaclyn ;
- Munderi, Paula ;
- Ruxrungtham, Kiat ;
- Gisslen, Magnus ;
- Worobey, Michael ;
- Lundgren, Jens D ;
- Marvig, Rasmus L
The emergence of HIV-1 group M subtype B in North American men who have sex with men was a key turning point in the HIV/AIDS pandemic. Phylogenetic studies have suggested cryptic subtype B circulation in the United States (US) throughout the 1970s1, 2 and an even older presence in the Caribbean2. However, these temporal and geographical inferences, based upon partial HIV-1 genomes that postdate the recognition of AIDS in 1981, remain contentious3, 4 and the earliest movements of the virus within the US are unknown. We serologically screened >2,000 1970s serum samples and developed a highly sensitive approach for recovering viral RNA from degraded archival samples. Here, we report eight coding-complete genomes from US serum samples from 1978–1979—eight of the nine oldest HIV-1 group M genomes to date. This early, full-genome ‘snapshot’ reveals that the US HIV-1 epidemic exhibited extensive genetic diversity in the 1970s but also provides strong evidence for its emergence from a pre-existing Caribbean epidemic. Bayesian phylogenetic analyses estimate the jump to the US at around 1970 and place the ancestral US virus in New York City with 0.99 posterior probability support, strongly suggesting this was the crucial hub of early US HIV/AIDS diversification. Logistic growth coalescent models reveal epidemic doubling times of 0.86 and 1.12 years for the US and Caribbean, respectively, suggesting rapid early expansion in each location3. Comparisons with more recent data reveal many of these insights to be unattainable without archival, full-genome sequences. We also recovered the HIV-1 genome from the individual known as ‘Patient 0’ (ref. 5) and found neither biological nor historical evidence that he was the primary case in the US or for subtype B as a whole. We discuss the genesis and persistence of this belief in the light of these evolutionary insights.
Authors
- Worobey, Michael ;
- Watts, Thomas D. ;
- McKay, Richard A. ;
- Suchard, Marc A. ;
- Granade, Timothy ;
- Teuwen, Dirk E. ;
- Koblin, Beryl A. ;
- Heneine, Walid ;
- Lemey, Philippe ;
- Jaffe, Harold W.
Zoonotic infectious diseases such as influenza continue to pose a grave threat to human health. However, the factors that mediate the emergence of RNA viruses such as influenza A virus (IAV) are still incompletely understood. Phylogenetic inference is crucial to reconstructing the origins and tracing the flow of IAV within and between hosts. Here we show that explicitly allowing IAV host lineages to have independent rates of molecular evolution is necessary for reliable phylogenetic inference of IAV and that methods that do not do so, including ‘relaxed’ molecular clock models, can be positively misleading. A phylogenomic analysis using a host-specific local clock model recovers extremely consistent evolutionary histories across all genomic segments and demonstrates that the equine H7N7 lineage is a sister clade to strains from birds—as well as those from humans, swine and the equine H3N8 lineage—sharing an ancestor with them in the mid to late 1800s. Moreover, major western and eastern hemisphere avian influenza lineages inferred for each gene coalesce in the late 1800s. On the basis of these phylogenies and the synchrony of these key nodes, we infer that the internal genes of avian influenza virus (AIV) underwent a global selective sweep beginning in the late 1800s, a process that continued throughout the twentieth century and up to the present. The resulting western hemispheric AIV lineage subsequently contributed most of the genomic segments to the 1918 pandemic virus and, independently, the 1963 equine H3N8 panzootic lineage. This approach provides a clear resolution of evolutionary patterns and processes in IAV, including the flow of viral genes and genomes within and between host lineages.
Authors
- Worobey, Michael ;
- Han, Guan-Zhu ;
- Rambaut, Andrew
Investigations into the evolutionary history of the common chimpanzee, Pan troglodytes, have produced inconsistent results, due to differences in the types of molecular data considered, the model assumptions employed, and the quantity and geographical range of samples used. We amplified and sequenced 24 complete P. troglodytes mitochondrial genomes from fecal samples collected at multiple study sites throughout sub-Saharan Africa. Using a ‘relaxed molecular clock,’ fossil calibrations, and 12 additional complete primate mitochondrial genomes, we analyzed the pattern and timing of primate diversification in a Bayesian framework. Our results support the recognition of four chimpanzee subspecies. Within P. troglodytes, we report a mean (95% highest posterior density (HPD)) time since most recent common ancestor (tMRCA) of 1.026 (0.811-1.263) MYA for the four proposed subspecies, with two major lineages. One of these lineages (tMRCA = 0.510 [0.387-0.650] MYA) contains P. t. verus (tMRCA = 0.155 [0.101-0.213] MYA) and P. t. ellioti (formerly P. t. vellerosus; tMRCA = 0.157 [0.102-0.215] MYA), both of which are monophyletic. The other major lineage contains P. t. schweinfurthii (tMRCA = 0.111 [0.077-0.146] MYA), a monophyletic clade nested within the P. t. troglodytes lineage (tMRCA = 0.380 [0.296-0.476] ¬MYA). We utilized two analysis techniques that may be of widespread interest. First, we implemented a Yule speciation prior across the entire primate tree with separate coalescent priors on each of the chimpanzee subspecies. The validity of this approach was confirmed by estimates based on more traditional techniques. We also suggest that accurate tMRCA estimates from large, computationally difficult sequence alignments may be obtained by implementing our novel method of bootstrapping smaller, randomly sub-sampled alignments.
Authors
- Bjork, Adam ;
- Liu, Weimin ;
- Wertheim, Joel O ;
- Hahn, Beatrice H ;
- Worobey, Michael