Automated Author ProfileChen, Charles
0000-0002-2203-0433
Chen, Charles
Current S-Index
Sum of Dataset Indices for all datasets
Average Dataset Index per Dataset
Average Dataset Index per dataset
Total Datasets
Total datasets for this author
Average FAIR Score
Average FAIR Score per dataset
Total Citations
Total citations to the author's datasets
Total Mentions
Total mentions of the author's datasets
S-Index Interpretation
The S-Index (Sharing Index) is a comprehensive metric that represents the cumulative impact of all your datasets. It is calculated as the sum of Dataset Index scores across all your claimed datasets.
What it means:
- A higher S-index indicates greater overall impact of your datasets relative to typical datasets in their fields of research
- The S-Index grows as you add more datasets or as existing datasets gain more citations and mentions
- It provides a single number to track your research data impact over time
Current S-Index: 2.7 (sum of 3 datasets Dataset Index scores)
More information here.
S-Index Over Time
Cumulative Citations Over Time
Cumulative Mentions Over Time
Datasets
The language of genetic code embodies a complex grammar and rich syntax of interacting molecular elements. Recent advances in self-supervision and feature learning suggest that statistical learning techniques can identify high-quality quantitative representations from inherent semantic structure. We present a gene-based language model that generates whole-genome vector representations from a population of 16 disease-causing bacterial species by leveraging natural contrastive characteristics between individuals. To achieve this, we developed a set-based learning objective, AB learning, that compares the annotated gene content of two population subsets for use in optimization. Using this foundational objective, we trained a Transformer model to backpropagate information into dense genome vector representations. The resulting bacterial representations, or embeddings, captured important population structure characteristics, like delineations across serotypes and host specificity preferences. Their vector quantities encoded the relevant functional information necessary to achieve state-of-the-art genomic supervised prediction accuracy in 11 out of 12 antibiotic resistance phenotypes.
Authors
- Naidenov, Bryan ;
- Chen, Charles
No description available
Authors
- Thistlethwaite, Frances R. ;
- Ratcliffe, Blaise ;
- Klápště, Jaroslav ;
- Porth, Ilga ;
- Chen, Charles ;
- Stoehr, Michael U. ;
- El-Kassaby, Yousry A.