Automated Author Profile

Chen, Charles

0000-0002-2203-0433

Current S-Index

2.7

Sum of Dataset Indices for all datasets

Average Dataset Index per Dataset

0.9

Average Dataset Index per dataset

Total Datasets

3

Total datasets for this author

Average FAIR Score

74.4%

Average FAIR Score per dataset

Total Citations

1

Total citations to the author's datasets

Total Mentions

0

Total mentions of the author's datasets

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Gene-language models are whole genome representation learners (Version: 6)

The language of genetic code embodies a complex grammar and rich syntax of interacting molecular elements. Recent advances in self-supervision and feature learning suggest that statistical learning techniques can identify high-quality quantitative representations from inherent semantic structure. We present a gene-based language model that generates whole-genome vector representations from a population of 16 disease-causing bacterial species by leveraging natural contrastive characteristics between individuals. To achieve this, we developed a set-based learning objective, AB learning, that compares the annotated gene content of two population subsets for use in optimization. Using this foundational objective, we trained a Transformer model to backpropagate information into dense genome vector representations. The resulting bacterial representations, or embeddings, captured important population structure characteristics, like delineations across serotypes and host specificity preferences. Their vector quantities encoded the relevant functional information necessary to achieve state-of-the-art genomic supervised prediction accuracy in 11 out of 12 antibiotic resistance phenotypes.

Authors

  • Naidenov, Bryan ;
  • Chen, Charles
1 Citation0 Mentions69% FAIR1.1 Dataset Index
10.5061/dryad.vx0k6djznFebruary 2024

Douglas-fir exomic SNP file

No description available

Authors

  • Thistlethwaite, Frances R. ;
  • Ratcliffe, Blaise ;
  • Klápště, Jaroslav ;
  • Porth, Ilga ;
  • Chen, Charles ;
  • Stoehr, Michael U. ;
  • El-Kassaby, Yousry A.
0 Citations0 Mentions77% FAIR0.8 Dataset Index
10.5061/dryad.vk048/1January 2017

Douglas-fir phenotypes

No description available

Authors

  • Thistlethwaite, Frances R. ;
  • Ratcliffe, Blaise ;
  • Klápště, Jaroslav ;
  • Porth, Ilga ;
  • Chen, Charles ;
  • Stoehr, Michael U. ;
  • El-Kassaby, Yousry A.
0 Citations0 Mentions77% FAIR0.8 Dataset Index
10.5061/dryad.vk048/2January 2017