Automated Author ProfileTaboada Ramírez, Blanca Itzelt
Universidad Nacional Autónoma de México
Taboada Ramírez, Blanca Itzelt
Current S-Index
Sum of Dataset Indices for all datasets
Average Dataset Index per Dataset
Average Dataset Index per dataset
Total Datasets
Total datasets for this author
Average FAIR Score
Average FAIR Score per dataset
Total Citations
Total citations to the author's datasets
Total Mentions
Total mentions of the author's datasets
S-Index Interpretation
The S-Index (Sharing Index) is a comprehensive metric that represents the cumulative impact of all your datasets. It is calculated as the sum of Dataset Index scores across all your claimed datasets.
What it means:
- A higher S-index indicates greater overall impact of your datasets relative to typical datasets in their fields of research
- The S-Index grows as you add more datasets or as existing datasets gain more citations and mentions
- It provides a single number to track your research data impact over time
Current S-Index: 4.1 (sum of 9 datasets Dataset Index scores)
More information here.
S-Index Over Time
Cumulative Citations Over Time
Cumulative Mentions Over Time
Datasets
K-FluDB: A Novel K-Mer Based Database for Enhanced Genomic Surveillance of Influenza A VirusesK-FluDB is a compressed database composed of distinct sub-sequences specific to 50 influenza A subtypes. It includes subtype-specific sequences for all 18 hemagglutinin (HA) and 11 neuraminidase (NA) subtypes. The original influenza sequences were obtained from the NCBI database on May 8, 2022, comprising a total of 895,900 Influenza A sequences.To generate this database, sequences were first subsampled based on genomic segment and variant group, resulting in 81,262 sequences. These sequences served as input for the PanGen-InfluenzaA tool (GitHub), which constructs pangenomes by identifying both subtype-specific sequences and sequences shared across multiple subtypes.Repository ContentsThis repository contains a ZIP archive with three folders, each corresponding to pangenome datasets designed for reads of 75, 150, and 300 nucleotides in length. Each folder includes the following files:Dispensable data files (1_dispensable.fasta to 8_dispensable.fasta):These files contain the dispensable genomic fragments for each of the eight segments of the Influenza A virus.Subtype-specific data files (1_specific.fasta to 8_specific.fasta):These files contain the subtype-specific genomic fragments for each segment. The recommended files for mapping against genomic reads are 4_specific.fasta and 6_specific.fasta, corresponding to segments 4 and 6, which are the targets commonly used for Influenza A subtyping.Within each core directory (75, 150, and 300), three distinct subdirectories—specific, pangenome, and dispensable—contain the pangenome files stratified by subtype. Specifically, the files located in the specific subdirectories for segments 4 and 6 are the recommended datasets for use in subtype identification during subsequent genomic analysis.Compression Efficiency and Classification AccuracyK-FluDB achieves a relative compression index of 96.54% when using the complete pangenome and 99.64% when considering only subtype-specific sequences. The average precision for correctly classifying Hx and Nx subtypes using the subtype-specific sequences is 99.2% and 99.71%, respectively.This database provides a highly efficient and accurate resource for influenza A subtype classification while significantly reducing the storage and computational requirements associated with full-genome analyses.AcknowledgementsThis work has been supported by the Universidad Nacional Autónoma de México grant number [PAPIIT-DGAPA-IN230523] granted to Blanca Taboada and Secretaría de Educación, Ciencia, Tecnología e Innovación de la Ciudad de México with grant number [SECTEI/138/2024] granted to Selene Zárate.The first author gratefully acknowledges the scholarship provided by CONAHCYT. We also extend our sincere appreciation to the National Autonomous University of Mexico (UNAM) for grant-ing access to the MIZTLI supercomputer, supported by the Gen-eral Directorate of Computing and Information and Communica-tion Technologies (DGTIC) through project LANCAD-UNAM-DGTIC-350. Lastly, we wish to thank Jerome Verleyen, Juan Manuel Hurtado, and Roberto Bahena from UNAM’s Instituto de Biotecnología for their indispensable assistance with computation-al support.
Authors
- Uscanga Junco, Oscar Alejandro ;
- Taboada Ramírez, Blanca Itzelt ;
- Díaz González, Lorena
K-FluDB: A Novel K-Mer Based Database for Enhanced Genomic Surveillance of Influenza A VirusesK-FluDB is a compressed database composed of distinct sub-sequences specific to 50 influenza A subtypes. It includes unique sequences for all 18 hemagglutinin (HA) and 11 neuraminidase (NA) subtypes. The original influenza sequences were obtained from the NCBI database on May 8, 2022, comprising a total of 895,900 Influenza A sequences.To generate this database, sequences were first subsampled based on genomic segment and variant group, resulting in 81,262 sequences. These sequences served as input for the PanGen-InfluenzaA tool (GitHub), which constructs pangenomes by identifying both unique subtype-specific sequences and sequences shared across multiple subtypes.Repository ContentsThis repository contains a ZIP archive with three folders, each corresponding to pangenome datasets designed for reads of 74, 150, and 300 nucleotides in length. Each folder includes the following files:Dispensable data files (1_dispensable.fasta to 8_dispensable.fasta):These files contain the dispensable genomic fragments for each of the eight segments of the Influenza A virus.Unique data files (1_unique.fasta to 8_unique.fasta):These files contain the unique genomic fragments for each segment. The recommended files for mapping against genomic reads are 4_unique.fasta and 6_unique.fasta, corresponding to segments 4 and 6, which are the targets commonly used for Influenza A subtyping.pangenome.fasta:This file contains the combined set of unique and dispensable genomic fragments for all eight segments.unique.fasta:This file includes only the unique genomic fragments for all eight segments.Within each core directory (75, 150, and 300), three distinct subdirectories—unique, pangenome, and dispensable—contain the pangenome files stratified by subtype. Specifically, the files located in the unique subdirectories for segments 4 and 6 are the recommended datasets for use in subtype identification during subsequent genomic analysis.Compression Efficiency and Classification AccuracyK-FluDB achieves a relative compression index of 96.54% when using the complete pangenome and 99.64% when considering only subtype-specific sequences. The average precision for correctly classifying Hx and Nx subtypes using the unique sequences is 99.2% and 99.71%, respectively.This database provides a highly efficient and accurate resource for influenza A subtype classification while significantly reducing the storage and computational requirements associated with full-genome analyses.AcknowledgementsThis research was partially supported by grants by PAPIIT-DGAPA-IN230523 awarded to BT. The first author gratefully acknowledges the scholarship provided by CONAHCYT. We also extend our sincere appreciation to the National Autonomous University of Mexico (UNAM) for granting access to the MIZTLI supercomputer, supported by the General Directorate of Computing and Information and Communication Technologies (DGTIC) through project LANCAD-UNAM-DGTIC-350.
Authors
- Uscanga Junco, Oscar Alejandro ;
- Taboada Ramírez, Blanca Itzelt ;
- Díaz González, Lorena
K-FluDB: A Novel K-Mer Based Database for Enhanced Genomic Surveillance of Influenza A VirusesK-FluDB is a compressed database composed of distinct sub-sequences specific to 50 influenza A subtypes. It includes subtype-specific sequences for all 18 hemagglutinin (HA) and 11 neuraminidase (NA) subtypes. The original influenza sequences were obtained from the NCBI database on May 8, 2022, comprising a total of 895,900 Influenza A sequences.To generate this database, sequences were first subsampled based on genomic segment and variant group, resulting in 81,262 sequences. These sequences served as input for the PanGen-InfluenzaA tool (GitHub), which constructs pangenomes by identifying both subtype-specific sequences and sequences shared across multiple subtypes.Repository ContentsThis repository contains a ZIP archive with three folders, each corresponding to pangenome datasets designed for reads of 75, 150, and 300 nucleotides in length. Each folder includes the following files:Dispensable data files (1_dispensable.fasta to 8_dispensable.fasta):These files contain the dispensable genomic fragments for each of the eight segments of the Influenza A virus.Subtype-specific data files (1_specific.fasta to 8_specific.fasta):These files contain the subtype-specific genomic fragments for each segment. The recommended files for mapping against genomic reads are 4_specific.fasta and 6_specific.fasta, corresponding to segments 4 and 6, which are the targets commonly used for Influenza A subtyping.Within each core directory (75, 150, and 300), three distinct subdirectories—specific, pangenome, and dispensable—contain the pangenome files stratified by subtype. Specifically, the files located in the specific subdirectories for segments 4 and 6 are the recommended datasets for use in subtype identification during subsequent genomic analysis.Compression Efficiency and Classification AccuracyK-FluDB achieves a relative compression index of 96.54% when using the complete pangenome and 99.64% when considering only subtype-specific sequences. The average precision for correctly classifying Hx and Nx subtypes using the subtype-specific sequences is 99.2% and 99.71%, respectively.This database provides a highly efficient and accurate resource for influenza A subtype classification while significantly reducing the storage and computational requirements associated with full-genome analyses.AcknowledgementsThis research was partially supported by grants by PAPIIT-DGAPA-IN230523 awarded to BT. The first author gratefully acknowledges the scholarship provided by CONAHCYT. We also extend our sincere appreciation to the National Autonomous University of Mexico (UNAM) for granting access to the MIZTLI supercomputer, supported by the General Directorate of Computing and Information and Communication Technologies (DGTIC) through project LANCAD-UNAM-DGTIC-350.
Authors
- Uscanga Junco, Oscar Alejandro ;
- Taboada Ramírez, Blanca Itzelt ;
- Díaz González, Lorena
K-FluDB: A Novel K-Mer Based Database for Enhanced Genomic Surveillance of Influenza A VirusesK-FluDB is a compressed database composed of distinct sub-sequences specific to 50 influenza A subtypes. It includes unique sequences for all 18 hemagglutinin (HA) and 11 neuraminidase (NA) subtypes. The original influenza sequences were obtained from the NCBI database on May 8, 2022, comprising a total of 895,900 Influenza A sequences.To generate this database, sequences were first subsampled based on genomic segment and variant group, resulting in 81,262 sequences. These sequences served as input for the PanGen-InfluenzaA tool (GitHub), which constructs pangenomes by identifying both unique subtype-specific sequences and sequences shared across multiple subtypes.Repository ContentsThis repository contains a ZIP archive with three folders, each corresponding to pangenome datasets designed for reads of 74, 150, and 300 nucleotides in length. Each folder includes the following files:Dispensable data files (1_dispensable.fasta to 8_dispensable.fasta):These files contain the dispensable genomic fragments for each of the eight segments of the Influenza A virus.Unique data files (1_unique.fasta to 8_unique.fasta):These files contain the unique genomic fragments for each segment. The recommended files for mapping against genomic reads are 4_unique.fasta and 6_unique.fasta, corresponding to segments 4 and 6, which are the targets commonly used for Influenza A subtyping.pangenome.fasta:This file contains the combined set of unique and dispensable genomic fragments for all eight segments.unique.fasta:This file includes only the unique genomic fragments for all eight segments.4_6_unique.fasta:This file incudes unique pangenomic pieces only for segments 4 and 6, and contains all 18 Htypes and 11Ntypes, this is the recomended file to use for Influenza A subtyping purposes.Compression Efficiency and Classification AccuracyK-FluDB achieves a relative compression index of 96.54% when using the complete pangenome and 99.64% when considering only subtype-specific sequences. The average precision for correctly classifying Hx and Nx subtypes using the unique sequences is 99.2% and 99.71%, respectively.This database provides a highly efficient and accurate resource for influenza A subtype classification while significantly reducing the storage and computational requirements associated with full-genome analyses.AcknowledgementsThis research was partially supported by grants by PAPIIT-DGAPA-IN230523 awarded to BT. The first author gratefully acknowledges the scholarship provided by CONAHCYT. We also extend our sincere appreciation to the National Autonomous University of Mexico (UNAM) for granting access to the MIZTLI supercomputer, supported by the General Directorate of Computing and Information and Communication Technologies (DGTIC) through project LANCAD-UNAM-DGTIC-350.
Authors
- Uscanga Junco, Oscar Alejandro ;
- Taboada Ramírez, Blanca Itzelt ;
- Díaz González, Lorena
K-FluDB: A Novel K-Mer Based Database for Enhanced Genomic Surveillance of Influenza A VirusesK-FluDB is a compressed database composed of distinct sub-sequences specific to 50 influenza A subtypes. It includes unique sequences for all 18 hemagglutinin (HA) and 11 neuraminidase (NA) subtypes. The original influenza sequences were obtained from the NCBI database on May 8, 2022, comprising a total of 895,900 Influenza A sequences.To generate this database, sequences were first subsampled based on genomic segment and variant group, resulting in 81,262 sequences. These sequences served as input for the PanGen-InfluenzaA tool (GitHub), which constructs pangenomes by identifying both unique subtype-specific sequences and sequences shared across multiple subtypes.Repository ContentsThis repository contains a ZIP archive with three folders, each corresponding to pangenome datasets designed for reads of 74, 150, and 300 nucleotides in length. Each folder includes the following files:Dispensable data files (1_dispensable.fasta to 8_dispensable.fasta):These files contain the dispensable genomic fragments for each of the eight segments of the Influenza A virus.Unique data files (1_unique.fasta to 8_unique.fasta):These files contain the unique genomic fragments for each segment. The recommended files for mapping against genomic reads are 4_unique.fasta and 6_unique.fasta, corresponding to segments 4 and 6, which are the targets commonly used for Influenza A subtyping.pangenome.fasta:This file contains the combined set of unique and dispensable genomic fragments for all eight segments.unique.fasta:This file includes only the unique genomic fragments for all eight segments.Compression Efficiency and Classification AccuracyK-FluDB achieves a relative compression index of 96.54% when using the complete pangenome and 99.64% when considering only subtype-specific sequences. The average precision for correctly classifying Hx and Nx subtypes using the unique sequences is 99.2% and 99.71%, respectively.This database provides a highly efficient and accurate resource for influenza A subtype classification while significantly reducing the storage and computational requirements associated with full-genome analyses.AcknowledgementsThis research was partially supported by grants by PAPIIT-DGAPA-IN230523 awarded to BT. The first author gratefully acknowledges the scholarship provided by CONAHCYT. We also extend our sincere appreciation to the National Autonomous University of Mexico (UNAM) for granting access to the MIZTLI supercomputer, supported by the General Directorate of Computing and Information and Communication Technologies (DGTIC) through project LANCAD-UNAM-DGTIC-350.
Authors
- Uscanga Junco, Oscar Alejandro ;
- Taboada Ramírez, Blanca Itzelt ;
- Díaz González, Lorena
K-FluDB: A Novel K-Mer Based Database for Enhanced Genomic Surveillance of Influenza A VirusesK-FluDB is a compressed database composed of distinct sub-sequences specific to 50 influenza A subtypes. It includes unique sequences for all 18 hemagglutinin (HA) and 11 neuraminidase (NA) subtypes. The original influenza sequences were obtained from the NCBI database on May 8, 2022, comprising a total of 895,900 Influenza A sequences.To generate this database, sequences were first subsampled based on genomic segment and variant group, resulting in 81,262 sequences. These sequences served as input for the PanGen-InfluenzaA tool (GitHub), which constructs pangenomes by identifying both unique subtype-specific sequences and sequences shared across multiple subtypes.Repository ContentsThis repository contains a ZIP archive with three folders, each corresponding to pangenome datasets designed for reads of 74, 150, and 300 nucleotides in length. Each folder includes the following files:Dispensable data files (1_dispensable.fasta to 8_dispensable.fasta):These files contain the dispensable genomic fragments for each of the eight segments of the Influenza A virus.Unique data files (1_unique.fasta to 8_unique.fasta):These files contain the unique genomic fragments for each segment. The recommended files for mapping against genomic reads are 4_unique.fasta and 6_unique.fasta, corresponding to segments 4 and 6, which are the targets commonly used for Influenza A subtyping.pangenome.fasta:This file contains the combined set of unique and dispensable genomic fragments for all eight segments.unique.fasta:This file includes only the unique genomic fragments for all eight segments.4_6_unique.fasta:This file incudes unique pangenomic pieces only for segments 4 and 6, and contains all 18 Htypes and 11Ntypes, this is the recomended file to use for Influenza A subtyping purposes.Compression Efficiency and Classification AccuracyK-FluDB achieves a relative compression index of 96.54% when using the complete pangenome and 99.64% when considering only subtype-specific sequences. The average precision for correctly classifying Hx and Nx subtypes using the unique sequences is 99.2% and 99.71%, respectively.This database provides a highly efficient and accurate resource for influenza A subtype classification while significantly reducing the storage and computational requirements associated with full-genome analyses.AcknowledgementsThis research was partially supported by grants by PAPIIT-DGAPA-IN230523 awarded to BT. The first author gratefully acknowledges the scholarship provided by CONAHCYT. We also extend our sincere appreciation to the National Autonomous University of Mexico (UNAM) for granting access to the MIZTLI supercomputer, supported by the General Directorate of Computing and Information and Communication Technologies (DGTIC) through project LANCAD-UNAM-DGTIC-350.
Authors
- Uscanga Junco, Oscar Alejandro ;
- Taboada Ramírez, Blanca Itzelt ;
- Díaz González, Lorena
This are genomic samples used in the study “K-FluDB: A Novel K-Mer Based Database for Enhanced Genomic Surveillance of Influenza A Viruses” by Uscanga et al. 2024. All samples included in this dataset were collected as part of the Official Mexican Influenza Response, adhering to the guidelines of the Mexican Official Norm NOM-017-SSA2-2012. Under this regulation, informed consent is waived, and all data are anonymized to ensure confidentiality.
Authors
- Uscanga Junco, Oscar Alejandro ;
- Taboada Ramírez, Blanca Itzelt ;
- Díaz González, Lorena
This are genomic samples used in the study “K-FluDB: A Novel K-Mer Based Database for Enhanced Genomic Surveillance of Influenza A Viruses” by Uscanga et al. 2024. All samples included in this dataset were collected as part of the Official Mexican Influenza Response, adhering to the guidelines of the Mexican Official Norm NOM-017-SSA2-2012. Under this regulation, informed consent is waived, and all data are anonymized to ensure confidentiality.
Authors
- Uscanga Junco, Oscar Alejandro ;
- Taboada Ramírez, Blanca Itzelt ;
- Díaz González, Lorena
K-FluDB: A Novel K-Mer Based Database for Enhanced Genomic Surveillance of Influenza A VirusesK-FluDB is a compressed database composed of distinct sub-sequences specific to 50 influenza A subtypes. It includes unique sequences for all 18 hemagglutinin (HA) and 11 neuraminidase (NA) subtypes. The original influenza sequences were obtained from the NCBI database on May 8, 2022, comprising a total of 895,900 Influenza A sequences.To generate this database, sequences were first subsampled based on genomic segment and variant group, resulting in 81,262 sequences. These sequences served as input for the PanGen-InfluenzaA tool (GitHub), which constructs pangenomes by identifying both unique subtype-specific sequences and sequences shared across multiple subtypes.Repository ContentsThis repository contains the following files:pangenome.fasta: The complete influenza A pangenome, including both unique (subtype-specific) and non-unique sequences.pangenome_unique_4.fasta: Unique sequences from segment 4, which can be used to classify Hx subtypes.pangenome_unique_6.fasta: Unique sequences from segment 6, suitable for identifying Nx subtypes.pangenome_unique_46.fasta: A combined dataset containing unique sequences from both segment 4 and segment 6.Additionally, within the segment_4 and segment_6 directories, there are individual files corresponding to each influenza A subtype. These files contain subtype-specific unique sequences and are named according to their respective NCBI taxonomic identifiers.Compression Efficiency and Classification AccuracyK-FluDB achieves a relative compression index of 96.54% when using the complete pangenome and 99.64% when considering only subtype-specific sequences. The average precision for correctly classifying Hx and Nx subtypes using the unique sequences is 99.2% and 99.71%, respectively.This database provides a highly efficient and accurate resource for influenza A subtype classification while significantly reducing the storage and computational requirements associated with full-genome analyses.AcknowledgementsThis research was partially supported by grants by PAPIIT-DGAPA-IN230523 awarded to BT. The first author gratefully acknowledges the scholarship provided by CONAHCYT. We also extend our sincere appreciation to the National Autonomous University of Mexico (UNAM) for granting access to the MIZTLI supercomputer, supported by the General Directorate of Computing and Information and Communication Technologies (DGTIC) through project LANCAD-UNAM-DGTIC-350.
Authors
- Uscanga Junco, Oscar Alejandro ;
- Taboada Ramírez, Blanca Itzelt ;
- Díaz González, Lorena