Scholar Data

Reducing the structure bias of RNA-Seq reveals a large number of non-annotated non-coding RNA Data Archive

[This repository contains the source data for the workflow presented in the manuscript "Reducing the structure bias of RNA-Seq reveals a large number of non-annotated non-coding RNA". The workflow can be found here: http://gitlabscottgroup.med.usherbrooke.ca/gaspard/snakemake_blockbuster ] The study of RNA expression is the fastest growing area of genomic research. However, despite the dramatic increase in the number of sequenced transcriptomes, we still do not have accurate estimates of the number and expression levels of non-coding RNA genes. Non-coding transcripts are often overlooked due to incomplete genome annotation. In this study, we use annotation-independent detection of RNA reads generated using a reverse transcriptase with low structure bias to identify non-coding RNA. Transcripts between 20 and 500 nucleotides were filtered and crosschecked with non-coding RNA annotations revealing 115 non-annotated non-coding RNAs expressed in different cell lines and tissues. Inspecting the sequence and structural features of these transcripts indicated that 60% of these transcripts correspond to new tRNA and snoRNA genes. The identified genes exhibited features of their respective families in terms of structure, expression, conservation and response to depletion of interacting proteins. Together, our data reveal a new group of RNA that are difficult to detect using standard gene prediction and RNA sequencing techniques, suggesting that reliance on actual gene annotation and sequencing techniques distort the perceived architecture of the human transcriptome.

Authors

Reulet, Gaspard ;
Scott, Michelle

0 Citations0 Mentions77% FAIR1.9 Dataset Index

10.5281/zenodo.3256666June 2019

Reducing the structure bias of RNA-Seq reveals a large number of non-annotated non-coding RNA Data Archive

Authors

Reulet, Gaspard ;
Scott, Michelle

0 Citations0 Mentions77% FAIR1.9 Dataset Index

10.5281/zenodo.3256665June 2019

MOESM4 of Aligning coding sequences with frameshift extension penalties

Additional file 4: Pairwise alignments for the 3-CDS benchmark. Zip file containing the sequence file and the pairwise alignment files at the fasta format for the manually-built3-CDS benchmark considered in the “Results” section, for each of the five methods and each parameter configuration.

Authors

Jammali, Safa ;
Esaie Kuitche ;
Rachati, Ayoub ;
Bélanger, François ;
Scott, Michelle ;
Ouangraoua, Aïda

1 Citation0 Mentions13% FAIR0.7 Dataset Index

10.6084/m9.figshare.c.3731008_d4January 2017

MOESM5 of Aligning coding sequences with frameshift extension penalties

Additional file 5: Pairwise alignments for the 21-CDS dataset. Zip file containing the sequence file and the pairwise alignment files at the fasta format for the 21-CDS benchmarkconsidered in the “Results” section, for each of the five methods and each parameter configuration.

Authors

Jammali, Safa ;
Esaie Kuitche ;
Rachati, Ayoub ;
Bélanger, François ;
Scott, Michelle ;
Ouangraoua, Aïda

1 Citation0 Mentions13% FAIR0.7 Dataset Index

10.6084/m9.figshare.c.3731008_d5January 2017

MOESM5 of Aligning coding sequences with frameshift extension penalties

Authors

Jammali, Safa ;
Esaie Kuitche ;
Rachati, Ayoub ;
Bélanger, François ;
Scott, Michelle ;
Ouangraoua, Aïda

1 Citation0 Mentions81% FAIR1.2 Dataset Index

10.6084/m9.figshare.c.3731008_d5.v1January 2017

MOESM2 of Aligning coding sequences with frameshift extension penalties

Additional file 2: CDS of the ten gene families. Zip file containing the CDS files at the fasta format for each of the ten gene families considered in the “Results” section.

Authors

Jammali, Safa ;
Esaie Kuitche ;
Rachati, Ayoub ;
Bélanger, François ;
Scott, Michelle ;
Ouangraoua, Aïda

1 Citation0 Mentions81% FAIR1.2 Dataset Index

10.6084/m9.figshare.c.3731008_d2.v1January 2017

MOESM4 of Aligning coding sequences with frameshift extension penalties

Authors

Jammali, Safa ;
Esaie Kuitche ;
Rachati, Ayoub ;
Bélanger, François ;
Scott, Michelle ;
Ouangraoua, Aïda

1 Citation0 Mentions81% FAIR2.3 Dataset Index

10.6084/m9.figshare.c.3731008_d4.v1January 2017

MOESM2 of Aligning coding sequences with frameshift extension penalties

Additional file 2: CDS of the ten gene families. Zip file containing the CDS files at the fasta format for each of the ten gene families considered in the “Results” section.

Authors

Jammali, Safa ;
Esaie Kuitche ;
Rachati, Ayoub ;
Bélanger, François ;
Scott, Michelle ;
Ouangraoua, Aïda

1 Citation0 Mentions13% FAIR0.7 Dataset Index

10.6084/m9.figshare.c.3731008_d2January 2017

Additional file 2: of Extent of pre-translational regulation for the control of nucleocytoplasmic protein localization

Classification of all NLS motifs considered in this study, including their sequence, class, regulation modes, number of encoding transcripts and number of encoding exons. (XLSX 23 kb)

Authors

Mikael-Jonathan Luce ;
Akpawu, Anna ;
Tucunduva, Daniel ;
Mason, Spencer ;
Scott, Michelle

1 Citation0 Mentions85% FAIR1.3 Dataset Index

10.6084/m9.figshare.c.3602123_d2.v1January 2016

Additional file 3: of Extent of pre-translational regulation for the control of nucleocytoplasmic protein localization

Classification of all NES motifs considered in this study, including their sequence, regulation modes, number of encoding transcripts and number of encoding exons. (XLSX 15 kb)

Authors

Mikael-Jonathan Luce ;
Akpawu, Anna ;
Tucunduva, Daniel ;
Mason, Spencer ;
Scott, Michelle

1 Citation0 Mentions85% FAIR1.3 Dataset Index

10.6084/m9.figshare.c.3602123_d1.v1January 2016

Automated Author Profile
Scott, Michelle

Scott, Michelle

Current S-Index

Average Dataset Index per Dataset

Total Datasets

Average FAIR Score

Total Citations

Total Mentions

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Reducing the structure bias of RNA-Seq reveals a large number of non-annotated non-coding RNA Data Archive

Reducing the structure bias of RNA-Seq reveals a large number of non-annotated non-coding RNA Data Archive

MOESM4 of Aligning coding sequences with frameshift extension penalties

MOESM5 of Aligning coding sequences with frameshift extension penalties

MOESM5 of Aligning coding sequences with frameshift extension penalties

MOESM2 of Aligning coding sequences with frameshift extension penalties

MOESM4 of Aligning coding sequences with frameshift extension penalties

MOESM2 of Aligning coding sequences with frameshift extension penalties

Additional file 2: of Extent of pre-translational regulation for the control of nucleocytoplasmic protein localization

Additional file 3: of Extent of pre-translational regulation for the control of nucleocytoplasmic protein localization

Automated Author ProfileScott, Michelle

Scott, Michelle

Current S-Index

Average Dataset Index per Dataset

Total Datasets

Average FAIR Score

Total Citations

Total Mentions

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Reducing the structure bias of RNA-Seq reveals a large number of non-annotated non-coding RNA Data Archive

Reducing the structure bias of RNA-Seq reveals a large number of non-annotated non-coding RNA Data Archive

MOESM4 of Aligning coding sequences with frameshift extension penalties

MOESM5 of Aligning coding sequences with frameshift extension penalties

MOESM5 of Aligning coding sequences with frameshift extension penalties

MOESM2 of Aligning coding sequences with frameshift extension penalties

MOESM4 of Aligning coding sequences with frameshift extension penalties

MOESM2 of Aligning coding sequences with frameshift extension penalties

Additional file 2: of Extent of pre-translational regulation for the control of nucleocytoplasmic protein localization

Additional file 3: of Extent of pre-translational regulation for the control of nucleocytoplasmic protein localization

Automated Author Profile
Scott, Michelle