Multi-scale footprinting

View Dataset
Hu, Yan;Ma, Sai;Kartha, Vinay;Duarte, Fabiana;Labade, Ajay;Shrestha, Rojesh;Kletzien, Heidi;Earl, Andrew;Meliki, Alia;Castillo, Andrew;Durand, Neva;Mattei, Eugenio;Shoresh, Noam;Wagers, Amy;Buenrostro, Jason

Description

Data associated with the multi-scale footprinting project. (1) Tn5_NN_model.h5 Pre-trained CNN-based Tn5 bias model implemented with Keras. Takes local DNA sequence context as input and predicts Tn5 insertion bias. See tutorial for how to use this model. (2) Tn5ModelTutorial.ipynb, Tn5ModelTutorial.html Tutorial showing how to use the pre-trained Tn5 bias model to score input sequences. (3) hg38Tn5Bias.tar.gz, mm10Tn5Bias.tar.gz, panTro6Tn5Bias.tar.gz, sacCer3Tn5Bias.tar.gz, dm6Tn5Bias.tar.gz, danRer11Tn5Bias.tar.gz, ce11Tn5Bias.tar.gz h5 files containing the genome-wide Tn5 bias pre-computed using our convolutional neural net model. (4) dispModel.tar.gz Zipped folder containing Tn5 cutting dispersion models for each footprint window radius. The footprint window size in our paper refers to the diameter the footprint window, which is twice the number listed here. During footprinting, these models are loaded into the footprintingProject object and then used for footprinting. (5) cisBP_mouse_pwms_2021.rds, cisBP_human_pwms_2021.rds Motif PWMs used in our study. (6) TFBS_model.h5, TFBS_model_cluster_I.h5 Pre-trained TF binding prediction models. The models takes local multi-scale footprints as input and predict whether a genomic position is bound by a TF if the corresponding motif is present. TFBS_model.h5 is the "TF habitation model" used in our study. It was trained using data of TFs from all TF clusters. TFBS_model_cluster_I.h5 was instead only trained on cluster 1 TFs (the TFs that leave the strongest footprints) and is in general not applicable to other TFs. (7) clusterLabels.txt, clusterLabelsAllTFs.txt Cluster labels of TFs. clusterLabels.txt is the clustering result directly obtained from clustering multi-scale footprints of all TFs with ChIP data. clusterLabelsAllTFs.txt includes other TFs without ChIP data. The cluster membership of these TFs were assigned based on motif homology among TFs. (8) BMMCTutorial.tar.gz Data needed for our tutorial. Content of this foder can be put into the /data/BMMCTutorial folder.

Citations (0)

Mentions (0)

Metrics

Dataset Index

1.7

FAIR Score

69%

Citations

0

Mentions

0

Metrics Over Time

Publication Details

DOI

Publisher

Zenodo

Assigned Domain

Subfield

Rheumatology

Field

Medicine

Domain

Health Sciences

Confidence Score

29%

Source

Scholar Data Model

Normalization Factors

FT

13.46

CTw

1.00

MTw

1.00