Automated Organization ProfileNew York University
New York University
Current S-Index
Sum of Dataset Indices for all datasets
Average Dataset Index per Dataset
Average Dataset Index per dataset
Total Datasets
Total datasets in this organization
Average FAIR Score
Average FAIR Score per dataset
Total Citations
Total citations to the organization's datasets
Total Mentions
Total mentions of the organization's datasets
S-Index Interpretation
The S-Index (Sharing Index) is a comprehensive metric that represents the cumulative impact of all your datasets. It is calculated as the sum of Dataset Index scores across all your claimed datasets.
What it means:
- A higher S-index indicates greater overall impact of your datasets relative to typical datasets in their fields of research
- The S-Index grows as you add more datasets or as existing datasets gain more citations and mentions
- It provides a single number to track your research data impact over time
Current S-Index: 3173.8 (sum of 4,362 datasets Dataset Index scores)
More information here.
S-Index Over Time
Cumulative Citations Over Time
Cumulative Mentions Over Time
Datasets
Data archive for Tier 1 of the ForceSMIP project, which is described in "Forced Component Estimation Statistical Method Intercomparison Project (ForceSMIP)" by Wills et al. Please cite that paper for any usage of this data (preprint citation information below; cite final published version once available). Wills, R.C.J., C. Deser, K.A. McKinnon, A. Phillips, S. Po-Chedley, S. Sippel, A.L. Merrifield, C. Bône, C. Bonfils, G. Camps-Valls, S. Cropper, C. Connolly, S. Duan, H. Durand, A. Feigin, M.A. Fernandez, G. Gastineau, A. Gavrilov, E. Gordon, M. Günther, M. Höver, S. Kravtsov, Y.-N. Kuo, J. Lien, G.D. Madakumbra, N. Mankovich, M. Newman, J. Rader, J.-R. Shi, S.-I. Shin, G. Varando: Forced Component Estimation Statistical Method Intercomparison Project (ForceSMIP), ESS Open Archive, https://doi.org/10.22541/essoar.175003371.14843115/v1.Types of data included here are:Evaluation-Tier1: Raw data for 10 evaluation members, including reanalysis/observations (member "1I")ensmeans-Tier1: The "true forced response", from the corresponding large ensemble mean, for the 9 evaluation members that are from models (all except "1I")ForceSMIP-estimates-Tier1: ForcesSMIP method estimates of the forced response in each evaluation memberEach of these types of data is provided at monthly temporal resolution over 1950-2022, for each of 8 variables: tos (sea-surface temperature), tas (surface air temperature), pr (precipitation), psl (sea-level pressure), monmaxtasmax (monthly maximum daily maximum temperature), monmintasmin (monthly minimum daily minimum temperature), monmaxpr (monthly maximum daily precipitation), and zmta (zonal-mean atmospheric temperature). Annual maximums of monmaxtasmax and monmaxpr give the annual maximums TXx and Rx1day following standard notational conventions in the study of extreme events (Zhang et al. 2011, https://doi.org/10.1002/wcc.147). Similarly, the annual minimum of monmintasmin gives TNn.For further details about the dataset and how it was generated, see Wills et al., "Forced Component Estimation Statistical Method Intercomparison Project (ForceSMIP)".Correspondence: Robert Jnglin Wills ([email protected])
Authors
- Wills, Robert C.J. ;
- Merrifield, Anna L. ;
- Phillips, Adam ;
- Deser, Clara ;
- McKinnon, Karen ;
- Po-Chedley, Stephen ;
- Sippel, Sebastian ;
- Bône, Constantin ;
- Bonfils, Celine ;
- Camps-Valls, Gustau ;
- Cropper, Stephen ;
- Connolly, Charlotte ;
- Duan, Shiheng ;
- Durand, Homer ;
- Feigin, Alexander ;
- Fernandez, Martin ;
- Gastineau, Guillaume ;
- Gavrilov, Andrei ;
- Gordon, Emily ;
- Günther, Moritz ;
- Höver, Maren ;
- Kravtsov, Sergey ;
- Kuo, Yan-Ning ;
- Lien, Justin ;
- Madakumbura, Gavin Dayanga ;
- Mankovich, Nathan ;
- Newman, Matthew ;
- Rader, Jamin ;
- Shi, Jia-Rui ;
- Shin, Sang-Ik ;
- Varando, Gherardo
This dataset contains the AlphaFold3 input JSON files and outputs for the manuscript "Diversification and conservation of DNA binding specificities of SPL family of transcription factors" (Li et al., BioRxiv 2024. https://doi.org/10.1101/2024.09.13.612952).If you use the data or the code, please cite: Li M, Yao T, Galli M, Lin W, Zhou Y, Chen JG, Gallavotti A, Huang SC. Diversification and conservation of DNA binding specificities of SPL family of transcription factors. bioRxiv [Preprint]. 2024 Sep 16:2024.09.13.612952. doi: 10.1101/2024.09.13.612952. PMID: 39345475; PMCID: PMC11429892.The following files are provided:a2025-07-18_analyze_af3_lis-all_results_from_Alphafold_info.tsv: Tab-delimited text file describing the protein and DNA inputs for the AlphaFold3 results in a2025-07-18_analyze_af3_lis-all_results_from_Alphafold.zipa2025-07-18_analyze_af3_lis-all_results_from_Alphafold.zip: ZIP file of AlphaFold3 outputs; see "AlphaFold Server Output Terms of Use" at https://alphafoldserver.com/output-termsa2025-07-18_analyze_af3_lis-af3_inputs.zip: input JSON files for AlphaFold3
Authors
- Li, Miaomiao ;
- Huang, Shao-shan Carol
This dataset contains the AlphaFold3 input JSON files and outputs for the manuscript "Diversification and conservation of DNA binding specificities of SPL family of transcription factors" (Li et al., BioRxiv 2024. https://doi.org/10.1101/2024.09.13.612952).If you use the data or the code, please cite: Li M, Yao T, Galli M, Lin W, Zhou Y, Chen JG, Gallavotti A, Huang SC. Diversification and conservation of DNA binding specificities of SPL family of transcription factors. bioRxiv [Preprint]. 2024 Sep 16:2024.09.13.612952. doi: 10.1101/2024.09.13.612952. PMID: 39345475; PMCID: PMC11429892.The following files are provided:a2025-07-18_analyze_af3_lis-all_results_from_Alphafold_info.tsv: Tab-delimited text file describing the protein and DNA inputs for the AlphaFold3 results in a2025-07-18_analyze_af3_lis-all_results_from_Alphafold.zipa2025-07-18_analyze_af3_lis-all_results_from_Alphafold.zip: ZIP file of AlphaFold3 outputs; see "AlphaFold Server Output Terms of Use" at https://alphafoldserver.com/output-termsa2025-07-18_analyze_af3_lis-af3_inputs.zip: input JSON files for AlphaFold3
Authors
- Li, Miaomiao ;
- Huang, Shao-shan Carol
MULTIVOX is a multimodal, spatial audio–visual dataset of a-cappella vocal performances recorded in controlled conditions with both choir and vocal chamber ensembles. The dataset comprises 154 performances (≈3 hours total), captured in two acoustically distinct spaces (auditorium and recording studio). Each performance includes synchronized 360° video, far-field audio (first-order Ambisonics and ORTF stereo), and per-singer near-field recordings captured on personal devices. Performances feature 6– and 16-singer configurations arranged in a circle around the 360° camera and far-field devices. The repertoire covers 18 short choral pieces, including vocal warm-ups, Latin American songs, and arranged popular music. Annotations are provided in the metadata.csv, including session information, song, duration, tonal center, ensemble composition, condition, per-singer roles, facing direction, gender, height, and near-field file availability. Refer to the README.md for full annotation details. The MULTIVOX_extended_dataset_description_and_supplement.pdf file includes a detailed technical description of the dataset. The dataset is designed to support research on spatial audio, sound event localization, source separation, ensemble synchronization, and multimodal modeling of group singing. The authors discourage the use of MULTIVOX for GenAI model training. Citation: Please cite the dataset if you use it in your research. You can read the AES paper here@inproceedings{meza2025multivox, author = {Meza, G. and Sepúlveda, M. and Roman, A. S. and Sigal Sefchovich, J. R. and Roman, I. R.}, title = {{MULTIVOX: A Spatial Audio-Visual Dataset of Singing Groups}}, booktitle = {Proceedings of the AES International Conference on Artificial Intelligence and Machine Learning for Audio (AES AIMLA)}, year = {2025}, address = {London}, doi = {10.5281/zenodo.17058101}}You can find files C3.zip and C4.zip here: https://doi.org/10.5281/zenodo.17065497
Authors
- Meza, Gerardo ;
- Sepúlveda, Mariana ;
- Roman, Adrian ;
- Sigal Sefchovich, Jorge Rodrigo ;
- Roman, Iran R.
MULTIVOX is a multimodal, spatial audio–visual dataset of a-cappella vocal performances recorded in controlled conditions with both choir and vocal chamber ensembles. The dataset comprises 154 performances (≈3 hours total), captured in two acoustically distinct spaces (auditorium and recording studio). Each performance includes synchronized 360° video, far-field audio (first-order Ambisonics and ORTF stereo), and per-singer near-field recordings captured on personal devices. Performances feature 6– and 16-singer configurations arranged in a circle around the 360° camera and far-field devices. The repertoire covers 18 short choral pieces, including vocal warm-ups, Latin American songs, and arranged popular music. Annotations are provided in the metadata.csv, including session information, song, duration, tonal center, ensemble composition, condition, per-singer roles, facing direction, gender, height, and near-field file availability. Refer to the README.md for full annotation details. The MULTIVOX_extended_dataset_description_and_supplement.pdf file includes a detailed technical description of the dataset. The dataset is designed to support research on spatial audio, sound event localization, source separation, ensemble synchronization, and multimodal modeling of group singing. The authors discourage the use of MULTIVOX for GenAI model training. Citation: Please cite the dataset if you use it in your research. You can read the AES paper here@inproceedings{meza2025multivox, author = {Meza, G. and Sepúlveda, M. and Roman, A. S. and Sigal Sefchovich, J. R. and Roman, I. R.}, title = {{MULTIVOX: A Spatial Audio-Visual Dataset of Singing Groups}}, booktitle = {Proceedings of the AES International Conference on Artificial Intelligence and Machine Learning for Audio (AES AIMLA)}, year = {2025}, address = {London}, doi = {10.5281/zenodo.17058101}}You can find files C3.zip and C4.zip here: https://doi.org/10.5281/zenodo.17065497
Authors
- Meza, Gerardo ;
- Sepúlveda, Mariana ;
- Roman, Adrian ;
- Sigal Sefchovich, Jorge Rodrigo ;
- Roman, Iran R.
These are the data used to generate the figures associated with the paper: Helical Growth of Twining Common Bean is Associated with Longitudinal, Not Skewed, Microtubule Patterning. All associated scripts can be found here: https://github.com/angelique-acevedo/Microtubule-Twining-Analysis
Authors
- Acevedo, Angelique
These are the data used to generate the figures associated with the paper: Helical Growth of Twining Common Bean is Associated with Longitudinal, Not Skewed, Microtubule Patterning. All associated scripts can be found here: https://github.com/angelique-acevedo/Microtubule-Twining-Analysis
Authors
- Acevedo, Angelique
IS3+ is an extended version of IS3 with clean audio/image pairs to ensure cross-modality consistency. The dataset has 4 GB of data.The dataset contains the following data:audio_wav: audio files (.wav)gt_segmentation: annotations of image bounding boxes and segmentation masksimages: images (.jpg)IS3_annotation.json: file with image/audio/gt information for every dataset sample. This work was done as part of the paper Learning from Silence and Noise for Visual Sound Source Localization Models.Paper citation:@misc{juanola2025learningsilencenoisevisual, title={Learning from Silence and Noise for Visual Sound Source Localization}, author={Xavier Juanola and Giovana Morais and Magdalena Fuentes and Gloria Haro}, year={2025}, eprint={2508.21761}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2508.21761}, }
Authors
- Morais, Giovana ;
- Juanola Molet, Xavier
IS3+ is an extended version of IS3 with clean audio/image pairs to ensure cross-modality consistency. The dataset has 4 GB of data.The dataset contains the following data:audio_wav: audio files (.wav)gt_segmentation: annotations of image bounding boxes and segmentation masksimages: images (.jpg)IS3_annotation.json: file with image/audio/gt information for every dataset sample. This work was done as part of the paper Learning from Silence and Noise for Visual Sound Source Localization Models.Paper citation:@misc{juanola2025learningsilencenoisevisual, title={Learning from Silence and Noise for Visual Sound Source Localization}, author={Xavier Juanola and Giovana Morais and Magdalena Fuentes and Gloria Haro}, year={2025}, eprint={2508.21761}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2508.21761}, }
Authors
- Morais, Giovana ;
- Juanola Molet, Xavier
Understanding animal behavior requires the capability of monitoring and quantifying behaviors across many time points and increasingly ethological behaviors. In the past several years, machine vision researchers have made significant advances in building animal tracking software relying on the development of convolutional neural networks (CNNs) and deep learning. However, the lack of longitudinal, curated video datasets for machine learning and behavioral science remains a key bottleneck in advancing both fields. Here, we present 3 curated multi-animal, multi-day rodent behavior datasets containing the behavior of gerbil families (2 adults and 4 pups) during critical periods of development (P15 to P30). Each cohort contains videos acquired at 24 fps for 15 days continuously, yielding approximately 30 million frames per cohort. We provide ground truth annotations for all animals in 1000 to 6000 frames across the cohorts using a standardized Ultralytics pose data format. We additionally provide predictions from using Sleap (Pereira et al, 2022) machine vision tracking as a dataset to be used by behavioral scientists for analysis and machine vision researchers as a benchmark.
Authors
- Mitelut, Catalin ;
- Sanes, Dan