Berkeley Education Alliance for Research in Singapore

Datasets for 'Energy Underprediction from Symmetry in Machine-Learning Interatomic Potentials' (Version: 1)

This collection contains datasets associated with paper "Energy Underprediction from Symmetry in Machine-Learning Interatomic Potentials". There are nine pretrained machine-learning interatomic potentials (MLIAPs), CHGNet, MACE, ORB-MPtrj, SevenNet, eqV2, eqV2-DeNS, MatterSim, eSEN, and eSEN-MPtrj, used for a test structure dataset with 153,235 structures queried from the Materials Project (database release v2023.11.1, licencse: CC BY 4.0, no GNoME structure involved). Details please find in the paper (preprint version).pandas_dataframe_pandas_id_strc_spg.json.gz: A pandas dataframe in json format can be read using Python pandas.read_json. There are three columns: material_id, json-serialized pymatgen.core.Structure object (need to be decoded by monty decoder), space group. This data was queried from the Materials Project using the SummaryDoc.pandas_dataframe_pandas_id_SpgSym_SpgNum_DoFs.json: A pandas dataframe in json format can be read using Python pandas.read_json. There are columns on material_id, space group number and various degree of freedom (DOF). The DOF computation is achieved by PyXtal API.pandas_dataframe_monty_dump_ThermoDoc_GGA_GGA+U_153243_retrieved_ehull.json.gz: A gzip-compressed json file for pandas dataframe that was dumped using monty.serialization.dumpfn, and has to be read using monty.serialization.loadfn. This file contain most of fields that were queried from the Materials Project using the ThermoDoc. The first few columns, 'material_id', 'chemsys', 'elements', 'nelements', 'nsites', 'composition', 'formula_pretty', 'entry_task_id', 'crystal_system', 'space_group_number', 'space_group_symbol', 'symprec_queried', 'energy_type', 'uncorrected_energy', 'correction', 'energy', 'uncorrected_energy_per_atom', 'correction_per_atom', 'energy_per_atom', 'formation_energy_per_atom', 'energy_above_hull', 'is_stable' are data recorded in the Materials Project. While, 'ef_uncorrected_retrieved', 'ehull_uncorrected_retrieved', 'ef_corrected_retrieved', 'ehull_corrected_retrieved' are retrieved formation energy and energy above hull using the uncorrected and corrected DFT energy. The retrieval was implemented using pymatgen phase diagram.${mliap_name}mp_relax.tar.gz: They are tar gzip files for ~13.8 millions calculations input and output files (structures, run logs, post-process data, etc.). Only CHGNet has fix-symmetry (symmetry constraint) calculations. The directory of the above ${mliap_name}mp_relax.tar.gz looks like, ${mliap_name}/no-symmetry/batch{1...101}.tar.gz; each batch{1...100}.tar.gz would have 1,532 directories (one for each material_id); under each material_id, there are 2 or 3 directories for cell choices (as-queried, primitive, conventional); under each cell choice dir, there are three dirs denoting the relax type (no-relax, pos-relax, vc-relax), and the run logs and in/out structures are given under this dir. Each ${mliap_name}/no-symmetry/ also have refs_hull containing the MLIAP-calculated formation energy and enegy above the hull. The MLIAP-calculated energy columns follow this convention: {ef or ehull}{cell choice}{relax task}csv_structure_matcher.tar.gz: A separate tar.gz file provides the StructureMatcher output for DFT- and MLIAP-relaxed structure pairs, including match or not, RMSD, and max paired distance.All data are suggested to be read using Python.

Authors

Nong, Wei ;
Zhu, Ruiming ;
Ren, Zekun ;
Hoffmann Petersen, Martin ;
Yamazaki, Shuya ;
Kazeev, Nikita ;
Ustyuzhanin, Andrey ;
Wu, Gang ;
Yang, Shuo-Wang ;
Hippalgaonkar, Kedar

0 Citations0 Mentions77% FAIR1.9 Dataset Index

10.5281/zenodo.16074113July 2025

Datasets for 'Energy Underprediction from Symmetry in Machine-Learning Interatomic Potentials' (Version: 1)

Authors

Nong, Wei ;
Zhu, Ruiming ;
Ren, Zekun ;
Hoffmann Petersen, Martin ;
Yamazaki, Shuya ;
Kazeev, Nikita ;
Ustyuzhanin, Andrey ;
Wu, Gang ;
Yang, Shuo-Wang ;
Hippalgaonkar, Kedar

0 Citations0 Mentions77% FAIR1.9 Dataset Index

10.5281/zenodo.16074112July 2025

ASHRAE global database of thermal comfort field measurements

AbstractRecognizing the value of open-source research databases in advancing the art and science of HVAC, in 2014 the ASHRAE Global Thermal Comfort Database II project was launched under the leadership of University of California at Berkeley’s Center for the Built Environment and The University of Sydney’s Indoor Environmental Quality (IEQ) Laboratory. The ASHRAE Global Thermal Comfort Database II (as it is known) is intended to support diverse inquiries about thermal comfort in field settings. The exercise began with a systematic collection and harmonization of raw data from the last two decades of thermal comfort field studies around the world. The final database is comprised of field studies from around the world, with contributors releasing their raw data to the project for wider dissemination to the thermal comfort research community. After the quality-assurance process, there was a total of 77,304 rows of data of paired subjective comfort votes and objective instrumental measurements of thermal comfort parameters. An additional 25,288 rows of data from the original ASHRAE RP-884 database are included. The most recent update (version 2.1) has 6,441 new rows of data, bringing the total number of entries to 109,033. The project was partially performed within the framework of the International Energy Agency Energy in Buildings and Communities programm (IEA-EBC) Annex 69 "Strategy and Practice of Adaptive Thermal Comfort in Low Energy Buildings.

Authors

Parkinson, Thomas ;
Tartarini, Federico ;
Földváry Ličina, Veronika ;
Cheung, Toby ;
Zhang, Hui ;
De Dear, Richard ;
Li, Peixian ;
Arens, Edward ;
Chun, Chungyoon ;
Schiavon, Stefano ;
Luo, Maohui ;
Brager, Gail

0 Citations0 Mentions88% FAIR1.9 Dataset Index

10.5683/sp2/gnvem8January 2021

buds-lab/building-data-genome-project-2: v1.0 (Version: v1.0)

The BDG2 open data set consists of 3,053 energy meters from 1,636 non-residential buildings with a range of two full years (2016 and 2017) at an hourly frequency (17,544 measurements per meter resulting in approximately 53.6 million measurements). These meters are collected from 19 sites across North America and Europe, and they measure electrical, heating and cooling water, steam, and solar energy as well as water and irrigation meters. Part of these data was used in the Great Energy Predictor III (GEPIII) competition hosted by the ASHRAE organization in October-December 2019. This subset includes data from 2,380 meters from 1,448 buildings that were used in the GEPIII, a machine learning competition for long-term prediction with an application to measurement and verification. This paper describes the process of data collection, cleaning, and convergence of time-series meter data, the meta-data about the buildings, and complementary weather data. This data set can be used for further prediction benchmarking and prototyping as well as anomaly detection, energy analysis, and building type classification.

Authors

Miller, Clayton ;
Anjukan Kathirgamanathan ;
Picchetti, Bianca ;
Pandarasamy Arjunan ;
Park, June Young ;
Zoltan Nagy ;
Raftery, Paul ;
Hobson, Brodie W. ;
Zixiao Shi ;
Meggers, Forrest

165 Citations0 Mentions77% FAIR81.8 Dataset Index

10.5281/zenodo.3887305June 2020

buds-lab/building-data-genome-project-2: v1.0 (Version: v1.0)

Authors

Miller, Clayton ;
Anjukan Kathirgamanathan ;
Picchetti, Bianca ;
Pandarasamy Arjunan ;
Park, June Young ;
Zoltan Nagy ;
Raftery, Paul ;
Hobson, Brodie W. ;
Zixiao Shi ;
Meggers, Forrest

3 Citations0 Mentions77% FAIR1.9 Dataset Index

10.5281/zenodo.3887306June 2020

Discomfort due to glare from a large source: Evaluating stimulus range effects when using the luminance adjustment procedure (Version: 3)

Using the luminance adjustment procedure, we evaluated four discomfort sensation on a Hopkinson-like glare scale: (a), (b), (c) and (d), respectively. Two methodological factors that were beleived to influence the luminance adjustment procedure were investigated. One was the stimulus range bias effect. This describes how the influence of the available range of a variable stimulus (i.e., glare source luminances) influence the setting made to a subjective sensation (i.e., "just uncomfortable" glare). 42 test participants were recruited to the experiment and made glare settings to four the discomfort sensations under three stimulus ranges with difference maximum luminances: low, middle, and high. Adjustments were performed to the luminance of a large articial window backlit by an array of warm and cool LEDs. This was repeated for two methods of control. Direct: with the obsever making adjustments directly to the glare source to the four discomfort sensations; and Indirect: with the experimenter adjusting the glare source according the vocal instructions of the participant. The dataset here contains the luminace settings made to each of the four sensations under the three ranges and two methods of control for all 42 test participants.

Authors

Kent, Michael ;
Cheung, Toby ;
Schiavon, Stefano

2 Citations0 Mentions69% FAIR2.2 Dataset Index

10.6078/d1cw92December 2018

A Bayesian method of evaluating discomfort due to glare: The effect of order bias from a large glare source (Version: 2)

to be confrim

Authors

Cheung, Toby ;
Kent, Michael ;
Schiavon, Stefano ;
Lipczyńska, Aleksandra

2 Citations0 Mentions69% FAIR2.4 Dataset Index

10.6078/d14q14October 2018

ASHRAE global database of thermal comfort field measurements (Version: 9)

Recognizing the value of open-source research databases in advancing the art and science of HVAC, in 2014 the ASHRAE Global Thermal Comfort Database II project was launched under the leadership of University of California at Berkeley’s Center for the Built Environment and The University of Sydney’s Indoor Environmental Quality (IEQ) Laboratory. The ASHRAE Global Thermal Comfort Database II (as it is known) is intended to support diverse inquiries about thermal comfort in field settings. The exercise began with a systematic collection and harmonization of raw data from the last two decades of thermal comfort field studies around the world. The final database is comprised of field studies from around the world, with contributors releasing their raw data to the project for wider dissemination to the thermal comfort research community. After the quality-assurance process, there was a total of 77,304 rows of data of paired subjective comfort votes and objective instrumental measurements of thermal comfort parameters. An additional 25,288 rows of data from the original ASHRAE RP-884 database are included. The most recent update (version 2.1) has 6,441 new rows of data, bringing the total number of entries to 109,033. The project was partially performed within the framework of the International Energy Agency Energy in Buildings and Communities programm (IEA-EBC) Annex 69 "Strategy and Practice of Adaptive Thermal Comfort in Low Energy Buildings.

Authors

Parkinson, Thomas ;
Tartarini, Federico ;
Földváry Ličina, Veronika ;
Cheung, Toby ;
Zhang, Hui ;
de Dear, Richard ;
Li, Peixian ;
Arens, Edward ;
Chun, Chungyoon ;
Schiavon, Stefano ;
Luo, Maohui ;
Brager, Gail

13 Citations13 Mentions69% FAIR12.5 Dataset Index

10.6078/d1f671July 2018

Automated Organization Profile
Berkeley Education Alliance for Research in Singapore

Berkeley Education Alliance for Research in Singapore

Current S-Index

Average Dataset Index per Dataset

Total Datasets

Average FAIR Score

Total Citations

Total Mentions

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Datasets for 'Energy Underprediction from Symmetry in Machine-Learning Interatomic Potentials' (Version: 1)

Datasets for 'Energy Underprediction from Symmetry in Machine-Learning Interatomic Potentials' (Version: 1)

ASHRAE global database of thermal comfort field measurements

buds-lab/building-data-genome-project-2: v1.0 (Version: v1.0)

buds-lab/building-data-genome-project-2: v1.0 (Version: v1.0)

Discomfort due to glare from a large source: Evaluating stimulus range effects when using the luminance adjustment procedure (Version: 3)

A Bayesian method of evaluating discomfort due to glare: The effect of order bias from a large glare source (Version: 2)

ASHRAE global database of thermal comfort field measurements (Version: 9)

Automated Organization ProfileBerkeley Education Alliance for Research in Singapore

Berkeley Education Alliance for Research in Singapore

Current S-Index

Average Dataset Index per Dataset

Total Datasets

Average FAIR Score

Total Citations

Total Mentions

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Datasets for 'Energy Underprediction from Symmetry in Machine-Learning Interatomic Potentials' (Version: 1)

Datasets for 'Energy Underprediction from Symmetry in Machine-Learning Interatomic Potentials' (Version: 1)

ASHRAE global database of thermal comfort field measurements

buds-lab/building-data-genome-project-2: v1.0 (Version: v1.0)

buds-lab/building-data-genome-project-2: v1.0 (Version: v1.0)

Discomfort due to glare from a large source: Evaluating stimulus range effects when using the luminance adjustment procedure (Version: 3)

A Bayesian method of evaluating discomfort due to glare: The effect of order bias from a large glare source (Version: 2)

ASHRAE global database of thermal comfort field measurements (Version: 9)

Automated Organization Profile
Berkeley Education Alliance for Research in Singapore