Version v0.0.2

Binary Classification as a Phase Separation Process (data repository)

View Dataset
Monteiro, Rafael

Description

For version 0.0.2 (from 2021) see below: This is a data repository for the paper "Binary classification as a phase separation process", by Rafael Monteiro. Website with description of this project: https://rafael-a-monteiro-math.github.io/Binary_classification_phase_separation/index.html Github: https://github.com/rafael-a-monteiro-math/Binary_classification_phase_separation This is a second version, which I wrote using tensorflow. It is much smaller (5 Gb when decompressed), a remarkable improvement when compared to the more than 100 Gb of the previous version). The new files are PSBC_BCs.tar.gz PSBC_classifier_PCA.tar.gz PSBC_dataset.tar.gz PSBC_libs_grids_statistics.tar.gz PSBC_notebooks.tar.gz Their content is explained in the file README_v2.pdf UPDATE: a Google Colab folder is also available. You can also find all the data and libraries there, unpacked. For usage, see the Git-hub. NOTE) I will keep the content for the previous version available in my Github as well. It is still a "nice exercise" to do all that is done in this new version in numpy, as done there. (Or, I should say, they should be studied as a cautionary tale of what to avoid.) For version 0.0.1 (from 2020) see below: This is a data repository for the paper "Binary classification as a phase separation process", by Rafael Monteiro. Website with description of this project: https://rafael-a-monteiro-math.github.io/Binary_classification_phase_separation/index.html Github: https://github.com/rafael-a-monteiro-math/Binary_classification_phase_separation Therein you will find Examples 1D toy model examples Computational statistics Several trained PSBC on MNIST dataset, with different parameter configurations Extra simulations, investigating normalization properties, low dimensional models that fail due to "too much" model compression, and comparison among ANNs, KNNs, and the PSBC in 1D If you want to know how to read the data how to access computational statistics, raw data, and examples how to use the data stored in this data repository see the guide README.pdf on GitHub page at Binary_Classification_Phase_Separation, where a script that downloads (and organizes) all this data is also available ("download_PSBC.sh). I did not include a copy of the train-test set (0-1dubset of the MNIST database) in every folder with simulations. But you can find a copy of the normalized dataset in the tar ball "PSBC_Examples.tar.gz" as data_test_normalized_MNIST.csv and data_train_normalized_MNIST.csv.

Citations (0)

Mentions (1)

Metrics

Dataset Index

1.7

FAIR Score

48%

Citations

0

Mentions

1

Metrics Over Time

Publication Details

DOI

Publisher

Zenodo

Assigned Domain

Subfield

Water Science and Technology

Field

Environmental Science

Domain

Physical Sciences

Confidence Score

39%

Source

Open Alex

Keywords

Phase-separation, Allen-Cahn model, binary classification, statistical machine learning, reaction-diffusion systems, maximum-principles, finite-differences methods, inverse problems, recurrent neural networks

Normalization Factors

FT

13.46

CTw

1.00

MTw

1.00