Binary Classification as a Phase Separation Process (data repository)
View DatasetDescription
For version 0.0.2 (from 2021) see below: This is a data repository for the paper "Binary classification as a phase separation process", by Rafael Monteiro. Website with description of this project: https://rafael-a-monteiro-math.github.io/Binary_classification_phase_separation/index.html Github: https://github.com/rafael-a-monteiro-math/Binary_classification_phase_separation This is a second version, which I wrote using tensorflow. It is much smaller (5 Gb when decompressed), a remarkable improvement when compared to the more than 100 Gb of the previous version). The new files are PSBC_BCs.tar.gz PSBC_classifier_PCA.tar.gz PSBC_dataset.tar.gz PSBC_libs_grids_statistics.tar.gz PSBC_notebooks.tar.gz Their content is explained in the file README_v2.pdf UPDATE: a Google Colab folder is also available. You can also find all the data and libraries there, unpacked. For usage, see the Git-hub. NOTE) I will keep the content for the previous version available in my Github as well. It is still a "nice exercise" to do all that is done in this new version in numpy, as done there. (Or, I should say, they should be studied as a cautionary tale of what to avoid.) For version 0.0.1 (from 2020) see below: This is a data repository for the paper "Binary classification as a phase separation process", by Rafael Monteiro. Website with description of this project: https://rafael-a-monteiro-math.github.io/Binary_classification_phase_separation/index.html Github: https://github.com/rafael-a-monteiro-math/Binary_classification_phase_separation Therein you will find Examples 1D toy model examples Computational statistics Several trained PSBC on MNIST dataset, with different parameter configurations Extra simulations, investigating normalization properties, low dimensional models that fail due to "too much" model compression, and comparison among ANNs, KNNs, and the PSBC in 1D If you want to know how to read the data how to access computational statistics, raw data, and examples how to use the data stored in this data repository see the guide README.pdf on GitHub page at Binary_Classification_Phase_Separation, where a script that downloads (and organizes) all this data is also available ("download_PSBC.sh). I did not include a copy of the train-test set (0-1dubset of the MNIST database) in every folder with simulations. But you can find a copy of the normalized dataset in the tar ball "PSBC_Examples.tar.gz" as data_test_normalized_MNIST.csv and data_train_normalized_MNIST.csv.
Citations (0)
No citations found
Mentions (1)
Mentioned on 28 August 2023
Weight: 1.46
Metrics Over Time
Publication Details
Subfield
Water Science and Technology
Field
Environmental Science
Domain
Physical Sciences
Confidence Score
39%
Source
Open Alex