Universidad de la República. Facultad de Ingeniería

Datasets

Conjunto de datos de: Water-quality data imputation with a high percentage of missing values: a machine learning approach

The monitoring of surface-water quality followed by water-quality modeling and analysis is essential for generating effective strategies in water resource management. However, water-quality studies are limited by the lack of complete and reliable data sets on surface-water-quality variables. These deficiencies are particularly noticeable in developing countries. This work focuses on surface-water-quality data from Santa Lucía Chico river (Uruguay), a mixed lotic and lentic river system. Data collected at six monitoring stations are publicly available at https://www.dinama.gub.uy/oan/datos-abiertos/calidad-agua/. The high temporal and spatial variability that characterizes water-quality variables and the high rate of missing values (between 50% and 70%) raises significant challenges. To deal with missing values, we applied several statistical and machine-learning imputation methods. The competing algorithms implemented belonged to both univariate and multivariate imputation methods (inverse distance weighting (IDW), Random Forest Regressor (RFR), Ridge (R), Bayesian Ridge (BR), AdaBoost (AB), Huber Regressor (HR), Support Vector Regressor (SVR), and K-nearest neighbors Regressor (KNNR)). IDW outperformed the others, achieving a very good performance (NSE greater than 0.8) in most cases. In this dataset, we include the original and imputed values for the following variables: - Water temperature (Tw) - Dissolved oxygen (DO) - Electrical conductivity (EC) - pH - Turbidity (Turb) - Nitrite (NO2-) - Nitrate (NO3-) - Total Nitrogen (TN) Each variable is identified as [STATION] VARIABLE FULL NAME (VARIABLE SHORT NAME) [UNIT METRIC]. More details about the study area, the original datasets, and the methodology adopted can be found in our paper https://www.mdpi.com/2071-1050/13/11/6318. If you use this dataset in your work, please cite our paper: Rodríguez, R.; Pastorini, M.; Etcheverry, L.; Chreties, C.; Fossati, M.; Castro, A.; Gorgoglione, A. Water-Quality Data Imputation with a High Percentage of Missing Values: A Machine Learning Approach. Sustainability 2021, 13, 6318. https://doi.org/10.3390/su13116318

Authors

Rodríguez, Rafael ;
Pastorini, Marcos ;
Etcheverry, Lorena ;
Chreties, Christian ;
Fossati, Mónica ;
Castro, Alberto ;
Gorgoglione, Angela

0 Citations0 Mentions88% FAIR1.9 Dataset Index

10.60895/redata/tnrt8q2024

Simulador Android de Robotito Básico

Aplicación que simula el Robotito Básico en un entorno de trabajo.

Authors

Bakala, Ewelina

0 Citations0 Mentions85% FAIR1.8 Dataset Index

10.60895/redata/rvyimx2024

Automated Organization Profile
Universidad de la República. Facultad de Ingeniería

Universidad de la República. Facultad de Ingeniería

Current S-Index

Average Dataset Index per Dataset

Total Datasets

Average FAIR Score

Total Citations

Total Mentions

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Conjunto de datos de: Water-quality data imputation with a high percentage of missing values: a machine learning approach

Simulador Android de Robotito Básico

Automated Organization ProfileUniversidad de la República. Facultad de Ingeniería

Universidad de la República. Facultad de Ingeniería

Current S-Index

Average Dataset Index per Dataset

Total Datasets

Average FAIR Score

Total Citations

Total Mentions

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Conjunto de datos de: Water-quality data imputation with a high percentage of missing values: a machine learning approach

Simulador Android de Robotito Básico

Automated Organization Profile
Universidad de la República. Facultad de Ingeniería