Automated Author Profile

Unwin, Antony

Current S-Index

3.3

Sum of Dataset Indices for all datasets

Average Dataset Index per Dataset

0.5

Average Dataset Index per dataset

Total Datasets

6

Total datasets for this author

Average FAIR Score

84.6%

Average FAIR Score per dataset

Total Citations

3

Total citations to the author's datasets

Total Mentions

0

Total mentions of the author's datasets

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Penguins Go Parallel: A Grammar of Graphics Framework for Generalized Parallel Coordinate Plots

Parallel Coordinate Plots (PCP) are a valuable tool for exploratory data analysis of high-dimensional numerical data. The use of PCPs is limited when working with categorical variables or a mix of categorical and continuous variables. In this article, we propose Generalized Parallel Coordinate Plots (GPCP) to extend the ability of PCPs from just numeric variables to dealing seamlessly with a mix of categorical and numeric variables in a single plot. In this process we find that existing solutions for categorical values only, such as hammock plots or parsets become edge cases in the new framework. By focusing on individual observations rather than a marginal frequency we gain additional flexibility. The resulting approach is implemented in the R package ggpcp. Supplementary materials for this article are available online.

Authors

  • VanderPlas, Susan ;
  • Ge, Yawei ;
  • Unwin, Antony ;
  • Hofmann, Heike
1 Citation0 Mentions85% FAIR0.7 Dataset Index
10.6084/m9.figshare.224673692023

Penguins Go Parallel: a grammar of graphics framework for generalized parallel coordinate plots

Parallel coordinate plots (PCP) are a valuable tool for exploratory data analysis of high-dimensional numerical data. The use of PCPs is limited when working with categorical variables or a mix of categorical and continuous variables. In this paper, we propose generalized parallel coordinate plots (GPCP) to extend the ability of PCPs from just numeric variables to dealing seamlessly with a mix of categorical and numeric variables in a single plot. In this process we find that existing solutions for categorical values only, such as hammock plots or parsets become edge cases in the new framework. By focusing on individual observations rather than a marginal frequency we gain additional flexibility. The resulting approach is implemented in the R package ggpcp.

Authors

  • Vander Plas, Susan ;
  • Ge, Yawei ;
  • Unwin, Antony ;
  • Hofmann, Heike
1 Citation0 Mentions85% FAIR0.6 Dataset Index
10.6084/m9.figshare.22467369.v12023

Penguins Go Parallel: A Grammar of Graphics Framework for Generalized Parallel Coordinate Plots

Parallel Coordinate Plots (PCP) are a valuable tool for exploratory data analysis of high-dimensional numerical data. The use of PCPs is limited when working with categorical variables or a mix of categorical and continuous variables. In this article, we propose Generalized Parallel Coordinate Plots (GPCP) to extend the ability of PCPs from just numeric variables to dealing seamlessly with a mix of categorical and numeric variables in a single plot. In this process we find that existing solutions for categorical values only, such as hammock plots or parsets become edge cases in the new framework. By focusing on individual observations rather than a marginal frequency we gain additional flexibility. The resulting approach is implemented in the R package ggpcp. Supplementary materials for this article are available online.

Authors

  • VanderPlas, Susan ;
  • Ge, Yawei ;
  • Unwin, Antony ;
  • Hofmann, Heike
0 Citations0 Mentions85% FAIR0.9 Dataset Index
10.6084/m9.figshare.22467369.v22023

Visualizing probability distributions across bivariate cyclic temporal granularities

Deconstructing a time index into time granularities can assist in exploration and automated analysis of large temporal data sets. This paper describes classes of time deconstructions using linear and cyclic time granularities. Linear granularities respect the linear progression of time such as hours, days, weeks and months. Cyclic granularities can be circular such as hour-of-the-day, quasi-circular such as day-of-the-month, and aperiodic such as public holidays. The hierarchical structure of granularities creates a nested ordering: hour-of-the-day and second-of-the-minute are single-order-up. Hour-of-the-week is multiple-order-up, because it passes over day-of-the-week. Methods are provided for creating all possible granularities for a time index. A recommendation algorithm provides an indication whether a pair of granularities can be meaningfully examined together (a “harmony”), or when they cannot (a “clash”). Time granularities can be used to create data visualizations to explore for periodicities, associations and anomalies. The granularities form categorical variables (ordered or unordered) which induce groupings of the observations. Assuming a numeric response variable, the resulting graphics are then displays of distributions compared across combinations of categorical variables. The methods implemented in the open source R package gravitas are consistent with a tidy workflow, with probability distributions examined using the range of graphics available in ggplot2.

Authors

  • Gupta, Sayani ;
  • Hyndman, Rob J ;
  • Cook, Dianne ;
  • Unwin, Antony
1 Citation0 Mentions85% FAIR0.6 Dataset Index
10.6084/m9.figshare.14749324.v12021

Visualizing Probability Distributions Across Bivariate Cyclic Temporal Granularities

Deconstructing a time index into time granularities can assist in exploration and automated analysis of large temporal datasets. This article describes classes of time deconstructions using linear and cyclic time granularities. Linear granularities respect the linear progression of time such as hours, days, weeks and months. Cyclic granularities can be circular such as hour-of-the-day, quasi-circular such as day-of-the-month, and aperiodic such as public holidays. The hierarchical structure of granularities creates a nested ordering: hour-of-the-day and second-of-the-minute are single-order-up. Hour-of-the-week is multiple-order-up, because it passes over day-of-the-week. Methods are provided for creating all possible granularities for a time index. A recommendation algorithm provides an indication whether a pair of granularities can be meaningfully examined together (a “harmony”), or when they cannot (a “clash”). Time granularities can be used to create data visualizations to explore for periodicities, associations and anomalies. The granularities form categorical variables (ordered or unordered) which induce groupings of the observations. Assuming a numeric response variable, the resulting graphics are then displays of distributions compared across combinations of categorical variables. The methods implemented in the open source R package gravitas are consistent with a tidy workflow, with probability distributions examined using the range of graphics available in ggplot2. Supplementary files for this article are available online.

Authors

  • Gupta, Sayani ;
  • Hyndman, Rob J ;
  • Cook, Dianne ;
  • Unwin, Antony
0 Citations0 Mentions85% FAIR0.3 Dataset Index
10.6084/m9.figshare.147493242021

Visualizing Probability Distributions Across Bivariate Cyclic Temporal Granularities

Deconstructing a time index into time granularities can assist in exploration and automated analysis of large temporal datasets. This article describes classes of time deconstructions using linear and cyclic time granularities. Linear granularities respect the linear progression of time such as hours, days, weeks and months. Cyclic granularities can be circular such as hour-of-the-day, quasi-circular such as day-of-the-month, and aperiodic such as public holidays. The hierarchical structure of granularities creates a nested ordering: hour-of-the-day and second-of-the-minute are single-order-up. Hour-of-the-week is multiple-order-up, because it passes over day-of-the-week. Methods are provided for creating all possible granularities for a time index. A recommendation algorithm provides an indication whether a pair of granularities can be meaningfully examined together (a “harmony”), or when they cannot (a “clash”). Time granularities can be used to create data visualizations to explore for periodicities, associations and anomalies. The granularities form categorical variables (ordered or unordered) which induce groupings of the observations. Assuming a numeric response variable, the resulting graphics are then displays of distributions compared across combinations of categorical variables. The methods implemented in the open source R package gravitas are consistent with a tidy workflow, with probability distributions examined using the range of graphics available in ggplot2. Supplementary files for this article are available online.

Authors

  • Gupta, Sayani ;
  • Hyndman, Rob J ;
  • Cook, Dianne ;
  • Unwin, Antony
0 Citations0 Mentions85% FAIR0.1 Dataset Index
10.6084/m9.figshare.14749324.v22021