Skip to contents

sumvar 0.2.0

New features

  • Added explorer() to generate an HTML or PDF exploratory data analysis report for an entire data frame, including summaries of all continuous, date and categorical variables, missing data, and duplicates.
  • Added dist_date() for summarising and visualising date variables, optionally stratified by a grouping variable.
  • Redesigned tab() with a Stata-style formatted console output. Now supports row percentages, column percentages, or counts only (show argument), optional row/column totals, configurable decimal places, and chi-squared and Fisher’s exact hypothesis tests by default.
  • dist_sum() now returns n_outliers (IQR method), shapiro_p (Shapiro-Wilk p-value), and normal (logical) for both ungrouped and grouped summaries.
  • dist_sum() now automatically uses the t-distribution for confidence intervals when n < 30, and the Z-distribution for n >= 30.
  • tab() now runs both chi-squared and Fisher’s exact tests by default (test = "both").
  • tab() now prints a note when observations are excluded due to missing values.
  • tab() and tab1() now raise informative errors when the wrong number of variables is supplied.
  • explorer() now prompts interactively to exclude identifier columns (id, ids, pid, pids) from summaries.
  • explorer() now includes a correlation heatmap section when two or more numeric variables are present.

Bug fixes

  • dup(): fixed inverted label positions in the stacked bar chart.
  • dist_sum(): fixed incorrect test dispatch (ANOVA vs t-test) when the grouping variable contains missing values.
  • explorer(): fixed id_var scope in generated Rmd; fixed CLI progress bar token names.
  • explorer(): correctly recovers the data frame name when called via the %>% pipe.

sumvar 0.1.0