Added explorer() to generate an HTML or PDF exploratory data analysis report for an entire data frame, including summaries of all continuous, date and categorical variables, missing data, and duplicates.
Added dist_date() for summarising and visualising date variables, optionally stratified by a grouping variable.
Redesigned tab() with a Stata-style formatted console output. Now supports row percentages, column percentages, or counts only (show argument), optional row/column totals, configurable decimal places, and chi-squared and Fisher’s exact hypothesis tests by default.
dist_sum() now returns n_outliers (IQR method), shapiro_p (Shapiro-Wilk p-value), and normal (logical) for both ungrouped and grouped summaries.
dist_sum() now automatically uses the t-distribution for confidence intervals when n < 30, and the Z-distribution for n >= 30.
tab() now runs both chi-squared and Fisher’s exact tests by default (test = "both").
tab() now prints a note when observations are excluded due to missing values.
tab() and tab1() now raise informative errors when the wrong number of variables is supplied.
explorer() now prompts interactively to exclude identifier columns (id, ids, pid, pids) from summaries.
explorer() now includes a correlation heatmap section when two or more numeric variables are present.
Bug fixes
dup(): fixed inverted label positions in the stacked bar chart.
dist_sum(): fixed incorrect test dispatch (ANOVA vs t-test) when the grouping variable contains missing values.
explorer(): fixed id_var scope in generated Rmd; fixed CLI progress bar token names.
explorer(): correctly recovers the data frame name when called via the %>% pipe.