Generate Data Report
The Generate Data Report feature produces an automated data quality report for your current dataset. The report summarises each variable, highlighting potential issues such as unusual distributions, outliers, and missing values.
Access via Dataset > Generate data report.
Requirements
This feature requires the dataMaid, rmarkdown, and pandoc packages/tools to be installed. If they are not available, the menu item will not appear.
Report Formats
Choose from three output formats:
| Format | Description |
|---|---|
| Portable document, suitable for printing and sharing | |
| Word Document | Editable Microsoft Word document (.docx) |
| HTML | Web page, opens in your default browser |
What's in the Report
The report is generated by the dataMaid R package and includes, for each variable:
- Variable class and type (numeric, factor, character, etc.)
- Summary statistics (mean, median, range for numeric; frequency table for categorical)
- Distribution visualisations (histograms, bar charts)
- Potential problems flagged automatically, such as:
- Variables with very few unique values
- Variables with many missing values
- Possible outliers or unusual values
- Unexpected coding patterns
How to Use
- Load your dataset
- Go to Dataset > Generate data report
- Select the output format
- Click Generate
- The report will be created and opened automatically
The report is saved to a temporary location. To keep it, save or copy it from the location shown after generation.
Tips
- Use this as a first step when working with a new dataset to quickly identify data quality issues
- The flagged problems are suggestions — not all flags necessarily indicate real issues
- For large datasets, report generation may take a few moments