Requirements:
- CKAN 2.5+
- ckanext-report (https://github.com/datagovuk/ckanext-report)
- Python 2.6 or 2.7
Prerequisite - Install ckanext-report:
Install ckanext-report dependency:
pip install -e git+https://github.com/datagovuk/ckanext-report#egg=ckanext-report
Add report plugin to ckan.plugins:
ckan.plugins = … report …
Initialize report database:
paster –plugin=ckanext-report report initdb –config=production.ini
Installation:
Activate CKAN virtualenv:
. /usr/lib/ckan/default/bin/activate
Install extension:
git clone https://github.com/smotornyuk/ckanext-statsresources.git
cd ckanext-statsresources
python setup.py develop
Install dependencies:
pip install -r dev-requirements.txt
Add plugin to ckan.plugins in production.ini:
ckan.plugins = … report statsresources …
Note: statsresources must come after report plugin
Configure report mappings (see Configuration below)
Restart CKAN:
sudo service apache2 reload
Configuration:
Strict Access Control:
Restrict /reports pages to sysadmin only.
Default: false (any logged-in user can view)
reports.strict_access = true
If true: Only sysadmins can access /reports
If false: All authenticated users can view reports
Report Resource Mappings:
Define which reports to generate as resources.
Format: REPORT_NAME:FORMAT:PACKAGE_ID:RESOURCE_TITLE
Multiple mappings supported (one per line)
statsresources.report_map =
dataset_creation:json:1234-1234-1234-1234:Dataset creation dates JSON
dataset_creation:csv:1234-1234-1234-1234:Dataset creation dates CSV
organization_stats:json:5678-5678-5678-5678:Organization statistics
resource_formats:csv:9abc-9abc-9abc-9abc:Resource format breakdown
Each line creates one resource:
- REPORT_NAME: Name of report from ckanext-report
- FORMAT: json or csv
- PACKAGE_ID: ID of package to add resource to
- RESOURCE_TITLE: Title for the resource
Report-Specific Options:
Configure options for individual reports.
Format: option:value (one per line)
statsresources.dataset_creation.options =
include_private:false
include_draft:true
statsresources.organization_stats.options =
include_private:true
include_draft:true
min_datasets:5
Available options depend on the specific report.
Default: Reports run with default options (no additional options)
Configuration Example:
Complete configuration for multiple reports:
Restrict reports to sysadmin only
reports.strict_access = true
Define stat resources to generate
statsresources.report_map =
dataset_creation:json:a1b2c3d4-e5f6-7890-abcd-ef1234567890:Dataset Creation Dates (JSON)
dataset_creation:csv:a1b2c3d4-e5f6-7890-abcd-ef1234567890:Dataset Creation Dates (CSV)
organization_stats:json:b2c3d4e5-f6a7-8901-bcde-f12345678901:Organization Statistics
resource_formats:csv:c3d4e5f6-a7b8-9012-cdef-123456789012:Resource Formats
Options for dataset_creation report
statsresources.dataset_creation.options =
include_private:false
include_draft:false
Options for organization_stats report
statsresources.organization_stats.options =
include_private:true
min_datasets:1
Paster Commands:
List Stat Resources:
Show all configured stat resources that will be generated.
paster statsresources list -c /etc/ckan/default/production.ini
Output:
- Report name, format, package ID, resource title
- Current configuration status
- Validation messages
Generate/Update Stat Resources:
Create or update all configured stat resources.
paster statsresources generate -c /etc/ckan/default/production.ini
Process:
- Generates each configured report
- Creates resource if doesn’t exist
- Updates existing resource with new data
- Uploads to specified package
Scheduled Report Generation:
Set up cron job for automatic updates:
Generate reports daily at 2 AM
0 2 * * * /usr/lib/ckan/default/bin/paster statsresources generate -c /etc/ckan/default/production.ini
Generate reports every 6 hours
0 */6 * * * /usr/lib/ckan/default/bin/paster statsresources generate -c /etc/ckan/default/production.ini
Weekly on Monday at 3 AM
0 3 * * 1 /usr/lib/ckan/default/bin/paster statsresources generate -c /etc/ckan/default/production.ini
Available Reports (from ckanext-report):
Common built-in reports:
dataset_creation:
- Dataset creation dates
- Tracks when datasets were created
- Options: include_private, include_draft
organization_stats:
- Statistics per organization
- Dataset counts, resource counts
- Options: include_private, min_datasets
resource_formats:
- Breakdown of resource formats (CSV, JSON, PDF, etc.)
- Usage statistics
- Options: include_private
tagless_datasets:
- Datasets without tags
- Data quality indicator
publisher_activity:
- Publishing activity by organization
- Temporal analysis
Custom Reports:
Create custom reports via ckanext-report interface.
Then reference in statsresources.report_map.
Usage:
Automatic Resource Creation:
- Configure report_map in production.ini
- Run: paster statsresources generate
- Resources automatically created/updated in specified packages
- Reports available for download
Accessing Generated Resources:
- Navigate to package specified in report_map
- Find resource with configured title
- Download in specified format (JSON/CSV)
- Data contains report results
JSON Format Example:
{
“report_name”: “dataset_creation”,
“generated”: “2025-10-10T14:30:00”,
“data”: [
{“date”: “2025-01-15”, “count”: 5},
{“date”: “2025-02-20”, “count”: 8},
{“date”: “2025-03-10”, “count”: 3}
]
}
CSV Format Example:
date,count
2025-01-15,5
2025-02-20,8
2025-03-10,3
Development:
Clone repository:
git clone https://github.com/smotornyuk/ckanext-statsresources.git
cd ckanext-statsresources
Install for development:
python setup.py develop
pip install -r dev-requirements.txt
Create test.ini from template
Run tests:
nosetests –nologcapture –with-pylons=test.ini
Run with coverage:
pip install coverage
nosetests –nologcapture –with-pylons=test.ini –with-coverage –cover-package=ckanext.statsresources –cover-inclusive –cover-erase –cover-tests
Troubleshooting:
Reports not generating:
- Verify ckanext-report is installed and enabled
- Check report database initialized: paster report initdb
- Verify report names exist: paster report list
- Check logs for errors
- Validate report_map syntax
Resources not created:
- Verify package IDs exist
- Check user permissions (sysadmin recommended)
- Ensure package is not deleted
- Review resource creation logs
Options not applied:
- Check option syntax: option:value
- Verify option names match report expectations
- Review report documentation for valid options
- Check indentation in config
Permission errors:
- Set reports.strict_access appropriately
- Verify user has sysadmin role if strict_access=true
- Check CKAN authorization settings
Cron job failures:
- Use full paths in cron command
- Activate virtualenv in cron script
- Redirect output to log: » /var/log/ckan/statsresources.log 2>&1
- Verify cron user has permissions
Best Practices:
Report Organization:
- Create dedicated “Statistics” package
- Group related reports together
- Use clear, descriptive resource titles
- Include generation timestamp in description
Scheduling:
- Don’t generate too frequently (resource-intensive)
- Schedule during low-traffic periods
- Consider data freshness requirements
- Monitor generation time
Access Control:
- Use reports.strict_access=true for sensitive data
- Review report content for privacy concerns
- Consider separate public/private reports
Format Selection:
- JSON: For programmatic access, APIs
- CSV: For spreadsheet analysis, human-readable
- Generate both if needed
Development Status: Beta (4)
License: AGPL v3.0 or later
Keywords: CKAN, statistics, reports, analytics, resources, automated
Developer: Link Digital (Sergey Motornyuk)
Related Extensions:
- ckanext-report: Report generation (required)
- ckanext-ga-report: Google Analytics reports
- ckanext-dga-stats: data.gov.au statistics
- ckanext-dashboard: Analytics dashboards