Requirements:
- CKAN 2.0+
- Google Analytics account
- Google Analytics API credentials (OAuth JSON file)
- Python 2.7
Installation:
Activate CKAN virtualenv:
. /usr/lib/ckan/default/bin/activate
Install extension:
pip install -e git+https://github.com/wildcatzita/ckanext-ds-stats.git#egg=ckanext-ds-stats
Configure Google Analytics in production.ini:
Google Analytics Account ID (from GA admin)
ds_stats.ga.id = UA-1010101-1
Account name (top level item in GA)
ds_stats.ga.account = data.gov.uk
Path to OAuth credentials JSON file
ds_stats.ga.token.filepath = ~/pyenv/credentials.json
Reporting period (monthly, weekly, daily)
ds_stats.ga-report.period = monthly
URL path for bounce rate tracking (usually homepage)
ds_stats.ga-report.bounce_url = /
Initialize database tables:
paster initdb-ga –config=../ckan/development.ini
paster initdb-ga-report –config=../ckan/development.ini
paster initdb-ds-stats-cache –config=../ckan/development.ini
Add plugin to ckan.plugins:
ckan.plugins = … ds_stats …
Restart CKAN:
sudo service apache2 restart
Google Analytics API Authorization:
Before importing analytics data, set up OAuth credentials:
Visit Google APIs Console:
https://code.google.com/apis/console
Sign in and create project (or use existing)
Create Service Account:
- Go to Service accounts pane:
https://console.developers.google.com/iam-admin/serviceaccounts
- Choose your project
- Create new account
- Check “Furnish a new private key” → JSON type
- Save the Service account ID (email format)
Download credentials JSON file:
- Save downloaded file as credentials.json
- Reference path in config: ds_stats.ga.token.filepath
Grant Analytics access:
- Go to Google Analytics console:
https://analytics.google.com/analytics/web/#management
- Click ADMIN tab
- Find “User management” button
- Add service account using Service account ID (email)
- Grant “Read” role
Update CKAN config with credentials path:
ds_stats.ga.token.filepath = ~/pyenv/credentials.json
Configuration Options:
Required Settings:
Google Analytics ID
ds_stats.ga.id = UA-1010101-1
Google Analytics account name
ds_stats.ga.account = your-account-name
OAuth credentials file path
ds_stats.ga.token.filepath = /path/to/credentials.json
Reporting period
ds_stats.ga-report.period = monthly
Bounce rate URL
ds_stats.ga-report.bounce_url = /
Optional Settings (with defaults):
Resource download URL prefix (for tracking)
ds_stats.ga.resource_prefix = /downloads/
Domain for tracking (auto = automatic detection)
ds_stats.ga.domain = auto
Enable event tracking (CKAN 1.x only)
ds_stats.ga.track_events = false
Additional tracker fields (JSON object)
ds_stats.ga.fields = {}
Show download counts on package pages
ds_stats.ga.show_downloads = true
Domain Linking (Cross-Domain Tracking):
For tracking across multiple domains:
Comma-separated list of linked domains
ds_stats.ga.linked_domains = example.com,data.example.com,api.example.com
See Google’s documentation:
https://support.google.com/analytics/answer/1034342?hl=en
Paster Commands:
Initialize databases:
paster initdb-ga –config=production.ini
paster initdb-ga-report –config=production.ini
paster initdb-ds-stats-cache –config=production.ini
Load Google Analytics data:
paster loadanalytics-ga credentials.json –config=production.ini
Imports GA data using credentials file.
Load GA reports:
paster loadanalytics-ga-report [period] –config=production.ini
Period options:
- all: Data for all time (since 2010)
- latest: Just the latest data (default)
- YYYY-MM-DD: Data from specific date to present
Examples:
paster loadanalytics-ga-report latest –config=production.ini
paster loadanalytics-ga-report 2020-01-01 –config=production.ini
paster loadanalytics-ga-report all –config=production.ini
Fix time periods (maintenance):
paster fixtimeperiods –config=production.ini
Scheduled Data Import:
Set up cron jobs to regularly import analytics:
Import latest GA data daily at 3 AM
0 3 * * * /usr/lib/ckan/default/bin/paster loadanalytics-ga /path/to/credentials.json –config=/etc/ckan/default/production.ini
Import latest reports daily at 4 AM
0 4 * * * /usr/lib/ckan/default/bin/paster loadanalytics-ga-report latest –config=/etc/ckan/default/production.ini
Viewing Statistics:
Top datasets analytics:
http://your-ckan-site.com/stats/analytics/dataset/top
Shows most popular datasets based on GA data.
Resource download counts:
- Automatically displayed on package pages
- Next to each resource if ds_stats.ga.show_downloads = true
Admin dashboards:
- Access via CKAN admin interface
- Shows detailed analytics reports
- Group-level statistics available
Configuration Parameters Explained:
resource_prefix:
- Arbitrary identifier for download tracking in GA
- Should resemble URL path segment
- Example: /downloads/
- Makes filtering resources easier in GA interface
domain:
- Domain for user tracking
- Usually leave as ‘auto’ for automatic detection
- For multi-subdomain tracking: ‘.mydomain.com’
- See: http://code.google.com/apis/analytics/docs/gaJS/gaJSApiDomainDirectory.html#gat.GA_Tracker._setDomainName
track_events:
- CKAN 1.x only
- Enables GA event tracking for general pages
- Resource downloads always tracked regardless
fields:
- Additional options when creating tracker
- JSON object format
- See: https://developers.google.com/analytics/devguides/collection/analyticsjs/field-reference
show_downloads:
- Display download count on package pages
- Shows next to each resource
- Based on GA tracked downloads
Development Setup:
Clone repository:
git clone https://github.com/DataShades/ckanext-ds-stats.git
cd ckanext-ds-stats
Install for development:
python setup.py develop
pip install -r dev-requirements.txt
Run tests:
Create test.ini from template
nosetests –nologcapture –with-pylons=test.ini
Troubleshooting:
No analytics data appearing:
- Verify GA account ID is correct
- Check credentials.json is valid
- Ensure service account has Analytics read access
- Run loadanalytics commands manually to check for errors
Download counts not showing:
- Verify ds_stats.ga.show_downloads = true
- Check GA tracking is collecting download events
- Ensure resource_prefix matches tracked URLs
- Run loadanalytics-ga-report to import data
Database errors:
- Ensure all three initdb commands were run
- Check database permissions
- Verify CKAN database connection
Authorization errors:
- Regenerate credentials.json
- Verify service account email in GA user management
- Check token.filepath points to correct file
- Ensure file is readable by CKAN process
Development Status: Beta (4)
License: AGPL v3.0 or later
Keywords: CKAN, ga-report, dga-stats, googleanalytics, analytics, statistics
Components:
- dga-stats: data.gov.au statistics fork with private dataset filtering
- ga-report: Periodic reporting for site managers
- googleanalytics: GA tracking and event monitoring