Extension Automatic datastore refresh for URL resources


Extension Basics

Title
Automatic datastore refresh for URL resources
Name
ckanext-datastore-refresh
Type
Public extension
Description
Refresh/reupload datastore data for URL-uploaded resources via cron jobs
CKAN versions
Download-Url (zip)
Download-Url commit date
2021-01-01
Url to repo
Category
Data Management & Quality


Background Infos

Description (long)
Show details

This extension provides automatic datastore refresh functionality for resources uploaded by URL. Since CKAN core has no mechanism to track file changes after upload, this extension creates cron jobs to refresh data based on sysadmin configuration. Features include: automatic datastore refresh for URL resources, configurable refresh frequencies (10 minutes, 2 hours, 24 hours), CKAN admin menu configuration panel, CLI command for refresh execution (datastore-refresh refresh_dataset_datastore), database migrations for tracking refresh schedules, integration with ckanext-xloader for data loading, cron job setup with configurable intervals. Works with CKAN 2.9+, Python 3.7+. Requires ckanext-xloader dependency. AGPL licensed.

Version
0.1
Version release date
2021-01-01
Contact name
Salsa Digital AU
Contakt email
(not set)
Contact Url
(not set)


Installation Guide

Configuration hints

Install via pip:

git clone https://github.com/salsadigitalauorg/ckanext-datastore-refresh.git

cd ckanext-datastore-refresh

pip install -e .

Enable the plugin:

ckan.plugins = datastore_refresh

Apply database migrations:

ckan db upgrade -p datastore_refresh

Requires ckanext-xloader to be installed and configured.

Configuration:

Configuration is done via CKAN admin menu panel.

Available refresh frequencies: - 10 minutes - 2 hours - 24 hours

TODO: Make frequencies configurable via ckan.ini file.

Cron job setup:

Add cron job for automatic refresh (example with 10 minute frequency):

@hourly ckan -c /path/to/ckan.ini datastore-refresh refresh_dataset_datastore 10

Parameters: - 10: frequency to refresh the datastore (in minutes)

Cron job examples:

Every 10 minutes:

*/10 * * * * ckan -c /path/to/ckan.ini datastore-refresh refresh_dataset_datastore 10

Every 2 hours:

0 */2 * * * ckan -c /path/to/ckan.ini datastore-refresh refresh_dataset_datastore 120

Daily:

@daily ckan -c /path/to/ckan.ini datastore-refresh refresh_dataset_datastore 1440

CLI Commands:

ckan -c /path/to/ckan.ini datastore-refresh refresh_dataset_datastore

Use case:

Problem: CKAN doesn’t automatically detect when external URL resources change. Solution: Schedule periodic datastore refreshes to keep data up-to-date.

Workflow: 1. Resource uploaded by URL 2. Admin configures refresh frequency via admin panel 3. Cron job calls CLI command at specified intervals 4. Extension triggers xloader to reload data 5. Datastore updated with latest data from URL

Requirements: - CKAN >= 2.9 - Python >= 3.7 - ckanext-xloader (required dependency)

Plugins to configure (ckan.ini)
datastore_refresh
CKAN Settings (ckan.ini)
(not set)
DB migration to be executed
datastore_refresh
<< back to Extensions