Extension Harvest


Extension Basics

Title
Harvest
Name
ckanext-harvest
Type
Public extension
Description
Remote harvesting framework for importing datasets from other CKAN instances and external data sources.
CKAN versions

~2.10.0, ~2.11.0

Show details

These CKAN Versions are exactely matched:

Download-Url (zip)
Download-Url commit date
2025-01-14
Url to repo
Category
Data Management & Quality


Background Infos

Description (long)
Show details

CKAN Harvest is a comprehensive harvesting framework that enables automatic import of datasets from remote CKAN instances and other data sources. It provides a three-stage harvesting process (gather, fetch, import) with background job processing for scalability. The extension includes built-in harvesters for CKAN instances, CSW servers, and provides an extensible architecture for developing custom harvesters. It supports configuration options for filtering, transformation, and scheduling of harvest jobs. The framework includes a web interface for managing harvest sources, monitoring job progress, and handling errors. It requires Redis or RabbitMQ for job queuing and provides comprehensive logging and error reporting capabilities.

Version
1.6.1
Version release date
2025-01-14
Contact name
CKAN Development Team
Contakt email
Contact Url
(not set)


Installation Guide

Configuration hints

Configure message queue (Redis or RabbitMQ), set up harvest source permissions, configure logging levels and cleanup settings. Run database migrations and set up background workers.

Plugins to configure (ckan.ini)
harvest ckan_harvester
CKAN Settings (ckan.ini)
# ckan.harvest.mq.type = redis # ckan.harvest.mq.hostname = localhost # ckan.harvest.mq.port = 6379 # ckan.harvest.mq.redis_db = 0 # ckan.harvest.log_scope = -1 # ckan.harvest.log_timeframe = 30
DB migration to be executed
-p harvest
<< back to Extensions