ckanext-importer
Overview
ckanext.importer provides utilities for easily importing metadata from an external data source into CKAN and keeping the CKAN metadata up-to-date when the contents of the data source is modified.
To achieve this, each entity (package, resource, view) in CKAN is linked to its counterpart in the original data source via an external ID (EID), for example the entity’s ID in the data source.
Key Features
- Automatic Entity Management: Automatically creates or retrieves CKAN packages, resources, and views based on external IDs (EIDs)
- Synchronization: Keeps CKAN metadata synchronized with external data sources
- Error Handling: Configurable error handling with options to reraise, keep, or delete entities on errors
- Context Managers: Uses Python context managers for clean resource management
- Batch Operations: Support for importing multiple datasets and automatic cleanup of unsynced entities
- Package Extras: Convenient dict-like interface for managing CKAN package extras
- Resource Views: Support for synchronizing resource views with external data
Usage Example
from ckanext.importer import Importer
imp = Importer('my-importer-id')
with imp.sync_package('my-package-eid') as pkg:
pkg['title'] = 'My Package Title'
pkg.extras['my-extra-key'] = 'my-extra-value'
with pkg.sync_resource('my-resource-eid') as res:
res['name'] = 'My Resource Name'
res['url'] = 'https://some-resource-url'
with res.sync_view('my-view-eid') as view:
view['view_type'] = 'text_view'
view['title'] = 'My View Title'
Error Handling Options
- OnError.reraise (default): Re-raise exceptions and keep previous state
- OnError.keep: Swallow exceptions and keep existing entity
- OnError.delete: Swallow exceptions and delete entity
License
Copyright © 2018, Stadt Karlsruhe. Distributed under the GNU Affero General Public License (AGPL-3.0).