Installation: pip install ckanext-federated-index
Add to ckan.plugins:
federated_index
Profile-based configuration (each remote portal needs a profile):
Basic profile (demo is the profile name)
ckanext.federated_index.profile.demo.url = https://demo.ckan.org
ckanext.federated_index.profile.demo.api_key = YOUR_API_KEY_OPTIONAL
ckanext.federated_index.profile.demo.timeout = 5
Profile with advanced features (JSON format for extras)
ckanext.federated_index.profile.demo.extras = {“search_payload”: {“rows”: 100, “fq”: “organization:my-org”}, “storage”: {“type”: “redis”}}
Global settings:
ckanext.federated_index.align_with_local_schema = false
ckanext.federated_index.redirect_missing_federated_datasets = true
ckanext.federated_index.dataset_read_endpoints = dataset.read
ckanext.federated_index.index_url_field = federated_index_remote_url
ckanext.federated_index.index_profile_field = federated_index_profile
Storage types (configured via profile extras):
- db (default): custom table via migration
- fs: filesystem JSON files in ckan.storage_path/federated_index/PROFILENAME
- redis: Redis storage
- sqlite: separate SQLite database per profile
Refresh datasets via CLI:
ckanapi action federated_index_profile_refresh profile=demo index=true
Incremental updates (only newer datasets):
ckanapi action federated_index_profile_refresh profile=demo index=true since_last_refresh=true
Implement IFederatedIndex interface for custom hooks:
class MyPlugin(p.SingletonPlugin):
p.implements(plugins.IFederatedIndex)
def federated_index_before_index(self, pkg_dict, profile):
# Custom logic before indexing
return pkg_dict
Differences from ckanext-harvest:
- Works only with CKAN instances (not generic harvesters)
- Uses CKAN API, no background processes
- Adds to search index only, no local dataset copies
- Lighter weight and simpler architecture
Testing: pytest