Requirements:
- ckanext-harvest extension must be installed
- IMPORTANT: You need a sysadmin user called ‘harvest’ on your CKAN instance
Installation:
source /home/www-data/pyenv/bin/activate
pip install -e git+https://github.com/openresearchdata/ckanext-oaipmh.git#egg=ckanext-oaipmh –src /var/www
cd /var/www/ckanext-oaipmh
pip install -r requirements.txt
python setup.py develop
Add to ckan.plugins:
oaipmh_harvester
Setup Harvester:
1. Navigate to /harvest/new
2. Enter base URL of OAI-PMH repository (e.g., http://boris.unibe.ch/cgi/oai2)
3. Select Source type: OAI-PMH Harvester
Configuration options (JSON in Configuration section):
Credentials (if required)
{“username”: “foo”, “password”: “bar”}
Harvest specific set only
{“set”: “baz”}
Specify metadata format (currently oai_dc and oai_ddi supported)
{“metadata_prefix”: “oai_dc”}
Enforce HTTP GET if source doesn’t support POST (default: false)
{“force_http_get”: true}
Run Harvester:
1. Activate python environment
2. cd to CKAN directory (e.g., /usr/lib/ckan/default/src/ckan)
3. Start consumers:
paster –plugin=ckanext-oaipmh harvester gather_consumer &
paster –plugin=ckanext-oaipmh harvester fetch_consumer &
4. Run job:
paster –plugin=ckanext-oaipmh harvester run
Development/Testing:
. ~/default/bin/activate
cd /var/www/ckanext-oaipmh
nosetests –logging-filter=ckanext.oaipmh.harvester –ckan –with-pylons=test.ini ckanext/oaipmh/tests
OAI-PMH repositories: http://www.openarchives.org/Register/BrowseSites