Extension Workflow (Dataset Lifecycle Management)


Extension Basics

Title
Workflow (Dataset Lifecycle Management)
Name
ckanext-workflow
Type
Public extension
Description
Extended workflow framework for managing dataset lifecycle states beyond CKAN's basic private/public states
CKAN versions

~2.9, ~2.10, ~2.11

Show details
Download-Url (zip)
Download-Url commit date
2025-10-10
Url to repo
Category
Specialized Tools


Background Infos

Description (long)
Show details

Provides an extensible workflow framework for managing complex dataset lifecycle states. Goes beyond CKAN’s simple private/public model to support custom workflow states such as draft, review, approved, published, archived, withdrawn, etc. Implements state transition logic, access control for different states, and visibility management (e.g., hide datasets with certain states from search results). Includes IWorkflow plugin interface for creating custom workflows, native_workflow plugin for enhanced private/public management, and workflow_state_change API action for programmatic state transitions. Each state can have custom permissions, validation rules, and transition constraints. Useful for organizations with editorial workflows, multi-stage approval processes, or complex content management requirements.

Version
2.0.0.post1
Version release date
2025-10-10
Contact name
DataShades / Sergey Motornyuk
Contakt email
Contact Url
(not set)


Installation Guide

Configuration hints

Requirements: - CKAN 2.9+ - Python 3.7+

Installation:

  1. Activate CKAN virtualenv: . /usr/lib/ckan/default/bin/activate

  2. Install extension: pip install ckanext-workflow

    Or from source: git clone https://github.com/DataShades/ckanext-workflow.git cd ckanext-workflow python setup.py develop

  3. Install dependencies: pip install -r dev-requirements.txt

  4. Add plugin to ckan.plugins in production.ini: ckan.plugins = … workflow native_workflow …

    Plugins:

    • workflow: Core workflow framework (required)
    • native_workflow: Enhanced private/public management (optional)
    • test_workflow: Test workflow for development (optional)
  5. Restart CKAN: sudo service apache2 reload

Configuration:

No global configuration required. Workflows are implemented via plugin interface.

Built-in Workflows:

  1. native_workflow: Enhanced management of CKAN’s native private/public states. Provides better control over state transitions.

    Enable: ckan.plugins = … workflow native_workflow …

Using Existing Workflows:

To use a custom workflow:

  1. Enable plugin implementing IWorkflow: ckan.plugins = … workflow my_custom_workflow …

  2. Change dataset state via API:

    POST /api/3/action/workflow_state_change { “id”: “dataset-id”, “transition”: “submit_for_review”, “message”: “Ready for editorial review” }

    Required:

    • id: Package ID

    Optional (depends on workflow):

    • transition: Named transition to execute
    • message: Comment/reason for state change
    • Any other workflow-specific parameters
  3. Workflow handles:

    • State validation
    • Permission checks
    • State transition
    • Access control updates
    • Visibility management

Low-Level API Usage:

For programmatic control:

import ckan.plugins.toolkit as tk

Get current state of package

state = tk.h.workflow_get_state(pkg_dict)

Check if workflow handles this package

if state is None: # No workflow enabled or package not managed return

Prepare transition data

data_dict = { ‘transition’: ‘approve’, ‘reviewer_id’: user_id, ‘notes’: ‘Approved with minor suggestions’ }

Execute state change

new_state = state.change(data_dict)

Commit changes to database

new_state.save()

Creating Custom Workflows:

  1. Implement IWorkflow Interface:

from ckan.plugins import SingletonPlugin, implements from ckanext.workflow.interfaces import IWorkflow

class EditorialWorkflowPlugin(SingletonPlugin): implements(IWorkflow)

def get_workflow_states(self):
    """Define available states"""
    return {
        'draft': DraftState(),
        'review': ReviewState(),
        'approved': ApprovedState(),
        'published': PublishedState(),
        'archived': ArchivedState()
    }

def get_initial_state(self, package_dict):
    """Return initial state for new packages"""
    return 'draft'

def can_handle_package(self, package_dict):
    """Determine if workflow applies to package"""
    # Example: Only for specific package type
    return package_dict.get('type') == 'editorial_content'
  1. Define State Classes:

from ckanext.workflow.state import WorkflowState

class DraftState(WorkflowState): name = ‘draft’ label = ‘Draft’

def get_allowed_transitions(self):
    """Define valid state transitions"""
    return ['submit_for_review', 'delete']

def can_transition_to(self, new_state, user, package):
    """Check if transition is allowed"""
    if new_state == 'review':
        # Only package creator can submit
        return user['id'] == package['creator_user_id']
    return True

def is_visible_in_search(self):
    """Should appear in search results?"""
    return False  # Drafts hidden from search

def get_allowed_roles(self):
    """Who can view/edit in this state?"""
    return {
        'view': ['creator', 'editor', 'admin'],
        'edit': ['creator', 'admin'],
        'delete': ['creator', 'admin']
    }

class ReviewState(WorkflowState): name = ‘review’ label = ‘Under Review’

def get_allowed_transitions(self):
    return ['approve', 'reject', 'request_changes']

def can_transition_to(self, new_state, user, package):
    # Only editors/admins can approve/reject
    return user.get('sysadmin') or 'editor' in user.get('roles', [])

def is_visible_in_search(self):
    return False  # Not public yet

def on_enter_state(self, package, data):
    """Called when entering this state"""
    # Send notification to editors
    self.notify_editors(package)

def on_exit_state(self, package, new_state):
    """Called when leaving this state"""
    # Log state change
    self.log_transition(package, new_state)

class PublishedState(WorkflowState): name = ‘published’ label = ‘Published’

def get_allowed_transitions(self):
    return ['unpublish', 'archive']

def is_visible_in_search(self):
    return True  # Public

def get_allowed_roles(self):
    return {
        'view': ['public'],  # Anyone can view
        'edit': ['editor', 'admin'],
        'delete': ['admin']
    }
  1. State Transition Logic:

class WorkflowState: def change(self, data_dict): “”“Execute state transition”“” transition = data_dict.get(‘transition’) new_state_name = self.resolve_transition(transition)

    # Validate transition
    if not self.can_transition_to(new_state_name, ...):
        raise NotAuthorized("Transition not allowed")

    # Get new state instance
    new_state = self.workflow.get_state(new_state_name)

    # Exit current state
    self.on_exit_state(self.package, new_state)

    # Enter new state
    new_state.on_enter_state(self.package, data_dict)

    # Update package
    self.package['workflow_state'] = new_state_name

    return new_state

def save(self):
    """Persist state changes to database"""
    tk.get_action('package_patch')(
        {'ignore_auth': True},
        self.package
    )

Advanced Features:

  1. State Validation:

class ReviewState(WorkflowState): def validate_package(self, package): “”“Ensure package meets requirements for this state”“” errors = {}

    if not package.get('title'):
        errors['title'] = ['Required for review']

    if len(package.get('resources', [])) < 1:
        errors['resources'] = ['At least one resource required']

    if errors:
        raise ValidationError(errors)
  1. Notifications:

class WorkflowState: def notify_users(self, package, role, message): “”“Send notifications to users with role”“” users = self.get_users_with_role(package, role) for user in users: self.send_email(user, message)

  1. Audit Trail:

class WorkflowState: def log_transition(self, package, old_state, new_state, user, data): “”“Log state transitions for audit”“” import datetime

    log_entry = {
        'package_id': package['id'],
        'old_state': old_state,
        'new_state': new_state,
        'user_id': user['id'],
        'timestamp': datetime.datetime.utcnow(),
        'data': data
    }

    self.save_to_audit_log(log_entry)
  1. Conditional Visibility:

class WorkflowState: def filter_search_results(self, results, user): “”“Filter search results based on state and user”“” filtered = [] for pkg in results: state = self.get_state(pkg) if state.can_user_view(user, pkg): filtered.append(pkg) return filtered

Development:

  1. Clone repository: git clone https://github.com/DataShades/ckanext-workflow.git cd ckanext-workflow

  2. Install for development: python setup.py develop pip install -r dev-requirements.txt

  3. Run tests: pytest –ckan-ini test.ini

  4. Run with coverage: pytest –ckan-ini test.ini –cov=ckanext.workflow

Testing Custom Workflows:

  1. Create test plugin with IWorkflow implementation
  2. Define test states and transitions
  3. Write tests for:
    • State initialization
    • Valid transitions
    • Invalid transitions (should fail)
    • Permission checks
    • Visibility rules
    • Audit logging

Troubleshooting:

  1. State not changing:

    • Verify workflow plugin is enabled
    • Check can_transition_to() permissions
    • Review transition validation logic
    • Check logs for errors
    • Verify package is handled by workflow
  2. Packages visible when shouldn’t be:

    • Check is_visible_in_search() implementation
    • Verify search filters applied
    • Review access control logic
    • Test with different user roles
  3. API errors:

    • Validate data_dict parameters
    • Check required fields provided
    • Verify workflow handles package type
    • Review error messages in logs

Best Practices:

  1. State Design:

    • Keep states simple and focused
    • Define clear transition rules
    • Document state meanings
    • Use meaningful state names
  2. Transitions:

    • Validate before transitioning
    • Log all transitions
    • Send appropriate notifications
    • Handle errors gracefully
  3. Permissions:

    • Implement least-privilege access
    • Test permission checks thoroughly
    • Document role requirements
    • Consider organization-level roles
  4. Testing:

    • Test all transition paths
    • Verify permission enforcement
    • Check visibility rules
    • Test with various user roles

Development Status: Beta (4)

License: AGPL v3.0 or later

Keywords: CKAN, workflow, lifecycle, state, approval, editorial

Developer: DataShades / Link Digital

Related Extensions: - ckanext-issues: Issue tracking workflow - ckanext-requestdata: Data request workflow - ckanext-datastore: Database workflow integration

Plugins to configure (ckan.ini)
# workflow=ckanext.workflow.plugin:WorkflowPlugin # native_workflow=ckanext.workflow.plugin:NativeWorkflowPlugin # test_workflow=ckanext.workflow.tests.plugin:TestWorkflowPlugin
CKAN Settings (ckan.ini)
# No global configuration - workflows implemented via plugin interface
DB migration to be executed
(not set)
<< back to Extensions