Extension Git-based Dataset Storage

Extension Basics

Title	Git-based Dataset Storage
Name	ckanext-gitdatahub
Type	Public extension
Description	Git-based metadata storage system with version control, branching, and distributed dataset management.
CKAN versions	~2.8.0, ~2.9.0, ~2.10.0 Show details These CKAN Versions are exactely matched: 2.8.0 2.8.1 2.8.10 2.8.11 2.8.12 2.8.2 2.8.3 2.8.4 2.8.5 2.8.6 2.8.7 2.8.8 2.8.9 2.9.0 2.9.1 2.9.10 2.9.11 2.9.2 2.9.3 2.9.4 2.9.5 2.9.6 2.9.7 2.9.8 2.9.9 2.10.0 2.10.1 2.10.2 2.10.3 2.10.4 2.10.5 2.10.6 2.10.7 2.10.8
Download-Url (zip)	https://github.com/datopian/ckanext-gitdatahub.git#egg=ckanext-gitdatahub
Download-Url commit date	2020-06-02
Url to repo	https://github.com/datopian/ckanext-gitdatahub
Category	Cloud Infrastructure & Storage

Background Infos

Description (long)	Show details The Git-based Dataset Storage extension transforms CKAN into a distributed data management system by implementing Git-based storage for dataset metadata, enabling advanced version control, collaborative editing, branching workflows, and distributed synchronization capabilities for sophisticated data governance and collaborative data management. This innovative extension stores all dataset metadata in Git repositories, providing full version history, branch management, and merge capabilities that enable collaborative dataset development with conflict resolution and change tracking. The system supports distributed CKAN instances that can synchronize metadata changes through Git push/pull operations, enabling federated data management across multiple organizations or geographic locations. Advanced features include metadata branching for experimental changes, pull request workflows for collaborative editing, and automated synchronization between CKAN instances through Git hooks and API integration. The extension provides comprehensive change tracking with detailed commit histories, author attribution, and rollback capabilities for robust metadata management. Administrative tools include repository management interfaces, branch visualization, conflict resolution workflows, and automated backup systems through Git’s distributed nature. Integration capabilities extend to external Git services like GitHub, GitLab, and Bitbucket for enterprise-grade collaboration and backup infrastructure. The system supports custom metadata schemas with Git-based storage, enabling complex metadata evolution and schema migration through version control mechanisms. Performance optimizations include efficient Git operations, metadata caching, and optimized synchronization protocols for large-scale deployments. Essential for organizations requiring collaborative metadata management, distributed CKAN deployments across multiple sites, research consortiums with shared data governance, and installations where version control, change attribution, and distributed resilience are critical for maintaining data integrity and enabling collaborative data stewardship across organizational boundaries.
Version	Latest
Version release date	2020-06-02
Contact name	Datopian Team
Contakt email	info@datopian.com
Contact Url	(not set)

Installation Guide

Configuration hints	Implements Git-based storage for dataset metadata with version control
Plugins to configure (ckan.ini)	gitdatahub
CKAN Settings (ckan.ini)	# ckanext.gitdatahub.git_repo_url = https://github.com/your-org/metadata-repo.git # ckanext.gitdatahub.git_user_name = CKAN System # ckanext.gitdatahub.git_user_email = ckan@example.com # ckanext.gitdatahub.auto_push = true # ckanext.gitdatahub.branch_name = main # ckanext.gitdatahub.enable_webhooks = true
DB migration to be executed	gitdatahub initdb

<< back to Extensions