Requirements:
- CKAN <= 2.8 (CKAN 2.9 compatibility not yet verified)
- python-magic library for file type sniffing
Installation:
1. Activate CKAN virtualenv
2. Install extension:
pip install -e git+https://github.com/qld-gov-au/ckanext-resource-type-validation.git#egg=ckanext-resource-type-validation
3. Install dependencies:
pip install -r ckanext-resource-type-validation/requirements.txt
Add to ckan.plugins:
resource_type_validation
Configuration:
Path to configuration file for file types (optional)
Default: ckanext/resource_type_validation/resources/resource_types.json
ckanext.resource_validation.types_file = /path/to/file.json
Support contact for error messages (optional)
ckanext.resource_validation.support_contact = webmaster@example.com
Whitelist of allowed MIME types (optional)
ckan.mimetypes_allowed = application/pdf,text/plain,text/xml
Configuration file structure (all optional):
allowed_extensions: List of allowed file extensions (case-insensitive)
Example: [“pdf”, “csv”, “json”, “xml”]
allowed_overrides: MIME type subtype mappings
Example: {“text/plain”: [“application/xml”, “text/”], “application/octet-stream”: [“”]}
- application/xml is subtype of text/plain
- Wildcards: “” for any type, “prefix/” for any with prefix
equal_types: Lists of interchangeable types
Example: [[“text/xml”, “application/xml”], [“text/csv”, “application/csv”]]
archive_types: Types requiring special handling
Example: [“application/zip”, “application/x-tar”, “application/gzip”]
- Archives can specify any format (referring to contents)
- Must be well-formed (extension and contents match)
generic_types: Generic supertypes (prevents content-sniffing attacks)
Example: [“text/plain”, “application/octet-stream”]
- File with text/plain content can specify CSV extension/format
- File with .txt extension cannot specify CSV format
- Prevents browser-based content-sniffing attacks
extra_mimetypes: Custom extension to MIME type mappings
Example: {“.ttf”: “text/plain”, “.geojson”: “application/geo+json”}
Benefits:
- Reduces staff workload fixing miscategorized files
- Better format restrictions via type sniffing
- Prevents invalid file uploads with fake extensions
- Protects against content-sniffing attacks
Testing:
python ckanext/resource_type_validation/test_mime_type_validation.py
OR
nosetests –ckan –with-pylons=test.ini ckanext/resource_type_validation