FileFormats¶
Fileformats is a library of Python classes that correspond to different file formats for file-type detection/validation, MIME-type lookup and file handling. The format classes also provide hooks for methods to read and manipulate the data contained in the files to facilitate the writing of duck-typed code. Unlike other Python packages, multi-file data formats, e.g. with separate header/data files or directories containing specific files, are supported, and can be handled just like single file types.
File-format types are typically identified by a combination of file extensions and "magic numbers", where applicable. In these cases new formats can be defined in just a few lines. However, for more exotic file formats like MRtrix Image Header, which requires inspection of headers to locate other members of the "file set", FileFormats provides a framework to add custom detection methods.
Extensions and Extras¶
The main FileFormats package covers all file-types with registered MIME types (see IANA MIME-types). Additional, domain-specific formats can be added via FileFormats extension framework, such as fileformats-medimage for medical imaging data, and fileformats-datascience for formats commonly found in datascience. These extension packages are understandably not comprehensive, but expected to grow as new use cases are found and new formats added (see Extensions).
The main FileFormats and its extension packages don't have any external dependencies. Extra functionality that requires external dependencies, such as libraries to read and write the file data, are implemented in separate extras packages (see Extras), e.g. fileformats-extras, fileformats-medimage-extras), to keep the base packages for format detection and file handling extremely light-weight.
Installation¶
FileFormats can be installed for Python >=3.8 using pip
$ python3 -m pip install fileformats
Extension packages can be installed similarly
$ python3 -m pip install fileformats-medimage fileformats-datascience
These installations have no dependencies and provide basic format detection and
file handling functionality. However, for metadata inspection and format conversion methods
that require external dependencies, you will need install the fileformats-extras
package.
$ python3 -m pip install fileformats-extras
and likewise for the extension packages
$ python3 -m pip install fileformats-medimage-extras fileformats-datascience-extras
Note
See the Extensions and Extras for instructions on how to implement your own extensions and extras, respectively.
License¶
This work is licensed under a Creative Commons Attribution 4.0 International License