Alternative storage systems

Alternative storage systems can be implemented by writing a new subclass of DataStore. The developers are interested in adding support for new systems, so if you would help to use Arcana with a different storage system please create an issue for it in the GitHub Issue Tracker.

Required methods

When subclassing DataStore, the following abstract methods must be overridden to implement the appropriate functionality of the data store. For a reference implementation please see arcana.dirtree.data.SimpleStore.

class arcana.core.data.store.DataStore[source]

Abstract base class for all data store adapters. A data store can be an external data management system, e.g. XNAT, OpenNeuro, Datalad or just a defined structure of how to lay out data within a file-system, e.g. BIDS.

For a data management system/data structure to be compatible with Arcana, it must meet a number of criteria. In Arcana, a store is assumed to

  • contain multiple projects/datasets addressable by unique IDs.

  • organise data within each project/dataset in trees

  • store arbitrary numbers of data “items” (e.g. “file-sets” and fields) within each tree node (including non-leaf nodes) addressable by unique “paths” relative to the node.

  • allow derivative data to be stored within in separate namespaces for different analyses on the same data

abstract load_dataset_definition(dataset_id: str, name: str) Dict[str, Any][source]

Load definition of a dataset saved within the store

Parameters:
  • dataset_id (str) – The ID (e.g. file-system path, XNAT project ID) of the project

  • name (str) – Name for the dataset definition to distinguish it from other definitions for the same directory/project

Returns:

definition – A dct Dataset object that was saved in the data store

Return type:

ty.Dict[str, Any]

abstract save_dataset_definition(dataset_id: str, definition: Dict[str, Any], name: str)[source]

Save definition of dataset within the store

Parameters:
  • dataset_id (str) – The ID/path of the dataset within the store

  • definition (ty.Dict[str, Any]) – A dictionary containing the dct Dataset to be saved. The dictionary is in a format ready to be dumped to file as JSON or YAML.

  • name (str) – Name for the dataset definition to distinguish it from other definitions for the same directory/project

Optional methods

The following methods are not strictly necessary to override, but can offer significant performance boosts by avoiding unnecessary downloads in the case of DataStore.get_checksums() and unnecessary remote connections in the case of DataStore.connect() and DataStore.disconnect() (by caching the connection between multiple calls).

class arcana.core.data.store.DataStore[source]

Abstract base class for all data store adapters. A data store can be an external data management system, e.g. XNAT, OpenNeuro, Datalad or just a defined structure of how to lay out data within a file-system, e.g. BIDS.

For a data management system/data structure to be compatible with Arcana, it must meet a number of criteria. In Arcana, a store is assumed to

  • contain multiple projects/datasets addressable by unique IDs.

  • organise data within each project/dataset in trees

  • store arbitrary numbers of data “items” (e.g. “file-sets” and fields) within each tree node (including non-leaf nodes) addressable by unique “paths” relative to the node.

  • allow derivative data to be stored within in separate namespaces for different analyses on the same data

abstract connect() Any[source]

If a connection session is required to the store manage it here

Returns:

session – a session object that will be stored in the connection manager and accessible at DataStore.connection

Return type:

Any

abstract disconnect(session: Any)[source]

If a connection session is required to the store manage it here

Parameters:

session (Any) – the session object returned by connect to be closed gracefully