BulkManager (session.bulk)

class vsc_irods.manager.bulk_manager.BulkManager(session)[source]

A class for easier ‘bulk’ operations with the iRODS file system

remove(iterator, recurse=False, force=False, interactive=False, verbose=False, **options)[source]

Remove iRODS data objects and/or collections, in a manner that resembles the UNIX ‘rm’ command.

Examples:

>>> session.bulk.remove('tmpdir*', recurse=True)
>>> session.bulk.remove('~/molecule_database/*.xyz')

Arguments:

iterator: iterator or str

Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a search_manager.iglob() iterator). Matching data objects (and, if used recursively, collections) will be removed.

recurse: bool (default: False)

Whether to use recursion, meaning that also matching collections and their data objects and subcollections will be removed.

force: bool (default: False)

Whether to imediately remove the collections or data objects, without putting them in the trash.

interactive: bool (default: False)

Whether to prompt for permission before every removal.

verbose: bool (default: False)

Whether to print more output.

options: (any remaining keywords arguments)

Additional options to be passed on to PRC’s collections.remove() and data_objects.unlink() methods.

move(iterator, irods_path, clobber=True, interactive=False, verbose=False)[source]

Moving or renaming iRODS data objects and/or collections, similar to the UNIX mv command.

Raises an CollectionDoesNotExist if the iterator corresponds to more than one item and the irods_path destination does not correspond to an existing collection.

Examples:

>>> session.bulk.move('tmpfiles*', '~/tmpdir/', verbose=True)
>>> session.bulk.move('./parent/dirname', './parent/dirname_new')

Arguments:

iterator: iterator or str

Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a search_manager.iglob() iterator). Matching data objects and collections will be moved to the new path.

irods_path: str (default: ‘.’)

The (absolute or relative) path on the local file system where the data objects and collections will moved to.

clobber: bool (default: True)

Whether to overwrite existing data objects.

interactive: bool (default: False)

Whether to prompt for permission before overwriting existing data objects. If True, the value of the ‘clobber’ argument is ignored.

verbose: bool (default: False)

Whether to print more output.

get(iterator, local_path='.', recurse=False, clobber=True, interactive=False, return_data_objects=False, verbose=False, **options)[source]

Copy iRODS data objects and/or collections to the local machine.

Examples:

>>> session.bulk.get('tmpdir*', recurse=True)
>>> session.bulk.get('~/irods_db/*.xyz', local_path='./local_db')

Arguments:

iterator: iterator or str

Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a search_manager.iglob() iterator). Matching data objects (and, if used recursively, collections) will be copied to the local machine.

local_path: str (default: ‘.’)

The (absolute or relative) path on the local file system where the data objects and collections will be copied to.

recurse: bool (default: False)

Whether to use recursion, meaning that also matching collections and their data objects and subcollections will be copied.

clobber: bool (default: True)

Whether to overwrite existing local files.

interactive: bool (default: False)

Whether to prompt for permission before overwriting existing local files. If True, the value of the ‘clobber’ argument is ignored.

return_data_objects: (default: False)

Whether to return a list of iRODSDataObject instances of the uploaded data objects, instead of downloading them to the local file system. If True, the ‘clobber’ and ‘interactive’ arguments are ignored.

verbose: bool (default: False)

Whether to print more output.

options: (any remaining keywords arguments)

Additional options to be passed on to PRC’s data_objects.get() method.

put(iterator, irods_path='.', recurse=False, clobber=True, interactive=False, verbose=False, create_options={}, **options)[source]

Copy local files and/or folders to the iRODS server, in a manner that resembles the UNIX ‘cp’ command.

Examples:

>>> session.bulk.put('tmpdir*', recurse=True)
>>> session.bulk.put('~/local_db/*.xyz', irods_path='./irods_db/')

Arguments:

iterator: iterator or str

Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a in search_manager.iglob() iterator). Matching files on the local machine (and, if used recursively, directories) will be copied to the iRODS server.

irods_path: str (default: ‘.’)

The (absolute or relative) path on the iRODS file system where the local files and folders will be copied to.

recurse: bool (default: False)

Whether to use recursion, meaning that also matching folders and their files and subfolders will be copied to the iRODS server.

clobber: bool (default: True)

Whether to overwrite existing data objects.

interactive: bool (default: False)

Whether to prompt for permission before overwriting existing data objects. If True, the value of the ‘clobber’ argument is ignored.

verbose: bool (default: False)

Whether to print more output.

create_options: dict (default: {})

Additional options to be passed on to PRC’s collections.create() method.

options: (any remaining keywords arguments)

Additional options to be passed on to PRC’s data_objects.put() method.

metadata(iterator, action='add', recurse=False, collection_avu=[], object_avu=[], verbose=False)[source]

Add or remove metadata to iRODS data objects and/or collections.

Examples:

>>> session.bulk.metadata('tmpdir*', action='add', recurse=True,
                           object_avu=('is_temporary_file',)),
                           collection_avu=('is_temporary_dir',))

Arguments:

iterator: iterator or str

Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a search_manager.iglob() iterator). Metadata will be modified for matching data objects and, if used recursively, collections.

action: str

The action to perform. Choose either ‘add’ or ‘remove’.

recurse: bool (default: False)

Whether to use recursion, meaning that metadata will be modified for matching collections and their data objects and subcollections.

collection_avu: tuple or list of tuples (default: [])

One or several attribute-value[-unit]] tuples to be modified for collections.

object_avu: tuple or list of tuples (default: [])

One or several attribute-value[-unit]] tuples to be modified for data objects.

verbose: bool (default: False)

Whether to print more output.

add_job_metadata(iterator, recurse=False, verbose=False)[source]

Add job-related metadata to selected data objects and collections.

Examples:

>>> session.bulk.add_job_metadata('~/data/out*.txt')

Arguments:

iterator: iterator or str

Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a search_manager.iglob() iterator). Job metadata will be added for matching data objects and, if used recursively, collections.

recurse: bool (default: False)

Whether to use recursion, meaning that job metadata will be added to matching collections and their data objects and subcollections.

verbose: bool (default: False)

Whether to print more output.

size(iterator, recurse=False, verbose=False)[source]

Yields (path, size-in-bytes) tuples for the selected data objects and collections.

Examples:

>>> session.bulk.size('~/data/out*.txt')
>>> session.bulk.size('./data', recurse=True)

Arguments:

iterator: iterator or str

Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a search_manager.iglob() iterator). Data sizes will be returned for matching data objects and, if used recursively, collections.

recurse: bool (default: False)

Whether to use recursion, meaning that the data size of matching collections will be calculated as the sum of their data objects and subcollection sizes.

verbose: bool (default: False)

Whether to print more output.