BulkManager (session.bulk)¶
-
class
vsc_irods.manager.bulk_manager.BulkManager(session)[source]¶ A class for easier ‘bulk’ operations with the iRODS file system
-
remove(iterator, recurse=False, force=False, interactive=False, verbose=False, **options)[source]¶ Remove iRODS data objects and/or collections, in a manner that resembles the UNIX ‘rm’ command.
Examples:
>>> session.bulk.remove('tmpdir*', recurse=True) >>> session.bulk.remove('~/molecule_database/*.xyz')
Arguments:
- iterator: iterator or str
Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a search_manager.iglob() iterator). Matching data objects (and, if used recursively, collections) will be removed.
- recurse: bool (default: False)
Whether to use recursion, meaning that also matching collections and their data objects and subcollections will be removed.
- force: bool (default: False)
Whether to imediately remove the collections or data objects, without putting them in the trash.
- interactive: bool (default: False)
Whether to prompt for permission before every removal.
- verbose: bool (default: False)
Whether to print more output.
- options: (any remaining keywords arguments)
Additional options to be passed on to PRC’s collections.remove() and data_objects.unlink() methods.
-
move(iterator, irods_path, clobber=True, interactive=False, verbose=False)[source]¶ Moving or renaming iRODS data objects and/or collections, similar to the UNIX mv command.
Raises an CollectionDoesNotExist if the iterator corresponds to more than one item and the irods_path destination does not correspond to an existing collection.
Examples:
>>> session.bulk.move('tmpfiles*', '~/tmpdir/', verbose=True) >>> session.bulk.move('./parent/dirname', './parent/dirname_new')
Arguments:
- iterator: iterator or str
Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a search_manager.iglob() iterator). Matching data objects and collections will be moved to the new path.
- irods_path: str (default: ‘.’)
The (absolute or relative) path on the local file system where the data objects and collections will moved to.
- clobber: bool (default: True)
Whether to overwrite existing data objects.
- interactive: bool (default: False)
Whether to prompt for permission before overwriting existing data objects. If True, the value of the ‘clobber’ argument is ignored.
- verbose: bool (default: False)
Whether to print more output.
-
get(iterator, local_path='.', recurse=False, clobber=True, interactive=False, return_data_objects=False, verbose=False, **options)[source]¶ Copy iRODS data objects and/or collections to the local machine.
Examples:
>>> session.bulk.get('tmpdir*', recurse=True) >>> session.bulk.get('~/irods_db/*.xyz', local_path='./local_db')
Arguments:
- iterator: iterator or str
Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a search_manager.iglob() iterator). Matching data objects (and, if used recursively, collections) will be copied to the local machine.
- local_path: str (default: ‘.’)
The (absolute or relative) path on the local file system where the data objects and collections will be copied to.
- recurse: bool (default: False)
Whether to use recursion, meaning that also matching collections and their data objects and subcollections will be copied.
- clobber: bool (default: True)
Whether to overwrite existing local files.
- interactive: bool (default: False)
Whether to prompt for permission before overwriting existing local files. If True, the value of the ‘clobber’ argument is ignored.
- return_data_objects: (default: False)
Whether to return a list of iRODSDataObject instances of the uploaded data objects, instead of downloading them to the local file system. If True, the ‘clobber’ and ‘interactive’ arguments are ignored.
- verbose: bool (default: False)
Whether to print more output.
- options: (any remaining keywords arguments)
Additional options to be passed on to PRC’s data_objects.get() method.
-
put(iterator, irods_path='.', recurse=False, clobber=True, interactive=False, verbose=False, create_options={}, **options)[source]¶ Copy local files and/or folders to the iRODS server, in a manner that resembles the UNIX ‘cp’ command.
Examples:
>>> session.bulk.put('tmpdir*', recurse=True) >>> session.bulk.put('~/local_db/*.xyz', irods_path='./irods_db/')
Arguments:
- iterator: iterator or str
Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a in search_manager.iglob() iterator). Matching files on the local machine (and, if used recursively, directories) will be copied to the iRODS server.
- irods_path: str (default: ‘.’)
The (absolute or relative) path on the iRODS file system where the local files and folders will be copied to.
- recurse: bool (default: False)
Whether to use recursion, meaning that also matching folders and their files and subfolders will be copied to the iRODS server.
- clobber: bool (default: True)
Whether to overwrite existing data objects.
- interactive: bool (default: False)
Whether to prompt for permission before overwriting existing data objects. If True, the value of the ‘clobber’ argument is ignored.
- verbose: bool (default: False)
Whether to print more output.
- create_options: dict (default: {})
Additional options to be passed on to PRC’s collections.create() method.
- options: (any remaining keywords arguments)
Additional options to be passed on to PRC’s data_objects.put() method.
-
metadata(iterator, action='add', recurse=False, collection_avu=[], object_avu=[], verbose=False)[source]¶ Add or remove metadata to iRODS data objects and/or collections.
Examples:
>>> session.bulk.metadata('tmpdir*', action='add', recurse=True, object_avu=('is_temporary_file',)), collection_avu=('is_temporary_dir',))
Arguments:
- iterator: iterator or str
Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a search_manager.iglob() iterator). Metadata will be modified for matching data objects and, if used recursively, collections.
- action: str
The action to perform. Choose either ‘add’ or ‘remove’.
- recurse: bool (default: False)
Whether to use recursion, meaning that metadata will be modified for matching collections and their data objects and subcollections.
- collection_avu: tuple or list of tuples (default: [])
One or several attribute-value[-unit]] tuples to be modified for collections.
- object_avu: tuple or list of tuples (default: [])
One or several attribute-value[-unit]] tuples to be modified for data objects.
- verbose: bool (default: False)
Whether to print more output.
-
add_job_metadata(iterator, recurse=False, verbose=False)[source]¶ Add job-related metadata to selected data objects and collections.
Examples:
>>> session.bulk.add_job_metadata('~/data/out*.txt')
Arguments:
- iterator: iterator or str
Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a search_manager.iglob() iterator). Job metadata will be added for matching data objects and, if used recursively, collections.
- recurse: bool (default: False)
Whether to use recursion, meaning that job metadata will be added to matching collections and their data objects and subcollections.
- verbose: bool (default: False)
Whether to print more output.
-
size(iterator, recurse=False, verbose=False)[source]¶ Yields (path, size-in-bytes) tuples for the selected data objects and collections.
Examples:
>>> session.bulk.size('~/data/out*.txt') >>> session.bulk.size('./data', recurse=True)
Arguments:
- iterator: iterator or str
Defines which items are subject to the bulk operation. Can be an iterator (e.g. using search_manager.find()) or a string (which will be used to construct a search_manager.iglob() iterator). Data sizes will be returned for matching data objects and, if used recursively, collections.
- recurse: bool (default: False)
Whether to use recursion, meaning that the data size of matching collections will be calculated as the sum of their data objects and subcollection sizes.
- verbose: bool (default: False)
Whether to print more output.
-