BIOBLEND(1) | BioBlend | BIOBLEND(1) |
bioblend - BioBlend Documentation
BioBlend is a Python library for interacting with the Galaxy API.
BioBlend is supported and tested on:
BioBlend's goal is to make it easier to script and automate the running of Galaxy analyses and administering of a Galaxy server. In practice, it makes it possible to do things like this:
from bioblend.galaxy import GalaxyInstance gi = GalaxyInstance('<Galaxy IP>', key='your API key') libs = gi.libraries.get_libraries() gi.workflows.show_workflow('workflow ID') wf_invocation = gi.workflows.invoke_workflow('workflow ID', inputs)
from bioblend.galaxy.objects import GalaxyInstance gi = GalaxyInstance("URL", "API_KEY") wf = gi.workflows.list()[0] hist = gi.histories.list()[0] inputs = hist.get_datasets()[:2] input_map = dict(zip(wf.input_labels, inputs)) params = {"Paste1": {"delimiter": "U"}} wf_invocation = wf.invoke(input_map, params=params)
The library was originally called just Blend but we renamed it to reflect more of its domain and a make it bit more unique so it can be easier to find. The name was intended to be short and easily pronounceable. In its original implementation, the goal was to provide a lot more support for CloudMan and other integration capabilities, allowing them to be blended together via code. BioBlend fitted the bill.
Stable releases of BioBlend are best installed via pip from PyPI:
$ python3 -m pip install bioblend
Alternatively, the most current source code from our Git repository can be installed with:
$ python3 -m pip install git+https://github.com/galaxyproject/bioblend
After installing the library, you will be able to simply import it into your Python environment with import bioblend. For details on the available functionality, see the API documentation.
BioBlend requires a number of Python libraries. These libraries are installed automatically when BioBlend itself is installed, regardless whether it is installed via PyPi or by running python3 setup.py install command. The current list of required libraries is always available from setup.py in the source code repository.
If you also want to run tests locally, some extra libraries are required. To install them, run:
$ python3 setup.py test
To get started using BioBlend, install the library as described above. Once the library becomes available on the given system, it can be developed against. The developed scripts do not need to reside in any particular location on the system.
It is probably best to take a look at the example scripts in docs/examples source directory and browse the API documentation. Beyond that, it's up to your creativity :).
Anyone interested in contributing or tweaking the library is more then welcome to do so. To start, simply fork the Git repository on Github and start playing with it. Then, issue pull requests.
BioBlend's API focuses around and matches the services it wraps. Thus, there are two top-level sets of APIs, each corresponding to a separate service and a corresponding step in the automation process. Note that each of the service APIs can be used completely independently of one another.
Effort has been made to keep the structure and naming of those API's consistent across the library but because they do bridge different services, some discrepancies may exist. Feel free to point those out and/or provide fixes.
For Galaxy, an alternative object-oriented API is also available. This API provides an explicit modeling of server-side Galaxy instances and their relationships, providing higher-level methods to perform operations such as retrieving all datasets for a given history, etc. Note that, at the moment, the oo API is still incomplete, providing access to a more restricted set of Galaxy modules with respect to the standard one.
API used to manipulate genomic analyses within Galaxy, including data management and workflow execution.
----
Contains possible interaction dealing with Galaxy configuration.
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
{'allow_library_path_paste': False, 'allow_user_creation': True, 'allow_user_dataset_purge': True, 'allow_user_deletion': False, 'enable_unique_workflow_defaults': False, 'ftp_upload_dir': '/SOMEWHERE/galaxy/ftp_dir', 'ftp_upload_site': 'galaxy.com', 'library_import_dir': 'None', 'logo_url': None, 'support_url': 'https://galaxyproject.org/support', 'terms_url': None, 'user_library_import_dir': None, 'wiki_url': 'https://galaxyproject.org/'}
{'extra': {}, 'version_major': '17.01'}
{'active': True, 'deleted': False, 'email': 'user@example.org', 'id': '4aaaaa85aacc9caa', 'last_password_change': '2021-07-29T05:34:54.632345', 'model_class': 'User', 'username': 'julia'}
----
Contains possible interactions with the Galaxy Datasets
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
Since the number of datasets may be very large, limit and offset parameters are required to specify the desired range.
If the user is an admin, this will return datasets for all the users, otherwise only for the current user.
NOTE:
NOTE:
----
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
NOTE:
----
Contains possible interactions with the Galaxy Datatype
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
['snpmatrix', 'snptest', 'tabular', 'taxonomy', 'twobit', 'txt', 'vcf', 'wig', 'xgmml', 'xml']
['galaxy.datatypes.tabular:Vcf', 'galaxy.datatypes.binary:TwoBit', 'galaxy.datatypes.binary:Bam', 'galaxy.datatypes.binary:Sff', 'galaxy.datatypes.xml:Phyloxml', 'galaxy.datatypes.xml:GenericXml', 'galaxy.datatypes.sequence:Maf', 'galaxy.datatypes.sequence:Lav', 'galaxy.datatypes.sequence:csFasta']
----
Contains possible interactions with the Galaxy library folders
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
----
Contains possible interactions with the Galaxy Forms
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
[{'id': 'f2db41e1fa331b3e', 'model_class': 'FormDefinition', 'name': 'First form', 'url': '/api/forms/f2db41e1fa331b3e'}, {'id': 'ebfb8f50c6abde6d', 'model_class': 'FormDefinition', 'name': 'second form', 'url': '/api/forms/ebfb8f50c6abde6d'}]
{'desc': 'here it is ', 'fields': [], 'form_definition_current_id': 'f2db41e1fa331b3e', 'id': 'f2db41e1fa331b3e', 'layout': [], 'model_class': 'FormDefinition', 'name': 'First form', 'url': '/api/forms/f2db41e1fa331b3e'}
----
Contains possible interactions with the Galaxy FTP Files
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
----
Contains possible interactions with the Galaxy Histories
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
Contains possible interactions with the Galaxy Groups
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
[{'id': '7c9636938c3e83bf', 'model_class': 'Group', 'name': 'My Group Name', 'url': '/api/groups/7c9636938c3e83bf'}]
[{'id': '33abac023ff186c2', 'model_class': 'Group', 'name': 'Listeria', 'url': '/api/groups/33abac023ff186c2'}, {'id': '73187219cd372cf8', 'model_class': 'Group', 'name': 'LPN', 'url': '/api/groups/73187219cd372cf8'}]
{'id': '33abac023ff186c2', 'model_class': 'Group', 'name': 'Listeria', 'roles_url': '/api/groups/33abac023ff186c2/roles', 'url': '/api/groups/33abac023ff186c2', 'users_url': '/api/groups/33abac023ff186c2/users'}
----
Contains possible interactions with the Galaxy Histories
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
a description of the dataset collection For example:
{'collection_type': 'list', 'element_identifiers': [{'id': 'f792763bee8d277a', 'name': 'element 1', 'src': 'hda'}, {'id': 'f792763bee8d277a', 'name': 'element 2', 'src': 'hda'}], 'name': 'My collection list'}
{'id': 'f792763bee8d277a', 'model_class': 'HistoryTagAssociation', 'user_tname': 'NGS_PE_RUN', 'user_value': None}
NOTE:
NOTE:
Changed in version 0.17.0: Using the deprecated history_id parameter now raises a ValueError exception.
WARNING:
{'id': '6fbd9b2274c62ebe', 'job_id': '5471ba76f274f929', 'parameters': {'chromInfo': '"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/mm9.len"', 'dbkey': '"mm9"', 'experiment_name': '"H3K4me3_TAC_MACS2"', 'input_chipseq_file1': {'id': '6f0a311a444290f2', 'uuid': 'null'}, 'input_control_file1': {'id': 'c21816a91f5dc24e', 'uuid': '16f8ee5e-228f-41e2-921e-a07866edce06'}, 'major_command': '{"gsize": "2716965481.0", "bdg": "False", "__current_case__": 0, "advanced_options": {"advanced_options_selector": "off", "__current_case__": 1}, "input_chipseq_file1": 104715, "xls_to_interval": "False", "major_command_selector": "callpeak", "input_control_file1": 104721, "pq_options": {"pq_options_selector": "qvalue", "qvalue": "0.05", "__current_case__": 1}, "bw": "300", "nomodel_type": {"nomodel_type_selector": "create_model", "__current_case__": 1}}'}, 'stderr': '', 'stdout': '', 'tool_id': 'toolshed.g2.bx.psu.edu/repos/ziru-zhou/macs2/modencode_peakcalling_macs2/2.0.10.2', 'uuid': '5c0c43f5-8d93-44bd-939d-305e82f213c6'}
NOTE:
Changed in version 0.8.0: Changed the return value from the status code (type int) to a dict.
Changed in version 0.8.0: Changed the return value from the status code (type int) to a dict.
Changed in version 0.8.0: Changed the return value from the status code (type int) to a dict.
----
Contains possible interactions with the Galaxy workflow invocations
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
{'markdown': '\n# Workflow Execution Summary of Example workflow\n\n ## Workflow Inputs\n\n\n## Workflow Outputs\n\n\n ## Workflow\n```galaxy\n workflow_display(workflow_id=f2db41e1fa331b3e)\n```\n', 'render_format': 'markdown', 'workflows': {'f2db41e1fa331b3e': {'name': 'Example workflow'}}}
[{'id': 'e85a3be143d5905b', 'model': 'Job', 'populated_state': 'ok', 'states': {'ok': 1}}, {'id': 'c9468fdb6dc5c5f1', 'model': 'Job', 'populated_state': 'ok', 'states': {'running': 1}}, {'id': '2a56795cad3c7db3', 'model': 'Job', 'populated_state': 'ok', 'states': {'new': 1}}]
{'states': {'paused': 4, 'error': 2, 'ok': 2}, 'model': 'WorkflowInvocation', 'id': 'a799d38679e985db', 'populated_state': 'ok'}
[{'history_id': '2f94e8ae9edff68a', 'id': 'df7a1f0c02a5b08e', 'model_class': 'WorkflowInvocation', 'state': 'new', 'update_time': '2015-10-31T22:00:22', 'uuid': 'c8aa2b1c-801a-11e5-a9e5-8ca98228593c', 'workflow_id': '03501d7626bd192f'}]
NOTE:
{'history_id': '2f94e8ae9edff68a', 'id': 'df7a1f0c02a5b08e', 'inputs': {'0': {'id': 'a7db2fac67043c7e', 'src': 'hda', 'uuid': '7932ffe0-2340-4952-8857-dbaa50f1f46a'}}, 'model_class': 'WorkflowInvocation', 'state': 'ready', 'steps': [{'action': None, 'id': 'd413a19dec13d11e', 'job_id': None, 'model_class': 'WorkflowInvocationStep', 'order_index': 0, 'state': None, 'update_time': '2015-10-31T22:00:26', 'workflow_step_id': 'cbbbf59e8f08c98c', 'workflow_step_label': None, 'workflow_step_uuid': 'b81250fd-3278-4e6a-b269-56a1f01ef485'}, {'action': None, 'id': '2f94e8ae9edff68a', 'job_id': 'e89067bb68bee7a0', 'model_class': 'WorkflowInvocationStep', 'order_index': 1, 'state': 'new', 'update_time': '2015-10-31T22:00:26', 'workflow_step_id': '964b37715ec9bd22', 'workflow_step_label': None, 'workflow_step_uuid': 'e62440b8-e911-408b-b124-e05435d3125e'}], 'update_time': '2015-10-31T22:00:26', 'uuid': 'c8aa2b1c-801a-11e5-a9e5-8ca98228593c', 'workflow_id': '03501d7626bd192f'}
{'action': None, 'id': '63cd3858d057a6d1', 'job_id': None, 'model_class': 'WorkflowInvocationStep', 'order_index': 2, 'state': None, 'update_time': '2015-10-31T22:11:14', 'workflow_step_id': '52e496b945151ee8', 'workflow_step_label': None, 'workflow_step_uuid': '4060554c-1dd5-4287-9040-8b4f281cf9dc'}
----
Contains possible interactions with the Galaxy Jobs
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
NOTE:
NOTE:
If the user is an admin, this will return jobs for all the users, otherwise only for the current user.
[{'create_time': '2014-03-01T16:16:48.640550', 'exit_code': 0, 'id': 'ebfb8f50c6abde6d', 'model_class': 'Job', 'state': 'ok', 'tool_id': 'fasta2tab', 'update_time': '2014-03-01T16:16:50.657399'}, {'create_time': '2014-03-01T16:05:34.851246', 'exit_code': 0, 'id': '1cd8e2f6b131e891', 'model_class': 'Job', 'state': 'ok', 'tool_id': 'upload1', 'update_time': '2014-03-01T16:05:39.558458'}]
NOTE:
NOTE:
New in version 0.5.3.
NOTE:
NOTE:
This method is designed to scan the list of previously run jobs and find records of jobs with identical input parameters and datasets. This can be used to minimize the amount of repeated work by simply recycling the old results.
Changed in version 0.16.0: Replaced the job_info parameter with separate tool_id, inputs and state.
{'create_time': '2014-03-01T16:17:29.828624', 'exit_code': 0, 'id': 'a799d38679e985db', 'inputs': {'input': {'id': 'ebfb8f50c6abde6d', 'src': 'hda'}}, 'model_class': 'Job', 'outputs': {'output': {'id': 'a799d38679e985db', 'src': 'hda'}}, 'params': {'chromInfo': '"/opt/galaxy-central/tool-data/shared/ucsc/chrom/?.len"', 'dbkey': '"?"', 'seq_col': '"2"', 'title_col': '["1"]'}, 'state': 'ok', 'tool_id': 'tab2fasta', 'update_time': '2014-03-01T16:17:31.930728'}
NOTE:
NOTE:
----
Contains possible interactions with the Galaxy Data Libraries
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
{'id': 'f740ab636b360a70', 'name': 'Library from bioblend', 'url': '/api/libraries/f740ab636b360a70'}
WARNING:
{'deleted': True, 'id': '60e680a037f41974'}
Changed in version 1.1.1: Using the deprecated folder_id parameter now raises a ValueError exception.
Changed in version 1.1.1: Using the deprecated library_id parameter now raises a ValueError exception.
???
Indicate whether to generate dataset tags from filenames.
Changed in version 0.14.0: Changed the default from True to False.
NOTE:
???
Indicate whether to generate dataset tags from filenames.
Changed in version 0.14.0: Changed the default from True to False.
NOTE:
----
Contains possible interactions with the Galaxy Quota
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
{'url': '/galaxy/api/quotas/386f14984287a0f7', 'model_class': 'Quota', 'message': "Quota 'Testing' has been created with 1 associated users and 0 associated groups.", 'id': '386f14984287a0f7', 'name': 'Testing'}
Before a quota can be deleted, the quota must not be a default quota.
"Deleted 1 quotas: Testing-B"
[{'id': '0604c8a56abe9a50', 'model_class': 'Quota', 'name': 'test ', 'url': '/api/quotas/0604c8a56abe9a50'}, {'id': '1ee267091d0190af', 'model_class': 'Quota', 'name': 'workshop', 'url': '/api/quotas/1ee267091d0190af'}]
{'bytes': 107374182400, 'default': [], 'description': 'just testing', 'display_amount': '100.0 GB', 'groups': [], 'id': '0604c8a56abe9a50', 'model_class': 'Quota', 'name': 'test ', 'operation': '=', 'users': []}
"Undeleted 1 quotas: Testing-B"
"Quota 'Testing-A' has been renamed to 'Testing-B'; Quota 'Testing-e' is now '-100.0 GB'; Quota 'Testing-B' is now the default for unregistered users"
----
Contains possible interactions with the Galaxy Roles
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
{'description': 'desc', 'url': '/api/roles/ebfb8f50c6abde6d', 'model_class': 'Role', 'type': 'admin', 'id': 'ebfb8f50c6abde6d', 'name': 'Foo'}
Changed in version 0.15.0: Changed the return value from a 1-element list to a dict.
[{"id": "f2db41e1fa331b3e", "model_class": "Role", "name": "Foo", "url": "/api/roles/f2db41e1fa331b3e"}, {"id": "f597429621d6eb2b", "model_class": "Role", "name": "Bar", "url": "/api/roles/f597429621d6eb2b"}]
{"description": "Private Role for Foo", "id": "f2db41e1fa331b3e", "model_class": "Role", "name": "Foo", "type": "private", "url": "/api/roles/f2db41e1fa331b3e"}
----
----
Contains possible interactions with the Galaxy Tool data tables
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
[{"model_class": "TabularToolDataTable", "name": "fasta_indexes"}, {"model_class": "TabularToolDataTable", "name": "bwa_indexes"}]
{'columns': ['value', 'dbkey', 'name', 'path'], 'fields': [['test id', 'test', 'test name', '/opt/galaxy-dist/tool-data/test/seq/test id.fa']], 'model_class': 'TabularToolDataTable', 'name': 'all_fasta'}
{'columns': ['value', 'dbkey', 'name', 'path'], 'fields': [['test id', 'test', 'test name', '/opt/galaxy-dist/tool-data/test/seq/test id.fa']], 'model_class': 'TabularToolDataTable', 'name': 'all_fasta'}
----
Contains interactions dealing with Galaxy dependency resolvers.
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
[{'requirements': [{'name': 'galaxy_sequence_utils', 'specs': [], 'type': 'package', 'version': '1.1.4'}, {'name': 'bx-python', 'specs': [], 'type': 'package', 'version': '0.8.6'}], 'status': [{'cacheable': False, 'dependency_type': None, 'exact': True, 'model_class': 'NullDependency', 'name': 'galaxy_sequence_utils', 'version': '1.1.4'}, {'cacheable': False, 'dependency_type': None, 'exact': True, 'model_class': 'NullDependency', 'name': 'bx-python', 'version': '0.8.6'}], 'tool_ids': ['vcf_to_maf_customtrack1']}]
NOTE:
----
----
----
----
Contains possible interactions with the Galaxy Workflows
All clients must define the following field (which will be used as part of the URL composition (e.g., http://<galaxy_instance>/api/libraries): self.module = 'workflows' | 'libraries' | 'histories' | ...
WARNING:
For more advanced filtering use InvocationClient.get_invocations().
[{'history_id': '2f94e8ae9edff68a', 'id': 'df7a1f0c02a5b08e', 'model_class': 'WorkflowInvocation', 'state': 'new', 'update_time': '2015-10-31T22:00:22', 'uuid': 'c8aa2b1c-801a-11e5-a9e5-8ca98228593c', 'workflow_id': '03501d7626bd192f'}]
[{'id': '92c56938c2f9b315', 'name': 'Simple', 'url': '/api/workflows/92c56938c2f9b315'}]
Changed in version 1.1.1: Using the deprecated workflow_id parameter now raises a ValueError exception.
{'id': 'ee0e2b4b696d9092', 'model_class': 'StoredWorkflow', 'name': 'Super workflow that solves everything!', 'published': False, 'tags': [], 'url': '/api/workflows/ee0e2b4b696d9092'}
{'name': 'Training: 16S rRNA sequencing with mothur: main tutorial', 'tags': [], 'deleted': false, 'latest_workflow_uuid': '368c6165-ccbe-4945-8a3c-d27982206d66', 'url': '/api/workflows/94bac0a90086bdcf', 'number_of_steps': 44, 'published': false, 'owner': 'jane-doe', 'model_class': 'StoredWorkflow', 'id': '94bac0a90086bdcf'}
{'name': 'Training: 16S rRNA sequencing with mothur: main tutorial', 'tags': [], 'deleted': false, 'latest_workflow_uuid': '368c6165-ccbe-4945-8a3c-d27982206d66', 'url': '/api/workflows/94bac0a90086bdcf', 'number_of_steps': 44, 'published': false, 'owner': 'jane-doe', 'model_class': 'StoredWorkflow', 'id': '94bac0a90086bdcf'}
A mapping of workflow inputs to datasets and dataset collections. The datasets source can be a LibraryDatasetDatasetAssociation (ldda), LibraryDataset (ld), HistoryDatasetAssociation (hda), or HistoryDatasetCollectionAssociation (hdca).
The map must be in the following format: {'<input_index>': {'id': <encoded dataset ID>, 'src': '[ldda, ld, hda, hdca]'}} (e.g. {'2': {'id': '29beef4fadeed09f', 'src': 'hda'}})
This map may also be indexed by the UUIDs of the workflow steps, as indicated by the uuid property of steps returned from the Galaxy API. Alternatively workflow steps may be addressed by the label that can be set in the workflow editor. If using uuid or label you need to also set the inputs_by parameter to step_uuid or name.
{'history_id': '2f94e8ae9edff68a', 'id': 'df7a1f0c02a5b08e', 'inputs': {'0': {'id': 'a7db2fac67043c7e', 'src': 'hda', 'uuid': '7932ffe0-2340-4952-8857-dbaa50f1f46a'}}, 'model_class': 'WorkflowInvocation', 'state': 'ready', 'steps': [{'action': None, 'id': 'd413a19dec13d11e', 'job_id': None, 'model_class': 'WorkflowInvocationStep', 'order_index': 0, 'state': None, 'update_time': '2015-10-31T22:00:26', 'workflow_step_id': 'cbbbf59e8f08c98c', 'workflow_step_label': None, 'workflow_step_uuid': 'b81250fd-3278-4e6a-b269-56a1f01ef485'}, {'action': None, 'id': '2f94e8ae9edff68a', 'job_id': 'e89067bb68bee7a0', 'model_class': 'WorkflowInvocationStep', 'order_index': 1, 'state': 'new', 'update_time': '2015-10-31T22:00:26', 'workflow_step_id': '964b37715ec9bd22', 'workflow_step_label': None, 'workflow_step_uuid': 'e62440b8-e911-408b-b124-e05435d3125e'}], 'update_time': '2015-10-31T22:00:26', 'uuid': 'c8aa2b1c-801a-11e5-a9e5-8ca98228593c', 'workflow_id': '03501d7626bd192f'}
The params dict should be specified as follows:
{STEP_ID: PARAM_DICT, ...}
where PARAM_DICT is:
{PARAM_NAME: VALUE, ...}
For backwards compatibility, the following (deprecated) format is also supported for params:
{TOOL_ID: PARAM_DICT, ...}
in which case PARAM_DICT affects all steps with the given tool id. If both by-tool-id and by-step-id specifications are used, the latter takes precedence.
Finally (again, for backwards compatibility), PARAM_DICT can also be specified as:
{'param': PARAM_NAME, 'value': VALUE}
Note that this format allows only one parameter to be set per step.
For a repeat parameter, the names of the contained parameters needs to be specified as <repeat name>_<repeat index>|<param name>, with the repeat index starting at 0. For example, if the tool XML contains:
<repeat name="cutoff" title="Parameters used to filter cells" min="1"> <param name="name" type="text" value="n_genes" label="Name of param..."> <option value="n_genes">n_genes</option> <option value="n_counts">n_counts</option> </param> <param name="min" type="float" min="0" value="0" label="Min value"/> </repeat>
then the PARAM_DICT should be something like:
{... "cutoff_0|name": "n_genes", "cutoff_0|min": "2", "cutoff_1|name": "n_counts", "cutoff_1|min": "4", ...}
At the time of this writing, it is not possible to change the number of times the contained parameters are repeated. Therefore, the parameter indexes can go from 0 to n-1, where n is the number of times the repeated element was added when the workflow was saved in the Galaxy UI.
The replacement_params dict should map parameter names in post-job actions (PJAs) to their runtime values. For instance, if the final step has a PJA like the following:
{'RenameDatasetActionout_file1': {'action_arguments': {'newname': '${output}'}, 'action_type': 'RenameDatasetAction', 'output_name': 'out_file1'}}
then the following renames the output dataset to 'foo':
replacement_params = {'output': 'foo'}
see also this email thread.
WARNING:
An example value for the actions argument might be:
actions = [ {"action_type": "add_input", "type": "data", "label": "foo"}, {"action_type": "update_step_label", "label": "bar", "step": {"label": "foo"}}, ]
{'history_id': '2f94e8ae9edff68a', 'id': 'df7a1f0c02a5b08e', 'inputs': {'0': {'id': 'a7db2fac67043c7e', 'src': 'hda', 'uuid': '7932ffe0-2340-4952-8857-dbaa50f1f46a'}}, 'model_class': 'WorkflowInvocation', 'state': 'ready', 'steps': [{'action': None, 'id': 'd413a19dec13d11e', 'job_id': None, 'model_class': 'WorkflowInvocationStep', 'order_index': 0, 'state': None, 'update_time': '2015-10-31T22:00:26', 'workflow_step_id': 'cbbbf59e8f08c98c', 'workflow_step_label': None, 'workflow_step_uuid': 'b81250fd-3278-4e6a-b269-56a1f01ef485'}, {'action': None, 'id': '2f94e8ae9edff68a', 'job_id': 'e89067bb68bee7a0', 'model_class': 'WorkflowInvocationStep', 'order_index': 1, 'state': 'new', 'update_time': '2015-10-31T22:00:26', 'workflow_step_id': '964b37715ec9bd22', 'workflow_step_label': None, 'workflow_step_uuid': 'e62440b8-e911-408b-b124-e05435d3125e'}], 'update_time': '2015-10-31T22:00:26', 'uuid': 'c8aa2b1c-801a-11e5-a9e5-8ca98228593c', 'workflow_id': '03501d7626bd192f'}
{'action': None, 'id': '63cd3858d057a6d1', 'job_id': None, 'model_class': 'WorkflowInvocationStep', 'order_index': 2, 'state': None, 'update_time': '2015-10-31T22:11:14', 'workflow_step_id': '52e496b945151ee8', 'workflow_step_label': None, 'workflow_step_uuid': '4060554c-1dd5-4287-9040-8b4f281cf9dc'}
{'id': '92c56938c2f9b315', 'inputs': {'23': {'label': 'Input Dataset', 'value': ''}}, 'name': 'Simple', 'url': '/api/workflows/92c56938c2f9b315'}
This page describes some sample use cases for the Galaxy API and provides examples for these API calls. In addition to this page, there are functional examples of complete scripts in the docs/examples directory of the BioBlend source code repository.
To connect to a running Galaxy server, you will need an account on that Galaxy instance and an API key for the account. Instructions on getting an API key can be found at https://galaxyproject.org/develop/api/ .
To open a connection call:
from bioblend.galaxy import GalaxyInstance gi = GalaxyInstance(url='http://example.galaxy.url', key='your-API-key')
We now have a GalaxyInstance object which allows us to interact with the Galaxy server under our account, and access our data. If the account is a Galaxy admin account we also will be able to use this connection to carry out admin actions.
Methods for accessing histories and datasets are grouped under GalaxyInstance.histories.* and GalaxyInstance.datasets.* respectively.
To get information on the Histories currently in your account, call:
>>> gi.histories.get_histories() [{'id': 'f3c2b0f3ecac9f02', 'name': 'RNAseq_DGE_BASIC_Prep', 'url': '/api/histories/f3c2b0f3ecac9f02'}, {'id': '8a91dcf1866a80c2', 'name': 'June demo', 'url': '/api/histories/8a91dcf1866a80c2'}]
This returns a list of dictionaries containing basic metadata, including the id and name of each History. In this case, we have two existing Histories in our account, 'RNAseq_DGE_BASIC_Prep' and 'June demo'. To get more detailed information about a History we can pass its id to the show_history method:
>>> gi.histories.show_history('f3c2b0f3ecac9f02', contents=False) {'annotation': '', 'contents_url': '/api/histories/f3c2b0f3ecac9f02/contents', 'id': 'f3c2b0f3ecac9f02', 'name': 'RNAseq_DGE_BASIC_Prep', 'nice_size': '93.5 MB', 'state': 'ok', 'state_details': {'discarded': 0, 'empty': 0, 'error': 0, 'failed_metadata': 0, 'new': 0, 'ok': 7, 'paused': 0, 'queued': 0, 'running': 0, 'setting_metadata': 0, 'upload': 0}, 'state_ids': {'discarded': [], 'empty': [], 'error': [], 'failed_metadata': [], 'new': [], 'ok': ['d6842fb08a76e351', '10a4b652da44e82a', '81c601a2549966a0', 'a154f05e3bcee26b', '1352fe19ddce0400', '06d549c52d753e53', '9ec54455d6279cc7'], 'paused': [], 'queued': [], 'running': [], 'setting_metadata': [], 'upload': []}}
This gives us a dictionary containing the History's metadata. With contents=False (the default), we only get a list of ids of the datasets contained within the History; with contents=True we would get metadata on each dataset. We can also directly access more detailed information on a particular dataset by passing its id to the show_dataset method:
>>> gi.datasets.show_dataset('10a4b652da44e82a') {'data_type': 'fastqsanger', 'deleted': False, 'file_size': 16527060, 'genome_build': 'dm3', 'id': 17499, 'metadata_data_lines': None, 'metadata_dbkey': 'dm3', 'metadata_sequences': None, 'misc_blurb': '15.8 MB', 'misc_info': 'Noneuploaded fastqsanger file', 'model_class': 'HistoryDatasetAssociation', 'name': 'C1_R2_1.chr4.fq', 'purged': False, 'state': 'ok', 'visible': True}
To upload a local file to a Galaxy server, you can run the upload_file method, supplying the path to a local file:
>>> gi.tools.upload_file('test.txt', 'f3c2b0f3ecac9f02') {'implicit_collections': [], 'jobs': [{'create_time': '2015-07-28T17:52:39.756488', 'exit_code': None, 'id': '9752b387803d3e1e', 'model_class': 'Job', 'state': 'new', 'tool_id': 'upload1', 'update_time': '2015-07-28T17:52:39.987509'}], 'output_collections': [], 'outputs': [{'create_time': '2015-07-28T17:52:39.331176', 'data_type': 'galaxy.datatypes.data.Text', 'deleted': False, 'file_ext': 'auto', 'file_size': 0, 'genome_build': '?', 'hda_ldda': 'hda', 'hid': 16, 'history_content_type': 'dataset', 'history_id': 'f3c2b0f3ecac9f02', 'id': '59c76a119581e190', 'metadata_data_lines': None, 'metadata_dbkey': '?', 'misc_blurb': None, 'misc_info': None, 'model_class': 'HistoryDatasetAssociation', 'name': 'test.txt', 'output_name': 'output0', 'peek': '<table cellspacing="0" cellpadding="3"></table>', 'purged': False, 'state': 'queued', 'tags': [], 'update_time': '2015-07-28T17:52:39.611887', 'uuid': 'ff0ee99b-7542-4125-802d-7a193f388e7e', 'visible': True}]}
If files are greater than 2GB in size, they will need to be uploaded via FTP. Importing files from the user's FTP folder can be done via running the upload tool again:
>>> gi.tools.upload_from_ftp('test.txt', 'f3c2b0f3ecac9f02') {'implicit_collections': [], 'jobs': [{'create_time': '2015-07-28T17:57:43.704394', 'exit_code': None, 'id': '82b264d8c3d11790', 'model_class': 'Job', 'state': 'new', 'tool_id': 'upload1', 'update_time': '2015-07-28T17:57:43.910958'}], 'output_collections': [], 'outputs': [{'create_time': '2015-07-28T17:57:43.209041', 'data_type': 'galaxy.datatypes.data.Text', 'deleted': False, 'file_ext': 'auto', 'file_size': 0, 'genome_build': '?', 'hda_ldda': 'hda', 'hid': 17, 'history_content_type': 'dataset', 'history_id': 'f3c2b0f3ecac9f02', 'id': 'a676e8f07209a3be', 'metadata_data_lines': None, 'metadata_dbkey': '?', 'misc_blurb': None, 'misc_info': None, 'model_class': 'HistoryDatasetAssociation', 'name': 'test.txt', 'output_name': 'output0', 'peek': '<table cellspacing="0" cellpadding="3"></table>', 'purged': False, 'state': 'queued', 'tags': [], 'update_time': '2015-07-28T17:57:43.544407', 'uuid': '2cbe8f0a-4019-47c4-87e2-005ce35b8449', 'visible': True}]}
Methods for accessing Data Libraries are grouped under GalaxyInstance.libraries.*. Most Data Library methods are available to all users, but as only administrators can create new Data Libraries within Galaxy, the create_folder and create_library methods can only be called using an API key belonging to an admin account.
We can view the Data Libraries available to our account using:
>>> gi.libraries.get_libraries() [{'id': '8e6f930d00d123ea', 'name': 'RNA-seq workshop data', 'url': '/api/libraries/8e6f930d00d123ea'}, {'id': 'f740ab636b360a70', 'name': '1000 genomes', 'url': '/api/libraries/f740ab636b360a70'}]
This gives a list of metadata dictionaries with basic information on each library. We can get more information on a particular Data Library by passing its id to the show_library method:
>>> gi.libraries.show_library('8e6f930d00d123ea') {'contents_url': '/api/libraries/8e6f930d00d123ea/contents', 'description': 'RNA-Seq workshop data', 'name': 'RNA-Seq', 'synopsis': 'Data for the RNA-Seq tutorial'}
We can get files into Data Libraries in several ways: by uploading from our local machine, by retrieving from a URL, by passing the new file content directly into the method, or by importing a file from the filesystem on the Galaxy server.
For instance, to upload a file from our machine we might call:
>>> gi.libraries.upload_file_from_local_path('8e6f930d00d123ea', '/local/path/to/mydata.fastq', file_type='fastqsanger')
Note that we have provided the id of the destination Data Library, and in this case we have specified the type that Galaxy should assign to the new dataset. The default value for file_type is 'auto', in which case Galaxy will attempt to guess the dataset type.
Methods for accessing workflows are grouped under GalaxyInstance.workflows.*.
To get information on the Workflows currently in your account, use:
>>> gi.workflows.get_workflows() [{'id': 'e8b85ad72aefca86', 'name': 'TopHat + cufflinks part 1', 'url': '/api/workflows/e8b85ad72aefca86'}, {'id': 'b0631c44aa74526d', 'name': 'CuffDiff', 'url': '/api/workflows/b0631c44aa74526d'}]
This returns a list of metadata dictionaries. We can get the details of a particular Workflow, including its steps, by passing its id to the show_workflow method:
>>> gi.workflows.show_workflow('e8b85ad72aefca86') {'id': 'e8b85ad72aefca86', 'inputs': {'252': {'label': 'Input RNA-seq fastq', 'value': ''}}, 'name': 'TopHat + cufflinks part 1', 'steps': {'250': {'id': 250, 'input_steps': {'input1': {'source_step': 252, 'step_output': 'output'}}, 'tool_id': 'tophat', 'type': 'tool'}, '251': {'id': 251, 'input_steps': {'input': {'source_step': 250, 'step_output': 'accepted_hits'}}, 'tool_id': 'cufflinks', 'type': 'tool'}, '252': {'id': 252, 'input_steps': {}, 'tool_id': None, 'type': 'data_input'}}, 'url': '/api/workflows/e8b85ad72aefca86'}
Workflows can be exported from or imported into Galaxy. This makes it possible to archive workflows, or to move them between Galaxy instances.
To export a workflow, we can call:
>>> workflow_dict = gi.workflows.export_workflow_dict('e8b85ad72aefca86')
This gives us a complex dictionary representing the workflow. We can import this dictionary as a new workflow with:
>>> gi.workflows.import_workflow_dict(workflow_dict) {'id': 'c0bacafdfe211f9a', 'name': 'TopHat + cufflinks part 1 (imported from API)', 'url': '/api/workflows/c0bacafdfe211f9a'}
This call returns a dictionary containing basic metadata on the new workflow. Since in this case we have imported the dictionary into the original Galaxy instance, we now have a duplicate of the original workflow in our account:
>>> gi.workflows.get_workflows() [{'id': 'c0bacafdfe211f9a', 'name': 'TopHat + cufflinks part 1 (imported from API)', 'url': '/api/workflows/c0bacafdfe211f9a'}, {'id': 'e8b85ad72aefca86', 'name': 'TopHat + cufflinks part 1', 'url': '/api/workflows/e8b85ad72aefca86'}, {'id': 'b0631c44aa74526d', 'name': 'CuffDiff', 'url': '/api/workflows/b0631c44aa74526d'}]
Instead of using dictionaries directly, workflows can be exported to or imported from files on the local disk using the export_workflow_to_local_path and import_workflow_from_local_path methods. See the API reference for details.
NOTE:
To invoke a workflow, we need to tell Galaxy which datasets to use for which workflow inputs. We can use datasets from histories or data libraries.
Examine the workflow above. We can see that it takes only one input file. That is:
>>> wf = gi.workflows.show_workflow('e8b85ad72aefca86') >>> wf['inputs'] {'252': {'label': 'Input RNA-seq fastq', 'value': ''}}
There is one input, labelled 'Input RNA-seq fastq'. This input is passed to the Tophat tool and should be a fastq file. We will use the dataset we examined above, under View Histories and Datasets, which had name 'C1_R2_1.chr4.fq' and id '10a4b652da44e82a'.
To specify the inputs, we build a data map and pass this to the invoke_workflow method. This data map is a nested dictionary object which maps inputs to datasets. We call:
>>> datamap = {'252': {'src':'hda', 'id':'10a4b652da44e82a'}} >>> gi.workflows.invoke_workflow('e8b85ad72aefca86', inputs=datamap, history_name='New output history') {'history': '0a7b7992a7cabaec', 'outputs': ['33be8ad9917d9207', 'fbee1c2dc793c114', '85866441984f9e28', '1c51aa78d3742386', 'a68e8770e52d03b4', 'c54baf809e3036ac', 'ba0db8ce6cd1fe8f', 'c019e4cf08b2ac94']}
In this case the only input id is '252' and the corresponding dataset id is '10a4b652da44e82a'. We have specified the dataset source to be 'hda' (HistoryDatasetAssociation) since the dataset is stored in a History. See the API reference for allowed dataset specifications. We have also requested that a new History be created and used to store the results of the run, by setting history_name='New output history'.
The invoke_workflow call submits all the jobs which need to be run to the Galaxy workflow engine, with the appropriate dependencies so that they will run in order. The call returns immediately, so we can continue to submit new jobs while waiting for this workflow to execute. invoke_workflow returns the a dictionary describing the workflow invocation.
If we view the output History immediately after calling invoke_workflow, we will see something like:
>>> gi.histories.show_history('0a7b7992a7cabaec') {'annotation': '', 'contents_url': '/api/histories/0a7b7992a7cabaec/contents', 'id': '0a7b7992a7cabaec', 'name': 'New output history', 'nice_size': '0 bytes', 'state': 'queued', 'state_details': {'discarded': 0, 'empty': 0, 'error': 0, 'failed_metadata': 0, 'new': 0, 'ok': 0, 'paused': 0, 'queued': 8, 'running': 0, 'setting_metadata': 0, 'upload': 0}, 'state_ids': {'discarded': [], 'empty': [], 'error': [], 'failed_metadata': [], 'new': [], 'ok': [], 'paused': [], 'queued': ['33be8ad9917d9207', 'fbee1c2dc793c114', '85866441984f9e28', '1c51aa78d3742386', 'a68e8770e52d03b4', 'c54baf809e3036ac', 'ba0db8ce6cd1fe8f', 'c019e4cf08b2ac94'], 'running': [], 'setting_metadata': [], 'upload': []}}
In this case, because the submitted jobs have not had time to run, the output History contains 8 datasets in the 'queued' state and has a total size of 0 bytes. If we make this call again later we should instead see completed output files.
Methods for managing users are grouped under GalaxyInstance.users.*. User management is only available to Galaxy administrators, that is, the API key used to connect to Galaxy must be that of an admin account.
To get a list of users, call:
>>> gi.users.get_users() [{'email': 'userA@example.org', 'id': '975a9ce09b49502a', 'quota_percent': None, 'url': '/api/users/975a9ce09b49502a'}, {'email': 'userB@example.org', 'id': '0193a95acf427d2c', 'quota_percent': None, 'url': '/api/users/0193a95acf427d2c'}]
BioBlend can be used to make HTTP requests to the Galaxy API in a more convenient way than using e.g. the requests Python library. There are 5 available methods corresponding to the most common HTTP methods: make_get_request, make_post_request, make_put_request, make_delete_request and make_patch_request. One advantage of using these methods is that the API keys stored in the GalaxyInstance object is automatically added to the request.
To make a GET request to the Galaxy API with BioBlend, call:
>>> gi.make_get_request(gi.base_url + "/api/version").json() {'version_major': '19.05', 'extra': {}}
To make a POST request to the Galaxy API with BioBlend, call:
>>> gi.make_post_request(gi.base_url + "/api/histories", payload={"name": "test history"}) {'importable': False, 'create_time': '2019-07-05T20:10:04.823716', 'contents_url': '/api/histories/a77b3f95070d689a/contents', 'id': 'a77b3f95070d689a', 'size': 0, 'user_id': '5b732999121d4593', 'username_and_slug': None, 'annotation': None, 'state_details': {'discarded': 0, 'ok': 0, 'failed_metadata': 0, 'upload': 0, 'paused': 0, 'running': 0, 'setting_metadata': 0, 'error': 0, 'new': 0, 'queued': 0, 'empty': 0}, 'state': 'new', 'empty': True, 'update_time': '2019-07-05T20:10:04.823742', 'tags': [], 'deleted': False, 'genome_build': None, 'slug': None, 'name': 'test history', 'url': '/api/histories/a77b3f95070d689a', 'state_ids': {'discarded': [], 'ok': [], 'failed_metadata': [], 'upload': [], 'paused': [], 'running': [], 'setting_metadata': [], 'error': [], 'new': [], 'queued': [], 'empty': []}, 'published': False, 'model_class': 'History', 'purged': False}
API used to interact with the Galaxy Toolshed, including repository management.
BioBlend allows library-wide configuration to be set in external files. These configuration files can be used to specify access keys, for example.
Should make it easier to debug when strange HTTP things happen such as a proxy server getting in the way of the request etc. @see: body attribute to see the content of the http response
This version is intended to be implemented by subclasses and so raises a NotImplementedError.
If you would like to do more than just a mock test, you need to point BioBlend to an instance of Galaxy. Do so by exporting the following two variables:
$ export BIOBLEND_GALAXY_URL=http://127.0.0.1:8080 $ export BIOBLEND_GALAXY_API_KEY=<API key>
The unit tests, stored in the tests folder, can be run using pytest. From the project root:
$ pytest
If you have run into issues, found a bug, or can't seem to find an answer to your question regarding the use and functionality of BioBlend, please use the Github Issues page to ask your question.
Links to other documentation and libraries relevant to this library:
Galaxy Project
2012-2024, Galaxy Project
January 6, 2024 | 1.2.0 |