#
Data loading: Importing without validation and deleting studies
For data curators and developers cbioportalImporter.py is available. This script can import data regardless of validation results. If data format is incorrect, the importer may stop with an error or crash, or leave the database in an inconsistent state.
This script can also be used to delete studies.
Requirements Importing a study without validation Deleting a study Deleting patients Deleting samples
#
Requirements
Make sure that you have followed the instructions in the Docker Compose setup guide.
#
Importing a study without validation
First, copy the study on your host machine into the study/ directory under cbioportal-docker-compose:
cp -r /path/to/your_study <cbioportal-docker-compose repo>/study/your_study
To import a study without validation, run this from the root of the cbioportal-docker-compose repo:
docker compose exec cbioportal cbioportalImporter.py -s /study/your_study
⚠️ After every import, you must rebuild the derived tables to update the ClickHouse structures that power the study view. Without this step, newly imported data will not appear correctly in the UI:
docker compose exec cbioportal metaImport.py derive-tables
For example:
docker compose exec cbioportal cbioportalImporter.py -s /study/lgg_ucsf_2014
#
Importing part of the data
To import only some new or updated data entries, you can specify -d instead -s option:
docker compose exec cbioportal cbioportalImporter.py -d <path to data directory>
Although the -d option accepts a directory that follows the same structure as the study directory, not all data types are supported for incremental upload. For more details on incremental data loading, see this page.
#
Deleting a study
To remove a study, run:
docker compose exec cbioportal cbioportalImporter.py -c remove-study -meta <path to study directory>/meta_study.txt
The meta_study.txt file should contain the study ID in cancer_study_identifier: of the study you would like to remove.
For example:
docker compose exec cbioportal cbioportalImporter.py -c remove-study -meta /data/brca_small/meta_study.txt
If you have the Cancer Study Id of the study, or studies you want to remove, you can also use:
docker compose exec cbioportal cbioportalImporter.py -c remove-study -id study1_id
Where study1_id is the Cancer Study Id of the study you would like to remove.
You can also remove multiple studies at once by passing the Cancer Study Ids separated by commas:
docker compose exec cbioportal cbioportalImporter.py -c remove-study -id study1_id,study2_id,study3_id
Where study1_id, study2_id and study3_id are the Cancer Study IDs of the studies you would like to remove.
#
Deleting patients
To remove patients (and their associated samples and data) from one or more studies, run:
docker compose exec cbioportal cbioportalImporter.py remove-patients --study_ids <study_ids> --patient_ids <patient_ids>
Where study_ids is a comma-separated list of Cancer Study IDs to search and patient_ids is a comma-separated list of patient identifiers to delete.
For example:
docker compose exec cbioportal cbioportalImporter.py remove-patients --study_ids study1_id --patient_ids patientA,patientB
#
Deleting samples
To remove specific samples from one or more studies, run:
docker compose exec cbioportal cbioportalImporter.py remove-samples --study_ids <study_ids> --sample_ids <sample_ids>
Where study_ids is a comma-separated list of Cancer Study IDs to search and sample_ids is a comma-separated list of sample identifiers to delete.
For example:
docker compose exec cbioportal cbioportalImporter.py remove-samples --study_ids study1_id,study2_id --sample_ids sampleX,sampleY