Example commands
Importing gene panel
Use this command to import a gene panel. Specify the gene panel file by replacing
<path_to_genepanel_file> with the absolute path to the gene panel file. Another option is to add the
gene panel files in cbioportal-docker-compose/study/reference_data which is mounted inside the container on /study/reference_data/.
mkdir -p <cbioportal-docker-compose repo>/study/reference_data/
cp /path/to/your_gene_panel.txt <cbioportal-docker-compose repo>/study/reference_data/
docker compose exec cbioportal importGenePanel.pl --data /study/reference_data/your_gene_panel.txt
Importing data
Use this command to validate a dataset. Add the study to the cbioportal-docker-compose/study folder. The
command will connect to the web API of the container cbioportal-container, and import
the study in its associated database.
# From the cbioportal-docker-compose repo
mkdir -p study/reports
docker compose exec cbioportal metaImport.py -s /study/<name_of_study> -o -html /study/reports/report.html
⚠️ After importing a study, remember to restart the
cbioportalcontainer to see the study on the home page. Rundocker compose restart cbioportal.
⚠️ Warning: When importing large studies, you may run into a Java out-of-memory error on machines with limited RAM. You can try adjusting the Java heap size used by the importer in order to work around this, for example:
docker compose exec cbioportal metaImport.py -s /study/your_study -o -jvo "-Xms16g -Xmx96g"
Incremental Import
To add or update data in an existing study without importing the entire study, you can use the new incremental import functionality. Point the importer to a folder containing a "delta" of study data you would like to add. To load data incrementally, you will specify the -d instead of the -s option.
docker compose exec \
cbioportal \
metaImport.py -d /study/<study_delta> -o
For more details on incremental data loading, see this page.
Removing Studies, Samples, and Patients
# Remove a study entirely
docker compose exec \
cbioportal \
cbioportalImporter.py remove-study -id study_id
# Remove specific samples from a study
docker compose exec cbioportal cbioportalImporter.py remove-samples --study_ids <study_id> --sample_ids SAMPLE1,SAMPLE2
# Remove specific patients and all their data
docker compose exec cbioportal cbioportalImporter.py remove-patients --study_ids <study_id> --patient_ids PATIENT1,PATIENT2
After any removal operation, rebuild derived tables:
docker compose exec cbioportal metaImport.py derive-tables
Using cached portal side-data
In some setups the data validation step may not have direct access to the web API, for instance when the web API is only accessible to authenticated browser sessions. You can use this command to generate a cached folder of files that the validation script can use instead.
# From the cbioportal-docker-compose repo
mkdir -p study/portalinfo
docker compose exec cbioportal dumpPortalInfo.pl /study/portalinfo
Then, tell the script to use the cached folder instead of the API:
# From the cbioportal-docker-compose repo
mkdir -p study/reports
docker compose exec cbioportal metaImport.py -p /study/portalinfo -s /study/name_of_study -o -html /study/reports/report.html
Inspecting or adjusting the database
# Set the appropriate variables first
CLICKHOUSE_USER=<your_clickhouse_user>
CLICKHOUSE_PASSWORD=<your_clickhouse_password>
CLICKHOUSE_DB=<your_clickhouse_db_name>
docker compose exec cbioportal-database \
sh -c 'clickhouse client -u"$CLICKHOUSE_USER" --password="$CLICKHOUSE_PASSWORD" --database="$CLICKHOUSE_DB"'
Deleting a study
To remove a study, run:
docker compose exec \
cbioportal \
cbioportalImporter.py -c remove-study -id study_id
Where study_id is the cancer_study_identifier of the study you would like to remove.