#
Deploy with Docker
#
Prerequisites
Docker provides a way to run applications securely isolated in a container, packaged with all its dependencies and libraries. To learn more on Docker, please see Docker overview.
Make sure that you have the latest version of Docker installed on your machine. Get latest version
WSL: Make sure to use Docker Desktop for Windows. Installing Docker in Ubuntu will not work.
#
Usage instructions
In this example we use Docker Compose to spin up all the different required containers/services for cBioPortal.
#
Quick Start
Important when using Windows:
Please use the WSL input window (WSL2 Terminal) directly and not "Windows PowerShell" or "Git Bash".
git clone https://github.com/cBioPortal/cbioportal-docker-compose.git
cd cbioportal-docker-compose
./init.sh
docker compose up
You should now be able to see the cBioPortal website at http://localhost:8080
Import studies with:
(Hint: Don't close the current terminal but open a second terminal and change the path to "/cbioportal-docker-compose")
docker compose run cbioportal metaImport.py -u http://cbioportal:8080 -s study/lgg_ucsf_2014/ -o
Clear persistent data volumes with:
docker compose down -v
#
Comprehensive Start
#
Step 1 - Run Docker Compose
Download the git repo that has the Docker compose file and go to the root of that folder:
git clone https://github.com/cBioPortal/cbioportal-docker-compose.git
cd cbioportal-docker-compose
Then download all necessary files (seed data, example config and example study from datahub) with the init script:
./init.sh
Then run:
docker compose up
This will start all four containers (services) defined here. That is:
- the mysql database, which holds most of the cBioPortal data
- the cBioPortal Java web app, this serves the React frontend as well as the REST API
- the session service Java web app. This service has a REST API and stores session information (e.g. what genes are being queried) and user specific data (e.g. saved cohorts) in a separate mongo database
- the mongo database that persists the data for the session service
It will take a few minutes the first time to import the seed database and perform migrations if necessary. Each container outputs logs to the terminal. For each log you'll see the name of the container that outputs it (e.g. cbioportal_container
or cbioportal_session_database_container
). If all is well you won't see any significant errors (maybe some warnings, that's fine to ignore). If all went well you should be able to visit the cBioPortal homepage on http://localhost:8080. You'll notice that cBioPortal already comes preloaded with a small Low-Grade Gliomas (UCSF, Science 2014)
study and the recommended gene panels:
Go to the next step to see how to import other studies.
#
Notes on detached mode
If you prefer to run the services in detached mode (i.e. not logging everything to your terminal), you can run
docker compose up -d
In this mode, you'll have to check the logs of each container manually using e.g.:
docker logs -f cbioportal_container
You can list all containers running on your system with
docker ps -a
To shut down the containers:
docker compose down
Tip: If you are using Docker Desktop, detached mode is the preferred way to run as it provides a UI for listing the containers and interfacing with them.
#
Step 2 - Import Studies
To import studies you can run:
docker compose run cbioportal metaImport.py -u http://cbioportal:8080 -s study/msk_impact_2017/ -o
## Sync clickhouse (ONLY for clickhouse mode, see below)
docker compose exec cbioportal-clickhouse-importer bash /workdir/sync-databases.sh
This will import the msk_impact_2017 study into your local database. It will take a few minutes to import.
If you are running cBioPortal in
docker compose exec cbioportal-clickhouse-importer bash /workdir/sync-databases.sh
After importing and syncing, restart the cbioportal web container:
docker compose restart cbioportal
or
All public studies can be downloaded from https://www.cbioportal.org/datasets, or https://github.com/cBioPortal/datahub/. You can add any of them to the ./study
folder and import them. There's also a script (./study/init.sh
) to download multiple studies. You can set DATAHUB_STUDIES
to any public study id (e.g. lgg_ucsf_2014
) and run ./init.sh
.
#
Notes on restarting
To avoid having to restart one can alternatively hit an API endpoint. To do so, call the /api/cache
endpoint with a DELETE
http-request (see here for more information):
curl -x DELETE -H "X-API-KEY: my-secret-api-key-value" http://localhost:8080/api/cache
The value of the API key is configured in the application.properties file. You can visit http://localhost:8080 again and you should be able to see the new study.
#
Step 3 - Customize your application.properties file
The properties file can be found in ./config/application.properties
. Which was set up when running init.sh
.
This properties file allows you to customize your instance of cBioPortal with e.g. custom logos, or point the cBioPortal container to e.g. use an external mysql database. See the properties documentation for a comprehensive overview.
If you would like to enable OncoKB see OncoKB data access for how to obtain a data access token. After obtaining a valid token use:
#
Step 4 - Customize cBioPortal setup
To read more about the various ways to use authentication and parameters for running the cBioPortal web app see the relevant backend deployment documentation.
On server systems that can easily spare 4 GiB or more of memory, set the -Xms
and -Xmx
options to the same number. This should increase performance of
certain memory-intensive web services such as computing the data for the
co-expression tab. If you are using MacOS or Windows, make sure to take a look
at these notes to allocate more memory for the
virtual machine in which all Docker processes are running.
#
Clickhouse Mode
For cBioPortal instances with large cohorts (>100K samples), we developed a "Clickhouse mode" of the Study View. This mode uses Clickhouse as an additional database next to MySQL for 10x faster querying (see video). The mode is experimental and is currently used only by the public-facing GENIE instance. We plan to roll it out to other portals later this year (see roadmap ticket). Follow the steps below to run cBioPortal Docker Compose in clickhouse mode.
- Modify .env to use release >= 6.0.27 of cBioPortal.
... DOCKER_IMAGE_CBIOPORTAL=cbioportal/cbioportal:6.0.27 ...
- Run init script
./init.sh
- Start cBioPortal with clickhouse
docker compose -f docker-compose.yml -f addon/clickhouse/docker-compose.clickhouse.yml up
#
Clickhouse Cloud
The Clickhouse setup mentioned above is fully compatible with a remote Clickhouse database. For production environments, you can set up a Clickhouse database using Clickhouse Cloud and update the clickhouse database credentials in the .env to match your database credentials. For the clickhouse sync step to work properly, your credentials should have both read
and write
permissions.
#
More commands
For documentation on how to import a study, see this tutorial For more uses of the cBioPortal image, see this file
To Dockerize a Keycloak authentication service alongside cBioPortal, see this file.
#
Building cBioPortal
If you need to build the cBioPortal backend to a docker image, you can use the following command from the cbioportal repo.
docker build -t cbioportal/cbioportal:my-dev-cbioportal-image -f docker/web-and-data/Dockerfile .
Change the env file in the cbioportal-docker-compose repo to reference your image.
DOCKER_IMAGE_CBIOPORTAL=cbioportal/cbioportal:my-dev-cbioportal-image
Note that the above command will deploy the repo as loose files. If you would like to build to a single executable app.jar, you can use the web configuration:
docker build -t cbioportal/cbioportal:my-dev-cbioportal-image -f docker/web/Dockerfile .
In this case, the default docker-compose.yml may not work. If you get an error about missing PortalApplication, it must be configured to launch the app.jar instead.
java -Xms2g -Xmx4g -jar /cbioportal-webapp/app.jar -spring...
There may be a docker-compose.web.yml you can use instead of modifying the above.
docker compose -f docker-compose.web.yml up -d
#
Uninstalling cBioPortal
docker compose down -v --rmi all