# Authenticating Users via Tokens

The cBioPortal provides access to data within its database through the cBioPortal web service. While most users interact with this web service indirectly while visiting cBioPortal web pages, it is possible to access this web service directly. Direct access to the web service is appealing for those users with an interest in mining or analyzing cBioPortal data beyond the confines of a web browser, such as from ‘R’ or Python clients.

If your instance of the cBioPortal will not require user authorization (login), direct access to the web service is available without any additional server configuration or client side prerequisites. However, if your instance of the cBioPortal will require login, additional data access token server configuration is necessary before direct access to the web service is available to your users. Additionally, prior to web service access, clients will need to obtain a data access token and present this token with each web service request.

This document contains information for both cBioPortal administrators and users. For administrators, this document describes how to configure an instance of the cBioPortal with data access token support. For users, this document provides details about how to obtain a data access token and present it with a web service request (see Using Data Access Tokens).

# Configuring Data Access Tokens within cBioPortal

# Implementations

The cBioPortal codebase comes with three different data access token implementations, each of which come with strengths and weaknesses.

The first implementation based on JSON Web Token (JWT) technology, is appealing because the data access token contains all the information required to validate the token during a web service request. When a data access token is generated by the cBioPortal, its expiration date is inserted into the token. The data access token is then signed and encrypted before presenting it to the user. When the data access token is presented in a subsequent web service request, the cBioPortal is able to inspect the token and determine its validity before satisfying the request. The downside to this implementation is that since the data access token is self-validating, no information about the token is stored in the database. This means there is no way to revoke a data access token in the event that it has been compromised or limit the number of data access tokens per user.

The second implementation based on Universally unique identifier (UUID) technology, allows for greater control over the data access token at the cost of additional database support. Unlike JWT based data access tokens which are self-validating, a UUID based token is a string representation of a unique number. When a data access token is generated by the cBioPortal, a record is inserted into the data_access_tokens database table:

mysql> describe data_access_tokens;
+------------+--------------+------+-----+---------+-------+
| Field      | Type         | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+-------+
| TOKEN      | varchar(50)  | NO   | PRI | NULL    |       |
| USERNAME   | varchar(255) | NO   | MUL | NULL    |       |
| EXPIRATION | datetime     | NO   |     | NULL    |       |
| CREATION   | datetime     | YES  |     | NULL    |       |
+------------+--------------+------+-----+---------+-------+
4 rows in set (0.00 sec)

In addition to storing the token (the unique number), its creation and expiration dates along with the associated username is also stored prior to presenting it to the user. When the data access token is presented in a subsequent web service request, the cBioPortal compares the token presented in the request against this table for validity. If the token exists in the table and has not yet expired, the request will be satisfied. The creation date is used by the cBioPortal when a user requests a new data access token. Based on the server configuration, if the number of outstanding data access tokens together with this new one exceeds the number of allowable data access tokens per user, the cBioPortal may refuse to present a new one or it may invalidate the oldest one before presenting the new one to the user.

A third implementation makes use of an external OAuth2-authentication provider and can integrate with single-sign on (SSO) solutions. The OAuth2 provider provides users with an offline token that does not allow for direct access to protected cBioPortal resources. Instead, the offline token is passed to the OAuth2 authorization server that returns an access token, a short-lived permission to access cBioPortal resources. The access token contains up-to-date user permissions (the studies that the user is permitted to view). This implementation ensures that any changes of user permissions registered at the authentication provider are effective immediately and do not rely on the expiration date of the token of the user. At any moment OAuth2 offline tokens can be revoked by the Authentication provider on a per-user basis.

A step-by-step guide to configure KeyCloak to provide OAuth2 client functionality can be found in Authenticating and Authorizing Users via keycloak.

# Modifying Configuration

The following properties must be present in application.properties in order to allow direct access to the cBioPortal web service when login is required:

Property: dat.method (required)

  • Description: The desired data access token implementation.
  • Permissible Values: jwt, uuid, oauth2, none
  • Default Value: none

Property: dat.filter_user_role (optional)

  • Description: This property determines users access in token generation. If present, this role will be checked in user roles before generating a token.
  • Permissible Values: A string value.
  • Default Value: none

Property: dat.unauth_users (optional, not used for dat.method = oauth2)

  • Description: A list of users that should not be allowed to download a data access token.
  • Permissible Values: A comma-delimited list of valid user names as found in the users table.
  • Default Value: anonymousUser

Property: dat.ttl_seconds (optional, not used for dat.method = oauth2)

  • Description: The time in seconds between token creation and token expiration.
  • Permissible Values: An integer value greater than zero.
  • Default Value: 2592000 (30 days)

Property: dat.jwt.secret_key (required only when dat.method = jwt)

  • Description: The signature used to sign JWT based data access tokens during creation. This signature is used in subsequent web service requests to validate the request. This property is used only when dat.method = jwt. Keys used with the HS256 algorithm MUST have a size >= 256 bits. See https://tools.ietf.org/html/rfc7518#section-3.2 for more information.
  • Permissible Values: A string value.
  • Default Value: none

Property: dat.uuid.max_number_per_user (required only when dat.method = uuid)

  • Description: This property determines the number of data access tokens that can be allocated to a single user at any given time. If a user requests a data access token that puts them over this limit, the cBioPortal will present a new token to the user and it may invalidate the oldest one before presenting the new one to the user. This property is used only when dat.method = uuid.
  • Permissible Values: An integer value greater than zero.
  • Default Value: 1

Property: dat.oauth2.clientId (required only when dat.method = oauth2)

  • Description: Identifier of the OAuth2 client of the authentication provider.
  • Permissible Values: string
  • Default Value: This value is specified by the authentication provider.

Property: dat.oauth2.clientSecret (required only when dat.method = oauth2)

  • Description: Secret of the OAuth2 client of the authentication provider used for back channel communication between cBioPortal and the provider. This value is specified by the authentication provider.
  • Permissible Values: string
  • Default Value: This value is specified by the authentication provider.

Property: dat.oauth2.issuer (required only when dat.method = oauth2)

  • Description: Name of the issuer of the tokens. It is used to assert the correct origin of the token. This value is specified by the authentication provider.
  • Permissible Values: string
  • Default Value: This value is specified by the authentication provider.

Property: dat.oauth2.userAuthorizationUri (required only when dat.method = oauth2)

  • Description: Url of the user login page of the OAuth2 authorization provider. According to OAuth2 specification this endpoint must end with /auth.
  • Permissible Values: url
  • Default Value: This value is specified by the authentication provider.

Property: dat.oauth2.accessTokenUri (required only when dat.method = oauth2)

  • Description: Url for retrieval of tokens from the OAuth2 authorization provider. According to OAuth2 specification this endpoint must end with /token.
  • Permissible Values: url
  • Default Value: This value is specified by the authentication provider.

Property: dat.oauth2.jwkUrl (required only when dat.method = oauth2)

  • Description: Url for retrieval of the public key used by the OAuth2 authorization provider to sign tokens. According to OAuth2 specification this endpoint must end with /certs.
  • Permissible Values: url
  • Default Value: This value is specified by the authentication provider.

Property: dat.oauth2.jwtRolesPath (required only when dat.method = oauth2)

  • Description: ::-separated path to array with user roles in JWT token returned by Keycloak.
  • Permissible Values: ::-separated strings
  • Default Value: resource_access::cbioportal::roles

# Using Data Access Tokens

To directly access the web service of a cBioPortal instance which requires user login, clients will need to obtain a data access token and present this token with each web service request. To obtain a data access token, login to the cBioPortal and then choose “Data Access Token” from the “Logged in as…” dropdown menu located on the top-right section of the webpage (note this menu option will not be present if the instance of the cBioPortal has not be configured to support data access tokens):

This will bring you to a section of the Web API page where you can download your token using the "Download Token" button.

The downloaded data access token file will contain the token, and, for uuid and jwt token methods, a creation and expiration date:

token: 63efa81c-2490-4e15-9d1c-fb6e8e50e35d
creation_date: 2018-11-28T20:23:55.241Z
expiration_date: 2019-05-27T20:23:55.241Z

To directly access the web service, include the token as an Authorization header to the web service request:

Authorization: Bearer 63efa81c-2490-4e15-9d1c-fb6e8e50e35d

Here is an example curl command showing a request to GET /api/studies with the proper Authorization header set:

curl -X GET "https://www.cbioportal.org/api/studies" -H "accept: application/json" -H "Authorization: Bearer 63efa81c-2490-4e15-9d1c-fb6e8e50e35d"

If a web service request is made without an Authorization header or if the token within the header is not valid, the web service will not satisfy the request and will return with an HTTP Status 401. Additionally, there are a handful of properties which determine the behavior of cBioPortal data access tokens (see Configuring Data Access Tokens within cBioPortal). For example, a data access token may have an expiration date of 30 days after its creation date or 300 days. Similarly, there may be a limit to the number of data access tokens a user may have at any one time. If this limit is reached a new data access token may or may not be downloadable. Please check with the local cBioPortal system administrator for more information about how data access token behavior has been configured.