cBioOnDemand Instance

Every cBioOnDemand instance has two main parts:

cBioPortal application
cBioPortal database

cBioPortal application

YAML files of related K8s objects are available at https://gitlab.ics.muni.cz/europdx/k8s/cbio-on-demand/api/tree/master/cbioondemandK8S/src/main/resources Kubernetes stack:

Replica Set to keep the pod up
Service to route traffic to the pod.
CronJob to fire ‘delete job’ via API (DELETE) endpoint with instance`s parameters.

Data import

Application image is based on cbioporatal image. It is extended with data loading script. Whole application image source is available at https://gitlab.ics.muni.cz/europdx/k8s/cbio-on-demand/data-loading. There is ‘postStart’ hook on every start of the Pod(creation, restart, reallocation, …) which start the loading-script. Script and also instance is configurable with env of the Pod.

Start up synchronization

cBioPortal application requires a database to be up and serving data before the application starts. We ensure this by using initContainer which test connectivity to the database.

Labels

Each cBioOnDemand instance has following labels:

app -> cbio
type -> ondemand
user -> user id from dataportal
instance -> unique id which is name of identifier(Custom object in Kubernetes which is created for every instance).

Special configuration

podAntiAfiinity -> This ensure that Pod of instances are spread evenly. (preferably)
nodeAffinity -> This ensure that Pod of instances are placed on nodes which has a label ‘edirex=cbio’
tolerations -> This ensure that Pod of instances can be placed on nodes which are tainted with ‘edirex=cbio, effect = NoSchedule’

Environment variables

DBHOST -> name of service for database Pod
IMPORT -> not in used
ID -> tmplist id
MOVE -> must match with path which instance is made public on (Domain/PATH)
URL -> must match to MOVE
DATAHUB -> indicate which DATAHUB to use in loading-script (this refers to configMap)
Requests are set to ensure no more than 4 instances are placed on one node (overload problem)

cBioPortal database

Kubernetes stack:

Replica Set to keep the pod up.
Service to route traffic to this Pod. No persistent storage is used. If pod goes down, data are gone…[but cBioPortal cache data].

Labels

Each instance has following labels:

app -> cbioDB
type -> ondemand
user -> user id from dataportal
instance -> unique id which is name of identifier(Custom object in Kubernetes which is created for every instance).

Special configuration

podAntiAffinity -> This ensure that Pod of instances are spread evenly. (preferably)
nodeAffinity -> This ensure that Pod of instances are placed on nodes which has a label ‘edirex=cbio’
tolerations -> This ensure that Pod of instances can be placed on nodes which are tainted with ‘edirex=cbio, effect = NoSchedule’
ENV refers to configMap which describe our configuration for mysql (also mounting configuration)
Requests are set to ensure stability of node (overload problem)