cBioOnDemand Instance
Every cBioOnDemand instance has two main parts:
- cBioPortal application
- cBioPortal database
cBioPortal application
YAML files of related K8s objects are available at https://gitlab.ics.muni.cz/europdx/k8s/cbio-on-demand/api/tree/master/cbioondemandK8S/src/main/resources Kubernetes stack:
- Replica Set to keep the pod up
- Service to route traffic to the pod.
- CronJob to fire ‘delete job’ via API (DELETE) endpoint with instance`s parameters.
Data import
Application image is based on cbioporatal image. It is extended with data loading script. Whole application image source is available at https://gitlab.ics.muni.cz/europdx/k8s/cbio-on-demand/data-loading. There is ‘postStart’ hook on every start of the Pod(creation, restart, reallocation, …) which start the loading-script. Script and also instance is configurable with env of the Pod.
Start up synchronization
cBioPortal application requires a database to be up and serving data before the application starts. We ensure this by using initContainer which test connectivity to the database.
Labels
Each cBioOnDemand instance has following labels:
- app -> cbio
- type -> ondemand
- user -> user id from dataportal
- instance -> unique id which is name of identifier(Custom object in Kubernetes which is created for every instance).
Special configuration
- podAntiAfiinity -> This ensure that Pod of instances are spread evenly. (preferably)
- nodeAffinity -> This ensure that Pod of instances are placed on nodes which has a label ‘edirex=cbio’
- tolerations -> This ensure that Pod of instances can be placed on nodes which are tainted with ‘edirex=cbio, effect = NoSchedule’
Environment variables
- DBHOST -> name of service for database Pod
- IMPORT -> not in used
- ID -> tmplist id
- MOVE -> must match with path which instance is made public on (Domain/PATH)
- URL -> must match to MOVE
-
DATAHUB -> indicate which DATAHUB to use in loading-script (this refers to configMap)
- Requests are set to ensure no more than 4 instances are placed on one node (overload problem)
cBioPortal database
Kubernetes stack:
- Replica Set to keep the pod up.
- Service to route traffic to this Pod. No persistent storage is used. If pod goes down, data are gone…[but cBioPortal cache data].
Labels
Each instance has following labels:
- app -> cbioDB
- type -> ondemand
- user -> user id from dataportal
- instance -> unique id which is name of identifier(Custom object in Kubernetes which is created for every instance).
Special configuration
- podAntiAffinity -> This ensure that Pod of instances are spread evenly. (preferably)
- nodeAffinity -> This ensure that Pod of instances are placed on nodes which has a label ‘edirex=cbio’
-
tolerations -> This ensure that Pod of instances can be placed on nodes which are tainted with ‘edirex=cbio, effect = NoSchedule’
- ENV refers to configMap which describe our configuration for mysql (also mounting configuration)
- Requests are set to ensure stability of node (overload problem)