Link Search Menu Expand Document

How To

Delete certain Instance

Instance objects are identified by label instance. You can get instance label for pod cbio-on-demand48zbz-89k7k with following:

kubectl get po cbio-on-demand48zbz-89k7k -n cbio-on-demand -o yaml|grep instance

To remove whole instance two Resource Sets, two Services and one Cronjob has to be deleted. First we check what is available for label=codcpcn6:

kubectl get all -n cbio-on-demand -l instance=codcpcn6

Now we can delete it (But be carefull!)

kubectl delete rs -n cbio-on-demand -l instance=codcpcn6

kubectl delete cronjob -n cbio-on-demand -l instance=codcpcn6

kubectl delete service -n cbio-on-demand -l instance=codcpcn6


(Note: Utilities delpods.sh and showinfo.sh can be used for this - see gitlab.ics.muni.cz/europdx/k8s-utils )

Delete Completed pods

Show completed pods:

kubectl get pod --field-selector=status.phase==Succeeded -n cbio-on-demand-beta

Delete completed pods:

kubectl delete pod --field-selector=status.phase==Succeeded -n cbio-on-demand-beta

Problem Solving

Pod network is down

It is important to distinguish between Pod and Host network problem/outages.

  1. Try to exec to pod and see if you can ping e.g is.muni.cz . Then try to curl it. If ping pass and curl not than it's probably MTU problem. Now it can be related to Pod network or Host. So try to do this at host which pod is located on. Fix for pod network

  2. If kubectl is not working and the first point is not a problem. All kubectl traffic are proxied through Rancher. Check if rancher.edirex.ics.muni.cz is available and if there is nginx running on the server.

  3. You can use tcpdump/wireshark on Host to help you identify the problem.

! WARNING ! DON`T PING SERVICE !

cBioOnDemand subsystem is down

  1. Check if any of previos is not a problem.
  2. Check if openstack is running.
  3. Check availability of API and proxy. With kubectl -n cbio-on-demand get po you should see something like this:

    cbio-api-8569bf78bd-s59bg 1/1 Running 0 25d

    cbio-proxy-6944f7587b-zbcw8 1/1 Running 0 25d

If any of these two pods is not available, then the system components of cBioOnDemand are not deployed or have failed.

Instance of cBio is not available

  1. Each cBio instance consist of application and database pod. Check if both pods are running with:

kubectl -n cbio-on-demand get all -l 'instance=codpriklad'.

Codpriklad is id of instance, which can be found in user’s dashboard.

  1. If both pods are up and running, check inside the proxy container if there is routing rule in apache configuration:
    1. kubectl -n cbio-on-demand exec -ti cbio-proxy-6944f7587b-zbcw8 bash
    2. cd /etc/apache2/sites-enabled/routes/
    3. Find the one which route to your servicename

Kubectl problem with x509

This is caused by (probably) bad certificate and the workaround is to delete “certificate-authority-data” part from kube config.

Expired certificate in Rancher

According to https://github.com/rancher/rancher/issues/26984 following steps should help.

sudo timedatectl set-ntp off
sudo date --set="2020-07-11 09:03:00.000"
sudo docker exec -it rancher sh -c "rm /var/lib/rancher/management-state/tls/token-node.crt; rm /var/lib/rancher/management-state/tls/localhost.crt"
sudo timedatectl set-ntp on
sudo docker restart rancher