How To
Delete certain Instance
Instance objects are identified by label instance. You can get instance label for pod cbio-on-demand48zbz-89k7k with following:
kubectl get po cbio-on-demand48zbz-89k7k -n cbio-on-demand -o yaml|grep instance
To remove whole instance two Resource Sets, two Services and one Cronjob has to be deleted. First we check what is available for label=codcpcn6:
kubectl get all -n cbio-on-demand -l instance=codcpcn6
Now we can delete it (But be carefull!)
kubectl delete rs -n cbio-on-demand -l instance=codcpcn6
kubectl delete cronjob -n cbio-on-demand -l instance=codcpcn6
kubectl delete service -n cbio-on-demand -l instance=codcpcn6
(Note: Utilities delpods.sh
and showinfo.sh
can be used for this - see gitlab.ics.muni.cz/europdx/k8s-utils )
Delete Completed pods
Show completed pods:
kubectl get pod --field-selector=status.phase==Succeeded -n cbio-on-demand-beta
Delete completed pods:
kubectl delete pod --field-selector=status.phase==Succeeded -n cbio-on-demand-beta
Problem Solving
Pod network is down
It is important to distinguish between Pod and Host network problem/outages.
-
Try to exec to pod and see if you can ping e.g is.muni.cz . Then try to curl it. If ping pass and curl not than it's probably MTU problem. Now it can be related to Pod network or Host. So try to do this at host which pod is located on. Fix for pod network
-
If kubectl is not working and the first point is not a problem. All kubectl traffic are proxied through Rancher. Check if rancher.edirex.ics.muni.cz is available and if there is nginx running on the server.
-
You can use tcpdump/wireshark on Host to help you identify the problem.
! WARNING ! DON`T PING SERVICE !
cBioOnDemand subsystem is down
- Check if any of previos is not a problem.
- Check if openstack is running.
-
Check availability of API and proxy. With
kubectl -n cbio-on-demand get po
you should see something like this:cbio-api-8569bf78bd-s59bg 1/1 Running 0 25d
cbio-proxy-6944f7587b-zbcw8 1/1 Running 0 25d
If any of these two pods is not available, then the system components of cBioOnDemand are not deployed or have failed.
Instance of cBio is not available
- Each cBio instance consist of application and database pod. Check if both pods are running with:
kubectl -n cbio-on-demand get all -l 'instance=codpriklad'
.
Codpriklad
is id of instance, which can be found in user’s dashboard.
- If both pods are up and running, check inside the proxy container if there is routing rule in apache configuration:
kubectl -n cbio-on-demand exec -ti cbio-proxy-6944f7587b-zbcw8 bash
cd /etc/apache2/sites-enabled/routes/
- Find the one which route to your servicename
Kubectl problem with x509
This is caused by (probably) bad certificate and the workaround is to delete “certificate-authority-data” part from kube config.
Expired certificate in Rancher
According to https://github.com/rancher/rancher/issues/26984 following steps should help.
sudo timedatectl set-ntp off
sudo date --set="2020-07-11 09:03:00.000"
sudo docker exec -it rancher sh -c "rm /var/lib/rancher/management-state/tls/token-node.crt; rm /var/lib/rancher/management-state/tls/localhost.crt"
sudo timedatectl set-ntp on
sudo docker restart rancher