- Published on
- Exercise 5
Pod scheduling issue
- Authors
- Name
You recently heard in the news that a young kid managed to hit a theoretical Tetris High Score that triggers a bug. This remind you of the old good time, and you managed to create all the required resource to play Tetris on your cluster (only during lunch break). But something still not working, and your pod doesn't deploy.
Go, fix it, and slam that high score. Good luck!
Table of Contents
What's going on?
You head straight to the exercise05-userXX
namespace, pick the "Developer perspective", check the topology view. You see this light blue deployment, and as an experienced OpenShift user, you know it should be dark blue!
Poke around, click, explore, and fix it!
Note: If you try to use a "lazy" shortcut solution, you may hit an error from a custom webhook admission controller.
Hints
Hint 1
How to quickly see the pod status?
You probably already found that the Pod is unschedulable.
The message indicates it's something to do with pod's node affinity/selector.
Hint 2
How to check the pod events?
You probably had a quick look at the pod events, to see if you get more information.
Nothing new. Keep digging!
Hint 3
How to check the pod status/message?
The source of truth is often in the Yaml itself, so you probably decided to check the current yaml of the deployment.
Nothing new. Keep digging!
Hint 4
How to check pod's node affinity?
Let's first check if there is some affinity setup on this pod, and if that's what is preventing the scheduling. There is no convenient way to check that (that I know of), so we should check the Yaml file, i.e. with the search feature.
No affinity issue, moving on to the pod's node selector, maybe?
Hint 5
How to check pod's node selector?
Pod selector can be visualize through the Details view of the pod, or replicaSet, or Deployment.
In the image below, we check from the Pod details view.
So yes, there is some nodeSelector setup. To select which node(s) exactly?
There is this nice link to trigger a search on the node selector label.
Mmmm no nodes with this label found. No wonder it's not working. It's probably a good time to check a bit the nodeSelector documentation here: https://docs.openshift.com/container-platform/4.15/nodes/scheduling/nodes-scheduler-node-selectors.html
You may be tempted to simply delete the nodeSelector section of the pod (in fact, it should be done in the template section of the Deployment, to make sure future pods are also affected by your change). However, I added a validating webhook that will prevent any pod without a nodeSelector to be created in this namespace. Just to make sure you find a proper solution...
Hint 6
How to check labels on node?
If your role have the appropriate permission to get & list nodes
(which is the case today), you should be able to see the nodes from the Administrator perspective.
And then you should be able to see the Details view of the node, which includes it's labels
You might find it is actually more convenient to look at the node labels via the node yaml section.
Enough hints for today, you should be able to fix it, otherwise have a look at the solution!
Solution
You probably understood that you need to specify a nodeSelector using a label from (one of) the actual OpenShift node. From Hint 6, you probably picked one label of your choice (it doesn't matter which one for the purpose of this exercise) Now, all you have to do is edit the nodeSelector of your deployment.
Warning: If you edit the nodeSelector of the pod itself, then be aware that if the pod is deleted, or if it crashes, the replacement pod generated by the Tetris Deployment won't have the nodeSelector. You really need to edit the deployment instead, in the pod template section:
That's it, you should be able to enjoy a good old Tetris game. Go back to the topology view, click on the route, and have fun.
Did you know?
Shortcut when editing YAML files from the console
OpenShift provide some editing capability, including shortcuts. You can easily access some key shortcut by clicking on the link...
... and you can access a more advanced list of commands with its shortcut viw the key F1 (Make sure your mouse cursor is inside the YAML file, to make sure the OpenShift console intercepts the shortcut, and not your browser, or your operating system)
Tooltip when working with YAML files from the console
OpenShift may display some useful tooltip when hovering your mouse above some YAML content. Make sure to enable Tooltip when needed. And on those occasions the tooltip are making your editing harder with those popup info, remember that you can just disable them.
Webhook admission plugins
For the purpose of this exercise, a custom admission webhook has been build. It ensure that any pod on your exercise05-userXX namespace is defining a nodeSelector. You may find this kind of approach useful in other context, for example, if your organisation mandate that any resource has a specific label such as "owner", or maybe "team" etc... But really, you can create your own validation rules to fit your needs, and reject resources creation or mutation that break those rules. If you want to know more, have a look at the documentation.
Take away
Node Affinity, anti-Affinity, and nodeSelector, are way to place pods on specific nodes. This is typically used when a pod need to run on a specific set of node's architecture. It's also a way to provide workload isolation where needed, with a project only using a set of nodes for example.
- Use the search node option by label to see which node has a specific label
- If the nodeSelector has no match, the pod won't be scheduled. If you need high availability, make sure you have more than one node with the label you use in your nodeSelector.
- When using deployments, edit your nodeSelector in your Deployment (template section), not your pod.