Published on
Exercise 5

Pod scheduling issue

Authors
  • Name
    Twitter

You recently heard in the news that a young kid managed to hit a theoretical Tetris High Score that triggers a bug. This remind you of the old good time, and you managed to create all the required resource to play Tetris on your cluster (only during lunch break). But something still not working, and your pod doesn't deploy.

Go, fix it, and slam that high score. Good luck!

Table of Contents

What's going on?

You head straight to the exercise05-userXX namespace, pick the "Developer perspective", check the topology view. You see this light blue deployment, and as an experienced OpenShift user, you know it should be dark blue!

topology

Poke around, click, explore, and fix it!

Note: If you try to use a "lazy" shortcut solution, you may hit an error from a custom webhook admission controller.

Hints

Hint 1

How to quickly see the pod status?

You probably already found that the Pod is unschedulable.

hint1

The message indicates it's something to do with pod's node affinity/selector.

Hint 2

How to check the pod events?

You probably had a quick look at the pod events, to see if you get more information.

hint2

Nothing new. Keep digging!

Hint 3

How to check the pod status/message?

The source of truth is often in the Yaml itself, so you probably decided to check the current yaml of the deployment.

hint3

Nothing new. Keep digging!

Hint 4

How to check pod's node affinity?

Let's first check if there is some affinity setup on this pod, and if that's what is preventing the scheduling. There is no convenient way to check that (that I know of), so we should check the Yaml file, i.e. with the search feature.

hint4

No affinity issue, moving on to the pod's node selector, maybe?

Hint 5

How to check pod's node selector?

Pod selector can be visualize through the Details view of the pod, or replicaSet, or Deployment.

In the image below, we check from the Pod details view.

hint5

So yes, there is some nodeSelector setup. To select which node(s) exactly?

There is this nice link to trigger a search on the node selector label.

hint5

Mmmm no nodes with this label found. No wonder it's not working. It's probably a good time to check a bit the nodeSelector documentation here: https://docs.openshift.com/container-platform/4.15/nodes/scheduling/nodes-scheduler-node-selectors.html

You may be tempted to simply delete the nodeSelector section of the pod (in fact, it should be done in the template section of the Deployment, to make sure future pods are also affected by your change). However, I added a validating webhook that will prevent any pod without a nodeSelector to be created in this namespace. Just to make sure you find a proper solution...

Hint 6

How to check labels on node?

If your role have the appropriate permission to get & list nodes (which is the case today), you should be able to see the nodes from the Administrator perspective.

hint6

And then you should be able to see the Details view of the node, which includes it's labels

hint7

You might find it is actually more convenient to look at the node labels via the node yaml section.

Enough hints for today, you should be able to fix it, otherwise have a look at the solution!

Solution

You probably understood that you need to specify a nodeSelector using a label from (one of) the actual OpenShift node. From Hint 6, you probably picked one label of your choice (it doesn't matter which one for the purpose of this exercise) Now, all you have to do is edit the nodeSelector of your deployment.

Warning: If you edit the nodeSelector of the pod itself, then be aware that if the pod is deleted, or if it crashes, the replacement pod generated by the Tetris Deployment won't have the nodeSelector. You really need to edit the deployment instead, in the pod template section:

solution

That's it, you should be able to enjoy a good old Tetris game. Go back to the topology view, click on the route, and have fun.

solution

Did you know?

Shortcut when editing YAML files from the console

OpenShift provide some editing capability, including shortcuts. You can easily access some key shortcut by clicking on the link...

duk1

... and you can access a more advanced list of commands with its shortcut viw the key F1 (Make sure your mouse cursor is inside the YAML file, to make sure the OpenShift console intercepts the shortcut, and not your browser, or your operating system)

duk1

Tooltip when working with YAML files from the console

OpenShift may display some useful tooltip when hovering your mouse above some YAML content. Make sure to enable Tooltip when needed. And on those occasions the tooltip are making your editing harder with those popup info, remember that you can just disable them.

duk2

Webhook admission plugins

For the purpose of this exercise, a custom admission webhook has been build. It ensure that any pod on your exercise05-userXX namespace is defining a nodeSelector. You may find this kind of approach useful in other context, for example, if your organisation mandate that any resource has a specific label such as "owner", or maybe "team" etc... But really, you can create your own validation rules to fit your needs, and reject resources creation or mutation that break those rules. If you want to know more, have a look at the documentation.

https://docs.openshift.com/container-platform/4.14/architecture/admission-plug-ins.html#admission-webhooks-about_admission-plug-ins

Take away

Node Affinity, anti-Affinity, and nodeSelector, are way to place pods on specific nodes. This is typically used when a pod need to run on a specific set of node's architecture. It's also a way to provide workload isolation where needed, with a project only using a set of nodes for example.

  • Use the search node option by label to see which node has a specific label
  • If the nodeSelector has no match, the pod won't be scheduled. If you need high availability, make sure you have more than one node with the label you use in your nodeSelector.
  • When using deployments, edit your nodeSelector in your Deployment (template section), not your pod.