TKG Autoscaler
Overview
TKG autoscaler
From the official TKG documentation page:
Cluster Autoscaler is a Kubernetes program that automatically scales Kubernetes clusters depending on the demands on the workload clusters. Use Cluster Autoscaler only for workload clusters deployed by a standalone management cluster.
Ok, lets try out this then.
Enable Cluster Autoscaler
So one of the pre-requisites is a TKG standalone management cluster. I have that already deployed and running. Then for a workload cluster to be able to use the cluster autoscaler I need to enable this by adding some parameters in the cluster deployment manifest. The following is the autoscaler relevant variables, some variables are required some are optional and only valid for use on a workload cluster deployment manifest. According to the official documentation the only supported way to enable autoscaler is when provisioning a new workload cluster.
-
ENABLE_AUTOSCALER: "true" #Required if you want to enable the autoscaler
-
AUTOSCALER_MAX_NODES_TOTAL: "0" #Optional
-
AUTOSCALER_SCALE_DOWN_DELAY_AFTER_ADD: "10m" #Optional
-
AUTOSCALER_SCALE_DOWN_DELAY_AFTER_DELETE: "10s" #Optional
-
AUTOSCALER_SCALE_DOWN_DELAY_AFTER_FAILURE: "3m" #Optional
-
AUTOSCALER_SCALE_DOWN_UNNEEDED_TIME: "10m" #Optional
-
AUTOSCALER_MAX_NODE_PROVISION_TIME: "15m" #Optional
-
AUTOSCALER_MIN_SIZE_0: "1" #Required (if Autoscaler is enabled as above)
-
AUTOSCALER_MAX_SIZE_0: "2" #Required (if Autoscaler is enabled as above)
-
AUTOSCALER_MIN_SIZE_1: "1" #Required (if Autoscaler is enabled as above, and using prod template and tkg in multi-az )
-
AUTOSCALER_MAX_SIZE_1: "3" #Required (if Autoscaler is enabled as above, and using prod template and tkg in multi-az )
-
AUTOSCALER_MIN_SIZE_2: "1" #Required (if Autoscaler is enabled as above, and using prod template and tkg in multi-az )
-
AUTOSCALER_MAX_SIZE_2: "4" #Required (if Autoscaler is enabled as above, and using prod template and tkg in multi-az )
Enable Autoscaler upon provisioning of a new workload cluster
Start by preparing a class-based yaml for the workload cluster. This procedure involves adding the AUTOSCALER variables (above) to the tkg bootstrap yaml (the one used to deploy the TKG management cluster). Then generate a cluster-class yaml manifest for the new workload cluster. I will make a copy of my existing TKG bootstrap yaml file, name it something relevant to autoscaling. Then in this file I will add these variables:
1#! ---------------
2#! Workload Cluster Specific
3#! -------------
4ENABLE_AUTOSCALER: "true"
5AUTOSCALER_MAX_NODES_TOTAL: "0"
6AUTOSCALER_SCALE_DOWN_DELAY_AFTER_ADD: "10m"
7AUTOSCALER_SCALE_DOWN_DELAY_AFTER_DELETE: "10s"
8AUTOSCALER_SCALE_DOWN_DELAY_AFTER_FAILURE: "3m"
9AUTOSCALER_SCALE_DOWN_UNNEEDED_TIME: "10m"
10AUTOSCALER_MAX_NODE_PROVISION_TIME: "15m"
11AUTOSCALER_MIN_SIZE_0: "1" #This will be used if not using availability zones. If using az this will count as zone 1 - required
12AUTOSCALER_MAX_SIZE_0: "2" ##This will be used if not using availability zones. If using az this will count as zone 1 - required
13AUTOSCALER_MIN_SIZE_1: "1" #This will be used for availability zone 2
14AUTOSCALER_MAX_SIZE_1: "3" #This will be used for availability zone 2
15AUTOSCALER_MIN_SIZE_2: "1" #This will be used for availability zone 3
16AUTOSCALER_MAX_SIZE_2: "4" #This will be used for availability zone 3
If not using TKG in a multi availability zone deployment, there is no need to add the lines AUTOSCALER_MIN_SIZE_1, AUTOSCALER_MAX_SIZE_1, AUTOSCALER_MIN_SIZE_2, and AUTOSCALER_MAX_SIZE_2 as these are only used for the additional zones you have configured. For a "no AZ" deployment AUTOSCALER_MIN/MAX_SIZE_1 is sufficient.
After the above has been added I will do a "--dry-run" to create my workload cluster class-based yaml file:
1andreasm@tkg-bootstrap:~$ tanzu cluster create tkg-cluster-3-auto --namespace tkg-ns-3 --file tkg-mgmt-bootstrap-tkg-2.3-autoscaler.yaml --dry-run > tkg-cluster-3-auto.yaml
The above command gives the workload cluster the name tkg-cluster-3-auto in the namespace tkg-ns-3 and using the modified tkg bootstrap file containing the autocluster variables. The output is the class-based yaml I will use to create the cluster, like this (if no error during the dry-run command). In my mgmt bootstrap I have defined the autoscaler min_max settings just to reflect the capabilities in differentiating settings pr availability zone. According to the manual this should only be used in AWS, but in 2.3 multi-az is fully supported and the docs has probably not been updated yet. If I take a look at the class-based yaml:
1 workers:
2 machineDeployments:
3 - class: tkg-worker
4 failureDomain: wdc-zone-2
5 metadata:
6 annotations:
7 cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size: "2"
8 cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "1"
9 run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
10 name: md-0
11 strategy:
12 type: RollingUpdate
13 - class: tkg-worker
14 failureDomain: wdc-zone-3
15 metadata:
16 annotations:
17 cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size: "3"
18 cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "1"
19 run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
20 name: md-1
21 strategy:
22 type: RollingUpdate
23 - class: tkg-worker
24 failureDomain: wdc-zone-3
25 metadata:
26 annotations:
27 cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size: "4"
28 cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "1"
29 run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
30 name: md-2
31 strategy:
32 type: RollingUpdate
33---
I notice that it does take into consideration my different availability zones. Perfect.
Before I deploy my workload cluster, I will edit the manifest to only deploy worker nodes in my AZ zone 2 due to resource constraints in my lab and to make the demo a bit better (scaling up from one worker and back again) then I will deploy the workload cluster.
1andreasm@tkg-bootstrap:~$ tanzu cluster create --file tkg-cluster-3-auto.yaml
2Validating configuration...
3cluster class based input file detected, getting tkr version from input yaml
4input TKR Version: v1.26.5+vmware.2-tkg.1
5TKR Version v1.26.5+vmware.2-tkg.1, Kubernetes Version v1.26.5+vmware.2-tkg.1 configured
Now it is all about wating... After the wating period is done it is time for some testing...
Enable Autoscaler on existing/running workload cluster
I have already a TKG workload cluster up and running and I want to "post-enable" autoscaler in this cluster. This cluster has been deployed with the AUTOSCALER_ENABLE=false and below is the class based yaml manifest (no autoscaler variables):
1 workers:
2 machineDeployments:
3 - class: tkg-worker
4 failureDomain: wdc-zone-2
5 metadata:
6 annotations:
7 run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
8 name: md-0
9 replicas: 1
10 strategy:
11 type: RollingUpdate
The above class based yaml has been generated from my my mgmt bootstrap yaml with the AUTOSCALER settings like this:
1#! ---------------
2#! Workload Cluster Specific
3#! -------------
4ENABLE_AUTOSCALER: "false"
5AUTOSCALER_MAX_NODES_TOTAL: "0"
6AUTOSCALER_SCALE_DOWN_DELAY_AFTER_ADD: "10m"
7AUTOSCALER_SCALE_DOWN_DELAY_AFTER_DELETE: "10s"
8AUTOSCALER_SCALE_DOWN_DELAY_AFTER_FAILURE: "3m"
9AUTOSCALER_SCALE_DOWN_UNNEEDED_TIME: "10m"
10AUTOSCALER_MAX_NODE_PROVISION_TIME: "15m"
11AUTOSCALER_MIN_SIZE_0: "1"
12AUTOSCALER_MAX_SIZE_0: "4"
13AUTOSCALER_MIN_SIZE_1: "1"
14AUTOSCALER_MAX_SIZE_1: "4"
15AUTOSCALER_MIN_SIZE_2: "1"
16AUTOSCALER_MAX_SIZE_2: "4"
If I check the autoscaler status:
1andreasm@linuxvm01:~$ k describe cm -n kube-system cluster-autoscaler-status
2Error from server (NotFound): configmaps "cluster-autoscaler-status" not found
Now, this cluster is in "serious" need to have autoscaler enabled. So how do I do that? This step is most likely not officially supported. I will now go back to the tkg mgmt bootstrap yaml, enable the autoscaler. Do a dry run of the config and apply the new class-based yaml manifest. This is all done in the TKG mgmt cluster context.
1andreasm@linuxvm01:~$ tanzu cluster create tkg-cluster-3-auto --namespace tkg-ns-3 --file tkg-mgmt-bootstrap-tkg-2.3-autoscaler-wld-1-zone.yaml --dry-run > tkg-cluster-3-auto-az.yaml
Before applying the yaml new class based manifest I will edit out the uneccessary crds, and just keep the updated settings relevant to the autoscaler, it may even be reduced further. Se my yaml below:
1apiVersion: cluster.x-k8s.io/v1beta1
2kind: Cluster
3metadata:
4 annotations:
5 osInfo: ubuntu,20.04,amd64
6 tkg/plan: dev
7 labels:
8 tkg.tanzu.vmware.com/cluster-name: tkg-cluster-3-auto
9 name: tkg-cluster-3-auto
10 namespace: tkg-ns-3
11spec:
12 clusterNetwork:
13 pods:
14 cidrBlocks:
15 - 100.96.0.0/11
16 services:
17 cidrBlocks:
18 - 100.64.0.0/13
19 topology:
20 class: tkg-vsphere-default-v1.1.0
21 controlPlane:
22 metadata:
23 annotations:
24 run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
25 replicas: 1
26 variables:
27 - name: cni
28 value: antrea
29 - name: controlPlaneCertificateRotation
30 value:
31 activate: true
32 daysBefore: 90
33 - name: auditLogging
34 value:
35 enabled: false
36 - name: podSecurityStandard
37 value:
38 audit: restricted
39 deactivated: false
40 warn: restricted
41 - name: apiServerEndpoint
42 value: ""
43 - name: aviAPIServerHAProvider
44 value: true
45 - name: vcenter
46 value:
47 cloneMode: fullClone
48 datacenter: /cPod-NSXAM-WDC
49 datastore: /cPod-NSXAM-WDC/datastore/vsanDatastore-wdc-01
50 folder: /cPod-NSXAM-WDC/vm/TKGm
51 network: /cPod-NSXAM-WDC/network/ls-tkg-mgmt
52 resourcePool: /cPod-NSXAM-WDC/host/Cluster-1/Resources
53 server: vcsa.FQDN
54 storagePolicyID: ""
55 tlsThumbprint: F8:----:7D
56 - name: user
57 value:
58 sshAuthorizedKeys:
59 - ssh-rsa BBAAB3NzaC1yc2EAAAADAQABA------QgPcxDoOhL6kdBHQY3ZRPE5LIh7RWM33SvsoIgic1OxK8LPaiGEPaOfUvP2ki7TNHLxP78bPxAfbkK7llDSmOIWrm7ukwG4DLHnyriBQahLqv1Wpx4kIRj5LM2UEBx235bVDSve==
60 - name: controlPlane
61 value:
62 machine:
63 diskGiB: 20
64 memoryMiB: 4096
65 numCPUs: 2
66 - name: worker
67 value:
68 machine:
69 diskGiB: 20
70 memoryMiB: 4096
71 numCPUs: 2
72 - name: controlPlaneZoneMatchingLabels
73 value:
74 region: k8s-region
75 tkg-cp: allowed
76 - name: security
77 value:
78 fileIntegrityMonitoring:
79 enabled: false
80 imagePolicy:
81 pullAlways: false
82 webhook:
83 enabled: false
84 spec:
85 allowTTL: 50
86 defaultAllow: true
87 denyTTL: 60
88 retryBackoff: 500
89 kubeletOptions:
90 eventQPS: 50
91 streamConnectionIdleTimeout: 4h0m0s
92 systemCryptoPolicy: default
93 version: v1.26.5+vmware.2-tkg.1
94 workers:
95 machineDeployments:
96 - class: tkg-worker
97 failureDomain: wdc-zone-2
98 metadata:
99 annotations:
100 cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size: "4"
101 cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "1"
102 run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
103 name: md-0
104 strategy:
105 type: RollingUpdate
106---
107apiVersion: apps/v1
108kind: Deployment
109metadata:
110 labels:
111 app: tkg-cluster-3-auto-cluster-autoscaler
112 name: tkg-cluster-3-auto-cluster-autoscaler
113 namespace: tkg-ns-3
114spec:
115 replicas: 1
116 selector:
117 matchLabels:
118 app: tkg-cluster-3-auto-cluster-autoscaler
119 template:
120 metadata:
121 labels:
122 app: tkg-cluster-3-auto-cluster-autoscaler
123 spec:
124 containers:
125 - args:
126 - --cloud-provider=clusterapi
127 - --v=4
128 - --clusterapi-cloud-config-authoritative
129 - --kubeconfig=/mnt/tkg-cluster-3-auto-kubeconfig/value
130 - --node-group-auto-discovery=clusterapi:clusterName=tkg-cluster-3-auto,namespace=tkg-ns-3
131 - --scale-down-delay-after-add=10m
132 - --scale-down-delay-after-delete=10s
133 - --scale-down-delay-after-failure=3m
134 - --scale-down-unneeded-time=10m
135 - --max-node-provision-time=15m
136 - --max-nodes-total=0
137 command:
138 - /cluster-autoscaler
139 image: projects.registry.vmware.com/tkg/cluster-autoscaler:v1.26.2_vmware.1
140 name: tkg-cluster-3-auto-cluster-autoscaler
141 volumeMounts:
142 - mountPath: /mnt/tkg-cluster-3-auto-kubeconfig
143 name: tkg-cluster-3-auto-cluster-autoscaler-volume
144 readOnly: true
145 serviceAccountName: tkg-cluster-3-auto-autoscaler
146 terminationGracePeriodSeconds: 10
147 tolerations:
148 - effect: NoSchedule
149 key: node-role.kubernetes.io/master
150 - effect: NoSchedule
151 key: node-role.kubernetes.io/control-plane
152 volumes:
153 - name: tkg-cluster-3-auto-cluster-autoscaler-volume
154 secret:
155 secretName: tkg-cluster-3-auto-kubeconfig
156---
157apiVersion: rbac.authorization.k8s.io/v1
158kind: ClusterRoleBinding
159metadata:
160 creationTimestamp: null
161 name: tkg-cluster-3-auto-autoscaler-workload
162roleRef:
163 apiGroup: rbac.authorization.k8s.io
164 kind: ClusterRole
165 name: cluster-autoscaler-workload
166subjects:
167- kind: ServiceAccount
168 name: tkg-cluster-3-auto-autoscaler
169 namespace: tkg-ns-3
170---
171apiVersion: rbac.authorization.k8s.io/v1
172kind: ClusterRoleBinding
173metadata:
174 creationTimestamp: null
175 name: tkg-cluster-3-auto-autoscaler-management
176roleRef:
177 apiGroup: rbac.authorization.k8s.io
178 kind: ClusterRole
179 name: cluster-autoscaler-management
180subjects:
181- kind: ServiceAccount
182 name: tkg-cluster-3-auto-autoscaler
183 namespace: tkg-ns-3
184---
185apiVersion: v1
186kind: ServiceAccount
187metadata:
188 name: tkg-cluster-3-auto-autoscaler
189 namespace: tkg-ns-3
190---
191apiVersion: rbac.authorization.k8s.io/v1
192kind: ClusterRole
193metadata:
194 name: cluster-autoscaler-workload
195rules:
196- apiGroups:
197 - ""
198 resources:
199 - persistentvolumeclaims
200 - persistentvolumes
201 - pods
202 - replicationcontrollers
203 verbs:
204 - get
205 - list
206 - watch
207- apiGroups:
208 - ""
209 resources:
210 - nodes
211 verbs:
212 - get
213 - list
214 - update
215 - watch
216- apiGroups:
217 - ""
218 resources:
219 - pods/eviction
220 verbs:
221 - create
222- apiGroups:
223 - policy
224 resources:
225 - poddisruptionbudgets
226 verbs:
227 - list
228 - watch
229- apiGroups:
230 - storage.k8s.io
231 resources:
232 - csinodes
233 - storageclasses
234 verbs:
235 - get
236 - list
237 - watch
238- apiGroups:
239 - batch
240 resources:
241 - jobs
242 verbs:
243 - list
244 - watch
245- apiGroups:
246 - apps
247 resources:
248 - daemonsets
249 - replicasets
250 - statefulsets
251 verbs:
252 - list
253 - watch
254- apiGroups:
255 - ""
256 resources:
257 - events
258 verbs:
259 - create
260 - patch
261- apiGroups:
262 - ""
263 resources:
264 - configmaps
265 verbs:
266 - create
267 - delete
268 - get
269 - update
270- apiGroups:
271 - coordination.k8s.io
272 resources:
273 - leases
274 verbs:
275 - create
276 - get
277 - update
278---
279apiVersion: rbac.authorization.k8s.io/v1
280kind: ClusterRole
281metadata:
282 name: cluster-autoscaler-management
283rules:
284- apiGroups:
285 - cluster.x-k8s.io
286 resources:
287 - machinedeployments
288 - machines
289 - machinesets
290 verbs:
291 - get
292 - list
293 - update
294 - watch
295 - patch
296- apiGroups:
297 - cluster.x-k8s.io
298 resources:
299 - machinedeployments/scale
300 - machinesets/scale
301 verbs:
302 - get
303 - update
304- apiGroups:
305 - infrastructure.cluster.x-k8s.io
306 resources:
307 - '*'
308 verbs:
309 - get
310 - list
And now I will apply the above yaml on my running TKG workload cluster using kubectl (done from the mgmt context):
1andreasm@linuxvm01:~$ kubectl apply -f tkg-cluster-3-enable-only-auto-az.yaml
2cluster.cluster.x-k8s.io/tkg-cluster-3-auto configured
3Warning: would violate PodSecurity "restricted:v1.24": allowPrivilegeEscalation != false (container "tkg-cluster-3-auto-cluster-autoscaler" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "tkg-cluster-3-auto-cluster-autoscaler" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "tkg-cluster-3-auto-cluster-autoscaler" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "tkg-cluster-3-auto-cluster-autoscaler" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
4deployment.apps/tkg-cluster-3-auto-cluster-autoscaler created
5clusterrolebinding.rbac.authorization.k8s.io/tkg-cluster-3-auto-autoscaler-workload created
6clusterrolebinding.rbac.authorization.k8s.io/tkg-cluster-3-auto-autoscaler-management created
7serviceaccount/tkg-cluster-3-auto-autoscaler created
8clusterrole.rbac.authorization.k8s.io/cluster-autoscaler-workload unchanged
9clusterrole.rbac.authorization.k8s.io/cluster-autoscaler-management unchanged
Checking for autoscaler status now shows this:
1andreasm@linuxvm01:~$ k describe cm -n kube-system cluster-autoscaler-status
2Name: cluster-autoscaler-status
3Namespace: kube-system
4Labels: <none>
5Annotations: cluster-autoscaler.kubernetes.io/last-updated: 2023-09-11 10:40:02.369535271 +0000 UTC
6
7Data
8====
9status:
10----
11Cluster-autoscaler status at 2023-09-11 10:40:02.369535271 +0000 UTC:
12Cluster-wide:
13 Health: Healthy (ready=2 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=2 longUnregistered=0)
14 LastProbeTime: 2023-09-11 10:40:01.146686706 +0000 UTC m=+26.613355068
15 LastTransitionTime: 2023-09-11 10:40:01.146686706 +0000 UTC m=+26.613355068
16 ScaleUp: NoActivity (ready=2 registered=2)
17 LastProbeTime: 2023-09-11 10:40:01.146686706 +0000 UTC m=+26.613355068
18 LastTransitionTime: 2023-09-11 10:40:01.146686706 +0000 UTC m=+26.613355068
19 ScaleDown: NoCandidates (candidates=0)
20 LastProbeTime: 2023-09-11 10:40:01.146686706 +0000 UTC m=+26.613355068
21 LastTransitionTime: 2023-09-11 10:40:01.146686706 +0000 UTC m=+26.613355068
22
23NodeGroups:
24 Name: MachineDeployment/tkg-ns-3/tkg-cluster-3-auto-md-0-s7d7t
25 Health: Healthy (ready=1 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=1 longUnregistered=0 cloudProviderTarget=1 (minSize=1, maxSize=4))
26 LastProbeTime: 2023-09-11 10:40:01.146686706 +0000 UTC m=+26.613355068
27 LastTransitionTime: 2023-09-11 10:40:01.146686706 +0000 UTC m=+26.613355068
28 ScaleUp: NoActivity (ready=1 cloudProviderTarget=1)
29 LastProbeTime: 2023-09-11 10:40:01.146686706 +0000 UTC m=+26.613355068
30 LastTransitionTime: 2023-09-11 10:40:01.146686706 +0000 UTC m=+26.613355068
31 ScaleDown: NoCandidates (candidates=0)
32 LastProbeTime: 2023-09-11 10:40:01.146686706 +0000 UTC m=+26.613355068
33 LastTransitionTime: 2023-09-11 10:40:01.146686706 +0000 UTC m=+26.613355068
34
35
36
37BinaryData
38====
39
40Events: <none>
Thats great.
Another way to do it is to edit the cluster directly following this KB article. This KB article can also be used to change/modify existing autoscaler settings.
Test the autoscaler
In the following chapters I will test the scale up and down of my worker nodes, based on load in the cluster. My initial cluster is up and running:
1NAME STATUS ROLES AGE VERSION
2tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q Ready <none> 4m17s v1.26.5+vmware.2
3tkg-cluster-3-auto-ns4jx-szp69 Ready control-plane 8m31s v1.26.5+vmware.2
One control-plane node and one worker node. Now I want to check the status of the cluster-scaler:
1andreasm@linuxvm01:~$ k describe cm -n kube-system cluster-autoscaler-status
2Name: cluster-autoscaler-status
3Namespace: kube-system
4Labels: <none>
5Annotations: cluster-autoscaler.kubernetes.io/last-updated: 2023-09-08 13:30:12.611110965 +0000 UTC
6
7Data
8====
9status:
10----
11Cluster-autoscaler status at 2023-09-08 13:30:12.611110965 +0000 UTC:
12Cluster-wide:
13 Health: Healthy (ready=2 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=2 longUnregistered=0)
14 LastProbeTime: 2023-09-08 13:30:11.394021754 +0000 UTC m=+1356.335230920
15 LastTransitionTime: 2023-09-08 13:07:46.176049718 +0000 UTC m=+11.117258901
16 ScaleUp: NoActivity (ready=2 registered=2)
17 LastProbeTime: 2023-09-08 13:30:11.394021754 +0000 UTC m=+1356.335230920
18 LastTransitionTime: 2023-09-08 13:07:46.176049718 +0000 UTC m=+11.117258901
19 ScaleDown: NoCandidates (candidates=0)
20 LastProbeTime: 2023-09-08 13:30:11.394021754 +0000 UTC m=+1356.335230920
21 LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
22
23NodeGroups:
24 Name: MachineDeployment/tkg-ns-3/tkg-cluster-3-auto-md-0-fhrws
25 Health: Healthy (ready=1 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=1 longUnregistered=0 cloudProviderTarget=1 (minSize=1, maxSize=4))
26 LastProbeTime: 2023-09-08 13:30:11.394021754 +0000 UTC m=+1356.335230920
27 LastTransitionTime: 2023-09-08 13:12:44.585589045 +0000 UTC m=+309.526798282
28 ScaleUp: NoActivity (ready=1 cloudProviderTarget=1)
29 LastProbeTime: 2023-09-08 13:30:11.394021754 +0000 UTC m=+1356.335230920
30 LastTransitionTime: 2023-09-08 13:12:44.585589045 +0000 UTC m=+309.526798282
31 ScaleDown: NoCandidates (candidates=0)
32 LastProbeTime: 2023-09-08 13:30:11.394021754 +0000 UTC m=+1356.335230920
33 LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
34
35
36
37BinaryData
38====
39
40Events: <none>
Scale-up - amount of worker nodes (horizontally)
Now I need to generate some load and see if it will do some magic scaling in the background.
I have deployed my Yelb app again, the only missing pod is the UI pod:
1NAME READY STATUS RESTARTS AGE
2redis-server-56d97cc8c-4h54n 1/1 Running 0 6m56s
3yelb-appserver-65855b7ffd-j2bjt 1/1 Running 0 6m55s
4yelb-db-6f78dc6f8f-rg68q 1/1 Running 0 6m56s
I still have my one cp node and one worker node. I will now deploy the UI pod and scale an insane amount of UI pods for the Yelb application.
1yelb-ui-5c5b8d8887-9598s 1/1 Running 0 2m35s
1andreasm@linuxvm01:~$ k scale deployment -n yelb yelb-ui --replicas 200
2deployment.apps/yelb-ui scaled
Lets check some status after this... A bunch of pods in pending states, waiting for a node to be scheduled on.
1NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
2redis-server-56d97cc8c-4h54n 1/1 Running 0 21m 100.96.1.9 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
3yelb-appserver-65855b7ffd-j2bjt 1/1 Running 0 21m 100.96.1.11 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
4yelb-db-6f78dc6f8f-rg68q 1/1 Running 0 21m 100.96.1.10 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
5yelb-ui-5c5b8d8887-22v8p 1/1 Running 0 6m18s 100.96.1.53 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
6yelb-ui-5c5b8d8887-2587j 0/1 Pending 0 3m49s <none> <none> <none> <none>
7yelb-ui-5c5b8d8887-2bzcg 0/1 Pending 0 3m51s <none> <none> <none> <none>
8yelb-ui-5c5b8d8887-2gncl 0/1 Pending 0 3m51s <none> <none> <none> <none>
9yelb-ui-5c5b8d8887-2gwp8 1/1 Running 0 3m53s 100.96.1.86 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
10yelb-ui-5c5b8d8887-2gz7r 0/1 Pending 0 3m50s <none> <none> <none> <none>
11yelb-ui-5c5b8d8887-2jlvv 0/1 Pending 0 3m49s <none> <none> <none> <none>
12yelb-ui-5c5b8d8887-2pfgp 1/1 Running 0 6m18s 100.96.1.36 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
13yelb-ui-5c5b8d8887-2prwf 0/1 Pending 0 3m50s <none> <none> <none> <none>
14yelb-ui-5c5b8d8887-2vr4f 0/1 Pending 0 3m53s <none> <none> <none> <none>
15yelb-ui-5c5b8d8887-2w2t8 0/1 Pending 0 3m49s <none> <none> <none> <none>
16yelb-ui-5c5b8d8887-2x6b7 1/1 Running 0 6m18s 100.96.1.34 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
17yelb-ui-5c5b8d8887-2x726 1/1 Running 0 9m40s 100.96.1.23 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
18yelb-ui-5c5b8d8887-452bx 0/1 Pending 0 3m49s <none> <none> <none> <none>
19yelb-ui-5c5b8d8887-452dd 1/1 Running 0 6m17s 100.96.1.69 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
20yelb-ui-5c5b8d8887-45nmz 0/1 Pending 0 3m48s <none> <none> <none> <none>
21yelb-ui-5c5b8d8887-4kj69 1/1 Running 0 3m53s 100.96.1.109 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
22yelb-ui-5c5b8d8887-4svbf 0/1 Pending 0 3m50s <none> <none> <none> <none>
23yelb-ui-5c5b8d8887-4t6dm 0/1 Pending 0 3m50s <none> <none> <none> <none>
24yelb-ui-5c5b8d8887-4zlhw 0/1 Pending 0 3m51s <none> <none> <none> <none>
25yelb-ui-5c5b8d8887-55qzm 1/1 Running 0 9m40s 100.96.1.15 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
26yelb-ui-5c5b8d8887-5fts4 1/1 Running 0 6m18s 100.96.1.55 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
The autoscaler status:
1andreasm@linuxvm01:~$ k describe cm -n kube-system cluster-autoscaler-status
2Name: cluster-autoscaler-status
3Namespace: kube-system
4Labels: <none>
5Annotations: cluster-autoscaler.kubernetes.io/last-updated: 2023-09-08 14:01:43.794315378 +0000 UTC
6
7Data
8====
9status:
10----
11Cluster-autoscaler status at 2023-09-08 14:01:43.794315378 +0000 UTC:
12Cluster-wide:
13 Health: Healthy (ready=2 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=2 longUnregistered=0)
14 LastProbeTime: 2023-09-08 14:01:41.380962042 +0000 UTC m=+3246.322171235
15 LastTransitionTime: 2023-09-08 13:07:46.176049718 +0000 UTC m=+11.117258901
16 ScaleUp: InProgress (ready=2 registered=2)
17 LastProbeTime: 2023-09-08 14:01:41.380962042 +0000 UTC m=+3246.322171235
18 LastTransitionTime: 2023-09-08 14:01:41.380962042 +0000 UTC m=+3246.322171235
19 ScaleDown: NoCandidates (candidates=0)
20 LastProbeTime: 2023-09-08 14:01:30.091765978 +0000 UTC m=+3235.032975159
21 LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
22
23NodeGroups:
24 Name: MachineDeployment/tkg-ns-3/tkg-cluster-3-auto-md-0-fhrws
25 Health: Healthy (ready=1 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=1 longUnregistered=0 cloudProviderTarget=2 (minSize=1, maxSize=4))
26 LastProbeTime: 2023-09-08 14:01:41.380962042 +0000 UTC m=+3246.322171235
27 LastTransitionTime: 2023-09-08 13:12:44.585589045 +0000 UTC m=+309.526798282
28 ScaleUp: InProgress (ready=1 cloudProviderTarget=2)
29 LastProbeTime: 2023-09-08 14:01:41.380962042 +0000 UTC m=+3246.322171235
30 LastTransitionTime: 2023-09-08 14:01:41.380962042 +0000 UTC m=+3246.322171235
31 ScaleDown: NoCandidates (candidates=0)
32 LastProbeTime: 2023-09-08 14:01:30.091765978 +0000 UTC m=+3235.032975159
33 LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
34
35
36
37BinaryData
38====
39
40Events:
41 Type Reason Age From Message
42 ---- ------ ---- ---- -------
43 Normal ScaledUpGroup 12s cluster-autoscaler Scale-up: setting group MachineDeployment/tkg-ns-3/tkg-cluster-3-auto-md-0-fhrws size to 2 instead of 1 (max: 4)
44 Normal ScaledUpGroup 11s cluster-autoscaler Scale-up: group MachineDeployment/tkg-ns-3/tkg-cluster-3-auto-md-0-fhrws size set to 2 instead of 1 (max: 4)
O yes, it has triggered a scale up. And in vCenter a new worker node is in the process:
1NAME STATUS ROLES AGE VERSION
2tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q Ready <none> 55m v1.26.5+vmware.2
3tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc NotReady <none> 10s v1.26.5+vmware.2
4tkg-cluster-3-auto-ns4jx-szp69 Ready control-plane 59m v1.26.5+vmware.2
Lets check the pods status when the new node has been provisioned and ready..
The node is now ready:
1NAME STATUS ROLES AGE VERSION
2tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q Ready <none> 56m v1.26.5+vmware.2
3tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc Ready <none> 101s v1.26.5+vmware.2
4tkg-cluster-3-auto-ns4jx-szp69 Ready control-plane 60m v1.26.5+vmware.2
All my 200 UI pods are now scheduled and running across two worker nodes:
1NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
2redis-server-56d97cc8c-4h54n 1/1 Running 0 30m 100.96.1.9 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
3yelb-appserver-65855b7ffd-j2bjt 1/1 Running 0 30m 100.96.1.11 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
4yelb-db-6f78dc6f8f-rg68q 1/1 Running 0 30m 100.96.1.10 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
5yelb-ui-5c5b8d8887-22v8p 1/1 Running 0 15m 100.96.1.53 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
6yelb-ui-5c5b8d8887-2587j 1/1 Running 0 12m 100.96.2.82 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc <none> <none>
7yelb-ui-5c5b8d8887-2bzcg 1/1 Running 0 12m 100.96.2.9 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc <none> <none>
8yelb-ui-5c5b8d8887-2gncl 1/1 Running 0 12m 100.96.2.28 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc <none> <none>
9yelb-ui-5c5b8d8887-2gwp8 1/1 Running 0 12m 100.96.1.86 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
10yelb-ui-5c5b8d8887-2gz7r 1/1 Running 0 12m 100.96.2.38 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc <none> <none>
11yelb-ui-5c5b8d8887-2jlvv 1/1 Running 0 12m 100.96.2.58 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc <none> <none>
12yelb-ui-5c5b8d8887-2pfgp 1/1 Running 0 15m 100.96.1.36 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
13yelb-ui-5c5b8d8887-2prwf 1/1 Running 0 12m 100.96.2.48 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc <none> <none>
14yelb-ui-5c5b8d8887-2vr4f 1/1 Running 0 12m 100.96.2.77 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc <none> <none>
15yelb-ui-5c5b8d8887-2w2t8 1/1 Running 0 12m 100.96.2.63 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc <none> <none>
16yelb-ui-5c5b8d8887-2x6b7 1/1 Running 0 15m 100.96.1.34 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
17yelb-ui-5c5b8d8887-2x726 1/1 Running 0 18m 100.96.1.23 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
18yelb-ui-5c5b8d8887-452bx 1/1 Running 0 12m 100.96.2.67 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc <none> <none>
19yelb-ui-5c5b8d8887-452dd 1/1 Running 0 15m 100.96.1.69 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q <none> <none>
20yelb-ui-5c5b8d8887-45nmz 1/1 Running 0 12m 100.96.2.100 tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc <none> <none>
Scale-down - remove un-needed worker nodes
Now that I have seen that the autoscaler is indeed scaling the amount worker nodes automatically, I will like to test whether it is also being capable of scaling down, removing unneccessary worker nodes as the load is not there any more. To test this I will just scale down the amount of UI pods in the Yelb application:
1andreasm@linuxvm01:~$ k scale deployment -n yelb yelb-ui --replicas 2
2deployment.apps/yelb-ui scaled
3andreasm@linuxvm01:~$ k get pods -n yelb
4NAME READY STATUS RESTARTS AGE
5redis-server-56d97cc8c-4h54n 1/1 Running 0 32m
6yelb-appserver-65855b7ffd-j2bjt 1/1 Running 0 32m
7yelb-db-6f78dc6f8f-rg68q 1/1 Running 0 32m
8yelb-ui-5c5b8d8887-22v8p 1/1 Terminating 0 17m
9yelb-ui-5c5b8d8887-2587j 1/1 Terminating 0 14m
10yelb-ui-5c5b8d8887-2bzcg 1/1 Terminating 0 14m
11yelb-ui-5c5b8d8887-2gncl 1/1 Terminating 0 14m
12yelb-ui-5c5b8d8887-2gwp8 1/1 Terminating 0 14m
13yelb-ui-5c5b8d8887-2gz7r 1/1 Terminating 0 14m
14yelb-ui-5c5b8d8887-2jlvv 1/1 Terminating 0 14m
15yelb-ui-5c5b8d8887-2pfgp 1/1 Terminating 0 17m
16yelb-ui-5c5b8d8887-2prwf 1/1 Terminating 0 14m
17yelb-ui-5c5b8d8887-2vr4f 1/1 Terminating 0 14m
18yelb-ui-5c5b8d8887-2w2t8 1/1 Terminating 0 14m
When all the unnecessary pods are gone, I need to monitor the removal of the worker nodes. It may take some minutes
The Yelb application is back to "normal"
1NAME READY STATUS RESTARTS AGE
2redis-server-56d97cc8c-4h54n 1/1 Running 0 33m
3yelb-appserver-65855b7ffd-j2bjt 1/1 Running 0 33m
4yelb-db-6f78dc6f8f-rg68q 1/1 Running 0 33m
5yelb-ui-5c5b8d8887-dxlth 1/1 Running 0 21m
6yelb-ui-5c5b8d8887-gv829 1/1 Running 0 21m
Checking the autoscaler status now, it has identified a candidate to scale down. But as I have sat this AUTOSCALER_SCALE_DOWN_DELAY_AFTER_ADD: "10m" I will need to wait 10 minutes after LastTransitionTime ...
1Name: cluster-autoscaler-status
2Namespace: kube-system
3Labels: <none>
4Annotations: cluster-autoscaler.kubernetes.io/last-updated: 2023-09-08 14:19:46.985695728 +0000 UTC
5
6Data
7====
8status:
9----
10Cluster-autoscaler status at 2023-09-08 14:19:46.985695728 +0000 UTC:
11Cluster-wide:
12 Health: Healthy (ready=3 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=3 longUnregistered=0)
13 LastProbeTime: 2023-09-08 14:19:45.772876369 +0000 UTC m=+4330.714085660
14 LastTransitionTime: 2023-09-08 13:07:46.176049718 +0000 UTC m=+11.117258901
15 ScaleUp: NoActivity (ready=3 registered=3)
16 LastProbeTime: 2023-09-08 14:19:45.772876369 +0000 UTC m=+4330.714085660
17 LastTransitionTime: 2023-09-08 14:08:21.539629262 +0000 UTC m=+3646.480838810
18 ScaleDown: CandidatesPresent (candidates=1)
19 LastProbeTime: 2023-09-08 14:19:45.772876369 +0000 UTC m=+4330.714085660
20 LastTransitionTime: 2023-09-08 14:18:26.989571984 +0000 UTC m=+4251.930781291
21
22NodeGroups:
23 Name: MachineDeployment/tkg-ns-3/tkg-cluster-3-auto-md-0-fhrws
24 Health: Healthy (ready=2 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=2 longUnregistered=0 cloudProviderTarget=2 (minSize=1, maxSize=4))
25 LastProbeTime: 2023-09-08 14:19:45.772876369 +0000 UTC m=+4330.714085660
26 LastTransitionTime: 2023-09-08 13:12:44.585589045 +0000 UTC m=+309.526798282
27 ScaleUp: NoActivity (ready=2 cloudProviderTarget=2)
28 LastProbeTime: 2023-09-08 14:19:45.772876369 +0000 UTC m=+4330.714085660
29 LastTransitionTime: 2023-09-08 14:08:21.539629262 +0000 UTC m=+3646.480838810
30 ScaleDown: CandidatesPresent (candidates=1)
31 LastProbeTime: 2023-09-08 14:19:45.772876369 +0000 UTC m=+4330.714085660
32 LastTransitionTime: 2023-09-08 14:18:26.989571984 +0000 UTC m=+4251.930781291
33
34
35
36BinaryData
37====
38
39Events:
40 Type Reason Age From Message
41 ---- ------ ---- ---- -------
42 Normal ScaledUpGroup 18m cluster-autoscaler Scale-up: setting group MachineDeployment/tkg-ns-3/tkg-cluster-3-auto-md-0-fhrws size to 2 instead of 1 (max: 4)
43 Normal ScaledUpGroup 18m cluster-autoscaler Scale-up: group MachineDeployment/tkg-ns-3/tkg-cluster-3-auto-md-0-fhrws size set to 2 instead of 1 (max: 4)
After the 10 minutes:
1NAME STATUS ROLES AGE VERSION
2tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-dcp2q Ready <none> 77m v1.26.5+vmware.2
3tkg-cluster-3-auto-ns4jx-szp69 Ready control-plane 81m v1.26.5+vmware.2
Back to two nodes again, and the VM has been deleted from vCenter.
The autoscaler status:
1Name: cluster-autoscaler-status
2Namespace: kube-system
3Labels: <none>
4Annotations: cluster-autoscaler.kubernetes.io/last-updated: 2023-09-08 14:29:32.692769073 +0000 UTC
5
6Data
7====
8status:
9----
10Cluster-autoscaler status at 2023-09-08 14:29:32.692769073 +0000 UTC:
11Cluster-wide:
12 Health: Healthy (ready=2 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=2 longUnregistered=0)
13 LastProbeTime: 2023-09-08 14:29:31.482497258 +0000 UTC m=+4916.423706440
14 LastTransitionTime: 2023-09-08 13:07:46.176049718 +0000 UTC m=+11.117258901
15 ScaleUp: NoActivity (ready=2 registered=2)
16 LastProbeTime: 2023-09-08 14:29:31.482497258 +0000 UTC m=+4916.423706440
17 LastTransitionTime: 2023-09-08 14:08:21.539629262 +0000 UTC m=+3646.480838810
18 ScaleDown: NoCandidates (candidates=0)
19 LastProbeTime: 2023-09-08 14:29:31.482497258 +0000 UTC m=+4916.423706440
20 LastTransitionTime: 2023-09-08 14:28:46.471388976 +0000 UTC m=+4871.412598145
21
22NodeGroups:
23 Name: MachineDeployment/tkg-ns-3/tkg-cluster-3-auto-md-0-fhrws
24 Health: Healthy (ready=1 unready=0 (resourceUnready=0) notStarted=0 longNotStarted=0 registered=1 longUnregistered=0 cloudProviderTarget=1 (minSize=1, maxSize=4))
25 LastProbeTime: 2023-09-08 14:29:31.482497258 +0000 UTC m=+4916.423706440
26 LastTransitionTime: 2023-09-08 13:12:44.585589045 +0000 UTC m=+309.526798282
27 ScaleUp: NoActivity (ready=1 cloudProviderTarget=1)
28 LastProbeTime: 2023-09-08 14:29:31.482497258 +0000 UTC m=+4916.423706440
29 LastTransitionTime: 2023-09-08 14:08:21.539629262 +0000 UTC m=+3646.480838810
30 ScaleDown: NoCandidates (candidates=0)
31 LastProbeTime: 2023-09-08 14:29:31.482497258 +0000 UTC m=+4916.423706440
32 LastTransitionTime: 2023-09-08 14:28:46.471388976 +0000 UTC m=+4871.412598145
33
34
35
36BinaryData
37====
38
39Events:
40 Type Reason Age From Message
41 ---- ------ ---- ---- -------
42 Normal ScaledUpGroup 27m cluster-autoscaler Scale-up: setting group MachineDeployment/tkg-ns-3/tkg-cluster-3-auto-md-0-fhrws size to 2 instead of 1 (max: 4)
43 Normal ScaledUpGroup 27m cluster-autoscaler Scale-up: group MachineDeployment/tkg-ns-3/tkg-cluster-3-auto-md-0-fhrws size set to 2 instead of 1 (max: 4)
44 Normal ScaleDownEmpty 61s cluster-autoscaler Scale-down: removing empty node "tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc"
45 Normal ScaleDownEmpty 55s cluster-autoscaler Scale-down: empty node tkg-cluster-3-auto-md-0-fhrws-757648f59cxq4hlz-q6fqc removed
This works really well. Quite straight forward to enable and a really nice feature to have. And this also concludes this post.