Taking Rancher by SUSE for a Spin
Overview
Rancher by Suse - short introduction
From the offical Rancher docs:
Rancher is a container management platform built for organizations that deploy containers in production. Rancher makes it easy to run Kubernetes everywhere, meet IT requirements, and empower DevOps teams.
Run Kubernetes Everywhere
Kubernetes has become the container orchestration standard. Most cloud and virtualization vendors now offer it as standard infrastructure. Rancher users have the choice of creating Kubernetes clusters with Rancher Kubernetes Engine (RKE) or cloud Kubernetes services, such as GKE, AKS, and EKS. Rancher users can also import and manage their existing Kubernetes clusters created using any Kubernetes distribution or installer.
Meet IT Requirements
Rancher supports centralized authentication, access control, and monitoring for all Kubernetes clusters under its control. For example, you can:
- Use your Active Directory credentials to access Kubernetes clusters hosted by cloud vendors, such as GKE.
- Setup and enforce access control and security policies across all users, groups, projects, clusters, and clouds.
- View the health and capacity of your Kubernetes clusters from a single-pane-of-glass.
Rancher can manage already existing Kubernetes clusters by importing them into Rancher, but Rancher can also do fully automated deployments of complete Kubernetes clusters.
The initial parts of this post will be focusing on getting Rancher itself deployed, then I will do some automated Kubernetes deployments from Rancher. The goal is to showcase how quick and easy it is to to deploy Kubernetes using Rancher. If it is not too obvious when reading these cluster creation chapters, trust me, deploying and managing Kubernetes cluster with Rancher is both fun and easy. After the initial chapters I will dive into some more technical topics.
This post is not meant to be a comprehensive article on all features/configurations possible with Rancher, look at it more as an unboxing Rancher post.
My environments used in this post
In this post I will be using two platforms where I run and deploy my Kubernetes clusters. One platform is VMware vSphere the other is using Proxmox. Proxmox is running home in my lab, vSphere is running in another remote place and I am accessing it using IPsec VPN. In this post I will deploy Rancher in a Tanzu Kubernetes Cluster deployed in my vSphere lab. I already have Rancher running in one of my Kubernetes clusters in my Proxmox lab. The Rancher deployment in vSphere is just used to go through the installation of Rancher (I had already done the installation of Rancher before creating this post and didnt want to tear it down). The Rancher instance in my lab will be used for both importing existing Kubernetes clusters and the automated deployments of Kubernetes clusters on both the vSphere environment and Proxmox environment.
In a very high level diagram it should look like this:
Where Rancher sits
As I will proceed through this post by adding and deploying Kubernetes clusters and using Rancher accessing these clusters, it will make sense to illustrate how this looks like. It may be beneficial to the overall understanding to have as much context as possible of whats going on and how things work when reading through the post.
Running Rancher on Kubernetes makes Rancher Highly Available, it will be distributed across multiple Kubernetes worker nodes and also benefit from Kubernetes lifecycle management, self-healing etc. Rancher will become a critical endpoint so managing and ensuring availability to this endpoint is critical. Exposing the Rancher using a HA capable loadbalancer is something to consider. Loosing a singleton instance loadbalancer means loosing access to Rancher. To expose Rancher one can use VMware Avi Loadbalancer with its distributed architecture (several Service Engines in Active/Active) or one can use Traefik loadbalancer, HAProxy Nginx, just to name a few. Then the underlaying physical compute hosts should consist of more than one host of course.
One should thrive to make the Rancher API endpoint as robust as possible as this will be THE endpoint to use to access and manage the Rancher managed Kubernetes Clusters.
As one can imagine, managing multiple clusters using Rancher there will be some requests to this endpoint. So performance and resilience is key.
As soon as I have authenticated through Rancher I can access my Kubernetes clusters. Rancher will then work as a proxy and forward my requests to my respective Kubernetes clusters.
1{"kind":"Event","authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"globaladmin-user-64t5q\" of ClusterRole \"cluster-admin\" to User \"user-64t5q\""}}
Installing Rancher on an existing Kubernetes Cluster using Helm
Before I can start using Rancher I need to install it. Rancher can be deployed using Docker for poc/testing purposes and on (more or less any) Kubernetes platform for production use. If one started out testing Rancher on Docker one can actually migrate from Docker to Kubernetes at a later stage also. In this post I will deploy Rancher on Kubernetes. I have already deployed a Kubernetes cluster I intend to deploy Rancher on. The Kubernetes cluster that will be used is provisioned by vSphere with Tanzu, running in my vSphere cluster using the method I describe in this post. vSphere with Tanzu is also its own Kubernetes management platform of course. I have deployed 3 control plane nodes and 3 worker nodes. All persistent storage will be handled by the the vSphere Cluster VSAN storage. Ingress will be taken care of by VMware Avi loadbalancer.
Here is my TKC cluster I will install rancher on:
The cluster is ready with all the necessary backend services like the Avi loadbalancer providing loadbalancer services and Ingress rules. The first thing I need is to add the Helm repo for Rancher.
1# latest - recommended for testing the newest features
2andreasm@linuxmgmt01:~$ helm repo add rancher-latest https://releases.rancher.com/server-charts/latest
3"rancher-latest" has been added to your repositories
Create a namespace for Rancher:
1andreasm@linuxmgmt01:~$ kubectl create namespace cattle-system
2
3namespace/cattle-system created
I will bring my own Certificate, so skipping Cert-Manager. This is my Rancher Helm value yaml:
1# Additional Trusted CAs.
2# Enable this flag and add your CA certs as a secret named tls-ca-additional in the namespace.
3# See README.md for details.
4additionalTrustedCAs: false
5
6antiAffinity: preferred
7topologyKey: kubernetes.io/hostname
8
9# Audit Logs https://rancher.com/docs/rancher/v2.x/en/installation/api-auditing/
10# The audit log is piped to the console of the rancher-audit-log container in the rancher pod.
11# https://rancher.com/docs/rancher/v2.x/en/installation/api-auditing/
12# destination stream to sidecar container console or hostPath volume
13# level: Verbosity of logs, 0 to 3. 0 is off 3 is a lot.
14auditLog:
15 destination: sidecar
16 hostPath: /var/log/rancher/audit/
17 level: 0
18 maxAge: 1
19 maxBackup: 1
20 maxSize: 100
21
22 # Image for collecting rancher audit logs.
23 # Important: update pkg/image/export/resolve.go when this default image is changed, so that it's reflected accordingly in rancher-images.txt generated for air-gapped setups.
24 image:
25 repository: "rancher/mirrored-bci-micro"
26 tag: 15.4.14.3
27 # Override imagePullPolicy image
28 # options: Always, Never, IfNotPresent
29 pullPolicy: "IfNotPresent"
30
31# As of Rancher v2.5.0 this flag is deprecated and must be set to 'true' in order for Rancher to start
32addLocal: "true"
33
34# Add debug flag to Rancher server
35debug: false
36
37# When starting Rancher for the first time, bootstrap the admin as restricted-admin
38restrictedAdmin: false
39
40# Extra environment variables passed to the rancher pods.
41# extraEnv:
42# - name: CATTLE_TLS_MIN_VERSION
43# value: "1.0"
44
45# Fully qualified name to reach your Rancher server
46hostname: rancher-01.my.domain.net
47
48## Optional array of imagePullSecrets containing private registry credentials
49## Ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
50imagePullSecrets: []
51# - name: secretName
52
53### ingress ###
54# Readme for details and instruction on adding tls secrets.
55ingress:
56 # If set to false, ingress will not be created
57 # Defaults to true
58 # options: true, false
59 enabled: true
60 includeDefaultExtraAnnotations: true
61 extraAnnotations: {}
62 ingressClassName: "avi-lb"
63 # Certain ingress controllers will will require the pathType or path to be set to a different value.
64 pathType: ImplementationSpecific
65 path: "/"
66 # backend port number
67 servicePort: 80
68
69 # configurationSnippet - Add additional Nginx configuration. This example statically sets a header on the ingress.
70 # configurationSnippet: |
71 # more_set_input_headers "X-Forwarded-Host: {{ .Values.hostname }}";
72
73 tls:
74 # options: rancher, letsEncrypt, secret
75 source: secret
76 secretName: tls-rancher-ingress
77
78### service ###
79# Override to use NodePort or LoadBalancer service type - default is ClusterIP
80service:
81 type: ""
82 annotations: {}
83
84### LetsEncrypt config ###
85# ProTip: The production environment only allows you to register a name 5 times a week.
86# Use staging until you have your config right.
87letsEncrypt:
88 # email: none@example.com
89 environment: production
90 ingress:
91 # options: traefik, nginx
92 class: ""
93# If you are using certs signed by a private CA set to 'true' and set the 'tls-ca'
94# in the 'rancher-system' namespace. See the README.md for details
95privateCA: false
96
97# http[s] proxy server passed into rancher server.
98# proxy: http://<username>@<password>:<url>:<port>
99
100# comma separated list of domains or ip addresses that will not use the proxy
101noProxy: 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,.svc,.cluster.local
102
103# Override rancher image location for Air Gap installs
104rancherImage: rancher/rancher
105# rancher/rancher image tag. https://hub.docker.com/r/rancher/rancher/tags/
106# Defaults to .Chart.appVersion
107# rancherImageTag: v2.0.7
108
109# Override imagePullPolicy for rancher server images
110# options: Always, Never, IfNotPresent
111# Defaults to IfNotPresent
112# rancherImagePullPolicy: <pullPolicy>
113
114# Number of Rancher server replicas. Setting to negative number will dynamically between 0 and the abs(replicas) based on available nodes.
115# of available nodes in the cluster
116replicas: 3
117
118# Set priorityClassName to avoid eviction
119priorityClassName: rancher-critical
120
121# Set pod resource requests/limits for Rancher.
122resources: {}
123
124#
125# tls
126# Where to offload the TLS/SSL encryption
127# - ingress (default)
128# - external
129tls: ingress
130
131systemDefaultRegistry: ""
132
133# Set to use the packaged system charts
134useBundledSystemChart: false
135
136# Certmanager version compatibility
137certmanager:
138 version: ""
139
140# Rancher custom logos persistence
141customLogos:
142 enabled: false
143 volumeSubpaths:
144 emberUi: "ember"
145 vueUi: "vue"
146 ## Volume kind to use for persistence: persistentVolumeClaim, configMap
147 volumeKind: persistentVolumeClaim
148 ## Use an existing volume. Custom logos should be copied to the volume by the user
149 # volumeName: custom-logos
150 ## Just for volumeKind: persistentVolumeClaim
151 ## To disables dynamic provisioning, set storageClass: "" or storageClass: "-"
152 # storageClass: "-"
153 accessMode: ReadWriteOnce
154 size: 1Gi
155
156# Rancher post-delete hook
157postDelete:
158 enabled: true
159 image:
160 repository: rancher/shell
161 tag: v0.1.23
162 namespaceList:
163 - cattle-fleet-system
164 - cattle-system
165 - rancher-operator-system
166 # Number of seconds to wait for an app to be uninstalled
167 timeout: 120
168 # by default, the job will fail if it fail to uninstall any of the apps
169 ignoreTimeoutError: false
170
171# Set a bootstrap password. If leave empty, a random password will be generated.
172bootstrapPassword: "MyPassword"
173
174livenessProbe:
175 initialDelaySeconds: 60
176 periodSeconds: 30
177readinessProbe:
178 initialDelaySeconds: 5
179 periodSeconds: 30
180
181global:
182 cattle:
183 psp:
184 # will default to true on 1.24 and below, and false for 1.25 and above
185 # can be changed manually to true or false to bypass version checks and force that option
186 enabled: ""
187
188# helm values to use when installing the rancher-webhook chart.
189# helm values set here will override all other global values used when installing the webhook such as priorityClassName and systemRegistry settings.
190webhook: ""
191
192# helm values to use when installing the fleet chart.
193# helm values set here will override all other global values used when installing the fleet chart.
194fleet: ""
I choose secret under TLS, so before I install Rancher I can create the secret with my own certificate.
1andreasm@linuxmgmt01:~$ k create secret -n cattle-system tls tls-rancher-ingress --cert=tls.crt --key=tls.key
2secret/tls-rancher-ingress created
Now I can deploy Rancher with my value.yaml above.
1helm install -n cattle-system rancher rancher-latest/rancher -f rancher.values.yaml
2NAME: rancher
3LAST DEPLOYED: Mon May 6 08:24:36 2024
4NAMESPACE: cattle-system
5STATUS: deployed
6REVISION: 1
7TEST SUITE: None
8NOTES:
9Rancher Server has been installed.
10
11NOTE: Rancher may take several minutes to fully initialize. Please standby while Certificates are being issued, Containers are started and the Ingress rule comes up.
12
13Check out our docs at https://rancher.com/docs/
14
15If you provided your own bootstrap password during installation, browse to https://rancher-01.my-domain.net to get started.
16
17If this is the first time you installed Rancher, get started by running this command and clicking the URL it generates:
18
19```
20echo https://rancher-01.my-domain.net/dashboard/?setup=$(kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}')
21```
22
23To get just the bootstrap password on its own, run:
24
25```
26kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}{{ "\n" }}'
27```
28
29
30Happy Containering!
Now its time to log into the the url mentioned above using the provided bootstrap password (if entered):
By entering the bootstrap password above will log you in. If logging out, next time it will look like this:
Before going to the next chapter, in my Kubernetes cluster now I will have some additional namespaces created, services and deployments. Below is some of them:
1# Namespaces
2NAME STATUS AGE
3cattle-fleet-clusters-system Active 3h9m
4cattle-fleet-local-system Active 3h8m
5cattle-fleet-system Active 3h10m
6cattle-global-data Active 3h10m
7cattle-global-nt Active 3h10m
8cattle-impersonation-system Active 3h9m
9cattle-provisioning-capi-system Active 3h8m
10cattle-system Active 3h24m
11cluster-fleet-local-local-1a3d67d0a899 Active 3h8m
12fleet-default Active 3h10m
13fleet-local Active 3h10m
14local Active 3h10m
15p-4s8sk Active 3h10m
16p-n8cmn Active 3h10m
17# Deployments and services
18NAME READY STATUS RESTARTS AGE
19pod/helm-operation-9bzv8 0/2 Completed 0 3m22s
20pod/helm-operation-bvjrq 0/2 Completed 0 8m10s
21pod/helm-operation-njvlg 0/2 Completed 0 8m22s
22pod/rancher-5498b85476-bpfcn 1/1 Running 0 9m25s
23pod/rancher-5498b85476-j6ggn 1/1 Running 0 9m25s
24pod/rancher-5498b85476-xg247 1/1 Running 0 9m25s
25pod/rancher-webhook-7d876fccc8-6m8tk 1/1 Running 0 8m7s
26
27NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
28service/rancher ClusterIP 10.10.227.75 <none> 80/TCP,443/TCP 9m26s
29service/rancher-webhook ClusterIP 10.10.81.6 <none> 443/TCP 8m8s
30
31NAME READY UP-TO-DATE AVAILABLE AGE
32deployment.apps/rancher 3/3 3 3 9m27s
33deployment.apps/rancher-webhook 1/1 1 1 8m9s
34
35NAME DESIRED CURRENT READY AGE
36replicaset.apps/rancher-5498b85476 3 3 3 9m26s
37replicaset.apps/rancher-webhook-7d876fccc8 1 1 1 8m8s
38
39NAME CLASS HOSTS ADDRESS PORTS AGE
40rancher avi-lb rancher-01.my-domain.net 192.168.121.10 80, 443 9m47s
Rancher is ready to be used.
Create and manage Kubernetes clusters
As soon as I have logged in the first thing I would like to do is to create a Kubernetes cluster. Rancher supports many methods of creating a Kubernetes cluster. Rancher can create clusters in a hosted Kubernetes provider such as Amazon EKS, Azure AKS, Google GKE and more. Rancher can create clusters and provision the nodes on Amazon EC2, Azure, DigitalOcean, Harvester, Linode, VMware vSphere and more. If you dont use any of them, it can create clusters on exisiting nodes like VMs deployed with a Linux (or even Windows..) OS waiting to be configured to do something (I will go through that later) or using Elemental (more on that later also).
Under drivers, see the list of possible Cluster Drivers and Node Drivers:
I happen to have vSphere as my virtualization platform which is natively supported by Rancher and will start by provisioning a Kubernetes cluster using the vSphere cloud. I also have another hypervisor platform running in my lab (Proxmox) which does not have a native driver for Rancher, instead I will make Rancher deploy Kubernetes on existing VM nodes. I will go through how I deploy Kubernetes clusters on vSphere and on Proxmox using the Custom "cloud" on existing nodes.
Rancher Kubernetes distributions
Rancher is fully capable of doing automated provisining of Kubernetes clusters, as I will go through in this post, as much as it can manage already exisiting Kubernetes clusters by importing them.
When Rancher is doing automated provisions of Kubernetes on supported clouds or onto existing nodes it uses its own Kubernetes distribution. In Rancher one can choose between three distributions called RKE, RKE2 and K3s. In this post I will only use RKE2 which stands for Rancher Kubernetes Engine. RKE2 is Rancher's next-generation Kubernetes distribution. I may touch upon the K3s distribution also, as this is very much focused on edge use cases due to its lightweight.
For more information on RKE2 head over here
For more information on K3s head over here
At the current time of writing this post I am using Rancher 2.8.3 and the latest RKE2/Kubernetes release is 1.28.8 which is not so far away from Kubernetes upstream which is currently at v1.30.
Create RKE2 clusters using vSphere
Using vSphere as the "cloud" in Rancher do have some benefits. From the docs
For Rancher to spin up and create RKE2 clusters on vSphere I need to create a VM template using the OS of choice in my vSphere cluster. According to the official RKE2 documentation RKE2 should work on any Linux distribution that uses systemd and iptables. There is also a RKE2 Support Matrix for all OS versions that have been validated with RKE2 here (linked to v1.28 as this post uses 1.28 but v1.29 is already out at the time of writing this post).
I went with Ubuntu 22.04 as my Linux distribution
There is a couple of ways Rancher can utilize vSphere to manage the template. Below is a screenshot of the possible options of the time of writing this post:
In this section I will go through the two methods Deploy from template: Content Library and deploy from template: Data Center
Create Cloud Credentials
The first thing I will do before getting to the actual cluster creation is to create credentials for the user and connection information to my vCenter server. I will go to the Cluster Management and Cloud Credentials section in Rancher:
Click create in the top right corner and select VMware vSphere:
Fill in the relevant information about your vCenter server including the credentials with the right permissions (see permissions here using some very old screenshots from vCenter..):
RKE2 clusters using vSphere Content Library template
I decided to start with the method of deploying RKE2 clusters using the Content Library method as I found that one to be the easiest and fastest method (its just about uploading the image to the content library and thats it). The concept behind this is to create a Content Library in your vCenter hosting a Cloud Image template for the OS of choice. This should be as minimal as possible, with zero to none configs. All the needed configs are done by Rancher when adding the nodes to your cluster later.
As mentioned above, I went with Ubuntu Cloud image. Ubuntu Cloud images can be found here.
I will go with this OVA image (22.04 LTS Jammy Jellyfish):
In my vCenter I create a Content Library:
Local content library:
Select my datastore to host the content:
Finish:
Now I need to enter the content library by clicking on the name to add my desired Ubuntu Cloud Image version:
I will select Source File URL and paste the below URL:
https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.ova
Click on Import and it will download the OVA from the URL
It may complain about SSL certificate being untruste.. Click Actions and Continue:
The template should be downloading now:
Thats it from the vCenter side. The cloud image is ready as a template for Rancher to use. Now I will head over to Rancher and create my RKE2 cluster on vSphere based on this template.
I will go to Cluster Management using the top left menu or directy from the homepage clicking on manage
As I have done the preparations in vSphere I am now ready to create my RKE2 cluster on vSphere. Head to Clusters in the top left corner and click create in the top right corner then select VMware vSphere:
The cluster I want to create is one control-plane node and one worker node.
Give the cluster a name and description if you want. Then select the roles, at a minimum one have to deploy one node with all roles etcd, Control Plane and Worker. Give the pool a name, I will go with default values here. Machine count I will go with default 1 here also. I can scale it later 😄. Then make sure to select the correct datacenter in your vCenter, select the Resource Pool. If you have resource pools, if not just select the correct cluster (sfo-m01-cl01)/Resources which will be the cluster resource pool (root). Then select the correct Datastore. Then I will left the other fields default except the Creation method and Network. Under creation method I have selected Deploy from template: Content Library the correct Content Library and the jammy-server-cloudimg template I have uploaded/downloaded to the Content Library. Click Add Network and select the correct vCenter Network Portgroup you want the node to be placed in.
Thats it for pool-1. I will scroll a bit down, click on the + sign and add a second pool for the nodes to have only the role worker.
I will repeat the steps above for pool-2. They will be identical except for the pool name and only role Worker selected.
For the sake of keeping it simple, I will leave all the other fields default and just click create. I will now sit back and enjoy Rancher creating my RKE2 cluster on vSphere.
In vCenter:
And cluster is ready:
There was of course a bunch of option I elegantly skipped during the creation of this cluster. But the point, again, was to show how quickly and easily I could bring up a RKE2 cluster on vSphere. And so it was.
Supplemental information
To make further customization of the Ubuntu Cloud Image one can use the #cloud-config (cloud-init), and the vSphere vApp function.
vApp:
When it comes to the Ubuntu Cloud Image it does not contain any default username and password. So how can I access SSH on them? Well, from the Rancher UI one can access SSH directly or download the SSH keys. One can even copy paste into the SSH shell from there. SSH through the UI is very neat feature.
SSH Shell:
For IP allocation I have configured DHCP in my NSX-T environment for the segment my RKE2 nodes will be placed in and all nodes will use dynamically allocated IP addresses in that subnet.
RKE2 clusters using vSphere VM template
The other method of preparing a template for Rancher to use is to install the Ubuntu Server as a regular VM. That involves doing the necessary configs, power it down and convert it to template. I thought this would mean a very minimal config to be done. But not quite.
My first attempt did not go so well. I installed Ubuntu, then powered it down and converted it to a template. Tried to create a RKE2 cluster with it from Rancher, it cloned the amount of control plane nodes and worker nodes I had defined, but there it stopped. So I figured I must have missed something along the way.
Then I found this post here which elegantly described the things I need in my template before "handing" it over to Rancher. So below is what I did in my Ubuntu Template by following the blog post above.
In short, this is what I did following the post above.
On my template:
1sudo apt-get install -y curl wget git net-tools unzip ca-certificates cloud-init cloud-guest-utils cloud-image-utils cloud-initramfs-growroot open-iscsi openssh-server open-vm-tools net-tools apparmor
and:
1sudo dpkg-reconfigure cloud-init
Deselected everything except "NoCloud"
Then I ran the script:
1#!/bin/bash
2# Cleaning logs.
3if [ -f /var/log/audit/audit.log ]; then
4 cat /dev/null > /var/log/audit/audit.log
5fi
6if [ -f /var/log/wtmp ]; then
7 cat /dev/null > /var/log/wtmp
8fi
9if [ -f /var/log/lastlog ]; then
10 cat /dev/null > /var/log/lastlog
11fi
12
13# Cleaning udev rules.
14if [ -f /etc/udev/rules.d/70-persistent-net.rules ]; then
15 rm /etc/udev/rules.d/70-persistent-net.rules
16fi
17
18# Cleaning the /tmp directories
19rm -rf /tmp/*
20rm -rf /var/tmp/*
21
22# Cleaning the SSH host keys
23rm -f /etc/ssh/ssh_host_*
24
25# Cleaning the machine-id
26truncate -s 0 /etc/machine-id
27rm /var/lib/dbus/machine-id
28ln -s /etc/machine-id /var/lib/dbus/machine-id
29
30# Cleaning the shell history
31unset HISTFILE
32history -cw
33echo > ~/.bash_history
34rm -fr /root/.bash_history
35
36# Truncating hostname, hosts, resolv.conf and setting hostname to localhost
37truncate -s 0 /etc/{hostname,hosts,resolv.conf}
38hostnamectl set-hostname localhost
39
40# Clean cloud-init
41cloud-init clean -s -l
Powered off the VM, converted it to a template in vSphere.
Still after doing the steps described in the post linked to above my deployment failed. It turned out to be insufficient disk capacity. So I had to update my template in vSphere to use bigger disk. I was too conservative (or cheap) when I created it, I extended it to 60gb to be on the safe side. Could that have been a "no-issue" if I had read the official documentation/requirements, yes.
I saw that when I connected to the control plane and checked the cattle-cluster-agent (it never started):
1andreasm@linuxmgmt01:~/$ k --kubeconfig test-cluster.yaml describe pod -n cattle-system cattle-cluster-agent-5cc77b7988-dwrtj
2Name: cattle-cluster-agent-5cc77b7988-dwrtj
3Namespace: cattle-system
4Events:
5 Type Reason Age From Message
6 ---- ------ ---- ---- -------
7 Warning Failed 6m24s kubelet Failed to pull image "rancher/rancher-agent:v2.8.3": failed to pull and unpack image "docker.io/rancher/rancher-agent:v2.8.3": failed to extract layer sha256:3a9df665b61dd3e00c0753ca43ca9a0828cb5592ec051048b4bbc1a3f4488e05: write /var/lib/rancher/rke2/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/108/fs/var/lib/rancher-data/local-catalogs/v2/rancher-partner-charts/8f17acdce9bffd6e05a58a3798840e408c4ea71783381ecd2e9af30baad65974/.git/objects/pack/pack-3b07a8b347f63714781c345b34c0793ec81f2b86.pack: no space left on device: unknown
8 Warning Failed 6m24s kubelet Error: ErrImagePull
9 Warning FailedMount 4m15s (x5 over 4m22s) kubelet MountVolume.SetUp failed for volume "kube-api-access-cvwz2" : failed to fetch token: Post "https://127.0.0.1:6443/api/v1/namespaces/cattle-system/serviceaccounts/cattle/token": dial tcp 127.0.0.1:6443: connect: connection refused
10 Normal Pulling 4m7s kubelet Pulling image "rancher/rancher-agent:v2.8.3"
11 Warning Evicted 3m59s kubelet The node was low on resource: ephemeral-storage. Threshold quantity: 574385774, available: 108048Ki.
After doing the additional changes in my template I gave the RKE2 cluster creation another attempt.
From the Cluster Management, click Create
Select VMware vSphere:
I will start by defining "pool-1" as my 3 control plane nodes. Select vCenter Datacenter, Datastore and resource pool (if no resource pool is defined in vCenter, select just Resources which is "root"/cluster-level)
Then define the "pool-1" instance options like cpu, memory, disk, network and the created vm template:
Add a second pool by clicking on the + sign here:
Then confgure the amount of Machine count, roles (worker only) and vSphere placement as above (unless one have several clusters and want to distribute the workers.. That is another discussion).
Configure the pool-2 instance cpu, memory, network, vm template etc.. Here one can define the workers to be a bit more beefier (resource wise) than the control plane nodes.
Then it is the cluster config itself. There is a lot of details I will not cover now, will cover these a bit later. Now the sole purpose is to get a cluster up and running as easy and fast as possible. I will leave everything default, except pod CIDR and service CIDR. Will get into some more details later in the post.
Pod CIDR and Service CIDR
When done, click create bottom right corner.
From vCenter:
Here is my cluster starting to be built.
Rancher cluster creation status
Thats it. Cluster is up and running. There was a couple of additional tasks that needed to be done, but it worked. If one does not have the option to use a Cloud Image, then this will also work. The experience using a Cloud Image was by far the easiest and most flexible approach.
Creating a RKE2 cluster on existing nodes - non native cloud provider
If I dont have a vSphere environment or other supported Cloud provider, but another platform for my virtalization needs, one alternative is then to prepare some VMs with my preferred operating system I can tell Rancher to deploy RKE2 on. The provisioning and managing of these VMs will be handled by OpenTofu. Lets see how that works.
I have already a post covering how I am deploying VMs and Kubernetes using OpenTofu and Kubespray on my Proxmox cluster. For this section I will use the OpenTofu part just to quickly deploy the VMs on Proxmox I need to build or bring up my Kubernetes (RKE2) cluster.
My OpenTofu project is already configured, it should deploy 6 VMs, 3 control plane nodes and 3 workers. I will kick that task off like this:
1andreasm@linuxmgmt01:~/terraform/proxmox/rancher-cluster-1$ tofu apply plan
2proxmox_virtual_environment_file.ubuntu_cloud_init: Creating...
3proxmox_virtual_environment_file.ubuntu_cloud_init: Creation complete after 1s [id=local:snippets/ubuntu.cloud-config.yaml]
4proxmox_virtual_environment_vm.rke2-worker-vms-cl01[0]: Creating...
5proxmox_virtual_environment_vm.rke2-cp-vms-cl01[2]: Creating...
6proxmox_virtual_environment_vm.rke2-cp-vms-cl01[1]: Creating...
7proxmox_virtual_environment_vm.rke2-worker-vms-cl01[2]: Creating...
8proxmox_virtual_environment_vm.rke2-worker-vms-cl01[1]: Creating...
9proxmox_virtual_environment_vm.rke2-cp-vms-cl01[0]: Creating...
10...
11proxmox_virtual_environment_vm.rke2-cp-vms-cl01[2]: Creation complete after 1m37s [id=1023]
12
13Apply complete! Resources: 7 added, 0 changed, 0 destroyed.
And my vms should start popping up in my Proxmox ui:
Now I have some freshly installed VMs running, it is time to hand them over to Rancher to do some magic on them. Lets head over to Rancher and create a Custom cluster.
Creating RKE2 cluster on existing nodes - "custom cloud"
In Rancher, go to Cluster Management or from the homepage, create and select Custom
Give the cluster a name, and change what you want according to your needs. I will only change the Container Network and the pod/services CIDR.
Click create:
Now it will tell you to paste a registration command on the node you want to be the controlplane, etc and worker node.
Click on the cli command to get it into the clipboard. I will start by preparing the first node with all three roles control-plane, etcd and worker. Then I will continue with the next two. After my controlplane is ready and consist of three nodes I will change the parameter to only include worker node and do the last three nodes.
Then go ahead and ssh into my intented control plane node and paste the command:
1ubuntu@ubuntu:~$ curl -fL https://rancher-dev.my-domain.net/system-agent-install.sh | sudo sh -s - --server https://rancher-dev.my-domain.net --label 'cattle.io/os=linux' --token p5fp6sv4qbskmhl7mmpksbxcgs9gk9dtsqkqg9q8q8ln5f56l5jlpb --etcd --controlplane --worker
2 % Total % Received % Xferd Average Speed Time Time Time Current
3 Dload Upload Total Spent Left Speed
4100 32228 0 32228 0 0 368k 0 --:--:-- --:--:-- --:--:-- 370k
5[INFO] Label: cattle.io/os=linux
6[INFO] Role requested: etcd
7[INFO] Role requested: controlplane
8[INFO] Role requested: worker
9[INFO] Using default agent configuration directory /etc/rancher/agent
10[INFO] Using default agent var directory /var/lib/rancher/agent
11[INFO] Successfully tested Rancher connection
12[INFO] Downloading rancher-system-agent binary from https://rancher-dev.my-domain.net/assets/rancher-system-agent-amd64
13[INFO] Successfully downloaded the rancher-system-agent binary.
14[INFO] Downloading rancher-system-agent-uninstall.sh script from https://rancher-dev.my-doamin.net/assets/system-agent-uninstall.sh
15[INFO] Successfully downloaded the rancher-system-agent-uninstall.sh script.
16[INFO] Generating Cattle ID
17[INFO] Successfully downloaded Rancher connection information
18[INFO] systemd: Creating service file
19[INFO] Creating environment file /etc/systemd/system/rancher-system-agent.env
20[INFO] Enabling rancher-system-agent.service
21Created symlink /etc/systemd/system/multi-user.target.wants/rancher-system-agent.service → /etc/systemd/system/rancher-system-agent.service.
22[INFO] Starting/restarting rancher-system-agent.service
23ubuntu@ubuntu:~$
Now, lets just wait in the Rancher UI and monitor the progress.
Repeat same operation for the two last control plane nodes.
For the worker nodes I will adjust the command to only deploy worker node role:
1ubuntu@ubuntu:~$ curl -fL https://rancher-dev.my-domain.net/system-agent-install.sh | sudo sh -s - --server https://rancher-dev.my-domain.net --label 'cattle.io/os=linux' --token jfjd5vwfdqxkszgg25m6w49hhm4l99kjk2xhrm2hp75dvwhrjnrt2d --worker
2 % Total % Received % Xferd Average Speed Time Time Time Current
3 Dload Upload Total Spent Left Speed
4100 32228 0 32228 0 0 370k 0 --:--:-- --:--:-- --:--:-- 374k
5[INFO] Label: cattle.io/os=linux
6[INFO] Role requested: worker
7[INFO] Using default agent configuration directory /etc/rancher/agent
8[INFO] Using default agent var directory /var/lib/rancher/agent
9[INFO] Successfully tested Rancher connection
10[INFO] Downloading rancher-system-agent binary from https://rancher-dev.my-domain.net/assets/rancher-system-agent-amd64
11[INFO] Successfully downloaded the rancher-system-agent binary.
12[INFO] Downloading rancher-system-agent-uninstall.sh script from https://rancher-dev.my-doamin.net/assets/system-agent-uninstall.sh
13[INFO] Successfully downloaded the rancher-system-agent-uninstall.sh script.
14[INFO] Generating Cattle ID
15[INFO] Successfully downloaded Rancher connection information
16[INFO] systemd: Creating service file
17[INFO] Creating environment file /etc/systemd/system/rancher-system-agent.env
18[INFO] Enabling rancher-system-agent.service
19Created symlink /etc/systemd/system/multi-user.target.wants/rancher-system-agent.service → /etc/systemd/system/rancher-system-agent.service.
20[INFO] Starting/restarting rancher-system-agent.service
Repeat for all the worker nodes needed.
After some minutes, the cluster should be complete with all control plane nodes and worker nodes:
The whole process, two tasks really, from deploying my six VMs to a fully working Kubernetes cluster running was just shy of 10 minutes. All I had to do was to provison the VMs, go into Rancher create a RKE2 cluster, execute the registration command on each node.
Deploying Kubernetes clusters couldn't be more fun actually.
Import existing Kubernetes Clusters
As mentioned earlier, Rancher can also import existing Kubernetes clusters for central management. If one happen to have a bunch of Kubernetes clusters deployed, no central management of them, add them to Rancher.
Here is how to import existing Kubernetes clusters.
Head over to Cluster Management in the Rancher UI, go to Clusters and find the Import Existing button:
I will select Generic as I dont have any clusters running in Amazon, Azure nor Google.
Give the cluster a name (name as it will show in Rancher and ignore my typo tanzuz) and description. I am using the default admin account in Rancher.
Click create in the bottom right corner.
It will now show me how I can register my already existing cluster into Rancher:
Now I need to get into the context of the Kubernetes cluster I want to import and execute the following command:
1kubectl apply -f https://rancher-dev.my-domain.net/v3/import/knkh9n4rvhgpkxxxk7c4rhblhkqxfjxcq82nm7zcgfdcgqsl7bqdg5_c-m-qxpbc6tp.yaml
In the correct context:
1andreasm@linuxmgmt01:~$ kubectl apply -f https://rancher-dev.my-domain.net/v3/import/knkh9n4rvhgpkxxxk7c4rhblhkqxfjxcq82nm7zcgfdcgqsl7bqdg5_c-m-qxpbc6tp.yaml
2clusterrole.rbac.authorization.k8s.io/proxy-clusterrole-kubeapiserver created
3clusterrolebinding.rbac.authorization.k8s.io/proxy-role-binding-kubernetes-master created
4Warning: resource namespaces/cattle-system is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
5namespace/cattle-system configured
6serviceaccount/cattle created
7clusterrolebinding.rbac.authorization.k8s.io/cattle-admin-binding created
8secret/cattle-credentials-603e85e created
9clusterrole.rbac.authorization.k8s.io/cattle-admin created
10Warning: spec.template.spec.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[0].matchExpressions[0].key: beta.kubernetes.io/os is deprecated since v1.14; use "kubernetes.io/os" instead
11deployment.apps/cattle-cluster-agent created
12service/cattle-cluster-agent created
In the Rancher UI:
It took a couple of seconds, and the cluster was imported.
Now just having a look around.
Nodes:
Doing kubectl from the UI:
Navigating the Rancher UI
So far I have only created and imported clusters into Rancher. The current inventory in Rancher is a couple of clusters both imported and created by Rancher:
Its time to explore Rancher.
What is the information Rancher can give me on my new RKE2 cluster. Lets click around.
Click on Explore
Rancher Cluster Dashboard
The first thing I get to when clicking on Explore is the Cluster Dashboard. It contains a brief status of the cluster like CPU and Memory usage, total resources, pods running (55 of 660?) and latest events. Very nice set of information.
But in the top right corner, what have we there? Install Monitoring and Add Cluster Badge. Lets try the Cluster Badge
I can customize the cluster badge.. Lets see how this looks like
It adds a color and initials on the cluster list on the left side menu, making it easy to distinguish and sort the different Kubernetes clusters.
Now, what about the Install Monitoring.
A set of available monitoring tools, ready to be installed. This is really great, and qualifies for dedicated sections later in this post.
On left side menu, there is additional information:
Pods:
What about executing directly into shell of a pod?
Nodes:
ConfigMaps:
I have an action menu on every object available in the left side menus, I can edit configMaps, secrets, drain node etc right there from the Rancher UI.
If I know the object name, I can just search for it also:
So much information readily available directly after deploying the cluster. This was just a quick overview, there are several more fields to explore.
Accessing my new RKE2 cluster deployed on vSphere
Now that my clusters is up and running, how can I access it? SSH into one of the control plane nodes and grab the kubeconfig? Well that is possible, but is it a better way?
Yes it is...
Click on the three dots at the end of the cluster:
Lets try the Kubectl Shell
Or just copy the KubeConfig to Clipboard, paste it in a file on your workstation and access your cluster like this:
1andreasm@linuxmgmt01:~/$ k get nodes -owide
2NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
3rke2-cluster-1-pool1-47454a8f-6npvs Ready control-plane,etcd,master,worker 127m v1.28.8+rke2r1 192.168.150.98 <none> Ubuntu 22.04.4 LTS 5.15.0-105-generic containerd://1.7.11-k3s2
4rke2-cluster-1-pool1-47454a8f-9tv65 Ready control-plane,etcd,master,worker 123m v1.28.8+rke2r1 192.168.150.54 <none> Ubuntu 22.04.4 LTS 5.15.0-105-generic containerd://1.7.11-k3s2
5rke2-cluster-1-pool1-47454a8f-xpjqf Ready control-plane,etcd,master,worker 121m v1.28.8+rke2r1 192.168.150.32 <none> Ubuntu 22.04.4 LTS 5.15.0-105-generic containerd://1.7.11-k3s2
6rke2-cluster-1-pool2-b5e2f49f-2mr5r Ready worker 119m v1.28.8+rke2r1 192.168.150.97 <none> Ubuntu 22.04.4 LTS 5.15.0-105-generic containerd://1.7.11-k3s2
7rke2-cluster-1-pool2-b5e2f49f-9qnl2 Ready worker 119m v1.28.8+rke2r1 192.168.150.99 <none> Ubuntu 22.04.4 LTS 5.15.0-105-generic containerd://1.7.11-k3s2
8rke2-cluster-1-pool2-b5e2f49f-tfw2x Ready worker 119m v1.28.8+rke2r1 192.168.150.100 <none> Ubuntu 22.04.4 LTS 5.15.0-105-generic containerd://1.7.11-k3s2
9
10andreasm@linuxmgmt01:~/$ k get pods -A
11NAMESPACE NAME READY STATUS RESTARTS AGE
12cattle-fleet-system fleet-agent-595ff97fd6-vl9t4 1/1 Running 0 124m
13cattle-system cattle-cluster-agent-6bb5b4f9c6-br6lk 1/1 Running 0 123m
14cattle-system cattle-cluster-agent-6bb5b4f9c6-t5822 1/1 Running 0 123m
15cattle-system rancher-webhook-bcc8984b6-9ztqj 1/1 Running 0 122m
16kube-system cloud-controller-manager-rke2-cluster-1-pool1-47454a8f-6npvs 1/1 Running 0 127m
17kube-system cloud-controller-manager-rke2-cluster-1-pool1-47454a8f-9tv65 1/1 Running 0 123m
18kube-system cloud-controller-manager-rke2-cluster-1-pool1-47454a8f-xpjqf 1/1 Running 0 120m
19kube-system etcd-rke2-cluster-1-pool1-47454a8f-6npvs 1/1 Running 0 127m
20kube-system etcd-rke2-cluster-1-pool1-47454a8f-9tv65 1/1 Running 0 123m
21kube-system etcd-rke2-cluster-1-pool1-47454a8f-xpjqf 1/1 Running 0 120m
22kube-system helm-install-rke2-canal-4cxc6 0/1 Completed 0 127m
23kube-system helm-install-rke2-coredns-vxvpj 0/1 Completed 0 127m
24kube-system helm-install-rke2-ingress-nginx-f5g88 0/1 Completed 0 127m
25kube-system helm-install-rke2-metrics-server-66rff 0/1 Completed 0 127m
26kube-system helm-install-rke2-snapshot-controller-crd-9ncn6 0/1 Completed 0 127m
27kube-system helm-install-rke2-snapshot-controller-rrhbw 0/1 Completed 1 127m
28kube-system helm-install-rke2-snapshot-validation-webhook-jxgr5 0/1 Completed 0 127m
29kube-system kube-apiserver-rke2-cluster-1-pool1-47454a8f-6npvs 1/1 Running 0 127m
30kube-system kube-apiserver-rke2-cluster-1-pool1-47454a8f-9tv65 1/1 Running 0 123m
31kube-system kube-apiserver-rke2-cluster-1-pool1-47454a8f-xpjqf 1/1 Running 0 120m
32kube-system kube-controller-manager-rke2-cluster-1-pool1-47454a8f-6npvs 1/1 Running 0 127m
33kube-system kube-controller-manager-rke2-cluster-1-pool1-47454a8f-9tv65 1/1 Running 0 123m
34kube-system kube-controller-manager-rke2-cluster-1-pool1-47454a8f-xpjqf 1/1 Running 0 120m
35kube-system kube-proxy-rke2-cluster-1-pool1-47454a8f-6npvs 1/1 Running 0 127m
36kube-system kube-proxy-rke2-cluster-1-pool1-47454a8f-9tv65 1/1 Running 0 123m
37kube-system kube-proxy-rke2-cluster-1-pool1-47454a8f-xpjqf 1/1 Running 0 120m
38kube-system kube-proxy-rke2-cluster-1-pool2-b5e2f49f-2mr5r 1/1 Running 0 119m
39kube-system kube-proxy-rke2-cluster-1-pool2-b5e2f49f-9qnl2 1/1 Running 0 119m
40kube-system kube-proxy-rke2-cluster-1-pool2-b5e2f49f-tfw2x 1/1 Running 0 119m
41kube-system kube-scheduler-rke2-cluster-1-pool1-47454a8f-6npvs 1/1 Running 0 127m
42kube-system kube-scheduler-rke2-cluster-1-pool1-47454a8f-9tv65 1/1 Running 0 123m
43kube-system kube-scheduler-rke2-cluster-1-pool1-47454a8f-xpjqf 1/1 Running 0 120m
44kube-system rke2-canal-7lmxr 2/2 Running 0 119m
45kube-system rke2-canal-cmrm2 2/2 Running 0 123m
46kube-system rke2-canal-f9ztm 2/2 Running 0 119m
47kube-system rke2-canal-gptjt 2/2 Running 0 119m
48kube-system rke2-canal-p9dj4 2/2 Running 0 121m
49kube-system rke2-canal-qn998 2/2 Running 0 126m
50kube-system rke2-coredns-rke2-coredns-84b9cb946c-q7cjn 1/1 Running 0 123m
51kube-system rke2-coredns-rke2-coredns-84b9cb946c-v2lq7 1/1 Running 0 126m
52kube-system rke2-coredns-rke2-coredns-autoscaler-b49765765-lj6mp 1/1 Running 0 126m
53kube-system rke2-ingress-nginx-controller-24g4k 1/1 Running 0 113m
54kube-system rke2-ingress-nginx-controller-9gncw 1/1 Running 0 123m
55kube-system rke2-ingress-nginx-controller-h9slv 1/1 Running 0 119m
56kube-system rke2-ingress-nginx-controller-p98mx 1/1 Running 0 120m
57kube-system rke2-ingress-nginx-controller-rfvnm 1/1 Running 0 125m
58kube-system rke2-ingress-nginx-controller-xlhd8 1/1 Running 0 113m
59kube-system rke2-metrics-server-544c8c66fc-n8r9b 1/1 Running 0 126m
60kube-system rke2-snapshot-controller-59cc9cd8f4-5twmc 1/1 Running 0 126m
61kube-system rke2-snapshot-validation-webhook-54c5989b65-8fzx7 1/1 Running 0 126m
This goes for both Rancher automated deployed Kubernetes clusters and imported clusters. Same API endpoint to access them all.
If I would access my Tanzu Kubernetes clusters I would have to use the kubectl vsphere login command and from there enter the correnct vSphere Namespace context to get to my TKC cluster. By just adding the Tanzu cluster to Rancher I can use the same endpoint to access this cluster two. Have a quick look at how the kubectl config:
1apiVersion: v1
2kind: Config
3clusters:
4- name: "tanzuz-cluster-1"
5 cluster:
6 server: "https://rancher-dev.my-domain.net/k8s/clusters/c-m-qxpbc6tp"
7
8users:
9- name: "tanzuz-cluster-1"
10 user:
11 token: "kubeconfig-user-64t5qq4d67:l8mdcs55ggf8kz4zlrcpth8tk48j85kp9qjslk5zrcmjczw6nqr7fw"
12
13
14contexts:
15- name: "tanzuz-cluster-1"
16 context:
17 user: "tanzuz-cluster-1"
18 cluster: "tanzuz-cluster-1"
19
20current-context: "tanzuz-cluster-1"
I accessing my TKC cluster through Rancher API endpoint.
Managing Kubernetes clusters from Rancher
Though making Kubernetes cluster deployments an easy task is important, handling day two operations is even more important. Let's explore some typical administrative tasks that needs to be done on a Kubernetes cluster during its lifetime using Rancher.
Scaling nodes horizontally - up and down
There will come a time when there is a need to scale the amount of worker nodes in a cluster. Lets see how we can do that in Rancher.
If I manually want to scale the amount of worker nodes up, I will go to Cluster Management and Clusters, then click on the name of my cluster to be scaled:
By clicking on the cluster name it will take me directly to the Machine pools, on the right there is some obvious -/+ buttons. By clicking on the + sign once it will scale with 1 additional worker node in that pool. Lets test
I will now scale my rke2-cluster-2-vsphere with from one worker node to two worker nodes.
In my vCenter:
Cluster is now scaled with one additional worker node:
Manual autoscaling is not always sufficient as the cluster is too dynamic so Rancher does have support Node Autoscaling
Autoscaling: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/rancher/README.md
Scaling nodes vertically - add cpu, ram and disk
What if I need to change the cpu, memory and disk resources on the nodes? An unexpected change in the workload dictates the nodes to be handle to change that and scaling horizontally does not solve it. Then I need an easy way to change the current nodes resource.
From the Cluster Manager and Clusters view, click on the name of the cluster that needs to have its nodes resources changed, and from there click Config:
Select the pool you want to change the resources on, I will select pool2 as thats where my worker nodes are residing.
Then in the top right corner click on the three dots and select edit config to allow for the required fields to be edited.
From there I will adjust the memory and cpu on the pool2 from 2 cpus to 4 and from 4096mb memory to 6144mb memory. Then click save.
This triggers a rollout of a new worker nodes until all nodes have been replace with new nodes using the new cpu and memory config:
Thats it
Upgrading Kubernetes clusters
This is probably one of the most important task maintaining a Kubernetes cluster, keeping pace with Kubernetes updates and performing Kubernetes upgrades. New Kubernetes release is coming approximately every 3 months.
So how can I upgrade my Kubernetes cluster using Rancher?
Almost same approach as above, but instead of changing any of the CPU and Memory resources I will scroll down and select the new Kubernetes version I want to upgrade to:
I have a cluster currently running version 1.27.12 and want to upgrade to 1.28.8. I will now select the latest version available and then save:
Continue.
This will not trigger a rolling upgrade, instead it will do a in-line upgrade by cordon node by node and the upgrade the node.
My control plane nodes has been upgrade to 1.28.8 already.
All nodes have been upgraded:
This took just a couple of seconds (I was suspecting that something was wrong actually), it really only just took a couple of seconds. Everything worked as it should afterwards, I must say a relatively smoth process.
Installing another Ingress controller than the default Ngninx - Avi Kubernetes Operator
In my vSphere environment I already have Avi Loadbalancer running, Avi can also be extended into Kubernetes to automate K8s service creation and also Ingress and Gateway API. So in this section I will create a cluster with AKO as my Ingress controller.
By default Rancher installs Nginx as the Ingress controller. To install my own Ingress controller, like AKO, I will deselect Nginx in the cluster creation wizard:
Deploy the cluster, wait for it be finished.
Now that I have a cluster deployed without any ingressClasses or Ingress controller I can go on an deploy the ingress controller of my choice. As from my understanding there is currently no support for OCI Helm Chart repositories as of yet in Rancher I need to deploy AKO using the cli approach. Otherwise I could just have added the AKO repo through the GUI and deploy it as an APP.
From my linux machine, I have grabbed the context of my newly created K8s cluster without any Ingress installed. Then I will just follow the regular AKO installation instructions:
1andreasm@linuxmgmt01:~/rancher$ k get ingressclasses.networking.k8s.io
2No resources found
3andreasm@linuxmgmt01:~/rancher$ k create ns avi-system
4namespace/avi-system created
5andreasm@linuxmgmt01:~/rancher$ helm show values oci://projects.registry.vmware.com/ako/helm-charts/ako --version 1.11.3 > values.yaml
6Pulled: projects.registry.vmware.com/ako/helm-charts/ako:1.11.3
7Digest: sha256:e4d0fcd80ae5b5377edf510d3d085211181a3f6b458a17c8b4a19d328e4cdfe6
8# Will now edit the values.yaml file to accomodate my Avi config settings
9# After edit I will deploy AKO
10andreasm@linuxmgmt01:~/rancher$ helm install --generate-name oci://projects.registry.vmware.com/ako/helm-charts/ako --version 1.11.3 -f values.yaml -n avi-system
11Pulled: projects.registry.vmware.com/ako/helm-charts/ako:1.11.3
12Digest: sha256:e4d0fcd80ae5b5377edf510d3d085211181a3f6b458a17c8b4a19d328e4cdfe6
13NAME: ako-1715330716
14LAST DEPLOYED: Fri May 10 10:45:19 2024
15NAMESPACE: avi-system
16STATUS: deployed
17REVISION: 1
18TEST SUITE: None
19
20# Check for pods and Ingress Classes
21andreasm@linuxmgmt01:~/rancher$ k get pods -n avi-system
22NAME READY STATUS RESTARTS AGE
23ako-0 1/1 Running 0 39s
24andreasm@linuxmgmt01:~/rancher$ k get ingressclasses.networking.k8s.io
25NAME CONTROLLER PARAMETERS AGE
26avi-lb ako.vmware.com/avi-lb <none> 47s
Now I have an Ingress controller installed in my cluster.
How about going back to the Rancher UI and create and Ingress using my newly deployed ingress controller?
I will go into my cluster, Service Discovery -> Ingresses
Now lets try to create an Ingress:
Target Service and Ports are selected from the dropdown lists as long as the services are available from within the same namespace as I create my Ingress. Thats really nice.
Selecting my IngressClass from the dropdown.
Then click Create:
1andreasm@linuxmgmt01:~/rancher$ k get ingress -n fruit
2NAME CLASS HOSTS ADDRESS PORTS AGE
3fruit-ingress avi-lb fruit-1.my-domain.net 192.168.121.13 80 35s
In my Avi controller I should now have a new Virtual Service using the fqdn name:
Installing Custom CNI - bootstrapping new clusters using Antrea CNI
What if I want to use a CNI that is not in the list of CNIs in Rancher?
Rancher provide a list of tested CNIs to choose from when deploying a new RKE2 cluster, but one is not limited to only choose one of these CNIs. It is possible to create a cluster using your own preferred CNI. So in this chapter I will go through how I can deploy a new RKE2 cluster using Antrea as my CNI.
Before I can bootstrap a cluster with Antrea (or any custom CNI) I need to tell Rancher to not deploy any CNI from the list of provided CNIs. This involves editing the cluster deployment using the YAML editor in Rancher. There is two ways I can install Antrea as the CNI during cluster creation/bootsrap, one is using the Antrea deployment manifest the other is using Helm Charts. The Helm Chart approach is the recommended way if you ask me, but one may not always have the option use Helm Chart that is way I have included both approaches. Below I will go through both of them, starting with the Antrea manifest. I need to complete one of these steps before clicking on the create cluster button.
Add-on Config using manifest
After I have gone through the typical cluster creation steps, node pool config, cluster config etc I will need to add my Antrea manifest in the Add-On Config section.
So I will be going into the Cluster Configuration section and Add-On Config
Here I can add the deployment manifest for Antrea by just copy paste the content directly from the Antrea repo here. This will always take you to the latest version of Antrea. Do the necessary configs in the deployment yaml like enabling or disabling the Antrea feature gates, if needed. They can be changed post cluster provision too.
Disregard the already populated Calico Configuration section:
This will be removed at a later stage completely. I need to paste my Antrea deployment manifest under Additional Manifest
When done proceed to the step which involves disabling the default CNI that Rancher wants to deploy. It is not possible set CNI to none from the UI so I need to configure the yaml directly to tell it to not install any CNI. That is done by following the steps below.
Click edit as yaml
Within my cluster yaml, there is certain blocks I need to edit and remove. Below is a "default" yaml using Calico as CNI, then I will paste the yaml which I have edited to allow me to install my custom CNI. See comments in the yamls by extending the code-field. I have only pasted parts of the yaml that is involved to make it "shorter".
The "untouched" yaml:
1# Before edited - expand field for all content to display
2apiVersion: provisioning.cattle.io/v1
3kind: Cluster
4metadata:
5 name: rke2-cluster-4-antrea
6 annotations:
7 field.cattle.io/description: RKE Cluster with Antrea
8 # key: string
9 labels:
10 {}
11 # key: string
12 namespace: fleet-default
13spec:
14 cloudCredentialSecretName: cattle-global-data:cc-qsq7b
15 clusterAgentDeploymentCustomization:
16 kubernetesVersion: v1.28.8+rke2r1
17 localClusterAuthEndpoint:
18 caCerts: ''
19 enabled: false
20 fqdn: ''
21 rkeConfig:
22 chartValues: ## Need to add {} due to remove the below line
23 rke2-calico: {} ### This needs to be removed
24 etcd:
25 disableSnapshots: false
26 s3:
27# bucket: string
28# cloudCredentialName: string
29# endpoint: string
30# endpointCA: string
31# folder: string
32# region: string
33# skipSSLVerify: boolean
34 snapshotRetention: 5
35 snapshotScheduleCron: 0 */5 * * *
36 machineGlobalConfig:
37 cluster-cidr: 10.10.0.0/16
38 cni: calico ### Need to set this to none
39 disable-kube-proxy: false
40 etcd-expose-metrics: false
41 service-cidr: 10.20.0.0/16
42 profile: null
The edited yaml:
1# Before edited - expand field for all content to display
2apiVersion: provisioning.cattle.io/v1
3kind: Cluster
4metadata:
5 name: rke2-cluster-4-antrea
6 annotations:
7 field.cattle.io/description: RKE Cluster with Antrea
8 # key: string
9 labels:
10 {}
11 # key: string
12 namespace: fleet-default
13spec:
14 cloudCredentialSecretName: cattle-global-data:cc-qsq7b
15 clusterAgentDeploymentCustomization:
16 kubernetesVersion: v1.28.8+rke2r1
17 localClusterAuthEndpoint:
18 caCerts: ''
19 enabled: false
20 fqdn: ''
21 rkeConfig:
22 chartValues: {}
23 etcd:
24 disableSnapshots: false
25 s3:
26# bucket: string
27# cloudCredentialName: string
28# endpoint: string
29# endpointCA: string
30# folder: string
31# region: string
32# skipSSLVerify: boolean
33 snapshotRetention: 5
34 snapshotScheduleCron: 0 */5 * * *
35 machineGlobalConfig:
36 cluster-cidr: 10.10.0.0/16
37 cni: none
38 disable-kube-proxy: false ## See comment below
39 etcd-expose-metrics: false
40 service-cidr: 10.20.0.0/16
41 profile: null
After editing the yaml as indicated above it is time to click Create.
After a couple of minutes the cluster should be getting green and ready to consume with Antrea as the CNI.
Next up is the Helm Chart approach.
Add-on Config using Helm Charts
I will follow almost the same approach as above, but instead of providing the Antrea manifest, I will be providing the Antrea Helm Chart repo.
So to save some digital ink, head over to Add-On Config and populate the Additional Manifest with the following content:
Below is the yaml:
1apiVersion: helm.cattle.io/v1
2kind: HelmChart
3metadata:
4 name: antrea
5 namespace: kube-system
6spec:
7 bootstrap: true # this is important
8 chart: antrea
9 targetNamespace: kube-system
10 repo: https://charts.antrea.io
11 version: v1.15.1 # if I want a specific version to be installed
Below is a template to see which key/values that can be used:
1apiVersion: helm.cattle.io/v1
2kind: HelmChart
3metadata:
4 name: #string
5 namespace: default
6# annotations: key: string
7# labels: key: string
8spec:
9# authPassCredentials: boolean
10# authSecret:
11# name: string
12# backOffLimit: int
13# bootstrap: boolean
14# chart: string
15# chartContent: string
16# createNamespace: boolean
17# dockerRegistrySecret:
18# name: string
19# failurePolicy: string
20# helmVersion: string
21# jobImage: string
22# podSecurityContext:
23# fsGroup: int
24# fsGroupChangePolicy: string
25# runAsGroup: int
26# runAsNonRoot: boolean
27# runAsUser: int
28# seLinuxOptions:
29# type: string
30# level: string
31# role: string
32# user: string
33# seccompProfile:
34# type: string
35# localhostProfile: string
36# supplementalGroups:
37# - int
38# sysctls:
39# - name: string
40# value: string
41# windowsOptions:
42# gmsaCredentialSpec: string
43# gmsaCredentialSpecName: string
44# hostProcess: boolean
45# runAsUserName: string
46# repo: string
47# repoCA: string
48# repoCAConfigMap:
49# name: string
50# securityContext:
51# allowPrivilegeEscalation: boolean
52# capabilities:
53# add:
54# - string
55# drop:
56# - string
57# privileged: boolean
58# procMount: string
59# readOnlyRootFilesystem: boolean
60# runAsGroup: int
61# runAsNonRoot: boolean
62# runAsUser: int
63# seLinuxOptions:
64# type: string
65# level: string
66# role: string
67# user: string
68# seccompProfile:
69# type: string
70# localhostProfile: string
71# windowsOptions:
72# gmsaCredentialSpec: string
73# gmsaCredentialSpecName: string
74# hostProcess: boolean
75# runAsUserName: string
76# set: map[]
77# targetNamespace: string
78# timeout: string
79# valuesContent: string
80# version: string
Then edit the cluster yaml to disable the Rancher provided CNI, as described above.
Click create
Below is a complete yaml of the cluster being provisioned with Antrea as the CNI using Helm Chart option.
1apiVersion: provisioning.cattle.io/v1
2kind: Cluster
3metadata:
4 name: rke2-cluster-4-antrea
5 annotations:
6 field.cattle.io/description: RKE2 using Antrea
7 labels:
8 {}
9 namespace: fleet-default
10spec:
11 cloudCredentialSecretName: cattle-global-data:cc-qsq7b
12 clusterAgentDeploymentCustomization:
13 appendTolerations:
14 overrideResourceRequirements:
15 defaultPodSecurityAdmissionConfigurationTemplateName: ''
16 defaultPodSecurityPolicyTemplateName: ''
17 fleetAgentDeploymentCustomization:
18 appendTolerations:
19 overrideAffinity:
20 overrideResourceRequirements:
21 kubernetesVersion: v1.28.9+rke2r1
22 localClusterAuthEndpoint:
23 caCerts: ''
24 enabled: false
25 fqdn: ''
26 rkeConfig:
27 additionalManifest: |-
28 apiVersion: helm.cattle.io/v1
29 kind: HelmChart
30 metadata:
31 name: antrea
32 namespace: kube-system
33 spec:
34 bootstrap: true # This is important
35 chart: antrea
36 targetNamespace: kube-system
37 repo: https://charts.antrea.io
38 chartValues: {}
39 etcd:
40 disableSnapshots: false
41 s3:
42 snapshotRetention: 5
43 snapshotScheduleCron: 0 */5 * * *
44 machineGlobalConfig:
45 cluster-cidr: 10.10.0.0/16
46 cni: none
47 disable-kube-proxy: false
48 etcd-expose-metrics: false
49 service-cidr: 10.20.0.0/16
50 profile: null
51 machinePools:
52 - name: pool1
53 etcdRole: true
54 controlPlaneRole: true
55 workerRole: true
56 hostnamePrefix: ''
57 quantity: 1
58 unhealthyNodeTimeout: 0m
59 machineConfigRef:
60 kind: VmwarevsphereConfig
61 name: nc-rke2-cluster-4-antrea-pool1-gh2nj
62 drainBeforeDelete: true
63 machineOS: linux
64 labels: {}
65 - name: pool2
66 etcdRole: false
67 controlPlaneRole: false
68 workerRole: true
69 hostnamePrefix: ''
70 quantity: 1
71 unhealthyNodeTimeout: 0m
72 machineConfigRef:
73 kind: VmwarevsphereConfig
74 name: nc-rke2-cluster-4-antrea-pool2-s5sv4
75 drainBeforeDelete: true
76 machineOS: linux
77 labels: {}
78 machineSelectorConfig:
79 - config:
80 protect-kernel-defaults: false
81 registries:
82 configs:
83 {}
84 mirrors:
85 {}
86 upgradeStrategy:
87 controlPlaneConcurrency: '1'
88 controlPlaneDrainOptions:
89 deleteEmptyDirData: true
90 disableEviction: false
91 enabled: false
92 force: false
93 gracePeriod: -1
94 ignoreDaemonSets: true
95 skipWaitForDeleteTimeoutSeconds: 0
96 timeout: 120
97 workerConcurrency: '1'
98 workerDrainOptions:
99 deleteEmptyDirData: true
100 disableEviction: false
101 enabled: false
102 force: false
103 gracePeriod: -1
104 ignoreDaemonSets: true
105 skipWaitForDeleteTimeoutSeconds: 0
106 timeout: 120
107 machineSelectorConfig:
108 - config: {}
109__clone: true
By doing the steps above, I should shortly have a freshly installed Kubernetes cluster deployed using Antrea as my CNI.
1# Run kubectl commands inside here
2# e.g. kubectl get all
3> k get pods -A
4NAMESPACE NAME READY STATUS RESTARTS AGE
5cattle-fleet-system fleet-agent-7cf5c7b6cb-cbr4c 1/1 Running 0 35m
6cattle-system cattle-cluster-agent-686798b687-jq9kp 1/1 Running 0 36m
7cattle-system cattle-cluster-agent-686798b687-q97mx 1/1 Running 0 34m
8cattle-system dashboard-shell-6tldq 2/2 Running 0 7s
9cattle-system helm-operation-95wr6 0/2 Completed 0 35m
10cattle-system helm-operation-jtgzs 0/2 Completed 0 29m
11cattle-system helm-operation-qz9bc 0/2 Completed 0 34m
12cattle-system rancher-webhook-7d57cd6cb8-4thw7 1/1 Running 0 29m
13cattle-system system-upgrade-controller-6f86d6d4df-x46gp 1/1 Running 0 35m
14kube-system antrea-agent-9qm2d 2/2 Running 0 37m
15kube-system antrea-agent-dr2hz 2/2 Running 1 (33m ago) 34m
16kube-system antrea-controller-768b4dbcb5-7srnw 1/1 Running 0 37m
17kube-system cloud-controller-manager-rke2-cluster-5-antrea-pool1-fea898fd-t7fhv 1/1 Running 0 38m
18kube-system etcd-rke2-cluster-5-antrea-pool1-fea898fd-t7fhv 1/1 Running 0 38m
19kube-system helm-install-antrea-vtskp 0/1 Completed 0 34m
20kube-system helm-install-rke2-coredns-8lrrq 0/1 Completed 0 37m
21kube-system helm-install-rke2-ingress-nginx-4bbr4 0/1 Completed 0 37m
22kube-system helm-install-rke2-metrics-server-8xrbh 0/1 Completed 0 37m
23kube-system helm-install-rke2-snapshot-controller-crd-rt5hb 0/1 Completed 0 37m
24kube-system helm-install-rke2-snapshot-controller-pncxt 0/1 Completed 0 37m
25kube-system helm-install-rke2-snapshot-validation-webhook-zv9kl 0/1 Completed 0 37m
26kube-system kube-apiserver-rke2-cluster-5-antrea-pool1-fea898fd-t7fhv 1/1 Running 0 38m
27kube-system kube-controller-manager-rke2-cluster-5-antrea-pool1-fea898fd-t7fhv 1/1 Running 0 38m
28kube-system kube-proxy-rke2-cluster-5-antrea-pool1-fea898fd-t7fhv 1/1 Running 0 38m
29kube-system kube-proxy-rke2-cluster-5-antrea-pool2-209a1e89-9sjt7 1/1 Running 0 34m
30kube-system kube-scheduler-rke2-cluster-5-antrea-pool1-fea898fd-t7fhv 1/1 Running 0 38m
31kube-system rke2-coredns-rke2-coredns-84b9cb946c-6hrfv 1/1 Running 0 37m
32kube-system rke2-coredns-rke2-coredns-84b9cb946c-z2bb5 1/1 Running 0 34m
33kube-system rke2-coredns-rke2-coredns-autoscaler-b49765765-spj7d 1/1 Running 0 37m
34kube-system rke2-ingress-nginx-controller-8p7rz 1/1 Running 0 34m
35kube-system rke2-ingress-nginx-controller-zdbvm 0/1 Pending 0 37m
36kube-system rke2-metrics-server-655477f655-5n7sp 1/1 Running 0 37m
37kube-system rke2-snapshot-controller-59cc9cd8f4-6mtbs 1/1 Running 0 37m
38kube-system rke2-snapshot-validation-webhook-54c5989b65-jprqr 1/1 Running 0 37m
Now with Antrea installed as the CNI I can go ahead and use the rich set of feature gates Antrea brings to table.
To enable and disable the Antrea FeatureGates either edit the configMap using kubectl edit configmap -n kube-system antrea-config or use the Rancher UI:
Upgrading or changing version of the Antrea CNI using Rancher
When I deploy Antrea using the Helm Chart Add-On Config, upgrading or changing the version of Antrea is a walk in the park.
In the cluster I want to upgrade/downgrade Antrea I first need to add the Antrea Helm Repo under App -> Repositories:
Then head back to Installed Apps under Apps and click on antrea
Click top right corner:
Follow the wizard and select the version you want, then click next.
Change values or dont, and click Upgrade:
Sit back and enjoy:
Now I just downgraded from version 2.0.0 to 1.15.1, the same approach is true for upgrading to a newer version of course. But why upgrade when one can downgrade, one need to have some fun in life...
In the list of Installed Apps view Rancher will notify me if there is a new release available. This is a nice indicator whenever there is a new release:
NeuVector
NeuVector is new for me, by having a quick look at what it does it did spark my interest. From the official docs page:
NeuVector provides a powerful end-to-end container security platform. This includes end-to-end vulnerability scanning and complete run-time protection for containers, pods and hosts, including:
- CI/CD Vulnerability Management & Admission Control. Scan images with a Jenkins plug-in, scan registries, and enforce admission control rules for deployments into production.
- Violation Protection. Discovers behavior and creates a whitelist based policy to detect violations of normal behavior.
- Threat Detection. Detects common application attacks such as DDoS and DNS attacks on containers.
- DLP and WAF Sensors. Inspect network traffic for Data Loss Prevention of sensitive data, and detect common OWASP Top10 WAF attacks.
- Run-time Vulnerability Scanning. Scans registries, images and running containers orchestration platforms and hosts for common (CVE) as well as application specific vulnerabilities.
- Compliance & Auditing. Runs Docker Bench tests and Kubernetes CIS Benchmarks automatically.
- Endpoint/Host Security. Detects privilege escalations, monitors processes and file activity on hosts and within containers, and monitors container file systems for suspicious activity.
- Multi-cluster Management. Monitor and manage multiple Kubernetes clusters from a single console.
Other features of NeuVector include the ability to quarantine containers and to export logs through SYSLOG and webhooks, initiate packet capture for investigation, and integration with OpenShift RBACs, LDAP, Microsoft AD, and SSO with SAML. Note: Quarantine means that all network traffic is blocked. The container will remain and continue to run - just without any network connections. Kubernetes will not start up a container to replace a quarantined container, as the api-server is still able to reach the container.
If I head over to any of my clusters in Rancher under Apps -> Charts I will find a set of ready to install applications. There I also find NeuVector:
Lets try an installation of NeuVector. I will keep everything at default for now. As one can imagine there is some configurations that should be done if deploying it in production, like Ingress, Certificate, PVC etc. By just deploying with all default values it will be exposed using NodePort, selfsigned certificate and no PVC.
1> k get pods -A
2NAMESPACE NAME READY STATUS RESTARTS AGE
3cattle-neuvector-system neuvector-controller-pod-5b8b6f7df-cctd7 1/1 Running 0 53s
4cattle-neuvector-system neuvector-controller-pod-5b8b6f7df-ql5dz 1/1 Running 0 53s
5cattle-neuvector-system neuvector-controller-pod-5b8b6f7df-tmb77 1/1 Running 0 53s
6cattle-neuvector-system neuvector-enforcer-pod-lnsfl 1/1 Running 0 53s
7cattle-neuvector-system neuvector-enforcer-pod-m4gjp 1/1 Running 0 53s
8cattle-neuvector-system neuvector-enforcer-pod-wr5j4 1/1 Running 0 53s
9cattle-neuvector-system neuvector-manager-pod-779f9976f8-cmlh5 1/1 Running 0 53s
10cattle-neuvector-system neuvector-scanner-pod-797cc57f7d-8lztx 1/1 Running 0 53s
11cattle-neuvector-system neuvector-scanner-pod-797cc57f7d-8qhr7 1/1 Running 0 53s
12cattle-neuvector-system neuvector-scanner-pod-797cc57f7d-gcbr9 1/1 Running 0 53s
NeuVector is up and running.
Lets have a quick look inside NeuVector
Default password is admin
First glance:
Dashboard
Network Activity
Network Rules
Settings
For more information head over to the official NeuVector documentation page here
Neuvector Rancher UI Extension
Neuvector can also be installed as an extension in Rancher:
If I now head into the cluster I have deployed NeuVector, I will get a great overview from NeuVector directly in the Rancher UI:
That is very neat. And again, it was just one click to enable this extension.
Monitoring
Keeping track of whats going on in the whole Kubernetes estate is crucial, everything from Audit logs, performance, troubleshooting.
I will try out the Monitoring app that is available under Monitoring tools, or Cluster tools here:
Again, using default values:
1SUCCESS: helm upgrade --install=true --namespace=cattle-monitoring-system --timeout=10m0s --values=/home/shell/helm/values-rancher-monitoring-103.1.0-up45.31.1.yaml --version=103.1.0+up45.31.1 --wait=true rancher-monitoring /home/shell/helm/rancher-monitoring-103.1.0-up45.31.1.tgz
22024-05-11T08:19:06.403230781Z ---------------------------------------------------------------------
Now I suddenly can access it all from the Monitoring section within my cluster:
Notice the Active Alerts.
From the Cluster Dashboard:
Lets see whats going on in and with my cluster:
AlertManager
Grafana
A bunch of provided dashboards:
The Rancher API
One can interact with Rancher not only using the great Rancher UI, but using the CLI, kubectl and from external applications.
Lets start with the CLI. The binary for the CLI can be downloaded from the Rancher UI by going to about - here:
I download the tar file for mac, extract it using tar -zxvf rancher-darwin-amd64-v2.8.3.tar.gz
and then copy/and move it to /user/local/bin/ folder.
Before I can use th CLI to connect to my Rancher endpoint I need to create an api token here:
No Scopes as I want access to them all
Now I need to copy these keys, so I have them. They will expire in 90 days. If I loose them I have to create new ones.
1andreasm@linuxmgmt01:~/rancher$ rancher login https://rancher-dev.my-domain.net --token token-m8k88:cr89kvdq4xfsl4pcnn4rh5xkdcw2rhvz46cxlrpw29qblqxbsnz
2NUMBER CLUSTER NAME PROJECT ID PROJECT NAME PROJECT DESCRIPTION
31 rke2-cluster-3-ako c-m-6h64tt4t:p-dt4qw Default Default project created for the cluster
42 rke2-cluster-3-ako c-m-6h64tt4t:p-rpqhc System System project created for the cluster
53 rke2-cluster-1 c-m-d85dvj2l:p-gpczf System System project created for the cluster
64 rke2-cluster-1 c-m-d85dvj2l:p-l6cln Default Default project created for the cluster
75 rke2-cluster-2-vsphere c-m-ddljkdhx:p-57cpk Default Default project created for the cluster
86 rke2-cluster-2-vsphere c-m-ddljkdhx:p-fzdtp System System project created for the cluster
97 k8s-prod-cluster-2 c-m-pjl4pxfl:p-8gc9x System System project created for the cluster
108 k8s-prod-cluster-2 c-m-pjl4pxfl:p-f2kkb Default Default project created for the cluster
119 tanzuz-cluster-1 c-m-qxpbc6tp:p-mk49f System System project created for the cluster
1210 tanzuz-cluster-1 c-m-qxpbc6tp:p-s428w Default Default project created for the cluster
1311 rke2-cluster-proxmox-1 c-m-tllp6sd6:p-nrxgl Default Default project created for the cluster
1412 rke2-cluster-proxmox-1 c-m-tllp6sd6:p-xsmbd System System project created for the cluster
1513 local local:p-tqk2t Default Default project created for the cluster
1614 local local:p-znt9n System System project created for the cluster
17Select a Project:
Selecting project rke2-cluster-2-vsphere ID 5:
1Select a Project:5
2INFO[0123] Saving config to /Users/andreasm/.rancher/cli2.json
1andreasm@linuxmgmt01:~/rancher$ rancher kubectl get pods -A
2NAMESPACE NAME READY STATUS RESTARTS AGE
3calico-system calico-kube-controllers-7fdf877475-svqt8 1/1 Running 0 22h
1#Switch between contexts
2andreasm@linuxmgmt01:~/rancher$ rancher context switch
3NUMBER CLUSTER NAME PROJECT ID PROJECT NAME PROJECT DESCRIPTION
41 rke2-cluster-3-ako c-m-6h64tt4t:p-dt4qw Default Default project created for the cluster
52 rke2-cluster-3-ako c-m-6h64tt4t:p-rpqhc System System project created for the cluster
63 rke2-cluster-1 c-m-d85dvj2l:p-gpczf System System project created for the cluster
74 rke2-cluster-1 c-m-d85dvj2l:p-l6cln Default Default project created for the cluster
85 rke2-cluster-2-vsphere c-m-ddljkdhx:p-57cpk Default Default project created for the cluster
96 rke2-cluster-2-vsphere c-m-ddljkdhx:p-fzdtp System System project created for the cluster
107 k8s-prod-cluster-2 c-m-pjl4pxfl:p-8gc9x System System project created for the cluster
118 k8s-prod-cluster-2 c-m-pjl4pxfl:p-f2kkb Default Default project created for the cluster
129 tanzuz-cluster-1 c-m-qxpbc6tp:p-mk49f System System project created for the cluster
1310 tanzuz-cluster-1 c-m-qxpbc6tp:p-s428w Default Default project created for the cluster
1411 rke2-cluster-proxmox-1 c-m-tllp6sd6:p-nrxgl Default Default project created for the cluster
1512 rke2-cluster-proxmox-1 c-m-tllp6sd6:p-xsmbd System System project created for the cluster
1613 local local:p-tqk2t Default Default project created for the cluster
1714 local local:p-znt9n System System project created for the cluster
18Select a Project:
More information on the CLI reference page here
More information on the API here
kubectl
Using kubectl without the Rancher CLI is of course possible, just grab the contexts from your different cluster you have access to from the Rancher UI, copy or download them and put into the .kube/config or point to them as below and interact with Rancher managed Kubernetes clusters as they were any other Kubernetes cluster:
1andreasm@linuxmgmt01:~/rancher$ k --kubeconfig rke-ako.yaml get pods -A
2NAMESPACE NAME READY STATUS RESTARTS AGE
3avi-system ako-0 1/1 Running 0 24h
4calico-system calico-kube-controllers-67f857444-qwtd7 1/1 Running 0 25h
5calico-system calico-node-4k6kl 1/1 Running 0 25h
6calico-system calico-node-wlzpt 1/1 Running 0 25h
7calico-system calico-typha-75d7fbfc6c-bhgxj 1/1 Running 0 25h
8cattle-fleet-system fleet-agent-7cf5c7b6cb-kqgqm 1/1 Running 0 25h
9cattle-system cattle-cluster-agent-8548cbfbb7-b4t5f 1/1 Running 0 25h
10cattle-system cattle-cluster-agent-8548cbfbb7-hvdcx 1/1 Running 0 25h
11cattle-system rancher-webhook-7d57cd6cb8-gwvr4 1/1 Running 0 25h
12cattle-system system-upgrade-controller-6f86d6d4df-zg7lv 1/1 Running 0 25h
13fruit apple-app 1/1 Running 0 24h
14fruit banana-app 1/1 Running 0 24h
15kube-system cloud-controller-manager-rke2-cluster-3-ako-pool1-b6cddca0-tf84w 1/1 Running 0 25h
16kube-system etcd-rke2-cluster-3-ako-pool1-b6cddca0-tf84w 1/1 Running 0 25h
17kube-system helm-install-rke2-calico-crd-w2skj 0/1 Completed 0 25h
18kube-system helm-install-rke2-calico-lgcdp 0/1 Completed 1 25h
19kube-system helm-install-rke2-coredns-xlrtx 0/1 Completed 0 25h
20kube-system helm-install-rke2-metrics-server-rb7qj 0/1 Completed 0 25h
21kube-system helm-install-rke2-snapshot-controller-crd-bhrq5 0/1 Completed 0 25h
22kube-system helm-install-rke2-snapshot-controller-wlftf 0/1 Completed 0 25h
23kube-system helm-install-rke2-snapshot-validation-webhook-jmnm2 0/1 Completed 0 25h
24kube-system kube-apiserver-rke2-cluster-3-ako-pool1-b6cddca0-tf84w 1/1 Running 0 25h
25kube-system kube-controller-manager-rke2-cluster-3-ako-pool1-b6cddca0-tf84w 1/1 Running 0 25h
26kube-system kube-proxy-rke2-cluster-3-ako-pool1-b6cddca0-tf84w 1/1 Running 0 25h
27kube-system kube-proxy-rke2-cluster-3-ako-pool2-50bb46e0-tj2kp 1/1 Running 0 25h
28kube-system kube-scheduler-rke2-cluster-3-ako-pool1-b6cddca0-tf84w 1/1 Running 0 25h
29kube-system rke2-coredns-rke2-coredns-84b9cb946c-jj2gk 1/1 Running 0 25h
30kube-system rke2-coredns-rke2-coredns-84b9cb946c-wq4hs 1/1 Running 0 25h
31kube-system rke2-coredns-rke2-coredns-autoscaler-b49765765-c86ps 1/1 Running 0 25h
32kube-system rke2-metrics-server-544c8c66fc-hkdh9 1/1 Running 0 25h
33kube-system rke2-snapshot-controller-59cc9cd8f4-np4lt 1/1 Running 0 25h
34kube-system rke2-snapshot-validation-webhook-54c5989b65-9zk7w 1/1 Running 0 25h
35tigera-operator tigera-operator-779488d655-7x25x 1/1 Running 0 25h
Authentication and RBAC with Rancher
Rancher provides a robust Authentication and RBAC out of the box. For more information on authentication and RBAC in Rancher head over here here
Rancher can integrate with the following external authentication services, in addition to local user authentication:
- Microsoft Active Directory
- Microsoft Azure AD
- Microsoft AD FS
- GitHub
- FreeIPA
- OpenLDAP
- PingIdentity
- KeyCloak (OIDC)
- KeyCloak ((SAML)
- Okta
- Google OAuth
- Shibboleth
RBAC is an important part in managing a Kubernetes estate. I will cover RBAC in Rancher in more depth in another blog post.
Outro
My experience with Rancher has been very positive, I would say way over the expectations I had before I started writing this post. It (Rancher) just feels like a well thought through product, solid and complete. It brings a lot of built-in features and tested applications, but its not limiting me to do things differently nor add my own "custom" things to it. Delivering a product with so many features out-of-the-box, without setting to many limitations on what I can do "outside-the-box" is not an easy achievement, but Rancher gives me the feeling they have acheived it. I feel like whatever feature I wanted, Rancher had it, at the same time many features I even haven't thought of, Rancher had it.
A very powerful Kubernetes management platform. Its open-source and available.
Stay tuned for some follow-up posts where I deepdive into specific topics of Rancher. This will be fun.