Tanzu Kubernetes Grid 2.1
Overview
Tanzu Kubernetes Grid
This post will go through how to deploy TKG 2.1, the management cluster, a workload cluster (or two), and the necessary preparations to be done on the underlaying infrastructure to support TKG 2.1. In this post I will use vSphere 8 with vSAN, Avi LoadBalancer, and NSX. So what we want to end up with it something like this:
Preparations before deployment
This post will assume the following:
-
vSphere is already installed configured. See more info here and here
-
NSX has already been configured (see this post for how to configure NSX). Segments used for both Management cluster and Workload clusters should have DHCP server available. We dont need DHCP for Workload Cluster, but Management needs DHCP. NSX can provide DHCP server functionality for this use *
-
NSX Advanced LoadBalancer has been deployed (and configured with a NSX cloud). See this post for how to configure this. **
-
Import the VM template for TKG, see here
-
A dedicated Linux machine/VM we can use as the bootstrap host, with the Tanzu CLI installed. See more info here
(*) TKG 2.1 is not tied to NSX the same way as TKGs - So we can choose to use NSX for Security only or the full stack with networking and security. The built in NSX loadbalancer will not be used, I will use the NSX Advanced Loadbalancer (Avi)
(**) I want to use the NSX cloud in Avi as it gives several benefits such as integration into the NSX manager where Avi automatically creates security groups, tags and services to easily be used in security policy creation and automatic "route plumbing" for the VIPs.
TKG Management cluster - deployment
The first step after all the pre-requirements have been done is to prepare a bootstrap yaml for the management cluster. I will post an example file here and go through what the different fields means and why I have configured them and why I have uncommented some of them. Start by logging into the bootstrap machine, or if you decide to create the bootstrap yaml somewhere else go ahead but we need to copy it over to the bootstrap machine when we are ready to create the the management cluster.
To get started with a bootstrap yaml file we can either grab an example from here or in your bootstrap machine there is a folder which contains a default config you can start out with:
1andreasm@tkg-bootstrap:~/.config/tanzu/tkg/providers$ ll
2total 120
3drwxrwxr-x 18 andreasm andreasm 4096 Mar 24 09:10 ./
4drwx------ 9 andreasm andreasm 4096 Mar 16 11:32 ../
5drwxrwxr-x 2 andreasm andreasm 4096 Mar 16 06:52 ako/
6drwxrwxr-x 3 andreasm andreasm 4096 Mar 16 06:52 bootstrap-kubeadm/
7drwxrwxr-x 4 andreasm andreasm 4096 Mar 16 06:52 cert-manager/
8drwxrwxr-x 3 andreasm andreasm 4096 Mar 16 06:52 cluster-api/
9-rw------- 1 andreasm andreasm 1293 Mar 16 06:52 config.yaml
10-rw------- 1 andreasm andreasm 32007 Mar 16 06:52 config_default.yaml
11drwxrwxr-x 3 andreasm andreasm 4096 Mar 16 06:52 control-plane-kubeadm/
12drwxrwxr-x 5 andreasm andreasm 4096 Mar 16 06:52 infrastructure-aws/
13drwxrwxr-x 5 andreasm andreasm 4096 Mar 16 06:52 infrastructure-azure/
14drwxrwxr-x 6 andreasm andreasm 4096 Mar 16 06:52 infrastructure-docker/
15drwxrwxr-x 3 andreasm andreasm 4096 Mar 16 06:52 infrastructure-ipam-in-cluster/
16drwxrwxr-x 5 andreasm andreasm 4096 Mar 16 06:52 infrastructure-oci/
17drwxrwxr-x 4 andreasm andreasm 4096 Mar 16 06:52 infrastructure-tkg-service-vsphere/
18drwxrwxr-x 5 andreasm andreasm 4096 Mar 16 06:52 infrastructure-vsphere/
19drwxrwxr-x 2 andreasm andreasm 4096 Mar 16 06:52 kapp-controller-values/
20-rwxrwxr-x 1 andreasm andreasm 64 Mar 16 06:52 providers.sha256sum*
21-rw------- 1 andreasm andreasm 0 Mar 16 06:52 v0.28.0
22-rw------- 1 andreasm andreasm 747 Mar 16 06:52 vendir.lock.yml
23-rw------- 1 andreasm andreasm 903 Mar 16 06:52 vendir.yml
24drwxrwxr-x 8 andreasm andreasm 4096 Mar 16 06:52 ytt/
25drwxrwxr-x 2 andreasm andreasm 4096 Mar 16 06:52 yttcb/
26drwxrwxr-x 7 andreasm andreasm 4096 Mar 16 06:52 yttcc/
27andreasm@tkg-bootstrap:~/.config/tanzu/tkg/providers$
The file you should be looking for is called config_default.yaml . It could be a smart choice to use this as it will include the latest config parameters following the TKG version you have downloaded (Tanzu CLI).
Now copy this file to a folder of preference and start to edit it. Below is a copy of an example I am using:
1#! ---------------
2#! Basic config
3#! -------------
4CLUSTER_NAME: tkg-stc-mgmt-cluster #Name of the TKG mgmt cluster
5CLUSTER_PLAN: dev #Dev or Prod, defines the amount of control plane nodes of the mgmt cluster
6INFRASTRUCTURE_PROVIDER: vsphere #We are deploying on vSphere, could be AWS, Azure
7ENABLE_CEIP_PARTICIPATION: "false" #Customer Experience Improvement Program - set to true if you will participate
8ENABLE_AUDIT_LOGGING: "false" #Audit logging should be true in production environments
9CLUSTER_CIDR: 100.96.0.0/11 #Kubernetes Cluster CIDR
10SERVICE_CIDR: 100.64.0.0/13 #Kubernetes Services CIDR
11TKG_IP_FAMILY: ipv4 #ipv4 or ipv6
12DEPLOY_TKG_ON_VSPHERE7: "true" #Yes to deploy standalone tkg mgmt cluster on vSphere
13
14#! ---------------
15#! vSphere config
16#! -------------
17VSPHERE_DATACENTER: /cPod-NSXAM-STC #Name of vSphere Datacenter
18VSPHERE_DATASTORE: /cPod-NSXAM-STC/datastore/vsanDatastore #Name and path of vSphere datastore to be used
19VSPHERE_FOLDER: /cPod-NSXAM-STC/vm/TKGm #Name and path to VM folder
20VSPHERE_INSECURE: "false" #True if you dont want to verify vCenter thumprint below
21VSPHERE_NETWORK: /cPod-NSXAM-STC/network/ls-tkg-mgmt #A network portgroup (VDS or NSX Segment) for VM placement
22VSPHERE_CONTROL_PLANE_ENDPOINT: "" #Required if using Kube-Vip, I am using Avi Loadbalancer for this
23VSPHERE_PASSWORD: "password" #vCenter account password for account defined below
24VSPHERE_RESOURCE_POOL: /cPod-NSXAM-STC/host/Cluster/Resources #If you want to use a specific vSphere Resource Pool for the mgmt cluster. Leave it as is if not.
25VSPHERE_SERVER: vcsa.cpod-nsxam-stc.az-stc.cloud-garage.net #DNS record to vCenter Server
26VSPHERE_SSH_AUTHORIZED_KEY: ssh-rsa sdfgasdgadfgsdg sdfsdf@sdfsdf.net # your bootstrap machineSSH public key
27VSPHERE_TLS_THUMBPRINT: 22:FD # Your vCenter SHA1 Thumbprint
28VSPHERE_USERNAME: user@vspheresso/or/ad/user/domain #A user with the correct permissions
29
30#! ---------------
31#! Node config
32#! -------------
33OS_ARCH: amd64
34OS_NAME: ubuntu
35OS_VERSION: "20.04"
36VSPHERE_CONTROL_PLANE_DISK_GIB: "20"
37VSPHERE_CONTROL_PLANE_MEM_MIB: "4096"
38VSPHERE_CONTROL_PLANE_NUM_CPUS: "2"
39VSPHERE_WORKER_DISK_GIB: "20"
40VSPHERE_WORKER_MEM_MIB: "4096"
41VSPHERE_WORKER_NUM_CPUS: "2"
42CONTROL_PLANE_MACHINE_COUNT: 1
43WORKER_MACHINE_COUNT: 2
44
45#! ---------------
46#! Avi config
47#! -------------
48AVI_CA_DATA_B64: #Base64 of the Avi Certificate
49AVI_CLOUD_NAME: stc-nsx-cloud #Name of the cloud defined in Avi
50AVI_CONTROL_PLANE_HA_PROVIDER: "true" #True as we want to use Avi as K8s API endpoint
51AVI_CONTROLLER: 172.24.3.50 #IP or Hostname Avi controller or controller cluster
52# Network used to place workload clusters' endpoint VIPs - If you want to use a separate vip for Workload clusters Kubernetes API endpoint
53AVI_CONTROL_PLANE_NETWORK: vip-tkg-wld-l4 #Corresponds with network defined in Avi
54AVI_CONTROL_PLANE_NETWORK_CIDR: 10.13.102.0/24 #Corresponds with network defined in Avi
55# Network used to place workload clusters' services external IPs (load balancer & ingress services)
56AVI_DATA_NETWORK: vip-tkg-wld-l7 #Corresponds with network defined in Avi
57AVI_DATA_NETWORK_CIDR: 10.13.103.0/24 #Corresponds with network defined in Avi
58# Network used to place management clusters' services external IPs (load balancer & ingress services)
59AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_CIDR: 10.13.101.0/24 #Corresponds with network defined in Avi
60AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_NAME: vip-tkg-mgmt-l7 #Corresponds with network defined in Avi
61# Network used to place management clusters' endpoint VIPs
62AVI_MANAGEMENT_CLUSTER_CONTROL_PLANE_VIP_NETWORK_NAME: vip-tkg-mgmt-l4 #Corresponds with network defined in Avi
63AVI_MANAGEMENT_CLUSTER_CONTROL_PLANE_VIP_NETWORK_CIDR: 10.13.100.0/24 #Corresponds with network defined in Avi
64AVI_NSXT_T1LR: /infra/tier-1s/Tier-1 #Path to the NSX T1 you have configured, click on three dots in NSX on the T1 to get the full path.
65AVI_CONTROLLER_VERSION: 22.1.2 #Latest supported version of Avi for TKG 2.1
66AVI_ENABLE: "true" # Enables Avi as Loadbalancer for workloads
67AVI_LABELS: "" #When used Avi is enabled only workload cluster with corresponding label
68AVI_PASSWORD: "password" #Password for the account used in Avi, username defined below
69AVI_SERVICE_ENGINE_GROUP: stc-nsx #Service Engine group for Workload clusters if you want to have separate groups for Workload clusters and Management cluster
70AVI_MANAGEMENT_CLUSTER_SERVICE_ENGINE_GROUP: tkgm-se-group #Dedicated Service Engine group for management cluster
71AVI_USERNAME: admin
72AVI_DISABLE_STATIC_ROUTE_SYNC: true #Pod network reachable or not from the Avi Service Engines
73AVI_INGRESS_DEFAULT_INGRESS_CONTROLLER: true #If you want to use AKO as default ingress controller, false if you plan to use other ingress controllers also.
74AVI_INGRESS_SHARD_VS_SIZE: SMALL #Decides the amount of shared vs pr ip.
75AVI_INGRESS_SERVICE_TYPE: NodePortLocal #NodePortLocal only when using Antrea, otherwise NodePort or ClusterIP
76AVI_CNI_PLUGIN: antrea
77
78#! ---------------
79#! Proxy config
80#! -------------
81TKG_HTTP_PROXY_ENABLED: "false"
82
83#! ---------------------------------------------------------------------
84#! Antrea CNI configuration
85#! ---------------------------------------------------------------------
86# ANTREA_NO_SNAT: false
87# ANTREA_TRAFFIC_ENCAP_MODE: "encap"
88# ANTREA_PROXY: false
89# ANTREA_POLICY: true
90# ANTREA_TRACEFLOW: false
91ANTREA_NODEPORTLOCAL: true
92ANTREA_PROXY: true
93ANTREA_ENDPOINTSLICE: true
94ANTREA_POLICY: true
95ANTREA_TRACEFLOW: true
96ANTREA_NETWORKPOLICY_STATS: false
97ANTREA_EGRESS: true
98ANTREA_IPAM: false
99ANTREA_FLOWEXPORTER: false
100ANTREA_SERVICE_EXTERNALIP: false
101ANTREA_MULTICAST: false
102
103#! ---------------------------------------------------------------------
104#! Machine Health Check configuration
105#! ---------------------------------------------------------------------
106ENABLE_MHC: "true"
107ENABLE_MHC_CONTROL_PLANE: true
108ENABLE_MHC_WORKER_NODE: true
109MHC_UNKNOWN_STATUS_TIMEOUT: 5m
110MHC_FALSE_STATUS_TIMEOUT: 12m
111
112#! ---------------------------------------------------------------------
113#! Identity management configuration
114#! ---------------------------------------------------------------------
115
116IDENTITY_MANAGEMENT_TYPE: none #I have disabled this, use kubeconfig instead
117#LDAP_BIND_DN: CN=Andreas M,OU=Users,OU=GUZWARE,DC=guzware,DC=local
118#LDAP_BIND_PASSWORD: <encoded:UHNAc=>
119#LDAP_GROUP_SEARCH_BASE_DN: DC=guzware,DC=local
120#LDAP_GROUP_SEARCH_FILTER: (objectClass=group)
121#LDAP_GROUP_SEARCH_GROUP_ATTRIBUTE: member
122#LDAP_GROUP_SEARCH_NAME_ATTRIBUTE: cn
123#LDAP_GROUP_SEARCH_USER_ATTRIBUTE: distinguishedName
124#LDAP_HOST: guzad07.guzware.local:636
125#LDAP_ROOT_CA_DATA_B64: LS0tLS1CRUd
126#LDAP_USER_SEARCH_BASE_DN: DC=guzware,DC=local
127#LDAP_USER_SEARCH_FILTER: (objectClass=person)
128#LDAP_USER_SEARCH_NAME_ATTRIBUTE: uid
129#LDAP_USER_SEARCH_USERNAME: uid
130#OIDC_IDENTITY_PROVIDER_CLIENT_ID: ""
131#OIDC_IDENTITY_PROVIDER_CLIENT_SECRET: ""
132#OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM: ""
133#OIDC_IDENTITY_PROVIDER_ISSUER_URL: ""
134#OIDC_IDENTITY_PROVIDER_NAME: ""
135#OIDC_IDENTITY_PROVIDER_SCOPES: ""
136#OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM: ""
For additional explanations of the different values see here
When you feel you are ready with the bootstrap yaml file its time to deploy the management cluster. From your bootstrap machine where Tanzu CLI have been installed enter the following command:
1tanzu mc create --file path/to/cluster-config-file.yaml
For more information around this process have a look here
The first thing that happens is some validation checks, if those pass it will continue to build a local bootstrap cluster on your bootstrap machine before building the TKG Management cluster in your vSphere cluster.
Note! If you happen to use a an IP range within 172.16.0.0/12 on your computer you are accessing the bootstrap machine through you should edit the default Docker network. Otherwise you will loose connection to your bootstrap machine. This is done like this:
Add or edit, if it exists, the /etc/docker/daemon.json file with the following content:
1{
2 "default-address-pools":
3 [
4 {"base":"192.168.0.0/16","size":24}
5 ]
6}
Restart docker service or reboot the machine.
Now back to the tanzu create process, you can monitor the progress from the terminal of your bootstrap machine, and you should after a while see machines being cloned from your template and powered on. In the Avi controller you should also see a new virtual service being created:
The ip address depicted above is the sole control plane node as I am deploying a TKG management cluster using plan dev. If the progress in your bootstrap machine indicates that it is done, you can check the status with the following command:
1tanzu mc get
This will give you this output:
1 NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN TKR
2 tkg-stc-mgmt-cluster tkg-system running 1/1 2/2 v1.24.9+vmware.1 management dev v1.24.9---vmware.1-tkg.1
3
4
5Details:
6
7NAME READY SEVERITY REASON SINCE MESSAGE
8/tkg-stc-mgmt-cluster True 8d
9├─ClusterInfrastructure - VSphereCluster/tkg-stc-mgmt-cluster-xw6xs True 8d
10├─ControlPlane - KubeadmControlPlane/tkg-stc-mgmt-cluster-wrxtl True 8d
11│ └─Machine/tkg-stc-mgmt-cluster-wrxtl-gkv5m True 8d
12└─Workers
13 └─MachineDeployment/tkg-stc-mgmt-cluster-md-0-vs9dc True 3d3h
14 ├─Machine/tkg-stc-mgmt-cluster-md-0-vs9dc-55c649d9fc-gnpz4 True 8d
15 └─Machine/tkg-stc-mgmt-cluster-md-0-vs9dc-55c649d9fc-gwfvt True 8d
16
17
18Providers:
19
20 NAMESPACE NAME TYPE PROVIDERNAME VERSION WATCHNAMESPACE
21 caip-in-cluster-system infrastructure-ipam-in-cluster InfrastructureProvider ipam-in-cluster v0.1.0
22 capi-kubeadm-bootstrap-system bootstrap-kubeadm BootstrapProvider kubeadm v1.2.8
23 capi-kubeadm-control-plane-system control-plane-kubeadm ControlPlaneProvider kubeadm v1.2.8
24 capi-system cluster-api CoreProvider cluster-api v1.2.8
25 capv-system infrastructure-vsphere InfrastructureProvider vsphere v1.5.1
When cluster is ready deployed and before we can access it with our kubectl cli tool we must set the context to it.
1kubectl config use-context my-mgmnt-cluster-admin@my-mgmnt-cluster
But you probably have a dedicated workstation you want to acces the cluster from, then you can export the kubeconfig like this:
1tanzu mc kubeconfig get --admin --export-file MC-ADMIN-KUBECONFIG
Now copy the file to your workstation and accessed the cluster from there.
Tip! Test out this tool to easy manage your Kubernetes configs: https://github.com/sunny0826/kubecm
The above is a really great tool:
1amarqvardsen@amarqvards1MD6T:~$ kubecm switch --ui-size 10
2Use the arrow keys to navigate: ↓ ↑ → ← and / toggles search
3Select Kube Context
4 😼 tkc-cluster-1(*)
5 tkgs-cluster-1-admin@tkgs-cluster-1
6 wdc-2-tkc-cluster-1
7 10.13.200.2
8 andreasmk8slab-admin@andreasmk8slab-pinniped
9 ns-wdc-3
10 tkc-cluster-1-routed
11 tkg-mgmt-cluster-admin@tkg-mgmt-cluster
12 stc-tkgm-mgmt-cluster
13↓ tkg-wld-1-cluster-admin@tkg-wld-1-cluster
14
15--------- Info ----------
16Name: tkc-cluster-1
17Cluster: 10.13.202.1
18User: wcp:10.13.202.1:andreasm@cpod-nsxam-stc.az-stc.cloud-garage.net
Now your TKG management cluster is ready and we can deploy a workload cluster.
If you noticed some warnings around conciliation during deployment, you can check whether they failed or not by issuing this command after you have gotten the kubeconfig context in place to the Management cluster with this command:
1andreasm@tkg-bootstrap:~$ kubectl get pkgi -A
2NAMESPACE NAME PACKAGE NAME PACKAGE VERSION DESCRIPTION AGE
3stc-tkgm-ns-1 stc-tkgm-wld-cluster-1-kapp-controller kapp-controller.tanzu.vmware.com 0.41.5+vmware.1-tkg.1 Reconcile succeeded 7d22h
4stc-tkgm-ns-2 stc-tkgm-wld-cluster-2-kapp-controller kapp-controller.tanzu.vmware.com 0.41.5+vmware.1-tkg.1 Reconcile succeeded 7d16h
5tkg-system ako-operator ako-operator-v2.tanzu.vmware.com 0.28.0+vmware.1-tkg.1-zshippable Reconcile succeeded 8d
6tkg-system tanzu-addons-manager addons-manager.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
7tkg-system tanzu-auth tanzu-auth.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
8tkg-system tanzu-cliplugins cliplugins.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
9tkg-system tanzu-core-management-plugins core-management-plugins.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
10tkg-system tanzu-featuregates featuregates.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
11tkg-system tanzu-framework framework.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
12tkg-system tkg-clusterclass tkg-clusterclass.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
13tkg-system tkg-clusterclass-vsphere tkg-clusterclass-vsphere.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
14tkg-system tkg-pkg tkg.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
15tkg-system tkg-stc-mgmt-cluster-antrea antrea.tanzu.vmware.com 1.7.2+vmware.1-tkg.1-advanced Reconcile succeeded 8d
16tkg-system tkg-stc-mgmt-cluster-capabilities capabilities.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
17tkg-system tkg-stc-mgmt-cluster-load-balancer-and-ingress-service load-balancer-and-ingress-service.tanzu.vmware.com 1.8.2+vmware.1-tkg.1 Reconcile succeeded 8d
18tkg-system tkg-stc-mgmt-cluster-metrics-server metrics-server.tanzu.vmware.com 0.6.2+vmware.1-tkg.1 Reconcile succeeded 8d
19tkg-system tkg-stc-mgmt-cluster-pinniped pinniped.tanzu.vmware.com 0.12.1+vmware.2-tkg.3 Reconcile succeeded 8d
20tkg-system tkg-stc-mgmt-cluster-secretgen-controller secretgen-controller.tanzu.vmware.com 0.11.2+vmware.1-tkg.1 Reconcile succeeded 8d
21tkg-system tkg-stc-mgmt-cluster-tkg-storageclass tkg-storageclass.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
22tkg-system tkg-stc-mgmt-cluster-vsphere-cpi vsphere-cpi.tanzu.vmware.com 1.24.3+vmware.1-tkg.1 Reconcile succeeded 8d
23tkg-system tkg-stc-mgmt-cluster-vsphere-csi vsphere-csi.tanzu.vmware.com 2.6.2+vmware.2-tkg.1 Reconcile succeeded 8d
24tkg-system tkr-service tkr-service.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
25tkg-system tkr-source-controller tkr-source-controller.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
26tkg-system tkr-vsphere-resolver tkr-vsphere-resolver.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
TKG Workload cluster deployment
Now that we have done all the initial configs to support our TKG environment on vSphere, NSX and Avi, to deploy a workload cluster is as simple as loading a game on the Commodore 64 📼 From your bootstrap machine make sure you are in the context of your TKG Managment cluster:
1andreasm@tkg-bootstrap:~/.config/tanzu/tkg/providers$ kubectl config current-context
2tkg-stc-mgmt-cluster-admin@tkg-stc-mgmt-cluster
I you prefer to deploy your workload clusters in its own Kubernetes namespace go ahead and create a namespace for your workload cluster like this:
1kubectl create ns "name-of-namespace"
Now to create a workload cluster, this also needs a yaml definition file. The easiest way to achieve such a file is to re-use the bootstramp yaml we created for our TKG Management cluster. For more information deploying a workload cluster in TKG read here.By using the Tanzu CLI we can convert this bootstrap file to a workload cluster yaml definiton file, this is done like this:
1tanzu cluster create stc-tkgm-wld-cluster-1 --namespace=stc-tkgm-ns-1 --file tkg-mgmt-bootstrap-tkg-2.1.yaml --dry-run > stc-tkg-wld-cluster-1.yaml
The command above read the bootstrap yaml file we used to deploy the TKG management cluster, converts it into a yaml file we can use to deploy a workload cluster. It alse removes unnecessary fields not needed for our workload cluster. I am also using the --namespace field to point the config to use the correct namespace and automatically put that into the yaml file. then I am pointing to the TKG Management bootstrap yaml file and finally the --dry-run command to pipe it to a file called stc-tkg-wld-cluster-1.yaml. The result should look something like this:
1apiVersion: cpi.tanzu.vmware.com/v1alpha1
2kind: VSphereCPIConfig
3metadata:
4 name: stc-tkgm-wld-cluster-1
5 namespace: stc-tkgm-ns-1
6spec:
7 vsphereCPI:
8 ipFamily: ipv4
9 mode: vsphereCPI
10 tlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
11---
12apiVersion: csi.tanzu.vmware.com/v1alpha1
13kind: VSphereCSIConfig
14metadata:
15 name: stc-tkgm-wld-cluster-1
16 namespace: stc-tkgm-ns-1
17spec:
18 vsphereCSI:
19 config:
20 datacenter: /cPod-NSXAM-STC
21 httpProxy: ""
22 httpsProxy: ""
23 noProxy: ""
24 region: null
25 tlsThumbprint: 22:FD
26 useTopologyCategories: false
27 zone: null
28 mode: vsphereCSI
29---
30apiVersion: run.tanzu.vmware.com/v1alpha3
31kind: ClusterBootstrap
32metadata:
33 annotations:
34 tkg.tanzu.vmware.com/add-missing-fields-from-tkr: v1.24.9---vmware.1-tkg.1
35 name: stc-tkgm-wld-cluster-1
36 namespace: stc-tkgm-ns-1
37spec:
38 additionalPackages:
39 - refName: metrics-server*
40 - refName: secretgen-controller*
41 - refName: pinniped*
42 cpi:
43 refName: vsphere-cpi*
44 valuesFrom:
45 providerRef:
46 apiGroup: cpi.tanzu.vmware.com
47 kind: VSphereCPIConfig
48 name: stc-tkgm-wld-cluster-1
49 csi:
50 refName: vsphere-csi*
51 valuesFrom:
52 providerRef:
53 apiGroup: csi.tanzu.vmware.com
54 kind: VSphereCSIConfig
55 name: stc-tkgm-wld-cluster-1
56 kapp:
57 refName: kapp-controller*
58---
59apiVersion: v1
60kind: Secret
61metadata:
62 name: stc-tkgm-wld-cluster-1
63 namespace: stc-tkgm-ns-1
64stringData:
65 password: Password
66 username: andreasm@cpod-nsxam-stc.az-stc.cloud-garage.net
67---
68apiVersion: cluster.x-k8s.io/v1beta1
69kind: Cluster
70metadata:
71 annotations:
72 osInfo: ubuntu,20.04,amd64
73 tkg/plan: dev
74 labels:
75 tkg.tanzu.vmware.com/cluster-name: stc-tkgm-wld-cluster-1
76 name: stc-tkgm-wld-cluster-1
77 namespace: stc-tkgm-ns-1
78spec:
79 clusterNetwork:
80 pods:
81 cidrBlocks:
82 - 100.96.0.0/11
83 services:
84 cidrBlocks:
85 - 100.64.0.0/13
86 topology:
87 class: tkg-vsphere-default-v1.0.0
88 controlPlane:
89 metadata:
90 annotations:
91 run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
92 replicas: 1
93 variables:
94 - name: controlPlaneCertificateRotation
95 value:
96 activate: true
97 daysBefore: 90
98 - name: auditLogging
99 value:
100 enabled: false
101 - name: podSecurityStandard
102 value:
103 audit: baseline
104 deactivated: false
105 warn: baseline
106 - name: apiServerEndpoint
107 value: ""
108 - name: aviAPIServerHAProvider
109 value: true
110 - name: vcenter
111 value:
112 cloneMode: fullClone
113 datacenter: /cPod-NSXAM-STC
114 datastore: /cPod-NSXAM-STC/datastore/vsanDatastore
115 folder: /cPod-NSXAM-STC/vm/TKGm
116 network: /cPod-NSXAM-STC/network/ls-tkg-mgmt #Notice this - if you want to place your workload clusters in a different network change this to your desired portgroup.
117 resourcePool: /cPod-NSXAM-STC/host/Cluster/Resources
118 server: vcsa.cpod-nsxam-stc.az-stc.cloud-garage.net
119 storagePolicyID: ""
120 template: /cPod-NSXAM-STC/vm/ubuntu-2004-efi-kube-v1.24.9+vmware.1
121 tlsThumbprint: 22:FD
122 - name: user
123 value:
124 sshAuthorizedKeys:
125 - ssh-rsa 88qv2fowMT65qwpBHUIybHz5Ra2L53zwsv/5yvUej48QLmyAalSNNeH+FIKTkFiuX/WjsHiCIMFisn5dqpc/6x8=
126 - name: controlPlane
127 value:
128 machine:
129 diskGiB: 20
130 memoryMiB: 4096
131 numCPUs: 2
132 - name: worker
133 value:
134 count: 2
135 machine:
136 diskGiB: 20
137 memoryMiB: 4096
138 numCPUs: 2
139 version: v1.24.9+vmware.1
140 workers:
141 machineDeployments:
142 - class: tkg-worker
143 metadata:
144 annotations:
145 run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
146 name: md-0
147 replicas: 2
Read through the result, edit if you find something you would like to change. If you want to deploy your workload cluster on a different network than your Management cluster edit this field to reflect the correct portgroup in vCenter:
1 network: /cPod-NSXAM-STC/network/ls-tkg-mgmt
Now that the yaml defintion is ready we can create the first workload cluster like this:
1tanzu cluster create --file stc-tkg-wld-cluster-1.yaml
You can monitor the progress from the terminal of your bootstrap machine. When done check your cluster status with Tanzu CLI (remember to either use -n "nameofnamespace" or just -A):
1andreasm@tkg-bootstrap:~$ tanzu cluster list -A
2 NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN TKR
3 stc-tkgm-wld-cluster-1 stc-tkgm-ns-1 running 1/1 2/2 v1.24.9+vmware.1 <none> dev v1.24.9---vmware.1-tkg.1
4 stc-tkgm-wld-cluster-2 stc-tkgm-ns-2 running 1/1 2/2 v1.24.9+vmware.1 <none> dev v1.24.9---vmware.1-tkg.1
Further verifications can be done with this command:
1andreasm@tkg-bootstrap:~$ tanzu cluster get stc-tkgm-wld-cluster-1 -n stc-tkgm-ns-1
2 NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES TKR
3 stc-tkgm-wld-cluster-1 stc-tkgm-ns-1 running 1/1 2/2 v1.24.9+vmware.1 <none> v1.24.9---vmware.1-tkg.1
4
5
6Details:
7
8NAME READY SEVERITY REASON SINCE MESSAGE
9/stc-tkgm-wld-cluster-1 True 7d22h
10├─ClusterInfrastructure - VSphereCluster/stc-tkgm-wld-cluster-1-lzjxq True 7d22h
11├─ControlPlane - KubeadmControlPlane/stc-tkgm-wld-cluster-1-22z8x True 7d22h
12│ └─Machine/stc-tkgm-wld-cluster-1-22z8x-jjb66 True 7d22h
13└─Workers
14 └─MachineDeployment/stc-tkgm-wld-cluster-1-md-0-2qmkw True 3d3h
15 ├─Machine/stc-tkgm-wld-cluster-1-md-0-2qmkw-6c4789d7b5-lj5wl True 7d22h
16 └─Machine/stc-tkgm-wld-cluster-1-md-0-2qmkw-6c4789d7b5-wb7k9 True 7d22h
If everything is green its time to get the kubeconfig for the cluster so we can start consume it. This is done like this:
1tanzu cluster kubeconfig get stc-tkgm-wld-cluster-1 --namespace stc-tkgm-ns-1 --admin --export-file stc-tkgm-wld-cluster-1-k8s-config.yaml
Now you can copy this to your preferred workstation and start consuming.
Note! The kubeconfigs I have used here is all admin privileges and is not something you will use in production where you want to have granular user access. I will create a post around user management in both TKGm and TKGs later.
The next sections will cover how to upgrade TKG, some configs on the workload clusters themselves around AKO and Antrea.
Antrea configs
If there is a feature you would like to enable in Antrea in one of your workload clusters, we need to create an AntreaConfig by using the AntreaConfig CRD (this is one way of doing it) and apply it on the Namespace where your workload cluster resides. This is the same approach as we do in vSphere 8 with Tanzu - see here
1apiVersion: cni.tanzu.vmware.com/v1alpha1
2kind: AntreaConfig
3metadata:
4 name: stc-tkgm-wld-cluster-1-antrea-package # notice the naming-convention cluster name-antrea-package
5 namespace: stc-tkgm-ns-1 # your vSphere Namespace the TKC cluster is in.
6spec:
7 antrea:
8 config:
9 featureGates:
10 AntreaProxy: true
11 EndpointSlice: false
12 AntreaPolicy: true
13 FlowExporter: true
14 Egress: true
15 NodePortLocal: true
16 AntreaTraceflow: true
17 NetworkPolicyStats: true
Avi/AKO configs
In TKGm we can override the default AKO settings by using AKODeploymentConfig CRD. We apply this configuration from the TKG Managment cluster on the respective Workload cluster by using labels. An example of such a config yaml:
1apiVersion: networking.tkg.tanzu.vmware.com/v1alpha1
2kind: AKODeploymentConfig
3metadata:
4 name: ako-stc-tkgm-wld-cluster-1
5spec:
6 adminCredentialRef:
7 name: avi-controller-credentials
8 namespace: tkg-system-networking
9 certificateAuthorityRef:
10 name: avi-controller-ca
11 namespace: tkg-system-networking
12 cloudName: stc-nsx-cloud
13 clusterSelector:
14 matchLabels:
15 ako-stc-wld-1: "ako-l7"
16 controller: 172.24.3.50
17 dataNetwork:
18 cidr: 10.13.103.0/24
19 name: vip-tkg-wld-l7
20 controlPlaneNetwork:
21 cidr: 10.13.102.0/24
22 name: vip-tkg-wld-l4
23 extraConfigs:
24 cniPlugin: antrea
25 disableStaticRouteSync: false # required
26 ingress:
27 defaultIngressController: true
28 disableIngressClass: false # required
29 nodeNetworkList: # required
30 - cidrs:
31 - 10.13.21.0/24
32 networkName: ls-tkg-wld-1
33 serviceType: NodePortLocal # required
34 shardVSSize: SMALL # required
35 l4Config:
36 autoFQDN: default
37 networksConfig:
38 nsxtT1LR: /infra/tier-1s/Tier-1
39 serviceEngineGroup: tkgm-se-group
Notice the:
1 clusterSelector:
2 matchLabels:
3 ako-stc-wld-1: "ako-l7"
We need to apply this label to our workload cluster. From the TKG management cluster list all your clusters:
1amarqvardsen@amarqvards1MD6T:~/Kubernetes-library/tkgm/stc-tkgm/stc-tkgm-wld-cluster-1$ k get cluster -A
2NAMESPACE NAME PHASE AGE VERSION
3stc-tkgm-ns-1 stc-tkgm-wld-cluster-1 Provisioned 7d23h v1.24.9+vmware.1
4stc-tkgm-ns-2 stc-tkgm-wld-cluster-2 Provisioned 7d17h v1.24.9+vmware.1
5tkg-system tkg-stc-mgmt-cluster Provisioned 8d v1.24.9+vmware.1
Apply the above label:
1kubectl label cluster -n stc-tkgm-ns-1 stc-tkgm-wld-cluster-1 ako-stc-wld-1=ako-l7
Now run the get cluster command again but with the value --show-labels to see if it has been applied:
1amarqvardsen@amarqvards1MD6T:~/Kubernetes-library/tkgm/stc-tkgm/stc-tkgm-wld-cluster-1$ k get cluster -A --show-labels
2NAMESPACE NAME PHASE AGE VERSION LABELS
3stc-tkgm-ns-1 stc-tkgm-wld-cluster-1 Provisioned 7d23h v1.24.9+vmware.1 ako-stc-wld-1=ako-l7,cluster.x-k8s.io/cluster-name=stc-tkgm-wld-cluster-1,networking.tkg.tanzu.vmware.com/avi=ako-stc-tkgm-wld-cluster-1,run.tanzu.vmware.com/tkr=v1.24.9---vmware.1-tkg.1,tkg.tanzu.vmware.com/cluster-name=stc-tkgm-wld-cluster-1,topology.cluster.x-k8s.io/owned=
Looks good. Then we can apply the AKODeploymentConfig above.
1k apply -f ako-wld-cluster-1.yaml
Verify if the AKODeploymentConfig has been applied:
1amarqvardsen@amarqvards1MD6T:~/Kubernetes-library/tkgm/stc-tkgm/stc-tkgm-wld-cluster-1$ k get akodeploymentconfigs.networking.tkg.tanzu.vmware.com
2NAME AGE
3ako-stc-tkgm-wld-cluster-1 7d21h
4ako-stc-tkgm-wld-cluster-2 7d6h
5install-ako-for-all 8d
6install-ako-for-management-cluster 8d
Now head back your workload cluster and check the AKO pod whether it has been restarted, if you dont want to wait you can always delete the pod to speed up the changes. To verify the changes have a look at the ako configmap like this:
1amarqvardsen@amarqvards1MD6T:~/Kubernetes-library/tkgm/stc-tkgm/stc-tkgm-wld-cluster-1$ k get configmaps -n avi-system avi-k8s-config -oyaml
2apiVersion: v1
3data:
4 apiServerPort: "8080"
5 autoFQDN: default
6 cloudName: stc-nsx-cloud
7 clusterName: stc-tkgm-ns-1-stc-tkgm-wld-cluster-1
8 cniPlugin: antrea
9 controllerIP: 172.24.3.50
10 controllerVersion: 22.1.2
11 defaultIngController: "true"
12 deleteConfig: "false"
13 disableStaticRouteSync: "false"
14 fullSyncFrequency: "1800"
15 logLevel: INFO
16 nodeNetworkList: '[{"networkName":"ls-tkg-wld-1","cidrs":["10.13.21.0/24"]}]'
17 nsxtT1LR: /infra/tier-1s/Tier-1
18 serviceEngineGroupName: tkgm-se-group
19 serviceType: NodePortLocal
20 shardVSSize: SMALL
21 vipNetworkList: '[{"networkName":"vip-tkg-wld-l7","cidr":"10.13.103.0/24"}]'
22kind: ConfigMap
23metadata:
24 annotations:
25 kapp.k14s.io/identity: v1;avi-system//ConfigMap/avi-k8s-config;v1
26 kapp.k14s.io/original: '{"apiVersion":"v1","data":{"apiServerPort":"8080","autoFQDN":"default","cloudName":"stc-nsx-cloud","clusterName":"stc-tkgm-ns-1-stc-tkgm-wld-cluster-1","cniPlugin":"antrea","controllerIP":"172.24.3.50","controllerVersion":"22.1.2","defaultIngController":"true","deleteConfig":"false","disableStaticRouteSync":"false","fullSyncFrequency":"1800","logLevel":"INFO","nodeNetworkList":"[{\"networkName\":\"ls-tkg-wld-1\",\"cidrs\":[\"10.13.21.0/24\"]}]","nsxtT1LR":"/infra/tier-1s/Tier-1","serviceEngineGroupName":"tkgm-se-group","serviceType":"NodePortLocal","shardVSSize":"SMALL","vipNetworkList":"[{\"networkName\":\"vip-tkg-wld-l7\",\"cidr\":\"10.13.103.0/24\"}]"},"kind":"ConfigMap","metadata":{"labels":{"kapp.k14s.io/app":"1678977773033139694","kapp.k14s.io/association":"v1.ae838cced3b6caccc5a03bfb3ae65cd7"},"name":"avi-k8s-config","namespace":"avi-system"}}'
27 kapp.k14s.io/original-diff-md5: c6e94dc94aed3401b5d0f26ed6c0bff3
28 creationTimestamp: "2023-03-16T14:43:11Z"
29 labels:
30 kapp.k14s.io/app: "1678977773033139694"
31 kapp.k14s.io/association: v1.ae838cced3b6caccc5a03bfb3ae65cd7
32 name: avi-k8s-config
33 namespace: avi-system
34 resourceVersion: "19561"
35 uid: 1baa90b2-e5d7-4177-ae34-6c558b5cfe29
It should reflect the changes we applied...
Antrea RBAC
Antrea comes with a list of Tiers where we can place our Antrea Native Policies. These can also be used to restrict who is allowed to apply policies and not. See this page for more information for now. I will update this section later with my own details - including the integration with NSX.
Upgrade TKG (from 2.1 to 2.1.1)
When a new TKG relase is available we can upgrade to use this new release. The steps I have followed are explained in detail here. I recommend to always follow the updated information there.
To upgrade TKG these are the typical steps:
- Download the latest Tanzu CLI - from my.vmware.com
- Download the latest Tanzu kubectl - from my.vmware.com
- Download the latest Photon or Ubuntu OVA VM template - from my.vmware.com
- Upgrade the TKG Management cluster
- Upgrade the TKG Workload clusters
So lets get into it.
Upgrade CLI tools and dependencies
I have already downloaded the Ubuntu VM image for version 2.1.1 into my vCenter and converted it to a template. I have also downloaded the Tanzu CLI tools and Tanzu kubectl for version 2.1.1. Now I need to install the Tanzu CLI and Tanzu kubectl. So I will getting back into my bootstrap machine used previously where I already have Tanzu CLI 2.1 installed.
The first thing I need to is to delete the following file:
1~/.config/tanzu/tkg/compatibility/tkg-compatibility.yaml
Extract the downloaded Tanzu CLI 2.1.1 packages (this will create a cli folder where you are placed. So if you want to use another folder create this first and extract the file in there) :
1tar -xvf tanzu-cli-bundle-linux-amd64.tar.gz
1andreasm@tkg-bootstrap:~/tanzu$ tar -xvf tanzu-cli-bundle-linux-amd64.2.1.1.tar.gz
2cli/
3cli/core/
4cli/core/v0.28.1/
5cli/core/v0.28.1/tanzu-core-linux_amd64
6cli/tanzu-framework-plugins-standalone-linux-amd64.tar.gz
7cli/tanzu-framework-plugins-context-linux-amd64.tar.gz
8cli/ytt-linux-amd64-v0.43.1+vmware.1.gz
9cli/kapp-linux-amd64-v0.53.2+vmware.1.gz
10cli/imgpkg-linux-amd64-v0.31.1+vmware.1.gz
11cli/kbld-linux-amd64-v0.35.1+vmware.1.gz
12cli/vendir-linux-amd64-v0.30.1+vmware.1.gz
Navigate to the cli folder and install the different packages.
Install Tanzu CLI:
1andreasm@tkg-bootstrap:~/tanzu/cli$ sudo install core/v0.28.1/tanzu-core-linux_amd64 /usr/local/bin/tanzu
Initialize the Tanzu CLI:
1andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu init
2ℹ Checking for required plugins...
3ℹ Installing plugin 'secret:v0.28.1' with target 'kubernetes'
4ℹ Installing plugin 'isolated-cluster:v0.28.1'
5ℹ Installing plugin 'login:v0.28.1'
6ℹ Installing plugin 'management-cluster:v0.28.1' with target 'kubernetes'
7ℹ Installing plugin 'package:v0.28.1' with target 'kubernetes'
8ℹ Installing plugin 'pinniped-auth:v0.28.1'
9ℹ Installing plugin 'telemetry:v0.28.1' with target 'kubernetes'
10ℹ Successfully installed all required plugins
11✔ successfully initialized CLI
Verify version:
1andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu version
2version: v0.28.1
3buildDate: 2023-03-07
4sha: 0e6704777-dirty
Now the Tanzu plugins:
1andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu plugin clean
2✔ successfully cleaned up all plugins
1andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu plugin sync
2ℹ Checking for required plugins...
3ℹ Installing plugin 'management-cluster:v0.28.1' with target 'kubernetes'
4ℹ Installing plugin 'secret:v0.28.1' with target 'kubernetes'
5ℹ Installing plugin 'telemetry:v0.28.1' with target 'kubernetes'
6ℹ Installing plugin 'cluster:v0.28.0' with target 'kubernetes'
7ℹ Installing plugin 'kubernetes-release:v0.28.0' with target 'kubernetes'
8ℹ Installing plugin 'login:v0.28.1'
9ℹ Installing plugin 'package:v0.28.1' with target 'kubernetes'
10ℹ Installing plugin 'pinniped-auth:v0.28.1'
11ℹ Installing plugin 'feature:v0.28.0' with target 'kubernetes'
12ℹ Installing plugin 'isolated-cluster:v0.28.1'
13✖ [unable to fetch the plugin metadata for plugin "login": could not find the artifact for version:v0.28.1, os:linux, arch:amd64, unable to fetch the plugin metadata for plugin "package": could not find the artifact for version:v0.28.1, os:linux, arch:amd64, unable to fetch the plugin metadata for plugin "pinniped-auth": could not find the artifact for version:v0.28.1, os:linux, arch:amd64, unable to fetch the plugin metadata for plugin "isolated-cluster": could not find the artifact for version:v0.28.1, os:linux, arch:amd64]
14andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu plugin sync
15ℹ Checking for required plugins...
16ℹ Installing plugin 'pinniped-auth:v0.28.1'
17ℹ Installing plugin 'isolated-cluster:v0.28.1'
18ℹ Installing plugin 'login:v0.28.1'
19ℹ Installing plugin 'package:v0.28.1' with target 'kubernetes'
20ℹ Successfully installed all required plugins
21✔ Done
Note! I had to run the comand twice as I ecountered an issue on first try. Now list the plugins:
1andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu plugin list
2Standalone Plugins
3 NAME DESCRIPTION TARGET DISCOVERY VERSION STATUS
4 isolated-cluster isolated-cluster operations default v0.28.1 installed
5 login Login to the platform default v0.28.1 installed
6 pinniped-auth Pinniped authentication operations (usually not directly invoked) default v0.28.1 installed
7 management-cluster Kubernetes management-cluster operations kubernetes default v0.28.1 installed
8 package Tanzu package management kubernetes default v0.28.1 installed
9 secret Tanzu secret management kubernetes default v0.28.1 installed
10 telemetry Configure cluster-wide telemetry settings kubernetes default v0.28.1 installed
11
12Plugins from Context: tkg-stc-mgmt-cluster
13 NAME DESCRIPTION TARGET VERSION STATUS
14 cluster Kubernetes cluster operations kubernetes v0.28.0 installed
15 feature Operate on features and featuregates kubernetes v0.28.0 installed
16 kubernetes-release Kubernetes release operations kubernetes v0.28.0 installed
Install the Tanzu kubectl:
1andreasm@tkg-bootstrap:~/tanzu$ gunzip kubectl-linux-v1.24.10+vmware.1.gz
2andreasm@tkg-bootstrap:~/tanzu$ chmod ugo+x kubectl-linux-v1.24.10+vmware.1
3andreasm@tkg-bootstrap:~/tanzu$ sudo install kubectl-linux-v1.24.10+vmware.1 /usr/local/bin/kubectl
Check version:
1andreasm@tkg-bootstrap:~/tanzu$ kubectl version
2WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
3Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.10+vmware.1", GitCommit:"b980a736cbd2ac0c5f7ca793122fd4231f705889", GitTreeState:"clean", BuildDate:"2023-01-24T15:36:34Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}
4Kustomize Version: v4.5.4
5Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.9+vmware.1", GitCommit:"d1d7c19c9b6265a8dcd1b2ab2620ec0fc7cee784", GitTreeState:"clean", BuildDate:"2022-12-14T06:23:39Z", GoVersion:"go1.18.9", Compiler:"gc", Platform:"linux/amd64"}
Install the Carvel tools. From the cli folder first out is ytt. Install ytt:
1andreasm@tkg-bootstrap:~/tanzu/cli$ gunzip ytt-linux-amd64-v0.43.1+vmware.1.gz
2andreasm@tkg-bootstrap:~/tanzu/cli$ chmod ugo+x ytt-linux-amd64-v0.43.1+vmware.1
3andreasm@tkg-bootstrap:~/tanzu/cli$ sudo mv ./ytt-linux-amd64-v0.43.1+vmware.1 /usr/local/bin/ytt
4andreasm@tkg-bootstrap:~/tanzu/cli$ ytt --version
5ytt version 0.43.1
Instal kapp:
1andreasm@tkg-bootstrap:~/tanzu/cli$ gunzip kapp-linux-amd64-v0.53.2+vmware.1.gz
2andreasm@tkg-bootstrap:~/tanzu/cli$ chmod ugo+x kapp-linux-amd64-v0.53.2+vmware.1
3andreasm@tkg-bootstrap:~/tanzu/cli$ sudo mv ./kapp-linux-amd64-v0.53.2+vmware.1 /usr/local/bin/kapp
4andreasm@tkg-bootstrap:~/tanzu/cli$ kapp --version
5kapp version 0.53.2
6
7Succeeded
Install kbld:
1andreasm@tkg-bootstrap:~/tanzu/cli$ gunzip kbld-linux-amd64-v0.35.1+vmware.1.gz
2andreasm@tkg-bootstrap:~/tanzu/cli$ chmod ugo+x kbld-linux-amd64-v0.35.1+vmware.1
3andreasm@tkg-bootstrap:~/tanzu/cli$ sudo mv ./kbld-linux-amd64-v0.35.1+vmware.1 /usr/local/bin/kbld
4andreasm@tkg-bootstrap:~/tanzu/cli$ kbld --version
5kbld version 0.35.1
6
7Succeeded
Install imgpkg:
1andreasm@tkg-bootstrap:~/tanzu/cli$ gunzip imgpkg-linux-amd64-v0.31.1+vmware.1.gz
2andreasm@tkg-bootstrap:~/tanzu/cli$ chmod ugo+x imgpkg-linux-amd64-v0.31.1+vmware.1
3andreasm@tkg-bootstrap:~/tanzu/cli$ sudo mv ./imgpkg-linux-amd64-v0.31.1+vmware.1 /usr/local/bin/imgpkg
4andreasm@tkg-bootstrap:~/tanzu/cli$ imgpkg --version
5imgpkg version 0.31.1
6
7Succeeded
We have done the verification of the different versions, but we should have Tanzu cli version v0.28.1
Upgrade the TKG Management cluster
Now we can proceed with the upgrade process. One important document to check is this! Known Issues... Check whether you are using environments, if you happen to use them we need to unset them.
1andreasm@tkg-bootstrap:~/tanzu/cli$ printenv
I am clear here and will now start the upgrading of my standalone TKG Management cluster Make sure you are in the context of the TKG management cluster and that you have converted the new Ubuntu VM image as template.
1andreasm@tkg-bootstrap:~$ kubectl config current-context
2tkg-stc-mgmt-cluster-admin@tkg-stc-mgmt-cluster
If not, use the following command:
1andreasm@tkg-bootstrap:~$ tanzu login
2? Select a server [Use arrows to move, type to filter]
3> tkg-stc-mgmt-cluster()
4 + new server
1andreasm@tkg-bootstrap:~$ tanzu login
2? Select a server tkg-stc-mgmt-cluster()
3✔ successfully logged in to management cluster using the kubeconfig tkg-stc-mgmt-cluster
4ℹ Checking for required plugins...
5ℹ All required plugins are already installed and up-to-date
Here goes: (To start the upgrade of the management cluster)
1andreasm@tkg-bootstrap:~$ tanzu mc upgrade
2Upgrading management cluster 'tkg-stc-mgmt-cluster' to TKG version 'v2.1.1' with Kubernetes version 'v1.24.10+vmware.1'. Are you sure? [y/N]:
Eh.... yes...
Progress:
1andreasm@tkg-bootstrap:~$ tanzu mc upgrade
2Upgrading management cluster 'tkg-stc-mgmt-cluster' to TKG version 'v2.1.1' with Kubernetes version 'v1.24.10+vmware.1'. Are you sure? [y/N]: y
3Validating the compatibility before management cluster upgrade
4Validating for the required environment variables to be set
5Validating for the user configuration secret to be existed in the cluster
6Warning: unable to find component 'kube_rbac_proxy' under BoM
7Upgrading management cluster providers...
8 infrastructure-ipam-in-cluster provider's version is missing in BOM file, so it would not be upgraded
9Checking cert-manager version...
10Cert-manager is already up to date
11Performing upgrade...
12Scaling down Provider="cluster-api" Version="" Namespace="capi-system"
13Scaling down Provider="bootstrap-kubeadm" Version="" Namespace="capi-kubeadm-bootstrap-system"
14Scaling down Provider="control-plane-kubeadm" Version="" Namespace="capi-kubeadm-control-plane-system"
15Scaling down Provider="infrastructure-vsphere" Version="" Namespace="capv-system"
16Deleting Provider="cluster-api" Version="" Namespace="capi-system"
17Installing Provider="cluster-api" Version="v1.2.8" TargetNamespace="capi-system"
18Deleting Provider="bootstrap-kubeadm" Version="" Namespace="capi-kubeadm-bootstrap-system"
19Installing Provider="bootstrap-kubeadm" Version="v1.2.8" TargetNamespace="capi-kubeadm-bootstrap-system"
20Deleting Provider="control-plane-kubeadm" Version="" Namespace="capi-kubeadm-control-plane-system"
21Installing Provider="control-plane-kubeadm" Version="v1.2.8" TargetNamespace="capi-kubeadm-control-plane-system"
22Deleting Provider="infrastructure-vsphere" Version="" Namespace="capv-system"
23Installing Provider="infrastructure-vsphere" Version="v1.5.3" TargetNamespace="capv-system"
24Management cluster providers upgraded successfully...
25Preparing addons manager for upgrade
26Upgrading kapp-controller...
27Adding last-applied annotation on kapp-controller...
28Removing old management components...
29Upgrading management components...
30ℹ Updating package repository 'tanzu-management'
31ℹ Getting package repository 'tanzu-management'
32ℹ Validating provided settings for the package repository
33ℹ Updating package repository resource
34ℹ Waiting for 'PackageRepository' reconciliation for 'tanzu-management'
35ℹ 'PackageRepository' resource install status: Reconciling
36ℹ 'PackageRepository' resource install status: ReconcileSucceeded
37ℹ Updated package repository 'tanzu-management' in namespace 'tkg-system'
38ℹ Installing package 'tkg.tanzu.vmware.com'
39ℹ Updating package 'tkg-pkg'
40ℹ Getting package install for 'tkg-pkg'
41ℹ Getting package metadata for 'tkg.tanzu.vmware.com'
42ℹ Updating secret 'tkg-pkg-tkg-system-values'
43ℹ Updating package install for 'tkg-pkg'
44ℹ Waiting for 'PackageInstall' reconciliation for 'tkg-pkg'
45ℹ 'PackageInstall' resource install status: ReconcileSucceeded
46ℹ Updated installed package 'tkg-pkg'
47Cleanup core packages repository...
48Core package repository not found, no need to cleanup
49Upgrading management cluster kubernetes version...
50Upgrading kubernetes cluster to `v1.24.10+vmware.1` version, tkr version: `v1.24.10+vmware.1-tkg.2`
51Waiting for kubernetes version to be updated for control plane nodes...
52Waiting for kubernetes version to be updated for worker nodes...
In vCenter we should start see some action also:
Two control plane nodes:
No longer:
1management cluster is opted out of telemetry - skipping telemetry image upgrade
2Creating tkg-bom versioned ConfigMaps...
3Management cluster 'tkg-stc-mgmt-cluster' successfully upgraded to TKG version 'v2.1.1' with kubernetes version 'v1.24.10+vmware.1'
4ℹ Checking for required plugins...
5ℹ Installing plugin 'kubernetes-release:v0.28.1' with target 'kubernetes'
6ℹ Installing plugin 'cluster:v0.28.1' with target 'kubernetes'
7ℹ Installing plugin 'feature:v0.28.1' with target 'kubernetes'
8ℹ Successfully installed all required plugins
Well, it finished successfully.
Lets verify with Tanzu CLI:
1andreasm@tkg-bootstrap:~$ tanzu cluster list --include-management-cluster -A
2 NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN TKR
3 stc-tkgm-wld-cluster-1 stc-tkgm-ns-1 running 1/1 2/2 v1.24.9+vmware.1 <none> dev v1.24.9---vmware.1-tkg.1
4 stc-tkgm-wld-cluster-2 stc-tkgm-ns-2 running 1/1 2/2 v1.24.9+vmware.1 <none> dev v1.24.9---vmware.1-tkg.1
5 tkg-stc-mgmt-cluster tkg-system running 1/1 2/2 v1.24.10+vmware.1 management dev v1.24.10---vmware.1-tkg.2
Looks good, notice the different versions. Management cluster is upgraded to latest version, workload clusters are still on its older version. They are up next.
Lets do a last check before we head to Workload cluster upgrade.
1andreasm@tkg-bootstrap:~$ tanzu mc get
2 NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN TKR
3 tkg-stc-mgmt-cluster tkg-system running 1/1 2/2 v1.24.10+vmware.1 management dev v1.24.10---vmware.1-tkg.2
4
5
6Details:
7
8NAME READY SEVERITY REASON SINCE MESSAGE
9/tkg-stc-mgmt-cluster True 17m
10├─ClusterInfrastructure - VSphereCluster/tkg-stc-mgmt-cluster-xw6xs True 8d
11├─ControlPlane - KubeadmControlPlane/tkg-stc-mgmt-cluster-wrxtl True 17m
12│ └─Machine/tkg-stc-mgmt-cluster-wrxtl-csrnt True 24m
13└─Workers
14 └─MachineDeployment/tkg-stc-mgmt-cluster-md-0-vs9dc True 10m
15 ├─Machine/tkg-stc-mgmt-cluster-md-0-vs9dc-54554f9575-7hdfc True 14m
16 └─Machine/tkg-stc-mgmt-cluster-md-0-vs9dc-54554f9575-ng9lx True 7m4s
17
18
19Providers:
20
21 NAMESPACE NAME TYPE PROVIDERNAME VERSION WATCHNAMESPACE
22 caip-in-cluster-system infrastructure-ipam-in-cluster InfrastructureProvider ipam-in-cluster v0.1.0
23 capi-kubeadm-bootstrap-system bootstrap-kubeadm BootstrapProvider kubeadm v1.2.8
24 capi-kubeadm-control-plane-system control-plane-kubeadm ControlPlaneProvider kubeadm v1.2.8
25 capi-system cluster-api CoreProvider cluster-api v1.2.8
26 capv-system infrastructure-vsphere InfrastructureProvider vsphere v1.5.3
Congrats, head over to next level 😄
Upgrade workload cluster
This procedure is much simpler, almost as simple as starting a game in MS-DOS 6.2 requiring a bit over 600kb convential memory. Make sure your are still in the TKG Management cluster context.
As done above list out the cluster you have and notice the versions they are on now.:
1andreasm@tkg-bootstrap:~$ tanzu cluster list --include-management-cluster -A
2 NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN TKR
3 stc-tkgm-wld-cluster-1 stc-tkgm-ns-1 running 1/1 2/2 v1.24.9+vmware.1 <none> dev v1.24.9---vmware.1-tkg.1
4 stc-tkgm-wld-cluster-2 stc-tkgm-ns-2 running 1/1 2/2 v1.24.9+vmware.1 <none> dev v1.24.9---vmware.1-tkg.1
5 tkg-stc-mgmt-cluster tkg-system running 1/1 2/2 v1.24.10+vmware.1 management dev v1.24.10---vmware.1-tkg.2
Check if there are any new releases available from the management cluster:
1andreasm@tkg-bootstrap:~$ tanzu kubernetes-release get
2 NAME VERSION COMPATIBLE ACTIVE UPDATES AVAILABLE
3 v1.22.17---vmware.1-tkg.2 v1.22.17+vmware.1-tkg.2 True True
4 v1.23.16---vmware.1-tkg.2 v1.23.16+vmware.1-tkg.2 True True
5 v1.24.10---vmware.1-tkg.2 v1.24.10+vmware.1-tkg.2 True True
There is one there.. v1.24.10 and its compatible.
Lets check whether there are any updates ready for our workload cluster:
1andreasm@tkg-bootstrap:~$ tanzu cluster available-upgrades get -n stc-tkgm-ns-1 stc-tkgm-wld-cluster-1
2 NAME VERSION COMPATIBLE
3 v1.24.10---vmware.1-tkg.2 v1.24.10+vmware.1-tkg.2 True
It is...
Lets upgrade it:
1andreasm@tkg-bootstrap:~$ tanzu cluster upgrade -n stc-tkgm-ns-1 stc-tkgm-wld-cluster-1
2Upgrading workload cluster 'stc-tkgm-wld-cluster-1' to kubernetes version 'v1.24.10+vmware.1', tkr version 'v1.24.10+vmware.1-tkg.2'. Are you sure? [y/N]: y
3Upgrading kubernetes cluster to `v1.24.10+vmware.1` version, tkr version: `v1.24.10+vmware.1-tkg.2`
4Waiting for kubernetes version to be updated for control plane nodes...
y for YES
Sit back and wait for the upgrade process is to do its thing. You can monitor the output from the current terminal, and if something is happening in vCenter. Clone operations, power on, power off and delete.
And the result is in:
1Waiting for kubernetes version to be updated for worker nodes...
2Cluster 'stc-tkgm-wld-cluster-1' successfully upgraded to kubernetes version 'v1.24.10+vmware.1'
We have a winner.
Lets quickly check with Tanzu CLI:
1andreasm@tkg-bootstrap:~$ tanzu cluster get stc-tkgm-wld-cluster-1 -n stc-tkgm-ns-1
2 NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES TKR
3 stc-tkgm-wld-cluster-1 stc-tkgm-ns-1 running 1/1 2/2 v1.24.10+vmware.1 <none> v1.24.10---vmware.1-tkg.2
4
5
6Details:
7
8NAME READY SEVERITY REASON SINCE MESSAGE
9/stc-tkgm-wld-cluster-1 True 11m
10├─ClusterInfrastructure - VSphereCluster/stc-tkgm-wld-cluster-1-lzjxq True 8d
11├─ControlPlane - KubeadmControlPlane/stc-tkgm-wld-cluster-1-22z8x True 11m
12│ └─Machine/stc-tkgm-wld-cluster-1-22z8x-mtpgs True 15m
13└─Workers
14 └─MachineDeployment/stc-tkgm-wld-cluster-1-md-0-2qmkw True 39m
15 ├─Machine/stc-tkgm-wld-cluster-1-md-0-2qmkw-58c5764865-7xvfn True 8m31s
16 └─Machine/stc-tkgm-wld-cluster-1-md-0-2qmkw-58c5764865-c7rqj True 3m29s
Couldn't be better. Thats it then. Its Friday so have a great weekend and thanks for reading.