Tanzu Kubernetes Grid #
This post will go through how to deploy TKG 2.1, the management cluster, a workload cluster (or two), and the necessary preparations to be done on the underlaying infrastructure to support TKG 2.1. In this post I will use vSphere 8 with vSAN, Avi LoadBalancer, and NSX. So what we want to end up with it something like this:

Preparations before deployment #
This post will assume the following:
-
vSphere is already installed configured. See more info here and here
-
NSX has already been configured (see this post for how to configure NSX). Segments used for both Management cluster and Workload clusters should have DHCP server available. We dont need DHCP for Workload Cluster, but Management needs DHCP. NSX can provide DHCP server functionality for this use *
-
NSX Advanced LoadBalancer has been deployed (and configured with a NSX cloud). See this post for how to configure this. **
-
Import the VM template for TKG, see here
-
A dedicated Linux machine/VM we can use as the bootstrap host, with the Tanzu CLI installed. See more info here
(*) TKG 2.1 is not tied to NSX the same way as TKGs - So we can choose to use NSX for Security only or the full stack with networking and security. The built in NSX loadbalancer will not be used, I will use the NSX Advanced Loadbalancer (Avi)
(**) I want to use the NSX cloud in Avi as it gives several benefits such as integration into the NSX manager where Avi automatically creates security groups, tags and services to easily be used in security policy creation and automatic “route plumbing” for the VIPs.
TKG Management cluster - deployment #
The first step after all the pre-requirements have been done is to prepare a bootstrap yaml for the management cluster. I will post an example file here and go through what the different fields means and why I have configured them and why I have uncommented some of them. Start by logging into the bootstrap machine, or if you decide to create the bootstrap yaml somewhere else go ahead but we need to copy it over to the bootstrap machine when we are ready to create the the management cluster.
To get started with a bootstrap yaml file we can either grab an example from here or in your bootstrap machine there is a folder which contains a default config you can start out with:
andreasm@tkg-bootstrap:~/.config/tanzu/tkg/providers$ ll
total 120
drwxrwxr-x 18 andreasm andreasm 4096 Mar 24 09:10 ./
drwx------ 9 andreasm andreasm 4096 Mar 16 11:32 ../
drwxrwxr-x 2 andreasm andreasm 4096 Mar 16 06:52 ako/
drwxrwxr-x 3 andreasm andreasm 4096 Mar 16 06:52 bootstrap-kubeadm/
drwxrwxr-x 4 andreasm andreasm 4096 Mar 16 06:52 cert-manager/
drwxrwxr-x 3 andreasm andreasm 4096 Mar 16 06:52 cluster-api/
-rw------- 1 andreasm andreasm 1293 Mar 16 06:52 config.yaml
-rw------- 1 andreasm andreasm 32007 Mar 16 06:52 config_default.yaml
drwxrwxr-x 3 andreasm andreasm 4096 Mar 16 06:52 control-plane-kubeadm/
drwxrwxr-x 5 andreasm andreasm 4096 Mar 16 06:52 infrastructure-aws/
drwxrwxr-x 5 andreasm andreasm 4096 Mar 16 06:52 infrastructure-azure/
drwxrwxr-x 6 andreasm andreasm 4096 Mar 16 06:52 infrastructure-docker/
drwxrwxr-x 3 andreasm andreasm 4096 Mar 16 06:52 infrastructure-ipam-in-cluster/
drwxrwxr-x 5 andreasm andreasm 4096 Mar 16 06:52 infrastructure-oci/
drwxrwxr-x 4 andreasm andreasm 4096 Mar 16 06:52 infrastructure-tkg-service-vsphere/
drwxrwxr-x 5 andreasm andreasm 4096 Mar 16 06:52 infrastructure-vsphere/
drwxrwxr-x 2 andreasm andreasm 4096 Mar 16 06:52 kapp-controller-values/
-rwxrwxr-x 1 andreasm andreasm 64 Mar 16 06:52 providers.sha256sum*
-rw------- 1 andreasm andreasm 0 Mar 16 06:52 v0.28.0
-rw------- 1 andreasm andreasm 747 Mar 16 06:52 vendir.lock.yml
-rw------- 1 andreasm andreasm 903 Mar 16 06:52 vendir.yml
drwxrwxr-x 8 andreasm andreasm 4096 Mar 16 06:52 ytt/
drwxrwxr-x 2 andreasm andreasm 4096 Mar 16 06:52 yttcb/
drwxrwxr-x 7 andreasm andreasm 4096 Mar 16 06:52 yttcc/
andreasm@tkg-bootstrap:~/.config/tanzu/tkg/providers$
The file you should be looking for is called config_default.yaml . It could be a smart choice to use this as it will include the latest config parameters following the TKG version you have downloaded (Tanzu CLI).
Now copy this file to a folder of preference and start to edit it. Below is a copy of an example I am using:
#! ---------------
#! Basic config
#! -------------
CLUSTER_NAME: tkg-stc-mgmt-cluster #Name of the TKG mgmt cluster
CLUSTER_PLAN: dev #Dev or Prod, defines the amount of control plane nodes of the mgmt cluster
INFRASTRUCTURE_PROVIDER: vsphere #We are deploying on vSphere, could be AWS, Azure
ENABLE_CEIP_PARTICIPATION: "false" #Customer Experience Improvement Program - set to true if you will participate
ENABLE_AUDIT_LOGGING: "false" #Audit logging should be true in production environments
CLUSTER_CIDR: 100.96.0.0/11 #Kubernetes Cluster CIDR
SERVICE_CIDR: 100.64.0.0/13 #Kubernetes Services CIDR
TKG_IP_FAMILY: ipv4 #ipv4 or ipv6
DEPLOY_TKG_ON_VSPHERE7: "true" #Yes to deploy standalone tkg mgmt cluster on vSphere
#! ---------------
#! vSphere config
#! -------------
VSPHERE_DATACENTER: /cPod-NSXAM-STC #Name of vSphere Datacenter
VSPHERE_DATASTORE: /cPod-NSXAM-STC/datastore/vsanDatastore #Name and path of vSphere datastore to be used
VSPHERE_FOLDER: /cPod-NSXAM-STC/vm/TKGm #Name and path to VM folder
VSPHERE_INSECURE: "false" #True if you dont want to verify vCenter thumprint below
VSPHERE_NETWORK: /cPod-NSXAM-STC/network/ls-tkg-mgmt #A network portgroup (VDS or NSX Segment) for VM placement
VSPHERE_CONTROL_PLANE_ENDPOINT: "" #Required if using Kube-Vip, I am using Avi Loadbalancer for this
VSPHERE_PASSWORD: "password" #vCenter account password for account defined below
VSPHERE_RESOURCE_POOL: /cPod-NSXAM-STC/host/Cluster/Resources #If you want to use a specific vSphere Resource Pool for the mgmt cluster. Leave it as is if not.
VSPHERE_SERVER: vcsa.cpod-nsxam-stc.az-stc.cloud-garage.net #DNS record to vCenter Server
VSPHERE_SSH_AUTHORIZED_KEY: ssh-rsa sdfgasdgadfgsdg sdfsdf@sdfsdf.net # your bootstrap machineSSH public key
VSPHERE_TLS_THUMBPRINT: 22:FD # Your vCenter SHA1 Thumbprint
VSPHERE_USERNAME: user@vspheresso/or/ad/user/domain #A user with the correct permissions
#! ---------------
#! Node config
#! -------------
OS_ARCH: amd64
OS_NAME: ubuntu
OS_VERSION: "20.04"
VSPHERE_CONTROL_PLANE_DISK_GIB: "20"
VSPHERE_CONTROL_PLANE_MEM_MIB: "4096"
VSPHERE_CONTROL_PLANE_NUM_CPUS: "2"
VSPHERE_WORKER_DISK_GIB: "20"
VSPHERE_WORKER_MEM_MIB: "4096"
VSPHERE_WORKER_NUM_CPUS: "2"
CONTROL_PLANE_MACHINE_COUNT: 1
WORKER_MACHINE_COUNT: 2
#! ---------------
#! Avi config
#! -------------
AVI_CA_DATA_B64: #Base64 of the Avi Certificate
AVI_CLOUD_NAME: stc-nsx-cloud #Name of the cloud defined in Avi
AVI_CONTROL_PLANE_HA_PROVIDER: "true" #True as we want to use Avi as K8s API endpoint
AVI_CONTROLLER: 172.24.3.50 #IP or Hostname Avi controller or controller cluster
# Network used to place workload clusters' endpoint VIPs - If you want to use a separate vip for Workload clusters Kubernetes API endpoint
AVI_CONTROL_PLANE_NETWORK: vip-tkg-wld-l4 #Corresponds with network defined in Avi
AVI_CONTROL_PLANE_NETWORK_CIDR: 10.13.102.0/24 #Corresponds with network defined in Avi
# Network used to place workload clusters' services external IPs (load balancer & ingress services)
AVI_DATA_NETWORK: vip-tkg-wld-l7 #Corresponds with network defined in Avi
AVI_DATA_NETWORK_CIDR: 10.13.103.0/24 #Corresponds with network defined in Avi
# Network used to place management clusters' services external IPs (load balancer & ingress services)
AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_CIDR: 10.13.101.0/24 #Corresponds with network defined in Avi
AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_NAME: vip-tkg-mgmt-l7 #Corresponds with network defined in Avi
# Network used to place management clusters' endpoint VIPs
AVI_MANAGEMENT_CLUSTER_CONTROL_PLANE_VIP_NETWORK_NAME: vip-tkg-mgmt-l4 #Corresponds with network defined in Avi
AVI_MANAGEMENT_CLUSTER_CONTROL_PLANE_VIP_NETWORK_CIDR: 10.13.100.0/24 #Corresponds with network defined in Avi
AVI_NSXT_T1LR: /infra/tier-1s/Tier-1 #Path to the NSX T1 you have configured, click on three dots in NSX on the T1 to get the full path.
AVI_CONTROLLER_VERSION: 22.1.2 #Latest supported version of Avi for TKG 2.1
AVI_ENABLE: "true" # Enables Avi as Loadbalancer for workloads
AVI_LABELS: "" #When used Avi is enabled only workload cluster with corresponding label
AVI_PASSWORD: "password" #Password for the account used in Avi, username defined below
AVI_SERVICE_ENGINE_GROUP: stc-nsx #Service Engine group for Workload clusters if you want to have separate groups for Workload clusters and Management cluster
AVI_MANAGEMENT_CLUSTER_SERVICE_ENGINE_GROUP: tkgm-se-group #Dedicated Service Engine group for management cluster
AVI_USERNAME: admin
AVI_DISABLE_STATIC_ROUTE_SYNC: true #Pod network reachable or not from the Avi Service Engines
AVI_INGRESS_DEFAULT_INGRESS_CONTROLLER: true #If you want to use AKO as default ingress controller, false if you plan to use other ingress controllers also.
AVI_INGRESS_SHARD_VS_SIZE: SMALL #Decides the amount of shared vs pr ip.
AVI_INGRESS_SERVICE_TYPE: NodePortLocal #NodePortLocal only when using Antrea, otherwise NodePort or ClusterIP
AVI_CNI_PLUGIN: antrea
#! ---------------
#! Proxy config
#! -------------
TKG_HTTP_PROXY_ENABLED: "false"
#! ---------------------------------------------------------------------
#! Antrea CNI configuration
#! ---------------------------------------------------------------------
# ANTREA_NO_SNAT: false
# ANTREA_TRAFFIC_ENCAP_MODE: "encap"
# ANTREA_PROXY: false
# ANTREA_POLICY: true
# ANTREA_TRACEFLOW: false
ANTREA_NODEPORTLOCAL: true
ANTREA_PROXY: true
ANTREA_ENDPOINTSLICE: true
ANTREA_POLICY: true
ANTREA_TRACEFLOW: true
ANTREA_NETWORKPOLICY_STATS: false
ANTREA_EGRESS: true
ANTREA_IPAM: false
ANTREA_FLOWEXPORTER: false
ANTREA_SERVICE_EXTERNALIP: false
ANTREA_MULTICAST: false
#! ---------------------------------------------------------------------
#! Machine Health Check configuration
#! ---------------------------------------------------------------------
ENABLE_MHC: "true"
ENABLE_MHC_CONTROL_PLANE: true
ENABLE_MHC_WORKER_NODE: true
MHC_UNKNOWN_STATUS_TIMEOUT: 5m
MHC_FALSE_STATUS_TIMEOUT: 12m
#! ---------------------------------------------------------------------
#! Identity management configuration
#! ---------------------------------------------------------------------
IDENTITY_MANAGEMENT_TYPE: none #I have disabled this, use kubeconfig instead
#LDAP_BIND_DN: CN=Andreas M,OU=Users,OU=GUZWARE,DC=guzware,DC=local
#LDAP_BIND_PASSWORD: <encoded:UHNAc=>
#LDAP_GROUP_SEARCH_BASE_DN: DC=guzware,DC=local
#LDAP_GROUP_SEARCH_FILTER: (objectClass=group)
#LDAP_GROUP_SEARCH_GROUP_ATTRIBUTE: member
#LDAP_GROUP_SEARCH_NAME_ATTRIBUTE: cn
#LDAP_GROUP_SEARCH_USER_ATTRIBUTE: distinguishedName
#LDAP_HOST: guzad07.guzware.local:636
#LDAP_ROOT_CA_DATA_B64: LS0tLS1CRUd
#LDAP_USER_SEARCH_BASE_DN: DC=guzware,DC=local
#LDAP_USER_SEARCH_FILTER: (objectClass=person)
#LDAP_USER_SEARCH_NAME_ATTRIBUTE: uid
#LDAP_USER_SEARCH_USERNAME: uid
#OIDC_IDENTITY_PROVIDER_CLIENT_ID: ""
#OIDC_IDENTITY_PROVIDER_CLIENT_SECRET: ""
#OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM: ""
#OIDC_IDENTITY_PROVIDER_ISSUER_URL: ""
#OIDC_IDENTITY_PROVIDER_NAME: ""
#OIDC_IDENTITY_PROVIDER_SCOPES: ""
#OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM: ""
For additional explanations of the different values see here
When you feel you are ready with the bootstrap yaml file its time to deploy the management cluster. From your bootstrap machine where Tanzu CLI have been installed enter the following command:
tanzu mc create --file path/to/cluster-config-file.yaml
For more information around this process have a look here
The first thing that happens is some validation checks, if those pass it will continue to build a local bootstrap cluster on your bootstrap machine before building the TKG Management cluster in your vSphere cluster.
Note! If you happen to use a an IP range within 172.16.0.0/12 on your computer you are accessing the bootstrap machine through you should edit the default Docker network. Otherwise you will loose connection to your bootstrap machine. This is done like this:
Add or edit, if it exists, the /etc/docker/daemon.json file with the following content:
{
"default-address-pools":
[
{"base":"192.168.0.0/16","size":24}
]
}
Restart docker service or reboot the machine.
Now back to the tanzu create process, you can monitor the progress from the terminal of your bootstrap machine, and you should after a while see machines being cloned from your template and powered on. In the Avi controller you should also see a new virtual service being created:


The ip address depicted above is the sole control plane node as I am deploying a TKG management cluster using plan dev. If the progress in your bootstrap machine indicates that it is done, you can check the status with the following command:
tanzu mc get
This will give you this output:
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN TKR
tkg-stc-mgmt-cluster tkg-system running 1/1 2/2 v1.24.9+vmware.1 management dev v1.24.9---vmware.1-tkg.1
Details:
NAME READY SEVERITY REASON SINCE MESSAGE
/tkg-stc-mgmt-cluster True 8d
├─ClusterInfrastructure - VSphereCluster/tkg-stc-mgmt-cluster-xw6xs True 8d
├─ControlPlane - KubeadmControlPlane/tkg-stc-mgmt-cluster-wrxtl True 8d
│ └─Machine/tkg-stc-mgmt-cluster-wrxtl-gkv5m True 8d
└─Workers
└─MachineDeployment/tkg-stc-mgmt-cluster-md-0-vs9dc True 3d3h
├─Machine/tkg-stc-mgmt-cluster-md-0-vs9dc-55c649d9fc-gnpz4 True 8d
└─Machine/tkg-stc-mgmt-cluster-md-0-vs9dc-55c649d9fc-gwfvt True 8d
Providers:
NAMESPACE NAME TYPE PROVIDERNAME VERSION WATCHNAMESPACE
caip-in-cluster-system infrastructure-ipam-in-cluster InfrastructureProvider ipam-in-cluster v0.1.0
capi-kubeadm-bootstrap-system bootstrap-kubeadm BootstrapProvider kubeadm v1.2.8
capi-kubeadm-control-plane-system control-plane-kubeadm ControlPlaneProvider kubeadm v1.2.8
capi-system cluster-api CoreProvider cluster-api v1.2.8
capv-system infrastructure-vsphere InfrastructureProvider vsphere v1.5.1
When cluster is ready deployed and before we can access it with our kubectl cli tool we must set the context to it.
kubectl config use-context my-mgmnt-cluster-admin@my-mgmnt-cluster
But you probably have a dedicated workstation you want to acces the cluster from, then you can export the kubeconfig like this:
tanzu mc kubeconfig get --admin --export-file MC-ADMIN-KUBECONFIG
Now copy the file to your workstation and accessed the cluster from there.
Tip! Test out this tool to easy manage your Kubernetes configs: https://github.com/sunny0826/kubecm
The above is a really great tool:
amarqvardsen@amarqvards1MD6T:~$ kubecm switch --ui-size 10
Use the arrow keys to navigate: ↓ ↑ → ← and / toggles search
Select Kube Context
😼 tkc-cluster-1(*)
tkgs-cluster-1-admin@tkgs-cluster-1
wdc-2-tkc-cluster-1
10.13.200.2
andreasmk8slab-admin@andreasmk8slab-pinniped
ns-wdc-3
tkc-cluster-1-routed
tkg-mgmt-cluster-admin@tkg-mgmt-cluster
stc-tkgm-mgmt-cluster
↓ tkg-wld-1-cluster-admin@tkg-wld-1-cluster
--------- Info ----------
Name: tkc-cluster-1
Cluster: 10.13.202.1
User: wcp:10.13.202.1:andreasm@cpod-nsxam-stc.az-stc.cloud-garage.net
Now your TKG management cluster is ready and we can deploy a workload cluster.
If you noticed some warnings around conciliation during deployment, you can check whether they failed or not by issuing this command after you have gotten the kubeconfig context in place to the Management cluster with this command:
andreasm@tkg-bootstrap:~$ kubectl get pkgi -A
NAMESPACE NAME PACKAGE NAME PACKAGE VERSION DESCRIPTION AGE
stc-tkgm-ns-1 stc-tkgm-wld-cluster-1-kapp-controller kapp-controller.tanzu.vmware.com 0.41.5+vmware.1-tkg.1 Reconcile succeeded 7d22h
stc-tkgm-ns-2 stc-tkgm-wld-cluster-2-kapp-controller kapp-controller.tanzu.vmware.com 0.41.5+vmware.1-tkg.1 Reconcile succeeded 7d16h
tkg-system ako-operator ako-operator-v2.tanzu.vmware.com 0.28.0+vmware.1-tkg.1-zshippable Reconcile succeeded 8d
tkg-system tanzu-addons-manager addons-manager.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
tkg-system tanzu-auth tanzu-auth.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
tkg-system tanzu-cliplugins cliplugins.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
tkg-system tanzu-core-management-plugins core-management-plugins.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
tkg-system tanzu-featuregates featuregates.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
tkg-system tanzu-framework framework.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
tkg-system tkg-clusterclass tkg-clusterclass.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
tkg-system tkg-clusterclass-vsphere tkg-clusterclass-vsphere.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
tkg-system tkg-pkg tkg.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
tkg-system tkg-stc-mgmt-cluster-antrea antrea.tanzu.vmware.com 1.7.2+vmware.1-tkg.1-advanced Reconcile succeeded 8d
tkg-system tkg-stc-mgmt-cluster-capabilities capabilities.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
tkg-system tkg-stc-mgmt-cluster-load-balancer-and-ingress-service load-balancer-and-ingress-service.tanzu.vmware.com 1.8.2+vmware.1-tkg.1 Reconcile succeeded 8d
tkg-system tkg-stc-mgmt-cluster-metrics-server metrics-server.tanzu.vmware.com 0.6.2+vmware.1-tkg.1 Reconcile succeeded 8d
tkg-system tkg-stc-mgmt-cluster-pinniped pinniped.tanzu.vmware.com 0.12.1+vmware.2-tkg.3 Reconcile succeeded 8d
tkg-system tkg-stc-mgmt-cluster-secretgen-controller secretgen-controller.tanzu.vmware.com 0.11.2+vmware.1-tkg.1 Reconcile succeeded 8d
tkg-system tkg-stc-mgmt-cluster-tkg-storageclass tkg-storageclass.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
tkg-system tkg-stc-mgmt-cluster-vsphere-cpi vsphere-cpi.tanzu.vmware.com 1.24.3+vmware.1-tkg.1 Reconcile succeeded 8d
tkg-system tkg-stc-mgmt-cluster-vsphere-csi vsphere-csi.tanzu.vmware.com 2.6.2+vmware.2-tkg.1 Reconcile succeeded 8d
tkg-system tkr-service tkr-service.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
tkg-system tkr-source-controller tkr-source-controller.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
tkg-system tkr-vsphere-resolver tkr-vsphere-resolver.tanzu.vmware.com 0.28.0+vmware.1 Reconcile succeeded 8d
TKG Workload cluster deployment #
Now that we have done all the initial configs to support our TKG environment on vSphere, NSX and Avi, to deploy a workload cluster is as simple as loading a game on the Commodore 64 📼 From your bootstrap machine make sure you are in the context of your TKG Managment cluster:
andreasm@tkg-bootstrap:~/.config/tanzu/tkg/providers$ kubectl config current-context
tkg-stc-mgmt-cluster-admin@tkg-stc-mgmt-cluster
I you prefer to deploy your workload clusters in its own Kubernetes namespace go ahead and create a namespace for your workload cluster like this:
kubectl create ns "name-of-namespace"
Now to create a workload cluster, this also needs a yaml definition file. The easiest way to achieve such a file is to re-use the bootstramp yaml we created for our TKG Management cluster. For more information deploying a workload cluster in TKG read here.By using the Tanzu CLI we can convert this bootstrap file to a workload cluster yaml definiton file, this is done like this:
tanzu cluster create stc-tkgm-wld-cluster-1 --namespace=stc-tkgm-ns-1 --file tkg-mgmt-bootstrap-tkg-2.1.yaml --dry-run > stc-tkg-wld-cluster-1.yaml
The command above read the bootstrap yaml file we used to deploy the TKG management cluster, converts it into a yaml file we can use to deploy a workload cluster. It alse removes unnecessary fields not needed for our workload cluster. I am also using the –namespace field to point the config to use the correct namespace and automatically put that into the yaml file. then I am pointing to the TKG Management bootstrap yaml file and finally the –dry-run command to pipe it to a file called stc-tkg-wld-cluster-1.yaml. The result should look something like this:
apiVersion: cpi.tanzu.vmware.com/v1alpha1
kind: VSphereCPIConfig
metadata:
name: stc-tkgm-wld-cluster-1
namespace: stc-tkgm-ns-1
spec:
vsphereCPI:
ipFamily: ipv4
mode: vsphereCPI
tlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
---
apiVersion: csi.tanzu.vmware.com/v1alpha1
kind: VSphereCSIConfig
metadata:
name: stc-tkgm-wld-cluster-1
namespace: stc-tkgm-ns-1
spec:
vsphereCSI:
config:
datacenter: /cPod-NSXAM-STC
httpProxy: ""
httpsProxy: ""
noProxy: ""
region: null
tlsThumbprint: 22:FD
useTopologyCategories: false
zone: null
mode: vsphereCSI
---
apiVersion: run.tanzu.vmware.com/v1alpha3
kind: ClusterBootstrap
metadata:
annotations:
tkg.tanzu.vmware.com/add-missing-fields-from-tkr: v1.24.9---vmware.1-tkg.1
name: stc-tkgm-wld-cluster-1
namespace: stc-tkgm-ns-1
spec:
additionalPackages:
- refName: metrics-server*
- refName: secretgen-controller*
- refName: pinniped*
cpi:
refName: vsphere-cpi*
valuesFrom:
providerRef:
apiGroup: cpi.tanzu.vmware.com
kind: VSphereCPIConfig
name: stc-tkgm-wld-cluster-1
csi:
refName: vsphere-csi*
valuesFrom:
providerRef:
apiGroup: csi.tanzu.vmware.com
kind: VSphereCSIConfig
name: stc-tkgm-wld-cluster-1
kapp:
refName: kapp-controller*
---
apiVersion: v1
kind: Secret
metadata:
name: stc-tkgm-wld-cluster-1
namespace: stc-tkgm-ns-1
stringData:
password: Password
username: andreasm@cpod-nsxam-stc.az-stc.cloud-garage.net
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
annotations:
osInfo: ubuntu,20.04,amd64
tkg/plan: dev
labels:
tkg.tanzu.vmware.com/cluster-name: stc-tkgm-wld-cluster-1
name: stc-tkgm-wld-cluster-1
namespace: stc-tkgm-ns-1
spec:
clusterNetwork:
pods:
cidrBlocks:
- 100.96.0.0/11
services:
cidrBlocks:
- 100.64.0.0/13
topology:
class: tkg-vsphere-default-v1.0.0
controlPlane:
metadata:
annotations:
run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
replicas: 1
variables:
- name: controlPlaneCertificateRotation
value:
activate: true
daysBefore: 90
- name: auditLogging
value:
enabled: false
- name: podSecurityStandard
value:
audit: baseline
deactivated: false
warn: baseline
- name: apiServerEndpoint
value: ""
- name: aviAPIServerHAProvider
value: true
- name: vcenter
value:
cloneMode: fullClone
datacenter: /cPod-NSXAM-STC
datastore: /cPod-NSXAM-STC/datastore/vsanDatastore
folder: /cPod-NSXAM-STC/vm/TKGm
network: /cPod-NSXAM-STC/network/ls-tkg-mgmt #Notice this - if you want to place your workload clusters in a different network change this to your desired portgroup.
resourcePool: /cPod-NSXAM-STC/host/Cluster/Resources
server: vcsa.cpod-nsxam-stc.az-stc.cloud-garage.net
storagePolicyID: ""
template: /cPod-NSXAM-STC/vm/ubuntu-2004-efi-kube-v1.24.9+vmware.1
tlsThumbprint: 22:FD
- name: user
value:
sshAuthorizedKeys:
- ssh-rsa 88qv2fowMT65qwpBHUIybHz5Ra2L53zwsv/5yvUej48QLmyAalSNNeH+FIKTkFiuX/WjsHiCIMFisn5dqpc/6x8=
- name: controlPlane
value:
machine:
diskGiB: 20
memoryMiB: 4096
numCPUs: 2
- name: worker
value:
count: 2
machine:
diskGiB: 20
memoryMiB: 4096
numCPUs: 2
version: v1.24.9+vmware.1
workers:
machineDeployments:
- class: tkg-worker
metadata:
annotations:
run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
name: md-0
replicas: 2
Read through the result, edit if you find something you would like to change. If you want to deploy your workload cluster on a different network than your Management cluster edit this field to reflect the correct portgroup in vCenter:
network: /cPod-NSXAM-STC/network/ls-tkg-mgmt
Now that the yaml defintion is ready we can create the first workload cluster like this:
tanzu cluster create --file stc-tkg-wld-cluster-1.yaml
You can monitor the progress from the terminal of your bootstrap machine. When done check your cluster status with Tanzu CLI (remember to either use -n “nameofnamespace” or just -A):
andreasm@tkg-bootstrap:~$ tanzu cluster list -A
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN TKR
stc-tkgm-wld-cluster-1 stc-tkgm-ns-1 running 1/1 2/2 v1.24.9+vmware.1 <none> dev v1.24.9---vmware.1-tkg.1
stc-tkgm-wld-cluster-2 stc-tkgm-ns-2 running 1/1 2/2 v1.24.9+vmware.1 <none> dev v1.24.9---vmware.1-tkg.1
Further verifications can be done with this command:
andreasm@tkg-bootstrap:~$ tanzu cluster get stc-tkgm-wld-cluster-1 -n stc-tkgm-ns-1
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES TKR
stc-tkgm-wld-cluster-1 stc-tkgm-ns-1 running 1/1 2/2 v1.24.9+vmware.1 <none> v1.24.9---vmware.1-tkg.1
Details:
NAME READY SEVERITY REASON SINCE MESSAGE
/stc-tkgm-wld-cluster-1 True 7d22h
├─ClusterInfrastructure - VSphereCluster/stc-tkgm-wld-cluster-1-lzjxq True 7d22h
├─ControlPlane - KubeadmControlPlane/stc-tkgm-wld-cluster-1-22z8x True 7d22h
│ └─Machine/stc-tkgm-wld-cluster-1-22z8x-jjb66 True 7d22h
└─Workers
└─MachineDeployment/stc-tkgm-wld-cluster-1-md-0-2qmkw True 3d3h
├─Machine/stc-tkgm-wld-cluster-1-md-0-2qmkw-6c4789d7b5-lj5wl True 7d22h
└─Machine/stc-tkgm-wld-cluster-1-md-0-2qmkw-6c4789d7b5-wb7k9 True 7d22h
If everything is green its time to get the kubeconfig for the cluster so we can start consume it. This is done like this:
tanzu cluster kubeconfig get stc-tkgm-wld-cluster-1 --namespace stc-tkgm-ns-1 --admin --export-file stc-tkgm-wld-cluster-1-k8s-config.yaml
Now you can copy this to your preferred workstation and start consuming.
Note! The kubeconfigs I have used here is all admin privileges and is not something you will use in production where you want to have granular user access. I will create a post around user management in both TKGm and TKGs later.
The next sections will cover how to upgrade TKG, some configs on the workload clusters themselves around AKO and Antrea.
Antrea configs #
If there is a feature you would like to enable in Antrea in one of your workload clusters, we need to create an AntreaConfig by using the AntreaConfig CRD (this is one way of doing it) and apply it on the Namespace where your workload cluster resides. This is the same approach as we do in vSphere 8 with Tanzu - see here
apiVersion: cni.tanzu.vmware.com/v1alpha1
kind: AntreaConfig
metadata:
name: stc-tkgm-wld-cluster-1-antrea-package # notice the naming-convention cluster name-antrea-package
namespace: stc-tkgm-ns-1 # your vSphere Namespace the TKC cluster is in.
spec:
antrea:
config:
featureGates:
AntreaProxy: true
EndpointSlice: false
AntreaPolicy: true
FlowExporter: true
Egress: true
NodePortLocal: true
AntreaTraceflow: true
NetworkPolicyStats: true
Avi/AKO configs #
In TKGm we can override the default AKO settings by using AKODeploymentConfig CRD. We apply this configuration from the TKG Managment cluster on the respective Workload cluster by using labels. An example of such a config yaml:
apiVersion: networking.tkg.tanzu.vmware.com/v1alpha1
kind: AKODeploymentConfig
metadata:
name: ako-stc-tkgm-wld-cluster-1
spec:
adminCredentialRef:
name: avi-controller-credentials
namespace: tkg-system-networking
certificateAuthorityRef:
name: avi-controller-ca
namespace: tkg-system-networking
cloudName: stc-nsx-cloud
clusterSelector:
matchLabels:
ako-stc-wld-1: "ako-l7"
controller: 172.24.3.50
dataNetwork:
cidr: 10.13.103.0/24
name: vip-tkg-wld-l7
controlPlaneNetwork:
cidr: 10.13.102.0/24
name: vip-tkg-wld-l4
extraConfigs:
cniPlugin: antrea
disableStaticRouteSync: false # required
ingress:
defaultIngressController: true
disableIngressClass: false # required
nodeNetworkList: # required
- cidrs:
- 10.13.21.0/24
networkName: ls-tkg-wld-1
serviceType: NodePortLocal # required
shardVSSize: SMALL # required
l4Config:
autoFQDN: default
networksConfig:
nsxtT1LR: /infra/tier-1s/Tier-1
serviceEngineGroup: tkgm-se-group
Notice the:
clusterSelector:
matchLabels:
ako-stc-wld-1: "ako-l7"
We need to apply this label to our workload cluster. From the TKG management cluster list all your clusters:
amarqvardsen@amarqvards1MD6T:~/Kubernetes-library/tkgm/stc-tkgm/stc-tkgm-wld-cluster-1$ k get cluster -A
NAMESPACE NAME PHASE AGE VERSION
stc-tkgm-ns-1 stc-tkgm-wld-cluster-1 Provisioned 7d23h v1.24.9+vmware.1
stc-tkgm-ns-2 stc-tkgm-wld-cluster-2 Provisioned 7d17h v1.24.9+vmware.1
tkg-system tkg-stc-mgmt-cluster Provisioned 8d v1.24.9+vmware.1
Apply the above label:
kubectl label cluster -n stc-tkgm-ns-1 stc-tkgm-wld-cluster-1 ako-stc-wld-1=ako-l7
Now run the get cluster command again but with the value –show-labels to see if it has been applied:
amarqvardsen@amarqvards1MD6T:~/Kubernetes-library/tkgm/stc-tkgm/stc-tkgm-wld-cluster-1$ k get cluster -A --show-labels
NAMESPACE NAME PHASE AGE VERSION LABELS
stc-tkgm-ns-1 stc-tkgm-wld-cluster-1 Provisioned 7d23h v1.24.9+vmware.1 ako-stc-wld-1=ako-l7,cluster.x-k8s.io/cluster-name=stc-tkgm-wld-cluster-1,networking.tkg.tanzu.vmware.com/avi=ako-stc-tkgm-wld-cluster-1,run.tanzu.vmware.com/tkr=v1.24.9---vmware.1-tkg.1,tkg.tanzu.vmware.com/cluster-name=stc-tkgm-wld-cluster-1,topology.cluster.x-k8s.io/owned=
Looks good. Then we can apply the AKODeploymentConfig above.
k apply -f ako-wld-cluster-1.yaml
Verify if the AKODeploymentConfig has been applied:
amarqvardsen@amarqvards1MD6T:~/Kubernetes-library/tkgm/stc-tkgm/stc-tkgm-wld-cluster-1$ k get akodeploymentconfigs.networking.tkg.tanzu.vmware.com
NAME AGE
ako-stc-tkgm-wld-cluster-1 7d21h
ako-stc-tkgm-wld-cluster-2 7d6h
install-ako-for-all 8d
install-ako-for-management-cluster 8d
Now head back your workload cluster and check the AKO pod whether it has been restarted, if you dont want to wait you can always delete the pod to speed up the changes. To verify the changes have a look at the ako configmap like this:
amarqvardsen@amarqvards1MD6T:~/Kubernetes-library/tkgm/stc-tkgm/stc-tkgm-wld-cluster-1$ k get configmaps -n avi-system avi-k8s-config -oyaml
apiVersion: v1
data:
apiServerPort: "8080"
autoFQDN: default
cloudName: stc-nsx-cloud
clusterName: stc-tkgm-ns-1-stc-tkgm-wld-cluster-1
cniPlugin: antrea
controllerIP: 172.24.3.50
controllerVersion: 22.1.2
defaultIngController: "true"
deleteConfig: "false"
disableStaticRouteSync: "false"
fullSyncFrequency: "1800"
logLevel: INFO
nodeNetworkList: '[{"networkName":"ls-tkg-wld-1","cidrs":["10.13.21.0/24"]}]'
nsxtT1LR: /infra/tier-1s/Tier-1
serviceEngineGroupName: tkgm-se-group
serviceType: NodePortLocal
shardVSSize: SMALL
vipNetworkList: '[{"networkName":"vip-tkg-wld-l7","cidr":"10.13.103.0/24"}]'
kind: ConfigMap
metadata:
annotations:
kapp.k14s.io/identity: v1;avi-system//ConfigMap/avi-k8s-config;v1
kapp.k14s.io/original: '{"apiVersion":"v1","data":{"apiServerPort":"8080","autoFQDN":"default","cloudName":"stc-nsx-cloud","clusterName":"stc-tkgm-ns-1-stc-tkgm-wld-cluster-1","cniPlugin":"antrea","controllerIP":"172.24.3.50","controllerVersion":"22.1.2","defaultIngController":"true","deleteConfig":"false","disableStaticRouteSync":"false","fullSyncFrequency":"1800","logLevel":"INFO","nodeNetworkList":"[{\"networkName\":\"ls-tkg-wld-1\",\"cidrs\":[\"10.13.21.0/24\"]}]","nsxtT1LR":"/infra/tier-1s/Tier-1","serviceEngineGroupName":"tkgm-se-group","serviceType":"NodePortLocal","shardVSSize":"SMALL","vipNetworkList":"[{\"networkName\":\"vip-tkg-wld-l7\",\"cidr\":\"10.13.103.0/24\"}]"},"kind":"ConfigMap","metadata":{"labels":{"kapp.k14s.io/app":"1678977773033139694","kapp.k14s.io/association":"v1.ae838cced3b6caccc5a03bfb3ae65cd7"},"name":"avi-k8s-config","namespace":"avi-system"}}'
kapp.k14s.io/original-diff-md5: c6e94dc94aed3401b5d0f26ed6c0bff3
creationTimestamp: "2023-03-16T14:43:11Z"
labels:
kapp.k14s.io/app: "1678977773033139694"
kapp.k14s.io/association: v1.ae838cced3b6caccc5a03bfb3ae65cd7
name: avi-k8s-config
namespace: avi-system
resourceVersion: "19561"
uid: 1baa90b2-e5d7-4177-ae34-6c558b5cfe29
It should reflect the changes we applied…
Antrea RBAC #
Antrea comes with a list of Tiers where we can place our Antrea Native Policies. These can also be used to restrict who is allowed to apply policies and not. See this page for more information for now. I will update this section later with my own details - including the integration with NSX.
Upgrade TKG (from 2.1 to 2.1.1) #
When a new TKG relase is available we can upgrade to use this new release. The steps I have followed are explained in detail here. I recommend to always follow the updated information there.
To upgrade TKG these are the typical steps:
- Download the latest Tanzu CLI - from my.vmware.com
- Download the latest Tanzu kubectl - from my.vmware.com
- Download the latest Photon or Ubuntu OVA VM template - from my.vmware.com
- Upgrade the TKG Management cluster
- Upgrade the TKG Workload clusters
So lets get into it.
Upgrade CLI tools and dependencies #
I have already downloaded the Ubuntu VM image for version 2.1.1 into my vCenter and converted it to a template. I have also downloaded the Tanzu CLI tools and Tanzu kubectl for version 2.1.1. Now I need to install the Tanzu CLI and Tanzu kubectl. So I will getting back into my bootstrap machine used previously where I already have Tanzu CLI 2.1 installed.
The first thing I need to is to delete the following file:
~/.config/tanzu/tkg/compatibility/tkg-compatibility.yaml
Extract the downloaded Tanzu CLI 2.1.1 packages (this will create a cli folder where you are placed. So if you want to use another folder create this first and extract the file in there) :
tar -xvf tanzu-cli-bundle-linux-amd64.tar.gz
andreasm@tkg-bootstrap:~/tanzu$ tar -xvf tanzu-cli-bundle-linux-amd64.2.1.1.tar.gz
cli/
cli/core/
cli/core/v0.28.1/
cli/core/v0.28.1/tanzu-core-linux_amd64
cli/tanzu-framework-plugins-standalone-linux-amd64.tar.gz
cli/tanzu-framework-plugins-context-linux-amd64.tar.gz
cli/ytt-linux-amd64-v0.43.1+vmware.1.gz
cli/kapp-linux-amd64-v0.53.2+vmware.1.gz
cli/imgpkg-linux-amd64-v0.31.1+vmware.1.gz
cli/kbld-linux-amd64-v0.35.1+vmware.1.gz
cli/vendir-linux-amd64-v0.30.1+vmware.1.gz
Navigate to the cli folder and install the different packages.
Install Tanzu CLI:
andreasm@tkg-bootstrap:~/tanzu/cli$ sudo install core/v0.28.1/tanzu-core-linux_amd64 /usr/local/bin/tanzu
Initialize the Tanzu CLI:
andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu init
ℹ Checking for required plugins...
ℹ Installing plugin 'secret:v0.28.1' with target 'kubernetes'
ℹ Installing plugin 'isolated-cluster:v0.28.1'
ℹ Installing plugin 'login:v0.28.1'
ℹ Installing plugin 'management-cluster:v0.28.1' with target 'kubernetes'
ℹ Installing plugin 'package:v0.28.1' with target 'kubernetes'
ℹ Installing plugin 'pinniped-auth:v0.28.1'
ℹ Installing plugin 'telemetry:v0.28.1' with target 'kubernetes'
ℹ Successfully installed all required plugins
✔ successfully initialized CLI
Verify version:
andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu version
version: v0.28.1
buildDate: 2023-03-07
sha: 0e6704777-dirty
Now the Tanzu plugins:
andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu plugin clean
✔ successfully cleaned up all plugins
andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu plugin sync
ℹ Checking for required plugins...
ℹ Installing plugin 'management-cluster:v0.28.1' with target 'kubernetes'
ℹ Installing plugin 'secret:v0.28.1' with target 'kubernetes'
ℹ Installing plugin 'telemetry:v0.28.1' with target 'kubernetes'
ℹ Installing plugin 'cluster:v0.28.0' with target 'kubernetes'
ℹ Installing plugin 'kubernetes-release:v0.28.0' with target 'kubernetes'
ℹ Installing plugin 'login:v0.28.1'
ℹ Installing plugin 'package:v0.28.1' with target 'kubernetes'
ℹ Installing plugin 'pinniped-auth:v0.28.1'
ℹ Installing plugin 'feature:v0.28.0' with target 'kubernetes'
ℹ Installing plugin 'isolated-cluster:v0.28.1'
✖ [unable to fetch the plugin metadata for plugin "login": could not find the artifact for version:v0.28.1, os:linux, arch:amd64, unable to fetch the plugin metadata for plugin "package": could not find the artifact for version:v0.28.1, os:linux, arch:amd64, unable to fetch the plugin metadata for plugin "pinniped-auth": could not find the artifact for version:v0.28.1, os:linux, arch:amd64, unable to fetch the plugin metadata for plugin "isolated-cluster": could not find the artifact for version:v0.28.1, os:linux, arch:amd64]
andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu plugin sync
ℹ Checking for required plugins...
ℹ Installing plugin 'pinniped-auth:v0.28.1'
ℹ Installing plugin 'isolated-cluster:v0.28.1'
ℹ Installing plugin 'login:v0.28.1'
ℹ Installing plugin 'package:v0.28.1' with target 'kubernetes'
ℹ Successfully installed all required plugins
✔ Done
Note! I had to run the comand twice as I ecountered an issue on first try. Now list the plugins:
andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu plugin list
Standalone Plugins
NAME DESCRIPTION TARGET DISCOVERY VERSION STATUS
isolated-cluster isolated-cluster operations default v0.28.1 installed
login Login to the platform default v0.28.1 installed
pinniped-auth Pinniped authentication operations (usually not directly invoked) default v0.28.1 installed
management-cluster Kubernetes management-cluster operations kubernetes default v0.28.1 installed
package Tanzu package management kubernetes default v0.28.1 installed
secret Tanzu secret management kubernetes default v0.28.1 installed
telemetry Configure cluster-wide telemetry settings kubernetes default v0.28.1 installed
Plugins from Context: tkg-stc-mgmt-cluster
NAME DESCRIPTION TARGET VERSION STATUS
cluster Kubernetes cluster operations kubernetes v0.28.0 installed
feature Operate on features and featuregates kubernetes v0.28.0 installed
kubernetes-release Kubernetes release operations kubernetes v0.28.0 installed
Install the Tanzu kubectl:
andreasm@tkg-bootstrap:~/tanzu$ gunzip kubectl-linux-v1.24.10+vmware.1.gz
andreasm@tkg-bootstrap:~/tanzu$ chmod ugo+x kubectl-linux-v1.24.10+vmware.1
andreasm@tkg-bootstrap:~/tanzu$ sudo install kubectl-linux-v1.24.10+vmware.1 /usr/local/bin/kubectl
Check version:
andreasm@tkg-bootstrap:~/tanzu$ kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.10+vmware.1", GitCommit:"b980a736cbd2ac0c5f7ca793122fd4231f705889", GitTreeState:"clean", BuildDate:"2023-01-24T15:36:34Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.9+vmware.1", GitCommit:"d1d7c19c9b6265a8dcd1b2ab2620ec0fc7cee784", GitTreeState:"clean", BuildDate:"2022-12-14T06:23:39Z", GoVersion:"go1.18.9", Compiler:"gc", Platform:"linux/amd64"}
Install the Carvel tools. From the cli folder first out is ytt. Install ytt:
andreasm@tkg-bootstrap:~/tanzu/cli$ gunzip ytt-linux-amd64-v0.43.1+vmware.1.gz
andreasm@tkg-bootstrap:~/tanzu/cli$ chmod ugo+x ytt-linux-amd64-v0.43.1+vmware.1
andreasm@tkg-bootstrap:~/tanzu/cli$ sudo mv ./ytt-linux-amd64-v0.43.1+vmware.1 /usr/local/bin/ytt
andreasm@tkg-bootstrap:~/tanzu/cli$ ytt --version
ytt version 0.43.1
Instal kapp:
andreasm@tkg-bootstrap:~/tanzu/cli$ gunzip kapp-linux-amd64-v0.53.2+vmware.1.gz
andreasm@tkg-bootstrap:~/tanzu/cli$ chmod ugo+x kapp-linux-amd64-v0.53.2+vmware.1
andreasm@tkg-bootstrap:~/tanzu/cli$ sudo mv ./kapp-linux-amd64-v0.53.2+vmware.1 /usr/local/bin/kapp
andreasm@tkg-bootstrap:~/tanzu/cli$ kapp --version
kapp version 0.53.2
Succeeded
Install kbld:
andreasm@tkg-bootstrap:~/tanzu/cli$ gunzip kbld-linux-amd64-v0.35.1+vmware.1.gz
andreasm@tkg-bootstrap:~/tanzu/cli$ chmod ugo+x kbld-linux-amd64-v0.35.1+vmware.1
andreasm@tkg-bootstrap:~/tanzu/cli$ sudo mv ./kbld-linux-amd64-v0.35.1+vmware.1 /usr/local/bin/kbld
andreasm@tkg-bootstrap:~/tanzu/cli$ kbld --version
kbld version 0.35.1
Succeeded
Install imgpkg:
andreasm@tkg-bootstrap:~/tanzu/cli$ gunzip imgpkg-linux-amd64-v0.31.1+vmware.1.gz
andreasm@tkg-bootstrap:~/tanzu/cli$ chmod ugo+x imgpkg-linux-amd64-v0.31.1+vmware.1
andreasm@tkg-bootstrap:~/tanzu/cli$ sudo mv ./imgpkg-linux-amd64-v0.31.1+vmware.1 /usr/local/bin/imgpkg
andreasm@tkg-bootstrap:~/tanzu/cli$ imgpkg --version
imgpkg version 0.31.1
Succeeded
We have done the verification of the different versions, but we should have Tanzu cli version v0.28.1
Upgrade the TKG Management cluster #
Now we can proceed with the upgrade process. One important document to check is this! Known Issues… Check whether you are using environments, if you happen to use them we need to unset them.
andreasm@tkg-bootstrap:~/tanzu/cli$ printenv
I am clear here and will now start the upgrading of my standalone TKG Management cluster Make sure you are in the context of the TKG management cluster and that you have converted the new Ubuntu VM image as template.
andreasm@tkg-bootstrap:~$ kubectl config current-context
tkg-stc-mgmt-cluster-admin@tkg-stc-mgmt-cluster
If not, use the following command:
andreasm@tkg-bootstrap:~$ tanzu login
? Select a server [Use arrows to move, type to filter]
> tkg-stc-mgmt-cluster()
+ new server
andreasm@tkg-bootstrap:~$ tanzu login
? Select a server tkg-stc-mgmt-cluster()
✔ successfully logged in to management cluster using the kubeconfig tkg-stc-mgmt-cluster
ℹ Checking for required plugins...
ℹ All required plugins are already installed and up-to-date
Here goes: (To start the upgrade of the management cluster)
andreasm@tkg-bootstrap:~$ tanzu mc upgrade
Upgrading management cluster 'tkg-stc-mgmt-cluster' to TKG version 'v2.1.1' with Kubernetes version 'v1.24.10+vmware.1'. Are you sure? [y/N]:
Eh…. yes…
Progress:
andreasm@tkg-bootstrap:~$ tanzu mc upgrade
Upgrading management cluster 'tkg-stc-mgmt-cluster' to TKG version 'v2.1.1' with Kubernetes version 'v1.24.10+vmware.1'. Are you sure? [y/N]: y
Validating the compatibility before management cluster upgrade
Validating for the required environment variables to be set
Validating for the user configuration secret to be existed in the cluster
Warning: unable to find component 'kube_rbac_proxy' under BoM
Upgrading management cluster providers...
infrastructure-ipam-in-cluster provider's version is missing in BOM file, so it would not be upgraded
Checking cert-manager version...
Cert-manager is already up to date
Performing upgrade...
Scaling down Provider="cluster-api" Version="" Namespace="capi-system"
Scaling down Provider="bootstrap-kubeadm" Version="" Namespace="capi-kubeadm-bootstrap-system"
Scaling down Provider="control-plane-kubeadm" Version="" Namespace="capi-kubeadm-control-plane-system"
Scaling down Provider="infrastructure-vsphere" Version="" Namespace="capv-system"
Deleting Provider="cluster-api" Version="" Namespace="capi-system"
Installing Provider="cluster-api" Version="v1.2.8" TargetNamespace="capi-system"
Deleting Provider="bootstrap-kubeadm" Version="" Namespace="capi-kubeadm-bootstrap-system"
Installing Provider="bootstrap-kubeadm" Version="v1.2.8" TargetNamespace="capi-kubeadm-bootstrap-system"
Deleting Provider="control-plane-kubeadm" Version="" Namespace="capi-kubeadm-control-plane-system"
Installing Provider="control-plane-kubeadm" Version="v1.2.8" TargetNamespace="capi-kubeadm-control-plane-system"
Deleting Provider="infrastructure-vsphere" Version="" Namespace="capv-system"
Installing Provider="infrastructure-vsphere" Version="v1.5.3" TargetNamespace="capv-system"
Management cluster providers upgraded successfully...
Preparing addons manager for upgrade
Upgrading kapp-controller...
Adding last-applied annotation on kapp-controller...
Removing old management components...
Upgrading management components...
ℹ Updating package repository 'tanzu-management'
ℹ Getting package repository 'tanzu-management'
ℹ Validating provided settings for the package repository
ℹ Updating package repository resource
ℹ Waiting for 'PackageRepository' reconciliation for 'tanzu-management'
ℹ 'PackageRepository' resource install status: Reconciling
ℹ 'PackageRepository' resource install status: ReconcileSucceeded
ℹ Updated package repository 'tanzu-management' in namespace 'tkg-system'
ℹ Installing package 'tkg.tanzu.vmware.com'
ℹ Updating package 'tkg-pkg'
ℹ Getting package install for 'tkg-pkg'
ℹ Getting package metadata for 'tkg.tanzu.vmware.com'
ℹ Updating secret 'tkg-pkg-tkg-system-values'
ℹ Updating package install for 'tkg-pkg'
ℹ Waiting for 'PackageInstall' reconciliation for 'tkg-pkg'
ℹ 'PackageInstall' resource install status: ReconcileSucceeded
ℹ Updated installed package 'tkg-pkg'
Cleanup core packages repository...
Core package repository not found, no need to cleanup
Upgrading management cluster kubernetes version...
Upgrading kubernetes cluster to `v1.24.10+vmware.1` version, tkr version: `v1.24.10+vmware.1-tkg.2`
Waiting for kubernetes version to be updated for control plane nodes...
Waiting for kubernetes version to be updated for worker nodes...
In vCenter we should start see some action also:

Two control plane nodes:

No longer:

management cluster is opted out of telemetry - skipping telemetry image upgrade
Creating tkg-bom versioned ConfigMaps...
Management cluster 'tkg-stc-mgmt-cluster' successfully upgraded to TKG version 'v2.1.1' with kubernetes version 'v1.24.10+vmware.1'
ℹ Checking for required plugins...
ℹ Installing plugin 'kubernetes-release:v0.28.1' with target 'kubernetes'
ℹ Installing plugin 'cluster:v0.28.1' with target 'kubernetes'
ℹ Installing plugin 'feature:v0.28.1' with target 'kubernetes'
ℹ Successfully installed all required plugins
Well, it finished successfully.
Lets verify with Tanzu CLI:
andreasm@tkg-bootstrap:~$ tanzu cluster list --include-management-cluster -A
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN TKR
stc-tkgm-wld-cluster-1 stc-tkgm-ns-1 running 1/1 2/2 v1.24.9+vmware.1 <none> dev v1.24.9---vmware.1-tkg.1
stc-tkgm-wld-cluster-2 stc-tkgm-ns-2 running 1/1 2/2 v1.24.9+vmware.1 <none> dev v1.24.9---vmware.1-tkg.1
tkg-stc-mgmt-cluster tkg-system running 1/1 2/2 v1.24.10+vmware.1 management dev v1.24.10---vmware.1-tkg.2
Looks good, notice the different versions. Management cluster is upgraded to latest version, workload clusters are still on its older version. They are up next.
Lets do a last check before we head to Workload cluster upgrade.
andreasm@tkg-bootstrap:~$ tanzu mc get
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN TKR
tkg-stc-mgmt-cluster tkg-system running 1/1 2/2 v1.24.10+vmware.1 management dev v1.24.10---vmware.1-tkg.2
Details:
NAME READY SEVERITY REASON SINCE MESSAGE
/tkg-stc-mgmt-cluster True 17m
├─ClusterInfrastructure - VSphereCluster/tkg-stc-mgmt-cluster-xw6xs True 8d
├─ControlPlane - KubeadmControlPlane/tkg-stc-mgmt-cluster-wrxtl True 17m
│ └─Machine/tkg-stc-mgmt-cluster-wrxtl-csrnt True 24m
└─Workers
└─MachineDeployment/tkg-stc-mgmt-cluster-md-0-vs9dc True 10m
├─Machine/tkg-stc-mgmt-cluster-md-0-vs9dc-54554f9575-7hdfc True 14m
└─Machine/tkg-stc-mgmt-cluster-md-0-vs9dc-54554f9575-ng9lx True 7m4s
Providers:
NAMESPACE NAME TYPE PROVIDERNAME VERSION WATCHNAMESPACE
caip-in-cluster-system infrastructure-ipam-in-cluster InfrastructureProvider ipam-in-cluster v0.1.0
capi-kubeadm-bootstrap-system bootstrap-kubeadm BootstrapProvider kubeadm v1.2.8
capi-kubeadm-control-plane-system control-plane-kubeadm ControlPlaneProvider kubeadm v1.2.8
capi-system cluster-api CoreProvider cluster-api v1.2.8
capv-system infrastructure-vsphere InfrastructureProvider vsphere v1.5.3
Congrats, head over to next level 😄
Upgrade workload cluster #
This procedure is much simpler, almost as simple as starting a game in MS-DOS 6.2 requiring a bit over 600kb convential memory. Make sure your are still in the TKG Management cluster context.
As done above list out the cluster you have and notice the versions they are on now.:
andreasm@tkg-bootstrap:~$ tanzu cluster list --include-management-cluster -A
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN TKR
stc-tkgm-wld-cluster-1 stc-tkgm-ns-1 running 1/1 2/2 v1.24.9+vmware.1 <none> dev v1.24.9---vmware.1-tkg.1
stc-tkgm-wld-cluster-2 stc-tkgm-ns-2 running 1/1 2/2 v1.24.9+vmware.1 <none> dev v1.24.9---vmware.1-tkg.1
tkg-stc-mgmt-cluster tkg-system running 1/1 2/2 v1.24.10+vmware.1 management dev v1.24.10---vmware.1-tkg.2
Check if there are any new releases available from the management cluster:
andreasm@tkg-bootstrap:~$ tanzu kubernetes-release get
NAME VERSION COMPATIBLE ACTIVE UPDATES AVAILABLE
v1.22.17---vmware.1-tkg.2 v1.22.17+vmware.1-tkg.2 True True
v1.23.16---vmware.1-tkg.2 v1.23.16+vmware.1-tkg.2 True True
v1.24.10---vmware.1-tkg.2 v1.24.10+vmware.1-tkg.2 True True
There is one there.. v1.24.10 and its compatible.
Lets check whether there are any updates ready for our workload cluster:
andreasm@tkg-bootstrap:~$ tanzu cluster available-upgrades get -n stc-tkgm-ns-1 stc-tkgm-wld-cluster-1
NAME VERSION COMPATIBLE
v1.24.10---vmware.1-tkg.2 v1.24.10+vmware.1-tkg.2 True
It is…
Lets upgrade it:
andreasm@tkg-bootstrap:~$ tanzu cluster upgrade -n stc-tkgm-ns-1 stc-tkgm-wld-cluster-1
Upgrading workload cluster 'stc-tkgm-wld-cluster-1' to kubernetes version 'v1.24.10+vmware.1', tkr version 'v1.24.10+vmware.1-tkg.2'. Are you sure? [y/N]: y
Upgrading kubernetes cluster to `v1.24.10+vmware.1` version, tkr version: `v1.24.10+vmware.1-tkg.2`
Waiting for kubernetes version to be updated for control plane nodes...
y for YES
Sit back and wait for the upgrade process is to do its thing. You can monitor the output from the current terminal, and if something is happening in vCenter. Clone operations, power on, power off and delete.

And the result is in:
Waiting for kubernetes version to be updated for worker nodes...
Cluster 'stc-tkgm-wld-cluster-1' successfully upgraded to kubernetes version 'v1.24.10+vmware.1'
We have a winner.
Lets quickly check with Tanzu CLI:
andreasm@tkg-bootstrap:~$ tanzu cluster get stc-tkgm-wld-cluster-1 -n stc-tkgm-ns-1
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES TKR
stc-tkgm-wld-cluster-1 stc-tkgm-ns-1 running 1/1 2/2 v1.24.10+vmware.1 <none> v1.24.10---vmware.1-tkg.2
Details:
NAME READY SEVERITY REASON SINCE MESSAGE
/stc-tkgm-wld-cluster-1 True 11m
├─ClusterInfrastructure - VSphereCluster/stc-tkgm-wld-cluster-1-lzjxq True 8d
├─ControlPlane - KubeadmControlPlane/stc-tkgm-wld-cluster-1-22z8x True 11m
│ └─Machine/stc-tkgm-wld-cluster-1-22z8x-mtpgs True 15m
└─Workers
└─MachineDeployment/stc-tkgm-wld-cluster-1-md-0-2qmkw True 39m
├─Machine/stc-tkgm-wld-cluster-1-md-0-2qmkw-58c5764865-7xvfn True 8m31s
└─Machine/stc-tkgm-wld-cluster-1-md-0-2qmkw-58c5764865-c7rqj True 3m29s
Couldn’t be better. Thats it then. Its Friday so have a great weekend and thanks for reading.