Tanzu Kubernetes Grid 2.1

Overview

Tanzu Kubernetes Grid

This post will go through how to deploy TKG 2.1, the management cluster, a workload cluster (or two), and the necessary preparations to be done on the underlaying infrastructure to support TKG 2.1. In this post I will use vSphere 8 with vSAN, Avi LoadBalancer, and NSX. So what we want to end up with it something like this:

Preparations before deployment

This post will assume the following:

  • vSphere is already installed configured. See more info here and here

  • NSX has already been configured (see this post for how to configure NSX). Segments used for both Management cluster and Workload clusters should have DHCP server available. We dont need DHCP for Workload Cluster, but Management needs DHCP. NSX can provide DHCP server functionality for this use *

  • NSX Advanced LoadBalancer has been deployed (and configured with a NSX cloud). See this post for how to configure this. **

  • Import the VM template for TKG, see here

  • A dedicated Linux machine/VM we can use as the bootstrap host, with the Tanzu CLI installed. See more info here

(*) TKG 2.1 is not tied to NSX the same way as TKGs - So we can choose to use NSX for Security only or the full stack with networking and security. The built in NSX loadbalancer will not be used, I will use the NSX Advanced Loadbalancer (Avi)

(**) I want to use the NSX cloud in Avi as it gives several benefits such as integration into the NSX manager where Avi automatically creates security groups, tags and services to easily be used in security policy creation and automatic "route plumbing" for the VIPs.

TKG Management cluster - deployment

The first step after all the pre-requirements have been done is to prepare a bootstrap yaml for the management cluster. I will post an example file here and go through what the different fields means and why I have configured them and why I have uncommented some of them. Start by logging into the bootstrap machine, or if you decide to create the bootstrap yaml somewhere else go ahead but we need to copy it over to the bootstrap machine when we are ready to create the the management cluster.

To get started with a bootstrap yaml file we can either grab an example from here or in your bootstrap machine there is a folder which contains a default config you can start out with:

 1andreasm@tkg-bootstrap:~/.config/tanzu/tkg/providers$ ll
 2total 120
 3drwxrwxr-x 18 andreasm andreasm  4096 Mar 24 09:10 ./
 4drwx------  9 andreasm andreasm  4096 Mar 16 11:32 ../
 5drwxrwxr-x  2 andreasm andreasm  4096 Mar 16 06:52 ako/
 6drwxrwxr-x  3 andreasm andreasm  4096 Mar 16 06:52 bootstrap-kubeadm/
 7drwxrwxr-x  4 andreasm andreasm  4096 Mar 16 06:52 cert-manager/
 8drwxrwxr-x  3 andreasm andreasm  4096 Mar 16 06:52 cluster-api/
 9-rw-------  1 andreasm andreasm  1293 Mar 16 06:52 config.yaml
10-rw-------  1 andreasm andreasm 32007 Mar 16 06:52 config_default.yaml
11drwxrwxr-x  3 andreasm andreasm  4096 Mar 16 06:52 control-plane-kubeadm/
12drwxrwxr-x  5 andreasm andreasm  4096 Mar 16 06:52 infrastructure-aws/
13drwxrwxr-x  5 andreasm andreasm  4096 Mar 16 06:52 infrastructure-azure/
14drwxrwxr-x  6 andreasm andreasm  4096 Mar 16 06:52 infrastructure-docker/
15drwxrwxr-x  3 andreasm andreasm  4096 Mar 16 06:52 infrastructure-ipam-in-cluster/
16drwxrwxr-x  5 andreasm andreasm  4096 Mar 16 06:52 infrastructure-oci/
17drwxrwxr-x  4 andreasm andreasm  4096 Mar 16 06:52 infrastructure-tkg-service-vsphere/
18drwxrwxr-x  5 andreasm andreasm  4096 Mar 16 06:52 infrastructure-vsphere/
19drwxrwxr-x  2 andreasm andreasm  4096 Mar 16 06:52 kapp-controller-values/
20-rwxrwxr-x  1 andreasm andreasm    64 Mar 16 06:52 providers.sha256sum*
21-rw-------  1 andreasm andreasm     0 Mar 16 06:52 v0.28.0
22-rw-------  1 andreasm andreasm   747 Mar 16 06:52 vendir.lock.yml
23-rw-------  1 andreasm andreasm   903 Mar 16 06:52 vendir.yml
24drwxrwxr-x  8 andreasm andreasm  4096 Mar 16 06:52 ytt/
25drwxrwxr-x  2 andreasm andreasm  4096 Mar 16 06:52 yttcb/
26drwxrwxr-x  7 andreasm andreasm  4096 Mar 16 06:52 yttcc/
27andreasm@tkg-bootstrap:~/.config/tanzu/tkg/providers$

The file you should be looking for is called config_default.yaml . It could be a smart choice to use this as it will include the latest config parameters following the TKG version you have downloaded (Tanzu CLI).

Now copy this file to a folder of preference and start to edit it. Below is a copy of an example I am using:

  1#! ---------------
  2#! Basic config
  3#! -------------
  4CLUSTER_NAME: tkg-stc-mgmt-cluster #Name of the TKG mgmt cluster
  5CLUSTER_PLAN: dev #Dev or Prod, defines the amount of control plane nodes of the mgmt cluster
  6INFRASTRUCTURE_PROVIDER: vsphere #We are deploying on vSphere, could be AWS, Azure 
  7ENABLE_CEIP_PARTICIPATION: "false" #Customer Experience Improvement Program - set to true if you will participate
  8ENABLE_AUDIT_LOGGING: "false" #Audit logging should be true in production environments
  9CLUSTER_CIDR: 100.96.0.0/11 #Kubernetes Cluster CIDR
 10SERVICE_CIDR: 100.64.0.0/13 #Kubernetes Services CIDR
 11TKG_IP_FAMILY: ipv4 #ipv4 or ipv6
 12DEPLOY_TKG_ON_VSPHERE7: "true" #Yes to deploy standalone tkg mgmt cluster on vSphere
 13
 14#! ---------------
 15#! vSphere config
 16#! -------------
 17VSPHERE_DATACENTER: /cPod-NSXAM-STC #Name of vSphere Datacenter
 18VSPHERE_DATASTORE: /cPod-NSXAM-STC/datastore/vsanDatastore #Name and path of vSphere datastore to be used
 19VSPHERE_FOLDER: /cPod-NSXAM-STC/vm/TKGm #Name and path to VM folder
 20VSPHERE_INSECURE: "false" #True if you dont want to verify vCenter thumprint below
 21VSPHERE_NETWORK: /cPod-NSXAM-STC/network/ls-tkg-mgmt #A network portgroup (VDS or NSX Segment) for VM placement
 22VSPHERE_CONTROL_PLANE_ENDPOINT: "" #Required if using Kube-Vip, I am using Avi Loadbalancer for this
 23VSPHERE_PASSWORD: "password" #vCenter account password for account defined below
 24VSPHERE_RESOURCE_POOL: /cPod-NSXAM-STC/host/Cluster/Resources #If you want to use a specific vSphere Resource Pool for the mgmt cluster. Leave it as is if not.
 25VSPHERE_SERVER: vcsa.cpod-nsxam-stc.az-stc.cloud-garage.net #DNS record to vCenter Server
 26VSPHERE_SSH_AUTHORIZED_KEY: ssh-rsa sdfgasdgadfgsdg sdfsdf@sdfsdf.net # your bootstrap machineSSH public key
 27VSPHERE_TLS_THUMBPRINT: 22:FD # Your vCenter SHA1 Thumbprint
 28VSPHERE_USERNAME: user@vspheresso/or/ad/user/domain #A user with the correct permissions
 29
 30#! ---------------
 31#! Node config
 32#! -------------
 33OS_ARCH: amd64
 34OS_NAME: ubuntu
 35OS_VERSION: "20.04"
 36VSPHERE_CONTROL_PLANE_DISK_GIB: "20"
 37VSPHERE_CONTROL_PLANE_MEM_MIB: "4096"
 38VSPHERE_CONTROL_PLANE_NUM_CPUS: "2"
 39VSPHERE_WORKER_DISK_GIB: "20"
 40VSPHERE_WORKER_MEM_MIB: "4096"
 41VSPHERE_WORKER_NUM_CPUS: "2"
 42CONTROL_PLANE_MACHINE_COUNT: 1
 43WORKER_MACHINE_COUNT: 2
 44
 45#! ---------------
 46#! Avi config
 47#! -------------
 48AVI_CA_DATA_B64: #Base64 of the Avi Certificate  
 49AVI_CLOUD_NAME: stc-nsx-cloud #Name of the cloud defined in Avi
 50AVI_CONTROL_PLANE_HA_PROVIDER: "true" #True as we want to use Avi as K8s API endpoint 
 51AVI_CONTROLLER: 172.24.3.50 #IP or Hostname Avi controller or controller cluster
 52# Network used to place workload clusters' endpoint VIPs - If you want to use a separate vip for Workload clusters Kubernetes API endpoint
 53AVI_CONTROL_PLANE_NETWORK: vip-tkg-wld-l4 #Corresponds with network defined in Avi
 54AVI_CONTROL_PLANE_NETWORK_CIDR: 10.13.102.0/24 #Corresponds with network defined in Avi
 55# Network used to place workload clusters' services external IPs (load balancer & ingress services)
 56AVI_DATA_NETWORK: vip-tkg-wld-l7 #Corresponds with network defined in Avi
 57AVI_DATA_NETWORK_CIDR: 10.13.103.0/24 #Corresponds with network defined in Avi
 58# Network used to place management clusters' services external IPs (load balancer & ingress services)
 59AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_CIDR: 10.13.101.0/24 #Corresponds with network defined in Avi
 60AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_NAME: vip-tkg-mgmt-l7 #Corresponds with network defined in Avi
 61# Network used to place management clusters' endpoint VIPs
 62AVI_MANAGEMENT_CLUSTER_CONTROL_PLANE_VIP_NETWORK_NAME: vip-tkg-mgmt-l4 #Corresponds with network defined in Avi
 63AVI_MANAGEMENT_CLUSTER_CONTROL_PLANE_VIP_NETWORK_CIDR: 10.13.100.0/24 #Corresponds with network defined in Avi
 64AVI_NSXT_T1LR: /infra/tier-1s/Tier-1 #Path to the NSX T1 you have configured, click on three dots in NSX on the T1 to get the full path.
 65AVI_CONTROLLER_VERSION: 22.1.2 #Latest supported version of Avi for TKG 2.1
 66AVI_ENABLE: "true" # Enables Avi as Loadbalancer for workloads
 67AVI_LABELS: "" #When used Avi is enabled only workload cluster with corresponding label
 68AVI_PASSWORD: "password" #Password for the account used in Avi, username defined below
 69AVI_SERVICE_ENGINE_GROUP: stc-nsx #Service Engine group for Workload clusters if you want to have separate groups for Workload clusters and Management cluster
 70AVI_MANAGEMENT_CLUSTER_SERVICE_ENGINE_GROUP: tkgm-se-group #Dedicated Service Engine group for management cluster
 71AVI_USERNAME: admin
 72AVI_DISABLE_STATIC_ROUTE_SYNC: true #Pod network reachable or not from the Avi Service Engines
 73AVI_INGRESS_DEFAULT_INGRESS_CONTROLLER: true #If you want to use AKO as default ingress controller, false if you plan to use other ingress controllers also.
 74AVI_INGRESS_SHARD_VS_SIZE: SMALL #Decides the amount of shared vs pr ip.
 75AVI_INGRESS_SERVICE_TYPE: NodePortLocal #NodePortLocal only when using Antrea, otherwise NodePort or ClusterIP
 76AVI_CNI_PLUGIN: antrea
 77
 78#! ---------------
 79#! Proxy config
 80#! -------------
 81TKG_HTTP_PROXY_ENABLED: "false"
 82
 83#! ---------------------------------------------------------------------
 84#! Antrea CNI configuration
 85#! ---------------------------------------------------------------------
 86# ANTREA_NO_SNAT: false
 87# ANTREA_TRAFFIC_ENCAP_MODE: "encap"
 88# ANTREA_PROXY: false
 89# ANTREA_POLICY: true
 90# ANTREA_TRACEFLOW: false
 91ANTREA_NODEPORTLOCAL: true
 92ANTREA_PROXY: true
 93ANTREA_ENDPOINTSLICE: true
 94ANTREA_POLICY: true
 95ANTREA_TRACEFLOW: true
 96ANTREA_NETWORKPOLICY_STATS: false
 97ANTREA_EGRESS: true
 98ANTREA_IPAM: false
 99ANTREA_FLOWEXPORTER: false
100ANTREA_SERVICE_EXTERNALIP: false
101ANTREA_MULTICAST: false
102
103#! ---------------------------------------------------------------------
104#! Machine Health Check configuration
105#! ---------------------------------------------------------------------
106ENABLE_MHC: "true"
107ENABLE_MHC_CONTROL_PLANE: true
108ENABLE_MHC_WORKER_NODE: true
109MHC_UNKNOWN_STATUS_TIMEOUT: 5m
110MHC_FALSE_STATUS_TIMEOUT: 12m
111
112#! ---------------------------------------------------------------------
113#! Identity management configuration
114#! ---------------------------------------------------------------------
115
116IDENTITY_MANAGEMENT_TYPE: none #I have disabled this, use kubeconfig instead
117#LDAP_BIND_DN: CN=Andreas M,OU=Users,OU=GUZWARE,DC=guzware,DC=local
118#LDAP_BIND_PASSWORD: <encoded:UHNAc=>
119#LDAP_GROUP_SEARCH_BASE_DN: DC=guzware,DC=local
120#LDAP_GROUP_SEARCH_FILTER: (objectClass=group)
121#LDAP_GROUP_SEARCH_GROUP_ATTRIBUTE: member
122#LDAP_GROUP_SEARCH_NAME_ATTRIBUTE: cn
123#LDAP_GROUP_SEARCH_USER_ATTRIBUTE: distinguishedName
124#LDAP_HOST: guzad07.guzware.local:636
125#LDAP_ROOT_CA_DATA_B64: LS0tLS1CRUd
126#LDAP_USER_SEARCH_BASE_DN: DC=guzware,DC=local
127#LDAP_USER_SEARCH_FILTER: (objectClass=person)
128#LDAP_USER_SEARCH_NAME_ATTRIBUTE: uid
129#LDAP_USER_SEARCH_USERNAME: uid
130#OIDC_IDENTITY_PROVIDER_CLIENT_ID: ""
131#OIDC_IDENTITY_PROVIDER_CLIENT_SECRET: ""
132#OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM: ""
133#OIDC_IDENTITY_PROVIDER_ISSUER_URL: ""
134#OIDC_IDENTITY_PROVIDER_NAME: ""
135#OIDC_IDENTITY_PROVIDER_SCOPES: ""
136#OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM: ""

For additional explanations of the different values see here

When you feel you are ready with the bootstrap yaml file its time to deploy the management cluster. From your bootstrap machine where Tanzu CLI have been installed enter the following command:

1tanzu mc create --file path/to/cluster-config-file.yaml

For more information around this process have a look here

The first thing that happens is some validation checks, if those pass it will continue to build a local bootstrap cluster on your bootstrap machine before building the TKG Management cluster in your vSphere cluster.

Note! If you happen to use a an IP range within 172.16.0.0/12 on your computer you are accessing the bootstrap machine through you should edit the default Docker network. Otherwise you will loose connection to your bootstrap machine. This is done like this:

Add or edit, if it exists, the /etc/docker/daemon.json file with the following content:

1{
2 "default-address-pools":
3 [
4 {"base":"192.168.0.0/16","size":24}
5 ]
6}

Restart docker service or reboot the machine.

Now back to the tanzu create process, you can monitor the progress from the terminal of your bootstrap machine, and you should after a while see machines being cloned from your template and powered on. In the Avi controller you should also see a new virtual service being created:

The ip address depicted above is the sole control plane node as I am deploying a TKG management cluster using plan dev. If the progress in your bootstrap machine indicates that it is done, you can check the status with the following command:

1tanzu mc get

This will give you this output:

 1  NAME                  NAMESPACE   STATUS   CONTROLPLANE  WORKERS  KUBERNETES        ROLES       PLAN  TKR
 2  tkg-stc-mgmt-cluster  tkg-system  running  1/1           2/2      v1.24.9+vmware.1  management  dev   v1.24.9---vmware.1-tkg.1
 3
 4
 5Details:
 6
 7NAME                                                                 READY  SEVERITY  REASON  SINCE  MESSAGE
 8/tkg-stc-mgmt-cluster                                                True                     8d
 9├─ClusterInfrastructure - VSphereCluster/tkg-stc-mgmt-cluster-xw6xs  True                     8d
10├─ControlPlane - KubeadmControlPlane/tkg-stc-mgmt-cluster-wrxtl      True                     8d
11│ └─Machine/tkg-stc-mgmt-cluster-wrxtl-gkv5m                         True                     8d
12└─Workers
13  └─MachineDeployment/tkg-stc-mgmt-cluster-md-0-vs9dc                True                     3d3h
14    ├─Machine/tkg-stc-mgmt-cluster-md-0-vs9dc-55c649d9fc-gnpz4       True                     8d
15    └─Machine/tkg-stc-mgmt-cluster-md-0-vs9dc-55c649d9fc-gwfvt       True                     8d
16
17
18Providers:
19
20  NAMESPACE                          NAME                            TYPE                    PROVIDERNAME     VERSION  WATCHNAMESPACE
21  caip-in-cluster-system             infrastructure-ipam-in-cluster  InfrastructureProvider  ipam-in-cluster  v0.1.0
22  capi-kubeadm-bootstrap-system      bootstrap-kubeadm               BootstrapProvider       kubeadm          v1.2.8
23  capi-kubeadm-control-plane-system  control-plane-kubeadm           ControlPlaneProvider    kubeadm          v1.2.8
24  capi-system                        cluster-api                     CoreProvider            cluster-api      v1.2.8
25  capv-system                        infrastructure-vsphere          InfrastructureProvider  vsphere          v1.5.1

When cluster is ready deployed and before we can access it with our kubectl cli tool we must set the context to it.

1kubectl config use-context my-mgmnt-cluster-admin@my-mgmnt-cluster

But you probably have a dedicated workstation you want to acces the cluster from, then you can export the kubeconfig like this:

1tanzu mc kubeconfig get --admin --export-file MC-ADMIN-KUBECONFIG

Now copy the file to your workstation and accessed the cluster from there.

Tip! Test out this tool to easy manage your Kubernetes configs: https://github.com/sunny0826/kubecm

The above is a really great tool:

 1amarqvardsen@amarqvards1MD6T:~$ kubecm switch --ui-size 10
 2Use the arrow keys to navigate: ↓ ↑ → ←  and / toggles search
 3Select Kube Context
 4  😼 tkc-cluster-1(*)
 5    tkgs-cluster-1-admin@tkgs-cluster-1
 6    wdc-2-tkc-cluster-1
 7    10.13.200.2
 8    andreasmk8slab-admin@andreasmk8slab-pinniped
 9    ns-wdc-3
10    tkc-cluster-1-routed
11    tkg-mgmt-cluster-admin@tkg-mgmt-cluster
12    stc-tkgm-mgmt-cluster
13↓   tkg-wld-1-cluster-admin@tkg-wld-1-cluster
14
15--------- Info ----------
16Name:           tkc-cluster-1
17Cluster:        10.13.202.1
18User:           wcp:10.13.202.1:andreasm@cpod-nsxam-stc.az-stc.cloud-garage.net

Now your TKG management cluster is ready and we can deploy a workload cluster.

If you noticed some warnings around conciliation during deployment, you can check whether they failed or not by issuing this command after you have gotten the kubeconfig context in place to the Management cluster with this command:

 1andreasm@tkg-bootstrap:~$ kubectl get pkgi -A
 2NAMESPACE       NAME                                                     PACKAGE NAME                                         PACKAGE VERSION                    DESCRIPTION           AGE
 3stc-tkgm-ns-1   stc-tkgm-wld-cluster-1-kapp-controller                   kapp-controller.tanzu.vmware.com                     0.41.5+vmware.1-tkg.1              Reconcile succeeded   7d22h
 4stc-tkgm-ns-2   stc-tkgm-wld-cluster-2-kapp-controller                   kapp-controller.tanzu.vmware.com                     0.41.5+vmware.1-tkg.1              Reconcile succeeded   7d16h
 5tkg-system      ako-operator                                             ako-operator-v2.tanzu.vmware.com                     0.28.0+vmware.1-tkg.1-zshippable   Reconcile succeeded   8d
 6tkg-system      tanzu-addons-manager                                     addons-manager.tanzu.vmware.com                      0.28.0+vmware.1                    Reconcile succeeded   8d
 7tkg-system      tanzu-auth                                               tanzu-auth.tanzu.vmware.com                          0.28.0+vmware.1                    Reconcile succeeded   8d
 8tkg-system      tanzu-cliplugins                                         cliplugins.tanzu.vmware.com                          0.28.0+vmware.1                    Reconcile succeeded   8d
 9tkg-system      tanzu-core-management-plugins                            core-management-plugins.tanzu.vmware.com             0.28.0+vmware.1                    Reconcile succeeded   8d
10tkg-system      tanzu-featuregates                                       featuregates.tanzu.vmware.com                        0.28.0+vmware.1                    Reconcile succeeded   8d
11tkg-system      tanzu-framework                                          framework.tanzu.vmware.com                           0.28.0+vmware.1                    Reconcile succeeded   8d
12tkg-system      tkg-clusterclass                                         tkg-clusterclass.tanzu.vmware.com                    0.28.0+vmware.1                    Reconcile succeeded   8d
13tkg-system      tkg-clusterclass-vsphere                                 tkg-clusterclass-vsphere.tanzu.vmware.com            0.28.0+vmware.1                    Reconcile succeeded   8d
14tkg-system      tkg-pkg                                                  tkg.tanzu.vmware.com                                 0.28.0+vmware.1                    Reconcile succeeded   8d
15tkg-system      tkg-stc-mgmt-cluster-antrea                              antrea.tanzu.vmware.com                              1.7.2+vmware.1-tkg.1-advanced      Reconcile succeeded   8d
16tkg-system      tkg-stc-mgmt-cluster-capabilities                        capabilities.tanzu.vmware.com                        0.28.0+vmware.1                    Reconcile succeeded   8d
17tkg-system      tkg-stc-mgmt-cluster-load-balancer-and-ingress-service   load-balancer-and-ingress-service.tanzu.vmware.com   1.8.2+vmware.1-tkg.1               Reconcile succeeded   8d
18tkg-system      tkg-stc-mgmt-cluster-metrics-server                      metrics-server.tanzu.vmware.com                      0.6.2+vmware.1-tkg.1               Reconcile succeeded   8d
19tkg-system      tkg-stc-mgmt-cluster-pinniped                            pinniped.tanzu.vmware.com                            0.12.1+vmware.2-tkg.3              Reconcile succeeded   8d
20tkg-system      tkg-stc-mgmt-cluster-secretgen-controller                secretgen-controller.tanzu.vmware.com                0.11.2+vmware.1-tkg.1              Reconcile succeeded   8d
21tkg-system      tkg-stc-mgmt-cluster-tkg-storageclass                    tkg-storageclass.tanzu.vmware.com                    0.28.0+vmware.1                    Reconcile succeeded   8d
22tkg-system      tkg-stc-mgmt-cluster-vsphere-cpi                         vsphere-cpi.tanzu.vmware.com                         1.24.3+vmware.1-tkg.1              Reconcile succeeded   8d
23tkg-system      tkg-stc-mgmt-cluster-vsphere-csi                         vsphere-csi.tanzu.vmware.com                         2.6.2+vmware.2-tkg.1               Reconcile succeeded   8d
24tkg-system      tkr-service                                              tkr-service.tanzu.vmware.com                         0.28.0+vmware.1                    Reconcile succeeded   8d
25tkg-system      tkr-source-controller                                    tkr-source-controller.tanzu.vmware.com               0.28.0+vmware.1                    Reconcile succeeded   8d
26tkg-system      tkr-vsphere-resolver                                     tkr-vsphere-resolver.tanzu.vmware.com                0.28.0+vmware.1                    Reconcile succeeded   8d

TKG Workload cluster deployment

Now that we have done all the initial configs to support our TKG environment on vSphere, NSX and Avi, to deploy a workload cluster is as simple as loading a game on the Commodore 64 📼 From your bootstrap machine make sure you are in the context of your TKG Managment cluster:

1andreasm@tkg-bootstrap:~/.config/tanzu/tkg/providers$ kubectl config current-context
2tkg-stc-mgmt-cluster-admin@tkg-stc-mgmt-cluster

I you prefer to deploy your workload clusters in its own Kubernetes namespace go ahead and create a namespace for your workload cluster like this:

1kubectl create ns "name-of-namespace"

Now to create a workload cluster, this also needs a yaml definition file. The easiest way to achieve such a file is to re-use the bootstramp yaml we created for our TKG Management cluster. For more information deploying a workload cluster in TKG read here.By using the Tanzu CLI we can convert this bootstrap file to a workload cluster yaml definiton file, this is done like this:

1tanzu cluster create stc-tkgm-wld-cluster-1 --namespace=stc-tkgm-ns-1 --file tkg-mgmt-bootstrap-tkg-2.1.yaml --dry-run > stc-tkg-wld-cluster-1.yaml

The command above read the bootstrap yaml file we used to deploy the TKG management cluster, converts it into a yaml file we can use to deploy a workload cluster. It alse removes unnecessary fields not needed for our workload cluster. I am also using the --namespace field to point the config to use the correct namespace and automatically put that into the yaml file. then I am pointing to the TKG Management bootstrap yaml file and finally the --dry-run command to pipe it to a file called stc-tkg-wld-cluster-1.yaml. The result should look something like this:

  1apiVersion: cpi.tanzu.vmware.com/v1alpha1
  2kind: VSphereCPIConfig
  3metadata:
  4  name: stc-tkgm-wld-cluster-1
  5  namespace: stc-tkgm-ns-1
  6spec:
  7  vsphereCPI:
  8    ipFamily: ipv4
  9    mode: vsphereCPI
 10    tlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
 11---
 12apiVersion: csi.tanzu.vmware.com/v1alpha1
 13kind: VSphereCSIConfig
 14metadata:
 15  name: stc-tkgm-wld-cluster-1
 16  namespace: stc-tkgm-ns-1
 17spec:
 18  vsphereCSI:
 19    config:
 20      datacenter: /cPod-NSXAM-STC
 21      httpProxy: ""
 22      httpsProxy: ""
 23      noProxy: ""
 24      region: null
 25      tlsThumbprint: 22:FD
 26      useTopologyCategories: false
 27      zone: null
 28    mode: vsphereCSI
 29---
 30apiVersion: run.tanzu.vmware.com/v1alpha3
 31kind: ClusterBootstrap
 32metadata:
 33  annotations:
 34    tkg.tanzu.vmware.com/add-missing-fields-from-tkr: v1.24.9---vmware.1-tkg.1
 35  name: stc-tkgm-wld-cluster-1
 36  namespace: stc-tkgm-ns-1
 37spec:
 38  additionalPackages:
 39  - refName: metrics-server*
 40  - refName: secretgen-controller*
 41  - refName: pinniped*
 42  cpi:
 43    refName: vsphere-cpi*
 44    valuesFrom:
 45      providerRef:
 46        apiGroup: cpi.tanzu.vmware.com
 47        kind: VSphereCPIConfig
 48        name: stc-tkgm-wld-cluster-1
 49  csi:
 50    refName: vsphere-csi*
 51    valuesFrom:
 52      providerRef:
 53        apiGroup: csi.tanzu.vmware.com
 54        kind: VSphereCSIConfig
 55        name: stc-tkgm-wld-cluster-1
 56  kapp:
 57    refName: kapp-controller*
 58---
 59apiVersion: v1
 60kind: Secret
 61metadata:
 62  name: stc-tkgm-wld-cluster-1
 63  namespace: stc-tkgm-ns-1
 64stringData:
 65  password: Password
 66  username: andreasm@cpod-nsxam-stc.az-stc.cloud-garage.net
 67---
 68apiVersion: cluster.x-k8s.io/v1beta1
 69kind: Cluster
 70metadata:
 71  annotations:
 72    osInfo: ubuntu,20.04,amd64
 73    tkg/plan: dev
 74  labels:
 75    tkg.tanzu.vmware.com/cluster-name: stc-tkgm-wld-cluster-1
 76  name: stc-tkgm-wld-cluster-1
 77  namespace: stc-tkgm-ns-1
 78spec:
 79  clusterNetwork:
 80    pods:
 81      cidrBlocks:
 82      - 100.96.0.0/11
 83    services:
 84      cidrBlocks:
 85      - 100.64.0.0/13
 86  topology:
 87    class: tkg-vsphere-default-v1.0.0
 88    controlPlane:
 89      metadata:
 90        annotations:
 91          run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
 92      replicas: 1
 93    variables:
 94    - name: controlPlaneCertificateRotation
 95      value:
 96        activate: true
 97        daysBefore: 90
 98    - name: auditLogging
 99      value:
100        enabled: false
101    - name: podSecurityStandard
102      value:
103        audit: baseline
104        deactivated: false
105        warn: baseline
106    - name: apiServerEndpoint
107      value: ""
108    - name: aviAPIServerHAProvider
109      value: true
110    - name: vcenter
111      value:
112        cloneMode: fullClone
113        datacenter: /cPod-NSXAM-STC
114        datastore: /cPod-NSXAM-STC/datastore/vsanDatastore
115        folder: /cPod-NSXAM-STC/vm/TKGm
116        network: /cPod-NSXAM-STC/network/ls-tkg-mgmt #Notice this - if you want to place your workload clusters in a different network change this to your desired portgroup.
117        resourcePool: /cPod-NSXAM-STC/host/Cluster/Resources
118        server: vcsa.cpod-nsxam-stc.az-stc.cloud-garage.net
119        storagePolicyID: ""
120        template: /cPod-NSXAM-STC/vm/ubuntu-2004-efi-kube-v1.24.9+vmware.1
121        tlsThumbprint: 22:FD
122    - name: user
123      value:
124        sshAuthorizedKeys:
125        - ssh-rsa 88qv2fowMT65qwpBHUIybHz5Ra2L53zwsv/5yvUej48QLmyAalSNNeH+FIKTkFiuX/WjsHiCIMFisn5dqpc/6x8=
126    - name: controlPlane
127      value:
128        machine:
129          diskGiB: 20
130          memoryMiB: 4096
131          numCPUs: 2
132    - name: worker
133      value:
134        count: 2
135        machine:
136          diskGiB: 20
137          memoryMiB: 4096
138          numCPUs: 2
139    version: v1.24.9+vmware.1
140    workers:
141      machineDeployments:
142      - class: tkg-worker
143        metadata:
144          annotations:
145            run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
146        name: md-0
147        replicas: 2

Read through the result, edit if you find something you would like to change. If you want to deploy your workload cluster on a different network than your Management cluster edit this field to reflect the correct portgroup in vCenter:

1 network: /cPod-NSXAM-STC/network/ls-tkg-mgmt

Now that the yaml defintion is ready we can create the first workload cluster like this:

1tanzu cluster create --file stc-tkg-wld-cluster-1.yaml

You can monitor the progress from the terminal of your bootstrap machine. When done check your cluster status with Tanzu CLI (remember to either use -n "nameofnamespace" or just -A):

1andreasm@tkg-bootstrap:~$ tanzu cluster list -A
2  NAME                    NAMESPACE      STATUS   CONTROLPLANE  WORKERS  KUBERNETES        ROLES   PLAN  TKR
3  stc-tkgm-wld-cluster-1  stc-tkgm-ns-1  running  1/1           2/2      v1.24.9+vmware.1  <none>  dev   v1.24.9---vmware.1-tkg.1
4  stc-tkgm-wld-cluster-2  stc-tkgm-ns-2  running  1/1           2/2      v1.24.9+vmware.1  <none>  dev   v1.24.9---vmware.1-tkg.1

Further verifications can be done with this command:

 1andreasm@tkg-bootstrap:~$ tanzu cluster get stc-tkgm-wld-cluster-1 -n stc-tkgm-ns-1
 2  NAME                    NAMESPACE      STATUS   CONTROLPLANE  WORKERS  KUBERNETES        ROLES   TKR
 3  stc-tkgm-wld-cluster-1  stc-tkgm-ns-1  running  1/1           2/2      v1.24.9+vmware.1  <none>  v1.24.9---vmware.1-tkg.1
 4
 5
 6Details:
 7
 8NAME                                                                   READY  SEVERITY  REASON  SINCE  MESSAGE
 9/stc-tkgm-wld-cluster-1                                                True                     7d22h
10├─ClusterInfrastructure - VSphereCluster/stc-tkgm-wld-cluster-1-lzjxq  True                     7d22h
11├─ControlPlane - KubeadmControlPlane/stc-tkgm-wld-cluster-1-22z8x      True                     7d22h
12│ └─Machine/stc-tkgm-wld-cluster-1-22z8x-jjb66                         True                     7d22h
13└─Workers
14  └─MachineDeployment/stc-tkgm-wld-cluster-1-md-0-2qmkw                True                     3d3h
15    ├─Machine/stc-tkgm-wld-cluster-1-md-0-2qmkw-6c4789d7b5-lj5wl       True                     7d22h
16    └─Machine/stc-tkgm-wld-cluster-1-md-0-2qmkw-6c4789d7b5-wb7k9       True                     7d22h

If everything is green its time to get the kubeconfig for the cluster so we can start consume it. This is done like this:

1tanzu cluster kubeconfig get stc-tkgm-wld-cluster-1 --namespace stc-tkgm-ns-1 --admin --export-file stc-tkgm-wld-cluster-1-k8s-config.yaml

Now you can copy this to your preferred workstation and start consuming.

Note! The kubeconfigs I have used here is all admin privileges and is not something you will use in production where you want to have granular user access. I will create a post around user management in both TKGm and TKGs later.

The next sections will cover how to upgrade TKG, some configs on the workload clusters themselves around AKO and Antrea.

Antrea configs

If there is a feature you would like to enable in Antrea in one of your workload clusters, we need to create an AntreaConfig by using the AntreaConfig CRD (this is one way of doing it) and apply it on the Namespace where your workload cluster resides. This is the same approach as we do in vSphere 8 with Tanzu - see here

 1apiVersion: cni.tanzu.vmware.com/v1alpha1
 2kind: AntreaConfig
 3metadata:
 4  name: stc-tkgm-wld-cluster-1-antrea-package  # notice the naming-convention cluster name-antrea-package
 5  namespace: stc-tkgm-ns-1 # your vSphere Namespace the TKC cluster is in.
 6spec:
 7  antrea:
 8    config:
 9      featureGates:
10        AntreaProxy: true
11        EndpointSlice: false
12        AntreaPolicy: true
13        FlowExporter: true
14        Egress: true
15        NodePortLocal: true
16        AntreaTraceflow: true
17        NetworkPolicyStats: true

Avi/AKO configs

In TKGm we can override the default AKO settings by using AKODeploymentConfig CRD. We apply this configuration from the TKG Managment cluster on the respective Workload cluster by using labels. An example of such a config yaml:

 1apiVersion: networking.tkg.tanzu.vmware.com/v1alpha1
 2kind: AKODeploymentConfig
 3metadata:
 4  name: ako-stc-tkgm-wld-cluster-1
 5spec:
 6  adminCredentialRef:
 7    name: avi-controller-credentials
 8    namespace: tkg-system-networking
 9  certificateAuthorityRef:
10    name: avi-controller-ca
11    namespace: tkg-system-networking
12  cloudName: stc-nsx-cloud
13  clusterSelector:
14    matchLabels:
15      ako-stc-wld-1: "ako-l7"
16  controller: 172.24.3.50
17  dataNetwork:
18    cidr: 10.13.103.0/24
19    name: vip-tkg-wld-l7
20  controlPlaneNetwork:
21    cidr: 10.13.102.0/24
22    name: vip-tkg-wld-l4
23  extraConfigs:
24    cniPlugin: antrea
25    disableStaticRouteSync: false                               # required
26    ingress:
27      defaultIngressController: true
28      disableIngressClass: false                                # required
29      nodeNetworkList:                                          # required
30        - cidrs:
31            - 10.13.21.0/24
32          networkName: ls-tkg-wld-1
33      serviceType: NodePortLocal                                # required
34      shardVSSize: SMALL                                        # required
35    l4Config:
36      autoFQDN: default
37    networksConfig:
38      nsxtT1LR: /infra/tier-1s/Tier-1
39  serviceEngineGroup: tkgm-se-group

Notice the:

1  clusterSelector:
2    matchLabels:
3      ako-stc-wld-1: "ako-l7"

We need to apply this label to our workload cluster. From the TKG management cluster list all your clusters:

1amarqvardsen@amarqvards1MD6T:~/Kubernetes-library/tkgm/stc-tkgm/stc-tkgm-wld-cluster-1$ k get cluster -A
2NAMESPACE       NAME                     PHASE         AGE     VERSION
3stc-tkgm-ns-1   stc-tkgm-wld-cluster-1   Provisioned   7d23h   v1.24.9+vmware.1
4stc-tkgm-ns-2   stc-tkgm-wld-cluster-2   Provisioned   7d17h   v1.24.9+vmware.1
5tkg-system      tkg-stc-mgmt-cluster     Provisioned   8d      v1.24.9+vmware.1

Apply the above label:

1kubectl label cluster -n stc-tkgm-ns-1 stc-tkgm-wld-cluster-1 ako-stc-wld-1=ako-l7

Now run the get cluster command again but with the value --show-labels to see if it has been applied:

1amarqvardsen@amarqvards1MD6T:~/Kubernetes-library/tkgm/stc-tkgm/stc-tkgm-wld-cluster-1$ k get cluster -A --show-labels
2NAMESPACE       NAME                     PHASE         AGE     VERSION            LABELS
3stc-tkgm-ns-1   stc-tkgm-wld-cluster-1   Provisioned   7d23h   v1.24.9+vmware.1   ako-stc-wld-1=ako-l7,cluster.x-k8s.io/cluster-name=stc-tkgm-wld-cluster-1,networking.tkg.tanzu.vmware.com/avi=ako-stc-tkgm-wld-cluster-1,run.tanzu.vmware.com/tkr=v1.24.9---vmware.1-tkg.1,tkg.tanzu.vmware.com/cluster-name=stc-tkgm-wld-cluster-1,topology.cluster.x-k8s.io/owned=

Looks good. Then we can apply the AKODeploymentConfig above.

1k apply -f ako-wld-cluster-1.yaml

Verify if the AKODeploymentConfig has been applied:

1amarqvardsen@amarqvards1MD6T:~/Kubernetes-library/tkgm/stc-tkgm/stc-tkgm-wld-cluster-1$ k get akodeploymentconfigs.networking.tkg.tanzu.vmware.com
2NAME                                 AGE
3ako-stc-tkgm-wld-cluster-1           7d21h
4ako-stc-tkgm-wld-cluster-2           7d6h
5install-ako-for-all                  8d
6install-ako-for-management-cluster   8d

Now head back your workload cluster and check the AKO pod whether it has been restarted, if you dont want to wait you can always delete the pod to speed up the changes. To verify the changes have a look at the ako configmap like this:

 1amarqvardsen@amarqvards1MD6T:~/Kubernetes-library/tkgm/stc-tkgm/stc-tkgm-wld-cluster-1$ k get configmaps -n avi-system avi-k8s-config -oyaml
 2apiVersion: v1
 3data:
 4  apiServerPort: "8080"
 5  autoFQDN: default
 6  cloudName: stc-nsx-cloud
 7  clusterName: stc-tkgm-ns-1-stc-tkgm-wld-cluster-1
 8  cniPlugin: antrea
 9  controllerIP: 172.24.3.50
10  controllerVersion: 22.1.2
11  defaultIngController: "true"
12  deleteConfig: "false"
13  disableStaticRouteSync: "false"
14  fullSyncFrequency: "1800"
15  logLevel: INFO
16  nodeNetworkList: '[{"networkName":"ls-tkg-wld-1","cidrs":["10.13.21.0/24"]}]'
17  nsxtT1LR: /infra/tier-1s/Tier-1
18  serviceEngineGroupName: tkgm-se-group
19  serviceType: NodePortLocal
20  shardVSSize: SMALL
21  vipNetworkList: '[{"networkName":"vip-tkg-wld-l7","cidr":"10.13.103.0/24"}]'
22kind: ConfigMap
23metadata:
24  annotations:
25    kapp.k14s.io/identity: v1;avi-system//ConfigMap/avi-k8s-config;v1
26    kapp.k14s.io/original: '{"apiVersion":"v1","data":{"apiServerPort":"8080","autoFQDN":"default","cloudName":"stc-nsx-cloud","clusterName":"stc-tkgm-ns-1-stc-tkgm-wld-cluster-1","cniPlugin":"antrea","controllerIP":"172.24.3.50","controllerVersion":"22.1.2","defaultIngController":"true","deleteConfig":"false","disableStaticRouteSync":"false","fullSyncFrequency":"1800","logLevel":"INFO","nodeNetworkList":"[{\"networkName\":\"ls-tkg-wld-1\",\"cidrs\":[\"10.13.21.0/24\"]}]","nsxtT1LR":"/infra/tier-1s/Tier-1","serviceEngineGroupName":"tkgm-se-group","serviceType":"NodePortLocal","shardVSSize":"SMALL","vipNetworkList":"[{\"networkName\":\"vip-tkg-wld-l7\",\"cidr\":\"10.13.103.0/24\"}]"},"kind":"ConfigMap","metadata":{"labels":{"kapp.k14s.io/app":"1678977773033139694","kapp.k14s.io/association":"v1.ae838cced3b6caccc5a03bfb3ae65cd7"},"name":"avi-k8s-config","namespace":"avi-system"}}'
27    kapp.k14s.io/original-diff-md5: c6e94dc94aed3401b5d0f26ed6c0bff3
28  creationTimestamp: "2023-03-16T14:43:11Z"
29  labels:
30    kapp.k14s.io/app: "1678977773033139694"
31    kapp.k14s.io/association: v1.ae838cced3b6caccc5a03bfb3ae65cd7
32  name: avi-k8s-config
33  namespace: avi-system
34  resourceVersion: "19561"
35  uid: 1baa90b2-e5d7-4177-ae34-6c558b5cfe29

It should reflect the changes we applied...

Antrea RBAC

Antrea comes with a list of Tiers where we can place our Antrea Native Policies. These can also be used to restrict who is allowed to apply policies and not. See this page for more information for now. I will update this section later with my own details - including the integration with NSX.

Upgrade TKG (from 2.1 to 2.1.1)

When a new TKG relase is available we can upgrade to use this new release. The steps I have followed are explained in detail here. I recommend to always follow the updated information there.

To upgrade TKG these are the typical steps:

  1. Download the latest Tanzu CLI - from my.vmware.com
  2. Download the latest Tanzu kubectl - from my.vmware.com
  3. Download the latest Photon or Ubuntu OVA VM template - from my.vmware.com
  4. Upgrade the TKG Management cluster
  5. Upgrade the TKG Workload clusters

So lets get into it.

Upgrade CLI tools and dependencies

I have already downloaded the Ubuntu VM image for version 2.1.1 into my vCenter and converted it to a template. I have also downloaded the Tanzu CLI tools and Tanzu kubectl for version 2.1.1. Now I need to install the Tanzu CLI and Tanzu kubectl. So I will getting back into my bootstrap machine used previously where I already have Tanzu CLI 2.1 installed.

The first thing I need to is to delete the following file:

1~/.config/tanzu/tkg/compatibility/tkg-compatibility.yaml

Extract the downloaded Tanzu CLI 2.1.1 packages (this will create a cli folder where you are placed. So if you want to use another folder create this first and extract the file in there) :

1tar -xvf tanzu-cli-bundle-linux-amd64.tar.gz
 1andreasm@tkg-bootstrap:~/tanzu$ tar -xvf tanzu-cli-bundle-linux-amd64.2.1.1.tar.gz
 2cli/
 3cli/core/
 4cli/core/v0.28.1/
 5cli/core/v0.28.1/tanzu-core-linux_amd64
 6cli/tanzu-framework-plugins-standalone-linux-amd64.tar.gz
 7cli/tanzu-framework-plugins-context-linux-amd64.tar.gz
 8cli/ytt-linux-amd64-v0.43.1+vmware.1.gz
 9cli/kapp-linux-amd64-v0.53.2+vmware.1.gz
10cli/imgpkg-linux-amd64-v0.31.1+vmware.1.gz
11cli/kbld-linux-amd64-v0.35.1+vmware.1.gz
12cli/vendir-linux-amd64-v0.30.1+vmware.1.gz

Navigate to the cli folder and install the different packages.

Install Tanzu CLI:

1andreasm@tkg-bootstrap:~/tanzu/cli$ sudo install core/v0.28.1/tanzu-core-linux_amd64 /usr/local/bin/tanzu

Initialize the Tanzu CLI:

 1andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu init
 2ℹ  Checking for required plugins...
 3ℹ  Installing plugin 'secret:v0.28.1' with target 'kubernetes'
 4ℹ  Installing plugin 'isolated-cluster:v0.28.1'
 5ℹ  Installing plugin 'login:v0.28.1'
 6ℹ  Installing plugin 'management-cluster:v0.28.1' with target 'kubernetes'
 7ℹ  Installing plugin 'package:v0.28.1' with target 'kubernetes'
 8ℹ  Installing plugin 'pinniped-auth:v0.28.1'
 9ℹ  Installing plugin 'telemetry:v0.28.1' with target 'kubernetes'
10ℹ  Successfully installed all required plugins
11✔  successfully initialized CLI

Verify version:

1andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu version
2version: v0.28.1
3buildDate: 2023-03-07
4sha: 0e6704777-dirty

Now the Tanzu plugins:

1andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu plugin clean
2✔  successfully cleaned up all plugins
 1andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu plugin sync
 2ℹ  Checking for required plugins...
 3ℹ  Installing plugin 'management-cluster:v0.28.1' with target 'kubernetes'
 4ℹ  Installing plugin 'secret:v0.28.1' with target 'kubernetes'
 5ℹ  Installing plugin 'telemetry:v0.28.1' with target 'kubernetes'
 6ℹ  Installing plugin 'cluster:v0.28.0' with target 'kubernetes'
 7ℹ  Installing plugin 'kubernetes-release:v0.28.0' with target 'kubernetes'
 8ℹ  Installing plugin 'login:v0.28.1'
 9ℹ  Installing plugin 'package:v0.28.1' with target 'kubernetes'
10ℹ  Installing plugin 'pinniped-auth:v0.28.1'
11ℹ  Installing plugin 'feature:v0.28.0' with target 'kubernetes'
12ℹ  Installing plugin 'isolated-cluster:v0.28.1'
13[unable to fetch the plugin metadata for plugin "login": could not find the artifact for version:v0.28.1, os:linux, arch:amd64, unable to fetch the plugin metadata for plugin "package": could not find the artifact for version:v0.28.1, os:linux, arch:amd64, unable to fetch the plugin metadata for plugin "pinniped-auth": could not find the artifact for version:v0.28.1, os:linux, arch:amd64, unable to fetch the plugin metadata for plugin "isolated-cluster": could not find the artifact for version:v0.28.1, os:linux, arch:amd64]
14andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu plugin sync
15ℹ  Checking for required plugins...
16ℹ  Installing plugin 'pinniped-auth:v0.28.1'
17ℹ  Installing plugin 'isolated-cluster:v0.28.1'
18ℹ  Installing plugin 'login:v0.28.1'
19ℹ  Installing plugin 'package:v0.28.1' with target 'kubernetes'
20ℹ  Successfully installed all required plugins
21✔  Done

Note! I had to run the comand twice as I ecountered an issue on first try. Now list the plugins:

 1andreasm@tkg-bootstrap:~/tanzu/cli$ tanzu plugin list
 2Standalone Plugins
 3  NAME                DESCRIPTION                                                        TARGET      DISCOVERY  VERSION  STATUS
 4  isolated-cluster    isolated-cluster operations                                                    default    v0.28.1  installed
 5  login               Login to the platform                                                          default    v0.28.1  installed
 6  pinniped-auth       Pinniped authentication operations (usually not directly invoked)              default    v0.28.1  installed
 7  management-cluster  Kubernetes management-cluster operations                           kubernetes  default    v0.28.1  installed
 8  package             Tanzu package management                                           kubernetes  default    v0.28.1  installed
 9  secret              Tanzu secret management                                            kubernetes  default    v0.28.1  installed
10  telemetry           Configure cluster-wide telemetry settings                          kubernetes  default    v0.28.1  installed
11
12Plugins from Context:  tkg-stc-mgmt-cluster
13  NAME                DESCRIPTION                           TARGET      VERSION  STATUS
14  cluster             Kubernetes cluster operations         kubernetes  v0.28.0  installed
15  feature             Operate on features and featuregates  kubernetes  v0.28.0  installed
16  kubernetes-release  Kubernetes release operations         kubernetes  v0.28.0  installed

Install the Tanzu kubectl:

1andreasm@tkg-bootstrap:~/tanzu$ gunzip kubectl-linux-v1.24.10+vmware.1.gz
2andreasm@tkg-bootstrap:~/tanzu$ chmod ugo+x kubectl-linux-v1.24.10+vmware.1
3andreasm@tkg-bootstrap:~/tanzu$ sudo install kubectl-linux-v1.24.10+vmware.1 /usr/local/bin/kubectl

Check version:

1andreasm@tkg-bootstrap:~/tanzu$ kubectl version
2WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
3Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.10+vmware.1", GitCommit:"b980a736cbd2ac0c5f7ca793122fd4231f705889", GitTreeState:"clean", BuildDate:"2023-01-24T15:36:34Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}
4Kustomize Version: v4.5.4
5Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.9+vmware.1", GitCommit:"d1d7c19c9b6265a8dcd1b2ab2620ec0fc7cee784", GitTreeState:"clean", BuildDate:"2022-12-14T06:23:39Z", GoVersion:"go1.18.9", Compiler:"gc", Platform:"linux/amd64"}

Install the Carvel tools. From the cli folder first out is ytt. Install ytt:

1andreasm@tkg-bootstrap:~/tanzu/cli$ gunzip ytt-linux-amd64-v0.43.1+vmware.1.gz
2andreasm@tkg-bootstrap:~/tanzu/cli$ chmod ugo+x ytt-linux-amd64-v0.43.1+vmware.1
3andreasm@tkg-bootstrap:~/tanzu/cli$ sudo mv ./ytt-linux-amd64-v0.43.1+vmware.1 /usr/local/bin/ytt
4andreasm@tkg-bootstrap:~/tanzu/cli$ ytt --version
5ytt version 0.43.1

Instal kapp:

1andreasm@tkg-bootstrap:~/tanzu/cli$ gunzip kapp-linux-amd64-v0.53.2+vmware.1.gz
2andreasm@tkg-bootstrap:~/tanzu/cli$ chmod ugo+x kapp-linux-amd64-v0.53.2+vmware.1
3andreasm@tkg-bootstrap:~/tanzu/cli$ sudo mv ./kapp-linux-amd64-v0.53.2+vmware.1 /usr/local/bin/kapp
4andreasm@tkg-bootstrap:~/tanzu/cli$ kapp --version
5kapp version 0.53.2
6
7Succeeded

Install kbld:

1andreasm@tkg-bootstrap:~/tanzu/cli$ gunzip kbld-linux-amd64-v0.35.1+vmware.1.gz
2andreasm@tkg-bootstrap:~/tanzu/cli$ chmod ugo+x kbld-linux-amd64-v0.35.1+vmware.1
3andreasm@tkg-bootstrap:~/tanzu/cli$ sudo mv ./kbld-linux-amd64-v0.35.1+vmware.1 /usr/local/bin/kbld
4andreasm@tkg-bootstrap:~/tanzu/cli$ kbld --version
5kbld version 0.35.1
6
7Succeeded

Install imgpkg:

1andreasm@tkg-bootstrap:~/tanzu/cli$ gunzip imgpkg-linux-amd64-v0.31.1+vmware.1.gz
2andreasm@tkg-bootstrap:~/tanzu/cli$ chmod ugo+x imgpkg-linux-amd64-v0.31.1+vmware.1
3andreasm@tkg-bootstrap:~/tanzu/cli$ sudo mv ./imgpkg-linux-amd64-v0.31.1+vmware.1 /usr/local/bin/imgpkg
4andreasm@tkg-bootstrap:~/tanzu/cli$ imgpkg --version
5imgpkg version 0.31.1
6
7Succeeded

We have done the verification of the different versions, but we should have Tanzu cli version v0.28.1

Upgrade the TKG Management cluster

Now we can proceed with the upgrade process. One important document to check is this! Known Issues... Check whether you are using environments, if you happen to use them we need to unset them.

1andreasm@tkg-bootstrap:~/tanzu/cli$ printenv

I am clear here and will now start the upgrading of my standalone TKG Management cluster Make sure you are in the context of the TKG management cluster and that you have converted the new Ubuntu VM image as template.

1andreasm@tkg-bootstrap:~$ kubectl config current-context
2tkg-stc-mgmt-cluster-admin@tkg-stc-mgmt-cluster

If not, use the following command:

1andreasm@tkg-bootstrap:~$ tanzu login
2? Select a server  [Use arrows to move, type to filter]
3> tkg-stc-mgmt-cluster()
4  + new server
1andreasm@tkg-bootstrap:~$ tanzu login
2? Select a server tkg-stc-mgmt-cluster()
3✔  successfully logged in to management cluster using the kubeconfig tkg-stc-mgmt-cluster
4ℹ  Checking for required plugins...
5ℹ  All required plugins are already installed and up-to-date

Here goes: (To start the upgrade of the management cluster)

1andreasm@tkg-bootstrap:~$ tanzu mc upgrade
2Upgrading management cluster 'tkg-stc-mgmt-cluster' to TKG version 'v2.1.1' with Kubernetes version 'v1.24.10+vmware.1'. Are you sure? [y/N]:

Eh.... yes...

Progress:

 1andreasm@tkg-bootstrap:~$ tanzu mc upgrade
 2Upgrading management cluster 'tkg-stc-mgmt-cluster' to TKG version 'v2.1.1' with Kubernetes version 'v1.24.10+vmware.1'. Are you sure? [y/N]: y
 3Validating the compatibility before management cluster upgrade
 4Validating for the required environment variables to be set
 5Validating for the user configuration secret to be existed in the cluster
 6Warning: unable to find component 'kube_rbac_proxy' under BoM
 7Upgrading management cluster providers...
 8 infrastructure-ipam-in-cluster provider's version is missing in BOM file, so it would not be upgraded
 9Checking cert-manager version...
10Cert-manager is already up to date
11Performing upgrade...
12Scaling down Provider="cluster-api" Version="" Namespace="capi-system"
13Scaling down Provider="bootstrap-kubeadm" Version="" Namespace="capi-kubeadm-bootstrap-system"
14Scaling down Provider="control-plane-kubeadm" Version="" Namespace="capi-kubeadm-control-plane-system"
15Scaling down Provider="infrastructure-vsphere" Version="" Namespace="capv-system"
16Deleting Provider="cluster-api" Version="" Namespace="capi-system"
17Installing Provider="cluster-api" Version="v1.2.8" TargetNamespace="capi-system"
18Deleting Provider="bootstrap-kubeadm" Version="" Namespace="capi-kubeadm-bootstrap-system"
19Installing Provider="bootstrap-kubeadm" Version="v1.2.8" TargetNamespace="capi-kubeadm-bootstrap-system"
20Deleting Provider="control-plane-kubeadm" Version="" Namespace="capi-kubeadm-control-plane-system"
21Installing Provider="control-plane-kubeadm" Version="v1.2.8" TargetNamespace="capi-kubeadm-control-plane-system"
22Deleting Provider="infrastructure-vsphere" Version="" Namespace="capv-system"
23Installing Provider="infrastructure-vsphere" Version="v1.5.3" TargetNamespace="capv-system"
24Management cluster providers upgraded successfully...
25Preparing addons manager for upgrade
26Upgrading kapp-controller...
27Adding last-applied annotation on kapp-controller...
28Removing old management components...
29Upgrading management components...
30ℹ   Updating package repository 'tanzu-management'
31ℹ   Getting package repository 'tanzu-management'
32ℹ   Validating provided settings for the package repository
33ℹ   Updating package repository resource
34ℹ   Waiting for 'PackageRepository' reconciliation for 'tanzu-management'
35ℹ   'PackageRepository' resource install status: Reconciling
36ℹ   'PackageRepository' resource install status: ReconcileSucceeded
37ℹ  Updated package repository 'tanzu-management' in namespace 'tkg-system'
38ℹ   Installing package 'tkg.tanzu.vmware.com'
39ℹ   Updating package 'tkg-pkg'
40ℹ   Getting package install for 'tkg-pkg'
41ℹ   Getting package metadata for 'tkg.tanzu.vmware.com'
42ℹ   Updating secret 'tkg-pkg-tkg-system-values'
43ℹ   Updating package install for 'tkg-pkg'
44ℹ   Waiting for 'PackageInstall' reconciliation for 'tkg-pkg'
45ℹ   'PackageInstall' resource install status: ReconcileSucceeded
46ℹ  Updated installed package 'tkg-pkg'
47Cleanup core packages repository...
48Core package repository not found, no need to cleanup
49Upgrading management cluster kubernetes version...
50Upgrading kubernetes cluster to `v1.24.10+vmware.1` version, tkr version: `v1.24.10+vmware.1-tkg.2`
51Waiting for kubernetes version to be updated for control plane nodes...
52Waiting for kubernetes version to be updated for worker nodes...

In vCenter we should start see some action also:

Two control plane nodes:

No longer:

1management cluster is opted out of telemetry - skipping telemetry image upgrade
2Creating tkg-bom versioned ConfigMaps...
3Management cluster 'tkg-stc-mgmt-cluster' successfully upgraded to TKG version 'v2.1.1' with kubernetes version 'v1.24.10+vmware.1'
4ℹ  Checking for required plugins...
5ℹ  Installing plugin 'kubernetes-release:v0.28.1' with target 'kubernetes'
6ℹ  Installing plugin 'cluster:v0.28.1' with target 'kubernetes'
7ℹ  Installing plugin 'feature:v0.28.1' with target 'kubernetes'
8ℹ  Successfully installed all required plugins

Well, it finished successfully.

Lets verify with Tanzu CLI:

1andreasm@tkg-bootstrap:~$ tanzu cluster list --include-management-cluster -A
2  NAME                    NAMESPACE      STATUS   CONTROLPLANE  WORKERS  KUBERNETES         ROLES       PLAN  TKR
3  stc-tkgm-wld-cluster-1  stc-tkgm-ns-1  running  1/1           2/2      v1.24.9+vmware.1   <none>      dev   v1.24.9---vmware.1-tkg.1
4  stc-tkgm-wld-cluster-2  stc-tkgm-ns-2  running  1/1           2/2      v1.24.9+vmware.1   <none>      dev   v1.24.9---vmware.1-tkg.1
5  tkg-stc-mgmt-cluster    tkg-system     running  1/1           2/2      v1.24.10+vmware.1  management  dev   v1.24.10---vmware.1-tkg.2

Looks good, notice the different versions. Management cluster is upgraded to latest version, workload clusters are still on its older version. They are up next.

Lets do a last check before we head to Workload cluster upgrade.

 1andreasm@tkg-bootstrap:~$ tanzu mc get
 2  NAME                  NAMESPACE   STATUS   CONTROLPLANE  WORKERS  KUBERNETES         ROLES       PLAN  TKR
 3  tkg-stc-mgmt-cluster  tkg-system  running  1/1           2/2      v1.24.10+vmware.1  management  dev   v1.24.10---vmware.1-tkg.2
 4
 5
 6Details:
 7
 8NAME                                                                 READY  SEVERITY  REASON  SINCE  MESSAGE
 9/tkg-stc-mgmt-cluster                                                True                     17m
10├─ClusterInfrastructure - VSphereCluster/tkg-stc-mgmt-cluster-xw6xs  True                     8d
11├─ControlPlane - KubeadmControlPlane/tkg-stc-mgmt-cluster-wrxtl      True                     17m
12│ └─Machine/tkg-stc-mgmt-cluster-wrxtl-csrnt                         True                     24m
13└─Workers
14  └─MachineDeployment/tkg-stc-mgmt-cluster-md-0-vs9dc                True                     10m
15    ├─Machine/tkg-stc-mgmt-cluster-md-0-vs9dc-54554f9575-7hdfc       True                     14m
16    └─Machine/tkg-stc-mgmt-cluster-md-0-vs9dc-54554f9575-ng9lx       True                     7m4s
17
18
19Providers:
20
21  NAMESPACE                          NAME                            TYPE                    PROVIDERNAME     VERSION  WATCHNAMESPACE
22  caip-in-cluster-system             infrastructure-ipam-in-cluster  InfrastructureProvider  ipam-in-cluster  v0.1.0
23  capi-kubeadm-bootstrap-system      bootstrap-kubeadm               BootstrapProvider       kubeadm          v1.2.8
24  capi-kubeadm-control-plane-system  control-plane-kubeadm           ControlPlaneProvider    kubeadm          v1.2.8
25  capi-system                        cluster-api                     CoreProvider            cluster-api      v1.2.8
26  capv-system                        infrastructure-vsphere          InfrastructureProvider  vsphere          v1.5.3

Congrats, head over to next level 😄

Upgrade workload cluster

This procedure is much simpler, almost as simple as starting a game in MS-DOS 6.2 requiring a bit over 600kb convential memory. Make sure your are still in the TKG Management cluster context.

As done above list out the cluster you have and notice the versions they are on now.:

1andreasm@tkg-bootstrap:~$ tanzu cluster list --include-management-cluster -A
2  NAME                    NAMESPACE      STATUS   CONTROLPLANE  WORKERS  KUBERNETES         ROLES       PLAN  TKR
3  stc-tkgm-wld-cluster-1  stc-tkgm-ns-1  running  1/1           2/2      v1.24.9+vmware.1   <none>      dev   v1.24.9---vmware.1-tkg.1
4  stc-tkgm-wld-cluster-2  stc-tkgm-ns-2  running  1/1           2/2      v1.24.9+vmware.1   <none>      dev   v1.24.9---vmware.1-tkg.1
5  tkg-stc-mgmt-cluster    tkg-system     running  1/1           2/2      v1.24.10+vmware.1  management  dev   v1.24.10---vmware.1-tkg.2

Check if there are any new releases available from the management cluster:

1andreasm@tkg-bootstrap:~$ tanzu kubernetes-release get
2  NAME                       VERSION                  COMPATIBLE  ACTIVE  UPDATES AVAILABLE
3  v1.22.17---vmware.1-tkg.2  v1.22.17+vmware.1-tkg.2  True        True
4  v1.23.16---vmware.1-tkg.2  v1.23.16+vmware.1-tkg.2  True        True
5  v1.24.10---vmware.1-tkg.2  v1.24.10+vmware.1-tkg.2  True        True

There is one there.. v1.24.10 and its compatible.

Lets check whether there are any updates ready for our workload cluster:

1andreasm@tkg-bootstrap:~$ tanzu cluster available-upgrades get -n stc-tkgm-ns-1 stc-tkgm-wld-cluster-1
2  NAME                       VERSION                  COMPATIBLE
3  v1.24.10---vmware.1-tkg.2  v1.24.10+vmware.1-tkg.2  True

It is...

Lets upgrade it:

1andreasm@tkg-bootstrap:~$ tanzu cluster upgrade -n stc-tkgm-ns-1 stc-tkgm-wld-cluster-1
2Upgrading workload cluster 'stc-tkgm-wld-cluster-1' to kubernetes version 'v1.24.10+vmware.1', tkr version 'v1.24.10+vmware.1-tkg.2'. Are you sure? [y/N]: y
3Upgrading kubernetes cluster to `v1.24.10+vmware.1` version, tkr version: `v1.24.10+vmware.1-tkg.2`
4Waiting for kubernetes version to be updated for control plane nodes...

y for YES

Sit back and wait for the upgrade process is to do its thing. You can monitor the output from the current terminal, and if something is happening in vCenter. Clone operations, power on, power off and delete.

And the result is in:

1Waiting for kubernetes version to be updated for worker nodes...
2Cluster 'stc-tkgm-wld-cluster-1' successfully upgraded to kubernetes version 'v1.24.10+vmware.1'

We have a winner.

Lets quickly check with Tanzu CLI:

 1andreasm@tkg-bootstrap:~$ tanzu cluster get stc-tkgm-wld-cluster-1 -n stc-tkgm-ns-1
 2  NAME                    NAMESPACE      STATUS   CONTROLPLANE  WORKERS  KUBERNETES         ROLES   TKR
 3  stc-tkgm-wld-cluster-1  stc-tkgm-ns-1  running  1/1           2/2      v1.24.10+vmware.1  <none>  v1.24.10---vmware.1-tkg.2
 4
 5
 6Details:
 7
 8NAME                                                                   READY  SEVERITY  REASON  SINCE  MESSAGE
 9/stc-tkgm-wld-cluster-1                                                True                     11m
10├─ClusterInfrastructure - VSphereCluster/stc-tkgm-wld-cluster-1-lzjxq  True                     8d
11├─ControlPlane - KubeadmControlPlane/stc-tkgm-wld-cluster-1-22z8x      True                     11m
12│ └─Machine/stc-tkgm-wld-cluster-1-22z8x-mtpgs                         True                     15m
13└─Workers
14  └─MachineDeployment/stc-tkgm-wld-cluster-1-md-0-2qmkw                True                     39m
15    ├─Machine/stc-tkgm-wld-cluster-1-md-0-2qmkw-58c5764865-7xvfn       True                     8m31s
16    └─Machine/stc-tkgm-wld-cluster-1-md-0-2qmkw-58c5764865-c7rqj       True                     3m29s

Couldn't be better. Thats it then. Its Friday so have a great weekend and thanks for reading.