Antrea Multi-cluster - in TKG and vSphere with Tanzu

Overview

Antrea Multi-cluster

From the official Antrea.io documentation:

Antrea Multi-cluster implements Multi-cluster Service API, which allows users to create multi-cluster Services that can be accessed cross clusters in a ClusterSet. Antrea Multi-cluster also supports Antrea ClusterNetworkPolicy replication. Multi-cluster admins can define ClusterNetworkPolicies to be replicated across the entire ClusterSet, and enforced in all member clusters.

An Antrea Multi-cluster ClusterSet includes a leader cluster and multiple member clusters. Antrea Multi-cluster Controller needs to be deployed in the leader and all member clusters. A cluster can serve as the leader, and meanwhile also be a member cluster of the ClusterSet.

The diagram below depicts a basic Antrea Multi-cluster topology with one leader cluster and two member clusters.

antrea-multicluster

In this post I will go through how to configure Antrea Multi-cluster in TKG 2.3 and Tanzu with vSphere. As of the time I am writing this post (end of September begining of October 2023) Tanzu with vSphere does not have all the feature gates available right now to be able to configure Antrea Multi-cluster, so this will be added later. After the initial configuration and installation of Antrea Multi-cluster I will go through the different possibilities (features) with Antrea Multi-cluster, with configuration and examples in each of their own sections. The first sections involving how to enable Antrea Multi-cluster feature gate is specific for the Kubernetes "distribution" it is enabled on (TKG, vSphere with Tanzu, upstream Kubernetes etc). After this initial config the rest is generic and can be re-used for all types of Kubernetes platforms. I will go through everything step by step to learn what the different "moving" parts are doing and how they work. At the end I have a bonus chapter where I have created a menu driven script that automates or simplify the whole process.

Antrea Feature Gates

Antrea has a set of Feature Gates that can be enabled or disabled on both the Antrea Controller and Antrea Agent, depending on the feature. These are configured using the corresponding antrea-config configMap. For a list of available features head over to the antrea.io documentation page here. Depending on the Kubernetes platform, and when the settings are applied, these features may be enabled in different ways. This post will specifically cover how to enable the Antrea Multi-cluster Feature Gate in Tanzu Kubernetes Grid and vSphere with Tanzu (not available yet).

Configuring Antrea Multi-cluster in TKG 2.3 with Antrea v1.11.1

Info

The following procedure may not at the time writing this post be officially supported - will get back and confirm this

Using Tanzu Kubernetes Grid the Antrea Feature Gates can be configured during provisioning of the workload clusters or post cluster provision. I will be enabling the Antrea Multi-cluster feature gate during cluster provisioning. If one need to enable these feature gates post cluster provision one must edit the antreaconfigs crd at the Management cluster level for the corresponding TKG cluster. See below.

1 k get antreaconfigs.cni.tanzu.vmware.com -n tkg-ns-1
2NAME            TRAFFICENCAPMODE   DEFAULTMTU   ANTREAPROXY   ANTREAPOLICY   SECRETREF
3tkg-cluster-1   encap                           true          true           tkg-cluster-1-antrea-data-values
4tkg-cluster-2   encap                           true          true           tkg-cluster-2-antrea-data-values
5tkg-cluster-3   encap                           true          true           tkg-cluster-3-antrea-data-values

If I take a look at the yaml values for any of these antreaconfigs:

 1apiVersion: cni.tanzu.vmware.com/v1alpha1
 2kind: AntreaConfig
 3metadata:
 4  annotations:
 5    kubectl.kubernetes.io/last-applied-configuration: |
 6      {"apiVersion":"cni.tanzu.vmware.com/v1alpha1","kind":"AntreaConfig","metadata":{"annotations":{},"name":"tkg-cluster-1","namespace":"tkg-ns-1"},"spec":{"antrea":{"config":{"antreaProxy":{"nodePortAddresses":[],"proxyAll":false,"proxyLoadBalancerIPs":true,"skipServices":[]},"cloudProvider":{"name":""},"disableTXChecksumOffload":false,"disableUdpTunnelOffload":false,"dnsServerOverride":"","egress":{"exceptCIDRs":[],"maxEgressIPsPerNode":255},"enableBridgingMode":null,"enableUsageReporting":false,"featureGates":{"AntreaIPAM":false,"AntreaPolicy":true,"AntreaProxy":true,"AntreaTraceflow":true,"Egress":true,"EndpointSlice":true,"FlowExporter":false,"L7NetworkPolicy":false,"Multicast":false,"Multicluster":true,"NetworkPolicyStats":false,"NodePortLocal":true,"SecondaryNetwork":false,"ServiceExternalIP":false,"SupportBundleCollection":false,"TopologyAwareHints":false,"TrafficControl":false},"flowExporter":{"activeFlowTimeout":"60s","idleFlowTimeout":"15s","pollInterval":"5s"},"kubeAPIServerOverride":null,"multicast":{"igmpQueryInterval":"125s"},"multicastInterfaces":[],"multicluster":{"enable":true,"enablePodToPodConnectivity":true,"enableStretchedNetworkPolicy":true,"namespace":"antrea-multicluster"},"noSNAT":false,"nodePortLocal":{"enabled":true,"portRange":"61000-62000"},"serviceCIDR":"10.132.0.0/16","trafficEncapMode":"encap","trafficEncryptionMode":"none","transportInterface":null,"transportInterfaceCIDRs":[],"tunnelCsum":false,"tunnelPort":0,"tunnelType":"geneve","wireGuard":{"port":51820}}}}}      
 7  creationTimestamp: "2023-09-28T19:49:11Z"
 8  generation: 1
 9  labels:
10    tkg.tanzu.vmware.com/cluster-name: tkg-cluster-1
11    tkg.tanzu.vmware.com/package-name: antrea.tanzu.vmware.com.1.11.1---vmware.4-tkg.1-advanced
12  name: tkg-cluster-1
13  namespace: tkg-ns-1
14  ownerReferences:
15  - apiVersion: cluster.x-k8s.io/v1beta1
16    kind: Cluster
17    name: tkg-cluster-1
18    uid: f635b355-e094-471f-bfeb-63e1c10443cf
19  - apiVersion: run.tanzu.vmware.com/v1alpha3
20    blockOwnerDeletion: true
21    controller: true
22    kind: ClusterBootstrap
23    name: tkg-cluster-1
24    uid: 83b5bdd6-27c3-4c65-a9bc-f665e99c0670
25  resourceVersion: "14988446"
26  uid: 7335854b-49ca-44d9-bded-2d4a09aaf5de
27spec:
28  antrea:
29    config:
30      antreaProxy:
31        nodePortAddresses: []
32        proxyAll: false
33        proxyLoadBalancerIPs: true
34        skipServices: []
35      cloudProvider:
36        name: ""
37      defaultMTU: ""
38      disableTXChecksumOffload: false
39      disableUdpTunnelOffload: false
40      dnsServerOverride: ""
41      egress:
42        exceptCIDRs: []
43        maxEgressIPsPerNode: 255
44      enableBridgingMode: false
45      enableUsageReporting: false
46      featureGates:
47        AntreaIPAM: false
48        AntreaPolicy: true
49        AntreaProxy: true
50        AntreaTraceflow: true
51        Egress: true
52        EndpointSlice: true
53        FlowExporter: false
54        L7NetworkPolicy: false
55        Multicast: false
56        Multicluster: true
57        NetworkPolicyStats: false
58        NodePortLocal: true
59        SecondaryNetwork: false
60        ServiceExternalIP: false
61        SupportBundleCollection: false
62        TopologyAwareHints: false
63        TrafficControl: false
64      flowExporter:
65        activeFlowTimeout: 60s
66        idleFlowTimeout: 15s
67        pollInterval: 5s
68      multicast:
69        igmpQueryInterval: 125s
70      multicastInterfaces: []
71      multicluster:
72        enable: true
73        enablePodToPodConnectivity: true
74        enableStretchedNetworkPolicy: true
75        namespace: antrea-multicluster
76      noSNAT: false
77      nodePortLocal:
78        enabled: true
79        portRange: 61000-62000
80      serviceCIDR: 10.132.0.0/16
81      tlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384
82      trafficEncapMode: encap
83      trafficEncryptionMode: none
84      transportInterfaceCIDRs: []
85      tunnelCsum: false
86      tunnelPort: 0
87      tunnelType: geneve
88      wireGuard:
89        port: 51820
90status:
91  secretRef: tkg-cluster-1-antrea-data-values

I have all the Antrea Feature Gates avaible for Antrea to use in the current version of TKG. What I dont have is the antrea-agent.conf and antrea-controller.conf sections. But enabling the Feature Gates here will enable the corresponding setting under the correct section in the native Antrea configMap.

The required settings or Antrea Feature-Gates that needs to be enabled are the following:

 1kind: ConfigMap
 2apiVersion: v1
 3metadata:
 4  name: antrea-config
 5  namespace: kube-system
 6data:
 7  antrea-agent.conf: |
 8    featureGates:
 9      Multicluster: true
10    multicluster:
11      enableGateway: true
12      namespace: ""    

Then I know which Feature Gates that must be enabled, but in TKG I dont have antrea-agent.conf nor antrea-controller.conf.

Using a class based yaml for my workload clusters in TKG the Antrea specific section look like this and to enable Antrea Multi-cluster (including two optional features) I need to enable these features (redacted):

 1apiVersion: cni.tanzu.vmware.com/v1alpha1
 2kind: AntreaConfig
 3metadata:
 4  name: tkg-cluster-1
 5  namespace: tkg-ns-1
 6spec:
 7  antrea:
 8    config:
 9      antreaProxy:
10      cloudProvider:
11      egress:
12      featureGates:
13        Multicluster: true # set to true
14      multicluster:
15        enable: true # set to true
16        enablePodToPodConnectivity: true # set to true
17        enableStretchedNetworkPolicy: true # set to true
18        namespace: "antrea-multicluster" #optional

Below is the full workload cluster manifest I use, beginning with the Antrea specific settings (see my inline comments again). This will enable the Multi-cluster feature gate, and the two additional Multi-cluster features PodToPodConnectivity and StretchedNetworkPolicy upon cluster creation.

  1apiVersion: cni.tanzu.vmware.com/v1alpha1
  2kind: AntreaConfig
  3metadata:
  4  name: tkg-cluster-1
  5  namespace: tkg-ns-1
  6spec:
  7  antrea:
  8    config:
  9      antreaProxy:
 10        nodePortAddresses: []
 11        proxyAll: false
 12        proxyLoadBalancerIPs: true
 13        skipServices: []
 14      cloudProvider:
 15        name: ""
 16      disableTXChecksumOffload: false
 17      disableUdpTunnelOffload: false
 18      dnsServerOverride: ""
 19      egress:
 20        exceptCIDRs: []
 21        maxEgressIPsPerNode: 255
 22      enableBridgingMode: null
 23      enableUsageReporting: false
 24      featureGates:
 25        AntreaIPAM: false
 26        AntreaPolicy: true
 27        AntreaProxy: true
 28        AntreaTraceflow: true
 29        Egress: true
 30        EndpointSlice: true
 31        FlowExporter: false
 32        L7NetworkPolicy: false
 33        Multicast: false
 34        Multicluster: true # set to true
 35        NetworkPolicyStats: false
 36        NodePortLocal: true
 37        SecondaryNetwork: false
 38        ServiceExternalIP: false
 39        SupportBundleCollection: false
 40        TopologyAwareHints: false
 41        TrafficControl: false
 42      flowExporter:
 43        activeFlowTimeout: 60s
 44        idleFlowTimeout: 15s
 45        pollInterval: 5s
 46      kubeAPIServerOverride: null
 47      multicast:
 48        igmpQueryInterval: 125s
 49      multicastInterfaces: []
 50      multicluster:
 51        enable: true # set to true
 52        enablePodToPodConnectivity: true # set to true
 53        enableStretchedNetworkPolicy: true # set to true
 54        namespace: "antrea-multicluster"
 55      noSNAT: false
 56      nodePortLocal:
 57        enabled: true
 58        portRange: 61000-62000
 59      serviceCIDR: 10.132.0.0/16 # if you forget to update this CIDR it will be updated according to the services cidr below
 60      trafficEncapMode: encap
 61      trafficEncryptionMode: none
 62      transportInterface: null
 63      transportInterfaceCIDRs: []
 64      tunnelCsum: false
 65      tunnelPort: 0
 66      tunnelType: geneve
 67      wireGuard:
 68        port: 51820
 69---
 70apiVersion: cpi.tanzu.vmware.com/v1alpha1
 71kind: VSphereCPIConfig
 72metadata:
 73  name: tkg-cluster-1
 74  namespace: tkg-ns-1
 75spec:
 76  vsphereCPI:
 77    ipFamily: ipv4
 78    mode: vsphereCPI
 79    region: k8s-region
 80    tlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
 81    zone: k8s-zone
 82---
 83apiVersion: csi.tanzu.vmware.com/v1alpha1
 84kind: VSphereCSIConfig
 85metadata:
 86  name: tkg-cluster-1
 87  namespace: tkg-ns-1
 88spec:
 89  vsphereCSI:
 90    config:
 91      datacenter: /cPod-NSXAM-WDC
 92      httpProxy: ""
 93      httpsProxy: ""
 94      insecureFlag: false
 95      noProxy: ""
 96      region: k8s-region
 97      tlsThumbprint: <SHA>
 98      useTopologyCategories: true
 99      zone: k8s-zone
100    mode: vsphereCSI
101---
102apiVersion: run.tanzu.vmware.com/v1alpha3
103kind: ClusterBootstrap
104metadata:
105  annotations:
106    tkg.tanzu.vmware.com/add-missing-fields-from-tkr: v1.26.5---vmware.2-tkg.1
107  name: tkg-cluster-1
108  namespace: tkg-ns-1
109spec:
110  additionalPackages:
111  - refName: metrics-server*
112  - refName: secretgen-controller*
113  - refName: pinniped*
114  cni:
115    refName: antrea*
116    valuesFrom:
117      providerRef:
118        apiGroup: cni.tanzu.vmware.com
119        kind: AntreaConfig
120        name: tkg-cluster-1
121  cpi:
122    refName: vsphere-cpi*
123    valuesFrom:
124      providerRef:
125        apiGroup: cpi.tanzu.vmware.com
126        kind: VSphereCPIConfig
127        name: tkg-cluster-1
128  csi:
129    refName: vsphere-csi*
130    valuesFrom:
131      providerRef:
132        apiGroup: csi.tanzu.vmware.com
133        kind: VSphereCSIConfig
134        name: tkg-cluster-1
135  kapp:
136    refName: kapp-controller*
137---
138apiVersion: v1
139kind: Secret
140metadata:
141  name: tkg-cluster-1
142  namespace: tkg-ns-1
143stringData:
144  password: password
145  username: andreasm@cpod-nsxam-wdc.domain.net
146---
147apiVersion: cluster.x-k8s.io/v1beta1
148kind: Cluster
149metadata:
150  annotations:
151    osInfo: ubuntu,20.04,amd64
152    tkg/plan: dev
153  labels:
154    tkg.tanzu.vmware.com/cluster-name: tkg-cluster-1
155  name: tkg-cluster-1
156  namespace: tkg-ns-1
157spec:
158  clusterNetwork:
159    pods:
160      cidrBlocks:
161      - 10.131.0.0/16
162    services:
163      cidrBlocks:
164      - 10.132.0.0/16
165  topology:
166    class: tkg-vsphere-default-v1.1.0
167    controlPlane:
168      metadata:
169        annotations:
170          run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
171      replicas: 1
172    variables:
173    - name: cni
174      value: antrea
175    - name: controlPlaneCertificateRotation
176      value:
177        activate: true
178        daysBefore: 90
179    - name: auditLogging
180      value:
181        enabled: false
182    - name: podSecurityStandard
183      value:
184        audit: restricted
185        deactivated: false
186        warn: restricted
187    - name: apiServerEndpoint
188      value: ""
189    - name: aviAPIServerHAProvider
190      value: true
191    - name: vcenter
192      value:
193        cloneMode: fullClone
194        datacenter: /cPod-NSXAM-WDC
195        datastore: /cPod-NSXAM-WDC/datastore/vsanDatastore-wdc-01
196        folder: /cPod-NSXAM-WDC/vm/TKGm
197        network: /cPod-NSXAM-WDC/network/ls-tkg-mgmt
198        resourcePool: /cPod-NSXAM-WDC/host/Cluster-1/Resources
199        server: vcsa.cpod-nsxam-wdc.az-wdc.domain.net
200        storagePolicyID: ""
201        tlsThumbprint: <SHA>
202    - name: user
203      value:
204        sshAuthorizedKeys:
205        - ssh-rsa 2UEBx235bVRSxQ==
206    - name: controlPlane
207      value:
208        machine:
209          diskGiB: 20
210          memoryMiB: 4096
211          numCPUs: 2
212    - name: worker
213      value:
214        machine:
215          diskGiB: 20
216          memoryMiB: 4096
217          numCPUs: 2
218    - name: controlPlaneZoneMatchingLabels
219      value:
220        region: k8s-region
221        tkg-cp: allowed
222    - name: security
223      value:
224        fileIntegrityMonitoring:
225          enabled: false
226        imagePolicy:
227          pullAlways: false
228          webhook:
229            enabled: false
230            spec:
231              allowTTL: 50
232              defaultAllow: true
233              denyTTL: 60
234              retryBackoff: 500
235        kubeletOptions:
236          eventQPS: 50
237          streamConnectionIdleTimeout: 4h0m0s
238        systemCryptoPolicy: default
239    version: v1.26.5+vmware.2-tkg.1
240    workers:
241      machineDeployments:
242      - class: tkg-worker
243        failureDomain: wdc-zone-2
244        metadata:
245          annotations:
246            run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
247        name: md-0
248        replicas: 1
249        strategy:
250          type: RollingUpdate
251      - class: tkg-worker
252        failureDomain: wdc-zone-3
253        metadata:
254          annotations:
255            run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
256        name: md-1
257        replicas: 1
258        strategy:
259          type: RollingUpdate

In addition I am also setting these three additional settings:

1        enablePodToPodConnectivity: true # set to true
2        enableStretchedNetworkPolicy: true # set to true
3        namespace: "antrea-multicluster" # the namespace will be created later, and I am not sure its even required to define anything here. Leave it blank ""

As I will test out the Multi-cluster NetworkPolicy and routing pod traffic through Multi-cluster Gateways, more on that later. In the workload cluster manifest make sure the services and pod CIDR does not overlap between your clusters that will be joined to the same Multi-cluster ClusterSet.

The namespace section is not mandatory.

Info

The services CIDRs can not overlap

The pod CIDR can not overlap if enabling PodToPodConnectivity

After my TKG workload cluster has been provisioned with the above yaml, this will give me the following AntreaConfig in my workload cluster:

  1andreasm@tkg-bootstrap:~$ k get configmaps -n kube-system antrea-config -oyaml
  2apiVersion: v1
  3data:
  4  antrea-agent.conf: |
  5    featureGates:
  6      AntreaProxy: true
  7      EndpointSlice: true
  8      TopologyAwareHints: false
  9      Traceflow: true
 10      NodePortLocal: true
 11      AntreaPolicy: true
 12      FlowExporter: false
 13      NetworkPolicyStats: false
 14      Egress: true
 15      AntreaIPAM: false
 16      Multicast: false
 17      Multicluster: true #enabled
 18      SecondaryNetwork: false
 19      ServiceExternalIP: false
 20      TrafficControl: false
 21      SupportBundleCollection: false
 22      L7NetworkPolicy: false
 23    trafficEncapMode: encap
 24    noSNAT: false
 25    tunnelType: geneve
 26    tunnelPort: 0
 27    tunnelCsum: false
 28    trafficEncryptionMode: none
 29    enableBridgingMode: false
 30    disableTXChecksumOffload: false
 31    wireGuard:
 32      port: 51820
 33    egress:
 34      exceptCIDRs: []
 35      maxEgressIPsPerNode: 255
 36    serviceCIDR: 10.132.0.0/16
 37    nodePortLocal:
 38      enable: true
 39      portRange: 61000-62000
 40    tlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384
 41    multicast: {}
 42    antreaProxy:
 43      proxyAll: false
 44      nodePortAddresses: []
 45      skipServices: []
 46      proxyLoadBalancerIPs: true
 47    multicluster:
 48      enableGateway: true #enabled
 49      namespace: antrea-multicluster
 50      enableStretchedNetworkPolicy: true #enabled
 51      enablePodToPodConnectivity: true #enabled    
 52  antrea-cni.conflist: |
 53    {
 54        "cniVersion":"0.3.0",
 55        "name": "antrea",
 56        "plugins": [
 57            {
 58                "type": "antrea",
 59                "ipam": {
 60                    "type": "host-local"
 61                }
 62            }
 63            ,
 64            {
 65                "type": "portmap",
 66                "capabilities": {"portMappings": true}
 67            }
 68            ,
 69            {
 70                "type": "bandwidth",
 71                "capabilities": {"bandwidth": true}
 72            }
 73        ]
 74    }    
 75  antrea-controller.conf: |
 76    featureGates:
 77      Traceflow: true
 78      AntreaPolicy: true
 79      NetworkPolicyStats: false
 80      Multicast: false
 81      Egress: true
 82      AntreaIPAM: false
 83      ServiceExternalIP: false
 84      SupportBundleCollection: false
 85      L7NetworkPolicy: false
 86      Multicluster: true
 87    tlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384
 88    nodeIPAM: null
 89    multicluster:
 90      enableStretchedNetworkPolicy: true
 91    cloudProvider:
 92      name: ""    
 93kind: ConfigMap
 94metadata:
 95  annotations:
 96    kapp.k14s.io/identity: v1;kube-system//ConfigMap/antrea-config;v1
 97    kapp.k14s.io/original: '{"apiVersion":"v1","data":{"antrea-agent.conf":"featureGates:\n  AntreaProxy:
 98      true\n  EndpointSlice: true\n  TopologyAwareHints: false\n  Traceflow: true\n  NodePortLocal:
 99      true\n  AntreaPolicy: true\n  FlowExporter: false\n  NetworkPolicyStats: false\n  Egress:
100      true\n  AntreaIPAM: false\n  Multicast: false\n  Multicluster: true\n  SecondaryNetwork:
101      false\n  ServiceExternalIP: false\n  TrafficControl: false\n  SupportBundleCollection:
102      false\n  L7NetworkPolicy: false\ntrafficEncapMode: encap\nnoSNAT: false\ntunnelType:
103      geneve\ntunnelPort: 0\ntunnelCsum: false\ntrafficEncryptionMode: none\nenableBridgingMode:
104      false\ndisableTXChecksumOffload: false\nwireGuard:\n  port: 51820\negress:\n  exceptCIDRs:
105      []\n  maxEgressIPsPerNode: 255\nserviceCIDR: 100.20.0.0/16\nnodePortLocal:\n  enable:
106      true\n  portRange: 61000-62000\ntlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384\nmulticast:
107      {}\nantreaProxy:\n  proxyAll: false\n  nodePortAddresses: []\n  skipServices:
108      []\n  proxyLoadBalancerIPs: true\nmulticluster:\n  enableGateway: true\n  enableStretchedNetworkPolicy:
109      true\n  enablePodToPodConnectivity: true\n","antrea-cni.conflist":"{\n    \"cniVersion\":\"0.3.0\",\n    \"name\":
110      \"antrea\",\n    \"plugins\": [\n        {\n            \"type\": \"antrea\",\n            \"ipam\":
111      {\n                \"type\": \"host-local\"\n            }\n        }\n        ,\n        {\n            \"type\":
112      \"portmap\",\n            \"capabilities\": {\"portMappings\": true}\n        }\n        ,\n        {\n            \"type\":
113      \"bandwidth\",\n            \"capabilities\": {\"bandwidth\": true}\n        }\n    ]\n}\n","antrea-controller.conf":"featureGates:\n  Traceflow:
114      true\n  AntreaPolicy: true\n  NetworkPolicyStats: false\n  Multicast: false\n  Egress:
115      true\n  AntreaIPAM: false\n  ServiceExternalIP: false\n  SupportBundleCollection:
116      false\n  L7NetworkPolicy: false\n  Multicluster: true\ntlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384\nnodeIPAM:
117      null\nmulticluster:\n  enableStretchedNetworkPolicy: true\ncloudProvider:\n  name:
118      \"\"\n"},"kind":"ConfigMap","metadata":{"labels":{"app":"antrea","kapp.k14s.io/app":"1695711779258641591","kapp.k14s.io/association":"v1.c39c4aca919097e50452c3432329dd40"},"name":"antrea-config","namespace":"kube-system"}}'
119    kapp.k14s.io/original-diff-md5: c6e94dc94aed3401b5d0f26ed6c0bff3
120  creationTimestamp: "2023-09-26T07:03:04Z"
121  labels:
122    app: antrea
123    kapp.k14s.io/app: "1695711779258641591"
124    kapp.k14s.io/association: v1.c39c4aca919097e50452c3432329dd40
125  name: antrea-config
126  namespace: kube-system
127  resourceVersion: "902"
128  uid: 49aea43b-4f8e-4f60-9d54-b299a73bdba8

Notice that the features have been enabled under the corresponding antrea-agent.conf and antrea-controller.conf sections.

 1apiVersion: v1
 2data:
 3  antrea-agent.conf: |
 4    featureGates:
 5      Multicluster: true #enabled
 6    multicluster:
 7      enableGateway: true #enabled
 8      namespace: antrea-multicluster
 9      enableStretchedNetworkPolicy: true #enabled
10      enablePodToPodConnectivity: true #enabled    
11  antrea-controller.conf: |
12    featureGates:
13      Multicluster: true #enabled
14    multicluster:
15      enableStretchedNetworkPolicy: true #enabled    

That's it for enabling Antrea Multi-cluster in TKG workload clusters.

Tanzu in vSphere and Antrea Multi-cluster

Coming later - stay tuned

Install and Configure Antrea Multi-cluster

All the instructions for this exercise has been taken from the offial antrea.io and the Antrea Github pages. My TKG Cluster-1 (Leader cluster) is up and running with the required Antrea Multi-cluster settings (see above) enabled. Now I will follow the user-guide which will involve installing the Antrea Multi-cluster controller in the leader cluster, create ClusterSet, Multi-cluster Gateway configuration. I will start doing all the steps that can be done on the Leader cluster, then take the member clusters next. I will be following the yaml approach. There is also another approach using antctl I may add that also or update the post at a later stage to involve these steps.

Info

It is important to follow the documentation according to the version of Antrea being used. This is due to updates in api, and general configuration settings. An example. If I am on Antrea v1.11.1 I would be using the following github or antrea url: https://github.com/antrea-io/antrea/blob/release-1.11/docs/multicluster/user-guide.md https://antrea.io/docs/v1.11.1/docs/multicluster/user-guide/

If using Antrea v1.13.1 I would be using the following url:

https://github.com/antrea-io/antrea/blob/release-1.13/docs/multicluster/user-guide.md https://antrea.io/docs/v1.13.1/docs/multicluster/user-guide/

Info

Antrea Multi-cluster can be configured in different topologies

Read more here: https://github.com/antrea-io/antrea/blob/main/docs/multicluster/user-guide.md#deploy-antrea-multi-cluster-controller

Install the Antrea Multi-cluster controller in the dedicated leader cluster

I will start by creating an enviroment variable, exporting the Antrea version I am using, for the following yaml manifests commands to use. Just to be certain I will first check the actual Antrea version by issuing the antctl version command inside the Antrea controller pod:

1k exec -it -n kube-system antrea-controller-75b85cf45b-hvhq4 bash
2kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
3root@tkg-cluster-1-btnn5-vzq5n:/# antctl version
4antctlVersion: v1.11.1-4776f66
5controllerVersion: v1.11.1-4776f66
6root@tkg-cluster-1-btnn5-vzq5n:/#
1andreasm@tkg-bootstrap:~$ export TAG=v1.11.1

Install the Multi-cluster controller in the namespace antrea-multicluster issuing the following two yamls:

1kubectl apply -f https://github.com/antrea-io/antrea/releases/download/$TAG/antrea-multicluster-leader-global.yml
2kubectl apply -f https://github.com/antrea-io/antrea/releases/download/$TAG/antrea-multicluster-leader-namespaced.yml

Creating the namespace antrea-multicluster:

1andreasm@tkg-bootstrap:~$ kubectl create ns antrea-multicluster
2namespace/antrea-multicluster created

Applying the antrea-multicluster-leader-global.yaml:

1## Apply the antrea-multicluster-leader-global.yaml
2andreasm@tkg-bootstrap:~$ kubectl apply -f https://github.com/antrea-io/antrea/releases/download/$TAG/antrea-multicluster-leader-global.yml
3## applied
4customresourcedefinition.apiextensions.k8s.io/clusterclaims.multicluster.crd.antrea.io created
5customresourcedefinition.apiextensions.k8s.io/clustersets.multicluster.crd.antrea.io created
6customresourcedefinition.apiextensions.k8s.io/memberclusterannounces.multicluster.crd.antrea.io created
7customresourcedefinition.apiextensions.k8s.io/resourceexports.multicluster.crd.antrea.io created
8customresourcedefinition.apiextensions.k8s.io/resourceimports.multicluster.crd.antrea.io created

Applying the antrea-multicluster-leader-namespaced.yaml:

 1## Apply the antrea-multicluster-leader-namespaced.yaml
 2andreasm@tkg-bootstrap:~$ kubectl apply -f https://github.com/antrea-io/antrea/releases/download/$TAG/antrea-multicluster-leader-namespaced.yml
 3## applied
 4
 5serviceaccount/antrea-mc-controller created
 6serviceaccount/antrea-mc-member-access-sa created
 7role.rbac.authorization.k8s.io/antrea-mc-controller-role created
 8role.rbac.authorization.k8s.io/antrea-mc-member-cluster-role created
 9clusterrole.rbac.authorization.k8s.io/antrea-multicluster-antrea-mc-controller-webhook-role created
10rolebinding.rbac.authorization.k8s.io/antrea-mc-controller-rolebinding created
11rolebinding.rbac.authorization.k8s.io/antrea-mc-member-cluster-rolebinding created
12clusterrolebinding.rbac.authorization.k8s.io/antrea-multicluster-antrea-mc-controller-webhook-rolebinding created
13configmap/antrea-mc-controller-config created
14service/antrea-mc-webhook-service created
15Warning: would violate PodSecurity "restricted:v1.24": host namespaces (hostNetwork=true), hostPort (container "antrea-mc-controller" uses hostPort 9443), unrestricted capabilities (container "antrea-mc-controller" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "antrea-mc-controller" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "antrea-mc-controller" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
16deployment.apps/antrea-mc-controller created
17mutatingwebhookconfiguration.admissionregistration.k8s.io/antrea-multicluster-antrea-mc-mutating-webhook-configuration created
18validatingwebhookconfiguration.admissionregistration.k8s.io/antrea-multicluster-antrea-mc-validating-webhook-configuration created
1andreasm@tkg-bootstrap:~$ k get pods -n antrea-multicluster
2NAME                                    READY   STATUS    RESTARTS   AGE
3antrea-mc-controller-7697c8b776-dqtzp   1/1     Running   0          58s

And the Antrea CRDs including the additional CRDs from the Multi-cluster installation:

 1aandreasm@tkg-bootstrap:~$ k get crd -A
 2NAME                                                     CREATED AT
 3antreaagentinfos.crd.antrea.io                           2023-09-26T07:03:02Z
 4antreacontrollerinfos.crd.antrea.io                      2023-09-26T07:03:02Z
 5clusterclaims.multicluster.crd.antrea.io                 2023-09-27T06:59:07Z
 6clustergroups.crd.antrea.io                              2023-09-26T07:03:04Z
 7clusternetworkpolicies.crd.antrea.io                     2023-09-26T07:03:02Z
 8clustersets.multicluster.crd.antrea.io                   2023-09-27T06:59:07Z
 9egresses.crd.antrea.io                                   2023-09-26T07:03:02Z
10externalentities.crd.antrea.io                           2023-09-26T07:03:02Z
11externalippools.crd.antrea.io                            2023-09-26T07:03:02Z
12externalnodes.crd.antrea.io                              2023-09-26T07:03:02Z
13groups.crd.antrea.io                                     2023-09-26T07:03:04Z
14ippools.crd.antrea.io                                    2023-09-26T07:03:02Z
15memberclusterannounces.multicluster.crd.antrea.io        2023-09-27T06:59:08Z
16multiclusteringresses.ako.vmware.com                     2023-09-26T07:03:51Z
17networkpolicies.crd.antrea.io                            2023-09-26T07:03:02Z
18resourceexports.multicluster.crd.antrea.io               2023-09-27T06:59:12Z
19resourceimports.multicluster.crd.antrea.io               2023-09-27T06:59:15Z
20supportbundlecollections.crd.antrea.io                   2023-09-26T07:03:03Z
21tierentitlementbindings.crd.antrea.tanzu.vmware.com      2023-09-26T07:03:04Z
22tierentitlements.crd.antrea.tanzu.vmware.com             2023-09-26T07:03:02Z
23tiers.crd.antrea.io                                      2023-09-26T07:03:03Z
24traceflows.crd.antrea.io                                 2023-09-26T07:03:03Z
25trafficcontrols.crd.antrea.io                            2023-09-26T07:03:03Z

Install the Antrea Multi-cluster controller in the member clusters

Next up is to deploy the Antrea Multi-cluster in both of my member clusters, tkg-cluster-2 and tkg-cluster-3. They have also been configured to enable the Multi-cluster feature gate. This operation involves applying the antrea-multicluster-member.yaml

 1andreasm@tkg-bootstrap:~$ k config current-context
 2tkg-cluster-2-admin@tkg-cluster-2
 3# Show running pods
 4NAMESPACE              NAME                                                     READY   STATUS      RESTARTS      AGE
 5avi-system             ako-0                                                    1/1     Running     0             19h
 6kube-system            antrea-agent-grq68                                       2/2     Running     0             18h
 7kube-system            antrea-agent-kcbc8                                       2/2     Running     0             18h
 8kube-system            antrea-agent-zvgcc                                       2/2     Running     0             18h
 9kube-system            antrea-controller-bc584bbcd-7hw7n                        1/1     Running     0             18h
10kube-system            coredns-75f565d4dd-48sgb                                 1/1     Running     0             19h
11kube-system            coredns-75f565d4dd-wr8pl                                 1/1     Running     0             19h
12kube-system            etcd-tkg-cluster-2-jzmpk-w2thj                           1/1     Running     0             19h
13kube-system            kube-apiserver-tkg-cluster-2-jzmpk-w2thj                 1/1     Running     0             19h
14kube-system            kube-controller-manager-tkg-cluster-2-jzmpk-w2thj        1/1     Running     0             19h
15kube-system            kube-proxy-dmcb5                                         1/1     Running     0             19h
16kube-system            kube-proxy-kr5hl                                         1/1     Running     0             19h
17kube-system            kube-proxy-t7qqp                                         1/1     Running     0             19h
18kube-system            kube-scheduler-tkg-cluster-2-jzmpk-w2thj                 1/1     Running     0             19h
19kube-system            metrics-server-5666ffccb9-p7qqw                          1/1     Running     0             19h
20kube-system            vsphere-cloud-controller-manager-6b8v2                   1/1     Running     0             19h
21secretgen-controller   secretgen-controller-69cbc65949-2x8nv                    1/1     Running     0             19h
22tkg-system             kapp-controller-5776b48998-7rkzf                         2/2     Running     0             19h
23tkg-system             tanzu-capabilities-controller-manager-77d8ffcd57-9tgzn   1/1     Running     0             19h
24vmware-system-antrea   register-placeholder-m5bj8                               0/1     Completed   0             8m41s
25vmware-system-csi      vsphere-csi-controller-67799db966-qnfv2                  7/7     Running     1 (19h ago)   19h
26vmware-system-csi      vsphere-csi-node-dkcf7                                   3/3     Running     2 (19h ago)   19h
27vmware-system-csi      vsphere-csi-node-w8dqw                                   3/3     Running     3 (31m ago)   19h
28vmware-system-csi      vsphere-csi-node-zxpmz                                   3/3     Running     4 (19h ago)   19h

Apply the multicluster-member.yaml

 1andreasm@tkg-bootstrap:~$ kubectl apply -f https://github.com/antrea-io/antrea/releases/download/$TAG/antrea-multicluster-member.yml
 2## applied
 3customresourcedefinition.apiextensions.k8s.io/clusterclaims.multicluster.crd.antrea.io created
 4customresourcedefinition.apiextensions.k8s.io/clusterinfoimports.multicluster.crd.antrea.io created
 5customresourcedefinition.apiextensions.k8s.io/clustersets.multicluster.crd.antrea.io created
 6customresourcedefinition.apiextensions.k8s.io/gateways.multicluster.crd.antrea.io created
 7customresourcedefinition.apiextensions.k8s.io/labelidentities.multicluster.crd.antrea.io created
 8customresourcedefinition.apiextensions.k8s.io/serviceexports.multicluster.x-k8s.io created
 9customresourcedefinition.apiextensions.k8s.io/serviceimports.multicluster.x-k8s.io created
10serviceaccount/antrea-mc-controller created
11clusterrole.rbac.authorization.k8s.io/antrea-mc-controller-role created
12clusterrolebinding.rbac.authorization.k8s.io/antrea-mc-controller-rolebinding created
13configmap/antrea-mc-controller-config created
14service/antrea-mc-webhook-service created
15deployment.apps/antrea-mc-controller created
16mutatingwebhookconfiguration.admissionregistration.k8s.io/antrea-mc-mutating-webhook-configuration created
17validatingwebhookconfiguration.admissionregistration.k8s.io/antrea-mc-validating-webhook-configuration created
1NAMESPACE              NAME                                                     READY   STATUS      RESTARTS      AGE
2kube-system            antrea-mc-controller-5bb945f87f-xk287                    1/1     Running     0             64s

The controller is running, and it is running in the kube-system namespace.

Now I will repeat the same operation for my tkg-cluster-3...

Create ClusterSet

Creating ClusterSet involves a couple of steps on both Leader cluster and member clusters.

On the Leader Cluster - Create ServiceAccounts

First I will configure a ServiceAccount for each member to access the Leader cluster's API.

In my Leader cluster I will apply the following yaml:

Member 1, tkg-cluster-2, will be called member-blue

 1## This the yaml for creating the Service Account for member-blue
 2apiVersion: v1
 3kind: ServiceAccount
 4metadata:
 5  name: member-blue
 6  namespace: antrea-multicluster
 7---
 8apiVersion: v1
 9kind: Secret
10metadata:
11  name: member-blue-token
12  namespace: antrea-multicluster
13  annotations:
14    kubernetes.io/service-account.name: member-blue
15type: kubernetes.io/service-account-token
16---
17apiVersion: rbac.authorization.k8s.io/v1
18kind: RoleBinding
19metadata:
20  name: member-blue
21  namespace: antrea-multicluster
22roleRef:
23  apiGroup: rbac.authorization.k8s.io
24  kind: Role
25  name: antrea-mc-member-cluster-role
26subjects:
27  - kind: ServiceAccount
28    name: member-blue
29    namespace: antrea-multicluster

Apply it:

1andreasm@tkg-bootstrap:~$ k apply -f sa-member-blue.yaml
2## applied
3serviceaccount/member-blue created
4secret/member-blue-token created
5rolebinding.rbac.authorization.k8s.io/member-blue created

Now create a token yaml file for member blue:

1kubectl get secret member-blue-token -n antrea-multicluster -o yaml | grep -w -e '^apiVersion' -e '^data' -e '^metadata' -e '^ *name:'  -e   '^kind' -e '  ca.crt' -e '  token:' -e '^type' -e '  namespace' | sed -e 's/kubernetes.io\/service-account-token/Opaque/g' -e 's/antrea-multicluster/kube-system/g' >  member-blue-token.yml

This should create the file member-blue-token.yaml and the content of the file:

 1# cat member-blue-token.yml
 2cat member-blue-token.yml
 3## output
 4apiVersion: v1
 5data:
 6  ca.crt: LS0tLS1CRUdJ0tCk1JSUM2akNDQWRLZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJek1Ea3lOakEyTlRReU5Wb1hEVE16TURreU16QTJOVGt5TlZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTDh2Ck1iWHc0MFM2NERZc2dnMG9qSEVwUHlOOC93MVBkdFE2cGxSSThvbUVuWW1ramc5TThIN3NrTDhqdHR1WXRxQVkKYnorZEFsVmJBcWRrWCtkL3Q5MTZGOWRBYmRveW9Qc3pwREIraVVQdDZ5Nm5YbDhPK0xEV2JWZzdEWTVXQ3lYYQpIeEJBM1I4NUUxRkhvYUxBREZ6OFRsZ2lKR3RZYktROFJYTWlIMk1xczRaNU9Mblp3Qy9rSTNVNEEzMVFlcXl3Cm1VMjd2SDdzZjlwK0tiTE5wZldtMDJoV3hZNzlMS1hCNE1LOStLaXAyVkt4VUlQbHl6bGpXTjQrcngyclN4bVEKbnpmR3NpT0JQTWpOanpQOE44cWJub01hL2Jd1haOHJpazhnenJGN05sQQp6R1BvdC84S2Q5UXZ2Q2doVVlNQ0F3RUFBYU5GTUVNd0RnWURWUjBQQVFIL0JBUURBZ0trTUJJR0ExVWRFd0VCCi93UUlNQVlCQWY4Q0FRQXdIUVlEVlIwT0JCWUVGQk9PWHcrb1o1VGhLQ3I5RnBMWm9ySkZZWW1wTUEwR0NTcUcKU0liM0RRRUJDd1VBQTRJQkFRQ2Rpa2lWYi9pblk1RVNIdVVIcTY2YnBLK1RTQXI5eTFERnN0Qjd2eUt3UGduVAp2bGdEZnZnN1o3UTVOaFhzcFBnV1Y4NEZMMU80UUQ1WmtTYzhLcDVlM1V1ZFFvckRQS3VEOWkzNHVXVVc0TVo5ClR2UUJLNS9sRUlsclBONG5XYmEyazYrOE9tZitmWWREd3JsZTVaa3JUOHh6UnhEbEtXdE5vNVRHMWgrMElUOVgKcVMwL1hzNFpISlU2NGd5dlRsQXlwR2pPdFdxMUc0MEZ5U3dydFJLSE52a3JjTStkeDdvTEM1d003ZTZSTHg1cApnb0Y5dGZZV3ZyUzJWWjl2RUR5QllPN1RheFhEMGlaV2V1VEh0ZFJxTWVLZVAvei9lYnZJMUkvRkWS9NNGdxYjdXbGVqM0JNS051TVc2Q1AwUmFYcXJnUmpnVTAKLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQu=
 7  namespace: YW50cmVhLW11bHRpY2x1c3Rlcg==
 8  token: WXlKaGJHY2lPaUpTVXpJMCUVRGaWNIUlJTR0Z1VnpkTU1GVjVOalF3VWpKcVNFVXdYMFkzZDFsNmJqWXpUa3hzVUc4aWZRLmV5SnBjM01pT2lKcmRXSmxjbTVsZEdWekwzTmxjblpwWTJWaFkyTnZkVzUwSWl3aWEzVmlaWEp1WlhSbGN5NXBieTl6WlhKMmFXTmxZV05qYjNWdWRDOXVZVzFsYzNCaFkyVWlPaUpoYm5SeVpXRXRiWFZzZEdsamJIVnpkR1Z5SWl3aWEzVmlaWEp1WlhSbGN5NXBieTl6WlhKMmFXTmxZV05qYjNWdWRDOXpaV055WlhRdWJtRnRaU0k2SW0xbGJXSmxjaTFpYkhWbExYUnZhMlZ1SWl3aWEzVmlaWEp1WlhSbGN5NXBieTl6WlhKMmFXTmxZV05qY5RdWJtRnRaU0k2SW0xbGJXSmxjaTFpYkhWbElpd2lhM1ZpWlhKdVpYUmxjeTVwYnk5elpYSjJhV05sWVdOamIzVnVkQzl6WlhKMmFXTmxMV0ZqWTI5MWJuUXVkV2xrSWpvaU1XSTVaVGd4TUdVdE5tTTNaQzAwTjJZekxXRTVaR1F0TldWbU4yTXpPVEE0T0dNNElpd2ljM1ZpSWpvaWMzbHpkR1Z0T25ObGNuWnBZMlZoWTJOdmRXNTBPbUZ1ZEhKbFlTMXRkV3gwYVdOc2RYTjBaWEk2YldWdFltVnlMV0pzZFdVaWZRLmNyU29sN0JzS2JyR2lhUVYyWmJ2OG8tUVl1eVBfWGlnZ1hfcDg2UGs0ZEpVOTBwWk1XYUZTUkFUR1ptQnBCUUJnajI5eVJHbkdvajVHRWN3YVNxWlZaV3FySk9jVTM1QXFlWHhpWm1fUl9LWDB4VUR1Y0wxQTVxNdWhOYTBGUkxmU2FKWUNOME1NTHNKVTBDU3pHUVg1dHlzTXBUN0YwVG0weS1mZFpVOE9IQmJoY0ZDWXkyYk5WdC0weU9pQUlYOHR2TVNrb2NzaHpWUm5ha1A5dmtMaXNVUGh2Vm9xMVROZ2RVRmtjc0lPRjl4ZFo5Ul9PX3NlT1ZLay1hNkhJbjB3THQzZ3FEZHRHU09Ub3BfMUh0djgxeEZQdF9zNlVRNmpldjZpejh3aFAzX1BkSGhwTlNCWFBEc3hZbEhyMlVaUK==
 9kind: Secret
10metadata:
11  name: member-blue-token
12  namespace: kube-system
13type: Opaque

Repeating the steps for my member cluster 2 (tkg-cluster-3)

Member 2, tkg-cluster-3, will be called member-red (yes you guessed it right)

 1## This the yaml for creating the Service Account for member-blue
 2apiVersion: v1
 3kind: ServiceAccount
 4metadata:
 5  name: member-red
 6  namespace: antrea-multicluster
 7---
 8apiVersion: v1
 9kind: Secret
10metadata:
11  name: member-red-token
12  namespace: antrea-multicluster
13  annotations:
14    kubernetes.io/service-account.name: member-red
15type: kubernetes.io/service-account-token
16---
17apiVersion: rbac.authorization.k8s.io/v1
18kind: RoleBinding
19metadata:
20  name: member-red
21  namespace: antrea-multicluster
22roleRef:
23  apiGroup: rbac.authorization.k8s.io
24  kind: Role
25  name: antrea-mc-member-cluster-role
26subjects:
27  - kind: ServiceAccount
28    name: member-red
29    namespace: antrea-multicluster

Apply it:

1andreasm@tkg-bootstrap:~$ k apply -f sa-member-red.yaml
2## applied
3serviceaccount/member-red created
4secret/member-red-token created
5rolebinding.rbac.authorization.k8s.io/member-red created

Now create a token yaml file for member red:

1kubectl get secret member-red-token -n antrea-multicluster -o yaml | grep -w -e '^apiVersion' -e '^data' -e '^metadata' -e '^ *name:'  -e   '^kind' -e '  ca.crt' -e '  token:' -e '^type' -e '  namespace' | sed -e 's/kubernetes.io\/service-account-token/Opaque/g' -e 's/antrea-multicluster/kube-system/g' >  member-blue-token.yml

This should create the file member-red-token.yaml and the content of the file:

 1# cat member-red-token.yml
 2cat member-red-token.yml
 3## output
 4apiVersion: v1
 5data:
 6  ca.crt: LS0tLS1CRUdJ0tCk1JSUM2akNDQWRLZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJek1Ea3lOakEyTlRReU5Wb1hEVE16TURreU16QTJOVGt5TlZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTDh2Ck1iWHc0MFM2NERZc2dnMG9qSEVwUHlOOC93MVBkdFE2cGxSSThvbUVuWW1ramc5TThIN3NrTDhqdHR1WXRxQVkKYnorZEFsVmJBcWRrWCtkL3Q5MTZGOWRBYmRveW9Qc3pwREIraVVQdDZ5Nm5YbDhPK0xEV2JWZzdEWTVXQ3lYYQpIeEJBM1I4NUUxRkhvYUxBREZ6OFRsZ2lKR3RZYktROFJYTWlIMk1xczRaNU9Mblp3Qy9rSTNVNEEzMVFlcXl3Cm1VMjd2SDdzZjlwK0tiTE5wZldtMDJoV3hZNzlMS1hCNE1LOStLaXAyVkt4VUlQbHl6bGpXTjQrcngyclN4bVEKbnpmR3NpT0JQTWpOanpQOE44cWJub01hL2Jd1haOHJpazhnenJGN05sQQp6R1BvdC84S2Q5UXZ2Q2doVVlNQ0F3RUFBYU5GTUVNd0RnWURWUjBQQVFIL0JBUURBZ0trTUJJR0ExVWRFd0VCCi93UUlNQVlCQWY4Q0FRQXdIUVlEVlIwT0JCWUVGQk9PWHcrb1o1VGhLQ3I5RnBMWm9ySkZZWW1wTUEwR0NTcUcKU0liM0RRRUJDd1VBQTRJQkFRQ2Rpa2lWYi9pblk1RVNIdVVIcTY2YnBLK1RTQXI5eTFERnN0Qjd2eUt3UGduVAp2bGdEZnZnN1o3UTVOaFhzcFBnV1Y4NEZMMU80UUQ1WmtTYzhLcDVlM1V1ZFFvckRQS3VEOWkzNHVXVVc0TVo5ClR2UUJLNS9sRUlsclBONG5XYmEyazYrOE9tZitmWWREd3JsZTVaa3JUOHh6UnhEbEtXdE5vNVRHMWgrMElUOVgKcVMwL1hzNFpISlU2NGd5dlRsQXlwR2pPdFdxMUc0MEZ5U3dydFJLSE52a3JjTStkeDdvTEM1d003ZTZSTHg1cApnb0Y5dGZZV3ZyUzJWWjl2RUR5QllPN1RheFhEMGlaV2V1VEh0ZFJxTWVLZVAvei9lYnZJMUkvRkWS9NNGdxYjdXbGVqM0JNS051TVc2Q1AwUmFYcXJnUmpnVTAKLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQu=
 7  namespace: YW50cmVhLW11bHRpY2x1c3Rlcg==
 8  token: WXlKaGJHY2lPaUpTVXpJMCUVRGaWNIUlJTR0Z1VnpkTU1GVjVOalF3VWpKcVNFVXdYMFkzZDFsNmJqWXpUa3hzVUc4aWZRLmV5SnBjM01pT2lKcmRXSmxjbTVsZEdWekwzTmxjblpwWTJWaFkyTnZkVzUwSWl3aWEzVmlaWEp1WlhSbGN5NXBieTl6WlhKMmFXTmxZV05qYjNWdWRDOXVZVzFsYzNCaFkyVWlPaUpoYm5SeVpXRXRiWFZzZEdsamJIVnpkR1Z5SWl3aWEzVmlaWEp1WlhSbGN5NXBieTl6WlhKMmFXTmxZV05qYjNWdWRDOXpaV055WlhRdWJtRnRaU0k2SW0xbGJXSmxjaTFpYkhWbExYUnZhMlZ1SWl3aWEzVmlaWEp1WlhSbGN5NXBieTl6WlhKMmFXTmxZV05qY5RdWJtRnRaU0k2SW0xbGJXSmxjaTFpYkhWbElpd2lhM1ZpWlhKdVpYUmxjeTVwYnk5elpYSjJhV05sWVdOamIzVnVkQzl6WlhKMmFXTmxMV0ZqWTI5MWJuUXVkV2xrSWpvaU1XSTVaVGd4TUdVdE5tTTNaQzAwTjJZekxXRTVaR1F0TldWbU4yTXpPVEE0T0dNNElpd2ljM1ZpSWpvaWMzbHpkR1Z0T25ObGNuWnBZMlZoWTJOdmRXNTBPbUZ1ZEhKbFlTMXRkV3gwYVdOc2RYTjBaWEk2YldWdFltVnlMV0pzZFdVaWZRLmNyU29sN0JzS2JyR2lhUVYyWmJ2OG8tUVl1eVBfWGlnZ1hfcDg2UGs0ZEpVOTBwWk1XYUZTUkFUR1ptQnBCUUJnajI5eVJHbkdvajVHRWN3YVNxWlZaV3FySk9jVTM1QXFlWHhpWm1fUl9LWDB4VUR1Y0wxQTVxNdWhOYTBGUkxmU2FKWUNOME1NTHNKVTBDU3pHUVg1dHlzTXBUN0YwVG0weS1mZFpVOE9IQmJoY0ZDWXkyYk5WdC0weU9pQUlYOHR2TVNrb2NzaHpWUm5ha1A5dmtMaXNVUGh2Vm9xMVROZ2RVRmtjc0lPRjl4ZFo5Ul9PX3NlT1ZLay1hNkhJbjB3THQzZ3FEZHRHU09Ub3BfMUh0djgxeEZQdF9zNlVRNmpldjZpejh3aFAzX1BkSGhwTlNCWFBEc3hZbEhyMlVaUK==
 9kind: Secret
10metadata:
11  name: member-red-token
12  namespace: kube-system
13type: Opaque

On the Member Cluster apply tokens

Now on both members apply the corresponding token yaml, member-1 applies the member-blue-token.yaml and member-2 applies the member-red-token.yaml.

1## tkg-cluster-2
2andreasm@tkg-bootstrap:~$ k apply -f member-blue-token.yml
3secret/member-blue-token created
4## tkg-cluster-3
5andreasm@tkg-bootstrap:~$ k apply -f member-red-token.yml
6secret/member-red-token created

On the Leader Cluster - ClusterSet Initialization

I am using Antrea version 1.11.1 (that comes with TKG 2.3) so in this step we need to use the v1alpha1 api.

First I need to create a ClusterSet in the Leader cluster by applying the below yaml:

 1apiVersion: multicluster.crd.antrea.io/v1alpha2
 2kind: ClusterClaim
 3metadata:
 4  name: id.k8s.io
 5  namespace: antrea-multicluster
 6value: tkg-cluster-leader
 7---
 8apiVersion: multicluster.crd.antrea.io/v1alpha2
 9kind: ClusterClaim
10metadata:
11  name: clusterset.k8s.io
12  namespace: antrea-multicluster
13value: andreasm-clusterset
14---
15apiVersion: multicluster.crd.antrea.io/v1alpha1
16kind: ClusterSet
17metadata:
18  name: andreasm-clusterset
19  namespace: antrea-multicluster
20spec:
21  leaders:
22    - clusterID: tkg-cluster-leader
1## apply it
2andreasm@tkg-bootstrap:~$ k apply -f clusterset-leader.yaml
3### applied
4clusterclaim.multicluster.crd.antrea.io/id.k8s.io created
5clusterclaim.multicluster.crd.antrea.io/clusterset.k8s.io created
6clusterset.multicluster.crd.antrea.io/andreasm-clusterset created
1andreasm@tkg-bootstrap:~$ k get clustersets.multicluster.crd.antrea.io -A
2NAMESPACE             NAME                  LEADER CLUSTER NAMESPACE   TOTAL CLUSTERS   READY CLUSTERS   AGE
3antrea-multicluster   andreasm-clusterset                                                                84s
4andreasm@tkg-bootstrap:~$ k get clusterclaims.multicluster.crd.antrea.io -A
5NAMESPACE             NAME                VALUE                 AGE
6antrea-multicluster   clusterset.k8s.io   andreasm-clusterset   118s
7antrea-multicluster   id.k8s.io           tkg-cluster-leader    119s

On the Member Clusters - ClusterSet Initialization

Next step is to deploy the corresponding yaml on both my member clusters (tkg-cluster-2=member-cluster-blue and tkg-cluster-3=member-cluster-red):

 1apiVersion: multicluster.crd.antrea.io/v1alpha2
 2kind: ClusterClaim
 3metadata:
 4  name: id.k8s.io
 5  namespace: kube-system
 6value: member-cluster-blue
 7---
 8apiVersion: multicluster.crd.antrea.io/v1alpha2
 9kind: ClusterClaim
10metadata:
11  name: clusterset.k8s.io
12  namespace: kube-system
13value: andreasm-clusterset
14---
15apiVersion: multicluster.crd.antrea.io/v1alpha1
16kind: ClusterSet
17metadata:
18  name: andreasm-clusterset
19  namespace: kube-system
20spec:
21  leaders:
22    - clusterID: tkg-cluster-leader
23      secret: "member-blue-token" # secret/token created earlier
24      server: "https://10.101.114.100:6443" # reflect the correct endpoint IP for leader cluster
25  namespace: antrea-multicluster
 1apiVersion: multicluster.crd.antrea.io/v1alpha2
 2kind: ClusterClaim
 3metadata:
 4  name: id.k8s.io
 5  namespace: kube-system
 6value: member-cluster-red
 7---
 8apiVersion: multicluster.crd.antrea.io/v1alpha2
 9kind: ClusterClaim
10metadata:
11  name: clusterset.k8s.io
12  namespace: kube-system
13value: andreasm-clusterset
14---
15apiVersion: multicluster.crd.antrea.io/v1alpha1
16kind: ClusterSet
17metadata:
18  name: andreasm-clusterset
19  namespace: kube-system
20spec:
21  leaders:
22    - clusterID: tkg-cluster-leader
23      secret: "member-red-token" # secret/token created earlier
24      server: "https://10.101.114.100:6443" # reflect the correct endpoint IP for leader cluster
25  namespace: antrea-multicluster
 1## tkg-cluster-2
 2andreasm@tkg-bootstrap:~$ k apply -f clusterclaim-member-blue.yaml
 3## applied
 4clusterclaim.multicluster.crd.antrea.io/id.k8s.io created
 5clusterclaim.multicluster.crd.antrea.io/clusterset.k8s.io created
 6clusterset.multicluster.crd.antrea.io/andreasm-clusterset created
 7## tkg-cluster-3
 8andreasm@tkg-bootstrap:~$ k apply -f clusterclaim-member-red.yaml
 9## applied
10clusterclaim.multicluster.crd.antrea.io/id.k8s.io created
11clusterclaim.multicluster.crd.antrea.io/clusterset.k8s.io created
12clusterset.multicluster.crd.antrea.io/andreasm-clusterset created

On the Member Clusters - Multi-cluster Gateway configuration

From the docs:

Multi-cluster Gateways are required to support multi-cluster Service access across member clusters. Each member cluster should have one Node served as its Multi-cluster Gateway. Multi-cluster Service traffic is routed among clusters through the tunnels between Gateways.

After a member cluster joins a ClusterSet, and the Multicluster feature is enabled on antrea-agent, you can select a Node of the cluster to serve as the Multi-cluster Gateway by adding an annotation: multicluster.antrea.io/gateway=true to the K8s Node.

You can annotate multiple Nodes in a member cluster as the candidates for Multi-cluster Gateway, but only one Node will be selected as the active Gateway. Before Antrea v1.9.0, the Gateway Node is just randomly selected and will never change unless the Node or its gateway annotation is deleted. Starting with Antrea v1.9.0, Antrea Multi-cluster Controller will guarantee a "ready" Node is selected as the Gateway, and when the current Gateway Node's status changes to not "ready", Antrea will try selecting another "ready" Node from the candidate Nodes to be the Gateway.

I will annotate both my worker nodes in both my member clusters, blue and red.

1andreasm@tkg-bootstrap:~/Kubernetes-library/tkgm/antrea-multicluster$ k annotate node tkg-cluster-2-md-0-qhqgb-58df8c59f4xxb5q8-s7w2z multicluster.antrea.io/gateway=true
2node/tkg-cluster-2-md-0-qhqgb-58df8c59f4xxb5q8-s7w2z annotated
3
4## Will repeat this for both nodes in both clusters

Then check which node that has been decided in both clusters:

1## tkg cluster 2 - member blue
2andreasm@tkg-bootstrap:~$ k get gateway -n kube-system
3NAME                                              GATEWAY IP     INTERNAL IP    AGE
4tkg-cluster-2-md-0-qhqgb-58df8c59f4xxb5q8-s7w2z   10.101.12.24   10.101.12.24   4m18s
5## tkg cluster 3 - member red
6andreasm@tkg-bootstrap:~$ k get gateway -n kube-system
7NAME                                              GATEWAY IP     INTERNAL IP    AGE
8tkg-cluster-3-md-0-82gh2-6dcf989cbcxnc8lc-z5c8g   10.101.12.26   10.101.12.26   116s

And the logs from the Antrea MC controller after adding the gateway annotation:

1I0928 09:46:01.599697       1 clusterset_controller.go:111] "Received ClusterSet add/update" clusterset="kube-system/andreasm-clusterset"
2I0928 09:46:01.599772       1 controller_utils.go:40] "Validating ClusterClaim" namespace="kube-system"
3I0928 09:46:01.701456       1 controller_utils.go:52] "Processing ClusterClaim" name="id.k8s.io" value="member-cluster-blue"
4I0928 09:46:01.701484       1 controller_utils.go:52] "Processing ClusterClaim" name="clusterset.k8s.io" value="andreasm-clusterset"
5I0928 09:46:01.701907       1 clusterset_controller.go:204] "Creating RemoteCommonArea" cluster=tkg-cluster-leader
6I0928 09:46:02.236829       1 remote_common_area.go:111] "Create a RemoteCommonArea" cluster=tkg-cluster-leader
7I0928 09:46:02.237004       1 clusterset_controller.go:251] "Created RemoteCommonArea" cluster=tkg-cluster-leader
8I0928 09:46:02.237064       1 remote_common_area.go:293] "Starting MemberAnnounce to RemoteCommonArea" cluster=tkg-cluster-leader
9I0928 09:46:02.388473       1 remote_common_area.go:236] "Updating RemoteCommonArea status" cluster=tkg-cluster-leader connected=true
1I0928 09:49:17.663979       1 clusterset_controller.go:111] "Received ClusterSet add/update" clusterset="kube-system/andreasm-clusterset"
2I0928 09:49:17.664066       1 controller_utils.go:40] "Validating ClusterClaim" namespace="kube-system"
3I0928 09:49:17.764734       1 controller_utils.go:52] "Processing ClusterClaim" name="id.k8s.io" value="member-cluster-red"
4I0928 09:49:17.764778       1 controller_utils.go:52] "Processing ClusterClaim" name="clusterset.k8s.io" value="andreasm-clusterset"
5I0928 09:49:17.764855       1 clusterset_controller.go:204] "Creating RemoteCommonArea" cluster=tkg-cluster-leader
6I0928 09:49:18.251398       1 remote_common_area.go:111] "Create a RemoteCommonArea" cluster=tkg-cluster-leader
7I0928 09:49:18.251475       1 clusterset_controller.go:251] "Created RemoteCommonArea" cluster=tkg-cluster-leader
8I0928 09:49:18.251588       1 remote_common_area.go:293] "Starting MemberAnnounce to RemoteCommonArea" cluster=tkg-cluster-leader
9I0928 09:49:18.363564       1 remote_common_area.go:236] "Updating RemoteCommonArea status" cluster=tkg-cluster-leader connected=true

Now on both member clusters, they should have received and be aware of their fellow member clusters network information. This can be verified with the following command on both member clusters:

1## member-red - tkg-cluster-3
2andreasm@tkg-bootstrap:~$ k get clusterinfoimports.multicluster.crd.antrea.io -n kube-system
3NAME                              CLUSTER ID            SERVICE CIDR    AGE
4member-cluster-blue-clusterinfo   member-cluster-blue   10.134.0.0/16   69s
1## member-blue - tkg-cluster-2
2andreasm@tkg-bootstrap:~$ k get clusterinfoimports.multicluster.crd.antrea.io -n kube-system
3NAME                             CLUSTER ID           SERVICE CIDR    AGE
4member-cluster-red-clusterinfo   member-cluster-red   10.136.0.0/16   3m10s

Now that Antrea Multi-cluster has been configured, its time to head over to the section on what we can use this for.

This his how my environment looks like now:

Member cluster blue is exchanging information to member cluster red via the leader cluster using their gateway currently active on worker-node-md-0 in both member clusters.

Using antctl

For now I will just link to the GitHub docs of Antrea how to use the antctl approach.

For how to use the antctl approach, click here

To use the antctl cli tool download the corresponding Antrea version of antctl here

Multi-cluster Service

Imagine you have an application that consists of a frontend and a backend. The frontend must run on a dedicated cluster or multiple clusters for scalability, easier exposure, and security posture reasons. The backend services should be running in a more controlled "inner" cluster. Using Antrea Multi-cluster that is possible. Lets go through how to configure this.

First I will deploy an application called Yelb source. This consists of a frontend service "yelb-ui"and three backend applications "application, redis and PostgreSQL".

This is how the architecture of Yelb looks like:

yelb

I want the yelb-ui to only run in the member cluster blue, and I want all the backends to run in the member cluster red. Like this:

I will start with deploying the backend part of the Yelb application in my member cluster red:

 1## Backend pods running in member-red
 2andreasm@tkg-bootstrap:~$ k get pods -n yelb -o wide
 3NAME                              READY   STATUS    RESTARTS   AGE     IP             NODE                                              NOMINATED NODE   READINESS GATES
 4redis-server-56d97cc8c-dzbg7      1/1     Running   0          5m38s   100.10.2.157   tkg-cluster-3-md-0-82gh2-6dcf989cbcxnc8lc-z5c8g   <none>           <none>
 5yelb-appserver-65855b7ffd-nggg9   1/1     Running   0          5m36s   100.10.2.159   tkg-cluster-3-md-0-82gh2-6dcf989cbcxnc8lc-z5c8g   <none>           <none>
 6yelb-db-6f78dc6f8f-5dgpv          1/1     Running   0          5m37s   100.10.2.158   tkg-cluster-3-md-0-82gh2-6dcf989cbcxnc8lc-z5c8g   <none>           <none>
 7## Backend services running in member-red
 8andreasm@tkg-bootstrap:~$ k get svc -n yelb
 9NAME             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
10redis-server     ClusterIP   100.20.176.130   <none>        6379/TCP   2m29s
11yelb-appserver   ClusterIP   100.20.111.133   <none>        4567/TCP   2m27s
12yelb-db          ClusterIP   100.20.7.160     <none>        5432/TCP   2m28s

Now I will deploy the frontend service in the member cluster blue

1andreasm@tkg-bootstrap:~$ k get pods -n yelb
2NAME                       READY   STATUS             RESTARTS      AGE
3yelb-ui-5c5b8d8887-wkchd   0/1     CrashLoopBackOff   3 (20s ago)   2m49s

its deployed, but not running. It cant reach the backend service yelb-appserver as this is running on a completely different cluster.

1andreasm@tkg-bootstrap:~$ k logs -n yelb yelb-ui-7bc645756b-qmtbp
22023/09/29 06:16:17 [emerg] 10#10: host not found in upstream "antrea-mc-yelb-appserver" in /etc/nginx/conf.d/default.conf:5
3nginx: [emerg] host not found in upstream "antrea-mc-yelb-appserver" in /etc/nginx/conf.d/default.conf:5

Its also red in my NSX-ALB environment.

Now I will export the yelb-appserver service using Antrea Multi-cluster Service, so any pods running in member-blue can also use this service running on member-red. For the Yelb UI to work it needs to talk to the appserver service (yelb-appserver).

From the source, where the yelb-appserver service is defined locally and running, I need to define a ServiceExport which is just using the name of the original service and the namespace where the service is located:

1apiVersion: multicluster.x-k8s.io/v1alpha1
2kind: ServiceExport
3metadata:
4  name: yelb-appserver ## name of service you want to export
5  namespace: yelb ## namespace of the service

Apply it

1andreasm@tkg-bootstrap:~$ k apply -f yelb-app-service-export.yaml
2serviceexport.multicluster.x-k8s.io/yelb-appserver created

From the leader-cluster I can check all the resourceexports and resourceimports:

1andreasm@tkg-bootstrap:~$ k get resourceexports.multicluster.crd.antrea.io -n antrea-multicluster
2NAME                                               CLUSTER ID            KIND          NAMESPACE     NAME                  AGE
3member-cluster-blue-clusterinfo                    member-cluster-blue   ClusterInfo   kube-system   member-cluster-blue   53m
4member-cluster-red-clusterinfo                     member-cluster-red    ClusterInfo   kube-system   member-cluster-red    51m
5member-cluster-red-yelb-yelb-appserver-endpoints   member-cluster-red    Endpoints     yelb          yelb-appserver        2m59s
6member-cluster-red-yelb-yelb-appserver-service     member-cluster-red    Service       yelb          yelb-appserver        2m59s
1andreasm@tkg-bootstrap:~$ k get resourceimports.multicluster.crd.antrea.io -n antrea-multicluster
2NAME                              KIND            NAMESPACE             NAME                              AGE
3member-cluster-blue-clusterinfo   ClusterInfo     antrea-multicluster   member-cluster-blue-clusterinfo   55m
4member-cluster-red-clusterinfo    ClusterInfo     antrea-multicluster   member-cluster-red-clusterinfo    53m
5yelb-yelb-appserver-endpoints     Endpoints       yelb                  yelb-appserver                    4m41s
6yelb-yelb-appserver-service       ServiceImport   yelb                  yelb-appserver                    4m41s

So my yelb-appserver service has been exported.

What happens now in my member cluster blue? For the service to be imported into my member blue cluster it needs to have the namespace yelb created, otherwise it will not be imported. I have the ns yelb created as the yelb-ui is already deployed in this namespace. In my member cluster blue I can now see that I have the service yelb-appserver imported. Its there. Yay.

1andreasm@tkg-bootstrap:~$ k get serviceimports.multicluster.x-k8s.io -A
2NAMESPACE   NAME             TYPE           IP                   AGE
3yelb        yelb-appserver   ClusterSetIP   ["100.40.224.145"]   15m
4andreasm@tkg-bootstrap:~$ k get serviceimports.multicluster.x-k8s.io -n yelb
5NAME             TYPE           IP                   AGE
6yelb-appserver   ClusterSetIP   ["100.40.224.145"]   17m
1andreasm@tkg-bootstrap:~$ k get svc -n yelb
2NAME                       TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)        AGE
3antrea-mc-yelb-appserver   ClusterIP      100.40.224.145   <none>           4567/TCP       19m
4yelb-ui                    LoadBalancer   100.40.136.62    10.101.115.100   80:30425/TCP   26m

Will my yelb-ui pod figure this out then?

The yelb-ui pod is running:

1amarqvardsen@amarqvards1MD6T:~/Kubernetes-library/tkgm/antrea-multicluster/yelb$ k get pods -n yelb
2NAME                       READY   STATUS    RESTARTS       AGE
3yelb-ui-5c5b8d8887-wkchd   1/1     Running   5 (5m5s ago)   8m17s

The VS is green:

And inside the application itself I can also see the app server it is currently using:

And doing a quick sanity check. Where are my pods running. The yelb-ui pod is running in member cluster blue:

1NAME                       READY   STATUS    RESTARTS       AGE    IP            NODE                                              NOMINATED NODE   READINESS GATES
2yelb-ui-7bc645756b-qmtbp   1/1     Running   5 (121m ago)   124m   10.133.1.56   tkg-cluster-2-md-1-wmxhd-5998fcf669x64b9h-z7mnw   <none>           <none>

And the appserver pod is running:

1NAME                             READY   STATUS    RESTARTS   AGE    IP            NODE                                              NOMINATED NODE   READINESS GATES
2redis-server-56d97cc8c-wvc4q     1/1     Running   0          126m   10.135.1.53   tkg-cluster-3-md-1-824bx-5bdb559f7bxqgbb8-bqwnr   <none>           <none>
3yelb-appserver-d9ddcdd97-z8gt9   1/1     Running   0          126m   10.135.1.55   tkg-cluster-3-md-1-824bx-5bdb559f7bxqgbb8-bqwnr   <none>           <none>
4yelb-db-749c784cb8-f6wx8         1/1     Running   0          126m   10.135.1.54   tkg-cluster-3-md-1-824bx-5bdb559f7bxqgbb8-bqwnr   <none>           <none>

Notice the name from the ui of the app above on the name of the pod listed. Its the same pod.

Recap of Multi-cluster service.

  • In member cluster red I deployed the three backends needed for the Yelb application. These consists of 3 PODs and their corresponding services. As of now they are just local and accessible internally in the same k8s cluster.
  • In member cluster blue I deployed the Yelb UI service, in a CrashLoopBackOff as it cant reach the necessary appserver service.
  • Then I exported the yelb-appserver using Antrea Multi-cluster Services

And the Yelb application lived happily ever after ๐Ÿ†’

Quick troubleshooting

Just to add an easy way to verify the exported services work. I deployed a simple nginx pod in member cluster red. Exported the nginx ClusterIP service, from my member cluster blue I have deployed a Ubuntu pod. From this pod I will do a curl to the exported nginx service to see if I can reach it and if it works.

 1## nginx pod running in member cluster red
 2NAME                        READY   STATUS    RESTARTS   AGE    IP            NODE                                              NOMINATED NODE   READINESS GATES
 3nginx-ui-59c956b95b-qhz8n   1/1     Running   0          151m   10.135.1.56   tkg-cluster-3-md-1-824bx-5bdb559f7bxqgbb8-bqwnr   <none>           <none>
 4## nginx local clusterip service
 5NAME              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
 6nginx             ClusterIP   10.136.40.246    <none>        80/TCP    152m
 7## the exported nginx service in my member cluster blue
 8NAME              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
 9antrea-mc-nginx   ClusterIP   10.134.30.220   <none>        80/TCP    147m
10## from my ubuntu pod in member cluster blue
11root@ubuntu-20-04-cbb58d77-tcrjb:/# nslookup 10.134.30.220
12220.30.134.10.in-addr.arpa	name = antrea-mc-nginx.nginx.svc.cluster.local.
13
14## curl the dns name
15root@ubuntu-20-04-cbb58d77-tcrjb:/# curl http://antrea-mc-nginx.nginx.svc.cluster.local
16<!DOCTYPE html>
17<html>
18<head>
19<title>Welcome to nginx!</title>
20<style>
21html { color-scheme: light dark; }
22body { width: 35em; margin: 0 auto;
23font-family: Tahoma, Verdana, Arial, sans-serif; }
24</style>
25</head>
26<body>
27<h1>Welcome to nginx!</h1>
28<p>If you see this page, the nginx web server is successfully installed and
29working. Further configuration is required.</p>
30
31<p>For online documentation and support please refer to
32<a href="http://nginx.org/">nginx.org</a>.<br/>
33Commercial support is available at
34<a href="http://nginx.com/">nginx.com</a>.</p>
35
36<p><em>Thank you for using nginx.</em></p>
37</body>
38</html>

Failure scenario

What happens if a node which currently holds the active gateway goes down?

Lets test that. Currently these are my active gateways in member-cluster-blue and red respectively:

 1## get the current gateway in member-cluster blue
 2k get gateway -A
 3NAMESPACE     NAME                                             GATEWAY IP     INTERNAL IP    AGE
 4kube-system   tkg-cluster-2-md-0-vrt25-7f44f4798xqbk9h-tc979   10.101.12.14   10.101.12.14   37h
 5## list all nodes in my member-cluster blue
 6k get nodes -owide
 7NAME                                              STATUS   ROLES           AGE   VERSION            INTERNAL-IP    EXTERNAL-IP    OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
 8tkg-cluster-2-md-0-vrt25-7f44f4798xqbk9h-tc979    Ready    <none>          46h   v1.26.5+vmware.2   10.101.12.14   10.101.12.14   Ubuntu 20.04.6 LTS   5.4.0-152-generic   containerd://1.6.18-1-gdbc99e5b1
 9tkg-cluster-2-md-1-wmxhd-5998fcf669x64b9h-z7mnw   Ready    <none>          46h   v1.26.5+vmware.2   10.101.12.13   10.101.12.13   Ubuntu 20.04.6 LTS   5.4.0-152-generic   containerd://1.6.18-1-gdbc99e5b1
10tkg-cluster-2-x2fqj-fxswx                         Ready    control-plane   46h   v1.26.5+vmware.2   10.101.12.33   10.101.12.33   Ubuntu 20.04.6 LTS   5.4.0-152-generic   containerd://1.6.18-1-gdbc99e5b1
 1## get the current gateway in member-cluster red
 2k get gateway -A
 3NAMESPACE     NAME                                             GATEWAY IP     INTERNAL IP    AGE
 4kube-system   tkg-cluster-3-md-0-h5ppw-9db445579xq45bn-nsq98   10.101.12.38   10.101.12.38   37h
 5## list all nodes in my member-cluster red
 6k get nodes -o wide
 7NAME                                              STATUS   ROLES           AGE   VERSION            INTERNAL-IP    EXTERNAL-IP    OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
 8tkg-cluster-3-krrwq-p588j                         Ready    control-plane   46h   v1.26.5+vmware.2   10.101.12.28   10.101.12.28   Ubuntu 20.04.6 LTS   5.4.0-152-generic   containerd://1.6.18-1-gdbc99e5b1
 9tkg-cluster-3-md-0-h5ppw-9db445579xq45bn-nsq98    Ready    <none>          46h   v1.26.5+vmware.2   10.101.12.38   10.101.12.38   Ubuntu 20.04.6 LTS   5.4.0-152-generic   containerd://1.6.18-1-gdbc99e5b1
10tkg-cluster-3-md-1-824bx-5bdb559f7bxqgbb8-bqwnr   Ready    <none>          46h   v1.26.5+vmware.2   10.101.12.17   10.101.12.17   Ubuntu 20.04.6 LTS   5.4.0-152-generic   containerd://1.6.18-1-gdbc99e5b1

Now I will shutdown the node that currently is the active gateway in my member-cluster blue. I will from my vCenter just do a "Power off" operation.

power-off-gateway-node

And I will even delete it..

Remember I annotated my two nodes as potential Gateway candidates?

Well look what happened after my active gateway disappeared. It selects the next available candidate automatically. Gateway up again and all services up.

1k get gateway -A
2NAMESPACE     NAME                                              GATEWAY IP     INTERNAL IP    AGE
3kube-system   tkg-cluster-2-md-1-wmxhd-5998fcf669x64b9h-z7mnw   10.101.12.13   10.101.12.13   37s

The deleted node will be taken care of by TKG and recreated.

Routing pod traffic through Multi-cluster Gateways

Next neat feature of Antrea Multi-cluster is the ability to allow pods from the different member clusters to reach each other on their respective pod IP addresses. Say I have POD-A running in member-cluster blue and are depending on connecting to POD-B running in member-cluster red. To achieve this a couple of steps is needed to be configured.

  • The agent.conf Feature Gate enablePodToPodConnectivity: true must be set to true (This is already done as explained in the initial chapter)
  • Need to configure the antrea-mc-controller configMap

There are two ways to edit the antrea-mc-controller configmap, edit the yaml antrea-multicluster-member.yml that was used to install the antrea-mc-controller in the member clusters or edit the configMap antrea-mc-controller-config directly in the ns kube-system. I will edit it directly and the only thing that needs to be added is the cluster configured pod CIDR.

Info

If editing directly, one have to restart the pod antrea-mc-controller for it to read the new configMap.

The POD CIDRS between the clusters can NOT overlap

Editing the configMap

 1## member-cluster-blue (tkg-cluster-2)
 2## Get the pod or cluster cidr of the cluster you are editing the configMap
 3k cluster-info dump | grep -m 1 cluster-cidr
 4"--cluster-cidr=10.133.0.0/16",
 5## edit the configmap
 6k edit configmaps -n kube-system antrea-mc-controller-config
 7## the configmap
 8# Please edit the object below. Lines beginning with a '#' will be ignored,
 9# and an empty file will abort the edit. If an error occurs while saving this file will be
10# reopened with the relevant failures.
11#
12apiVersion: v1
13data:
14  controller_manager_config.yaml: |
15    apiVersion: multicluster.crd.antrea.io/v1alpha1
16    kind: MultiClusterConfig
17    health:
18      healthProbeBindAddress: :8080
19    metrics:
20      bindAddress: "0"
21    webhook:
22      port: 9443
23    leaderElection:
24      leaderElect: false
25    serviceCIDR: ""
26    podCIDRs:
27      - "10.133.0.0/16"  ## Add the cidr here from the range above
28    gatewayIPPrecedence: "private"
29    endpointIPType: "ClusterIP"
30    enableStretchedNetworkPolicy: false    
31kind: ConfigMap
32metadata:
33  annotations:
34    kubectl.kubernetes.io/last-applied-configuration: |
35      {"apiVersion":"v1","data":{"controller_manager_config.yaml":"apiVersion: multicluster.crd.antrea.io/v1alpha1\nkind: MultiClusterConfig\nhealth:\n  healthProbeBindAddress: :8080\nmetrics:\n  bindAddress: \"0\"\nwebhook:\n  port: 9443\nleaderElection:\n  leaderElect: false\nserviceCIDR: \"\"\npodCIDRs:\n  - \"\"\ngatewayIPPrecedence: \"private\"\nendpointIPType: \"ClusterIP\"\nenableStretchedNetworkPolicy: false\n"},"kind":"ConfigMap","metadata":{"annotations":{},"labels":{"app":"antrea"},"name":"antrea-mc-controller-config","namespace":"kube-system"}}      
36  creationTimestamp: "2023-09-29T05:57:12Z"
37  labels:
38    app: antrea
39  name: antrea-mc-controller-config
40  namespace: kube-system
41  resourceVersion: "98622"
42  uid: fd760052-a18e-4623-b217-d9b96ae36cac
43  
44  
45  ## :wq
46  configmap/antrea-mc-controller-config edited
47  

Restart antrea-mc-controller pod

1k delete pod -n kube-system antrea-mc-controller-5bb945f87f-nfd97
2pod "antrea-mc-controller-5bb945f87f-nfd97" deleted

Repeat on the other clusters

Now testing time

I have a pod running in member-cluster red (tkg-cluster-3) with the following IP:

1NAME                          READY   STATUS    RESTARTS   AGE     IP            NODE                                              NOMINATED NODE   READINESS GATES
2ubuntu-20-04-cbb58d77-l25zv   1/1     Running   0          5h18m   10.135.1.58   tkg-cluster-3-md-1-824bx-5bdb559f7bxqgbb8-bqwnr   <none>           <none>

I also have another POD running in my member-cluster blue with this information:

1NAME                          READY   STATUS    RESTARTS   AGE     IP            NODE                                              NOMINATED NODE   READINESS GATES
2ubuntu-20-04-cbb58d77-tcrjb   1/1     Running   0          5h24m   10.133.1.57   tkg-cluster-2-md-1-wmxhd-5998fcf669x64b9h-z7mnw   <none>           <none>

Now I will execute into the last pod and do a ping to the pod in member-cluster red using the POD IP.

1root@ubuntu-20-04-cbb58d77-tcrjb:/# ping 10.135.1.58
2PING 10.135.1.58 (10.135.1.58) 56(84) bytes of data.

On the destination pod I have started tcpdump to listen on icmp, and on the screenshot below I have the ping from the source on the left side to the destination on the right. Notice the IP addresses being reported by tcpdump in the destination pod.

src-dst-ping

 1root@ubuntu-20-04-cbb58d77-l25zv:/# tcpdump -i eth0 icmp
 2tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
 3listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
 411:41:44.577683 IP 10.101.12.14 > ubuntu-20-04-cbb58d77-l25zv: ICMP echo request, id 4455, seq 1, length 64
 511:41:44.577897 IP ubuntu-20-04-cbb58d77-l25zv > 10.101.12.14: ICMP echo reply, id 4455, seq 1, length 64
 611:41:45.577012 IP 10.101.12.14 > ubuntu-20-04-cbb58d77-l25zv: ICMP echo request, id 4455, seq 2, length 64
 711:41:45.577072 IP ubuntu-20-04-cbb58d77-l25zv > 10.101.12.14: ICMP echo reply, id 4455, seq 2, length 64
 811:41:46.577099 IP 10.101.12.14 > ubuntu-20-04-cbb58d77-l25zv: ICMP echo request, id 4455, seq 3, length 64
 911:41:46.577144 IP ubuntu-20-04-cbb58d77-l25zv > 10.101.12.14: ICMP echo reply, id 4455, seq 3, length 64
1011:41:47.579227 IP 10.101.12.14 > ubuntu-20-04-cbb58d77-l25zv: ICMP echo request, id 4455, seq 4, length 64
1111:41:47.579256 IP ubuntu-20-04-cbb58d77-l25zv > 10.101.12.14: ICMP echo reply, id 4455, seq 4, length 64
1211:41:48.580607 IP 10.101.12.14 > ubuntu-20-04-cbb58d77-l25zv: ICMP echo request, id 4455, seq 5, length 64
1311:41:48.580632 IP ubuntu-20-04-cbb58d77-l25zv > 10.101.12.14: ICMP echo reply, id 4455, seq 5, length 64
1411:41:49.581900 IP 10.101.12.14 > ubuntu-20-04-cbb58d77-l25zv: ICMP echo request, id 4455, seq 6, length 64

The source IP address here is the gateway IP from the source member-cluster:

1andreasm@tkg-bootstrap:~$ k get gateway -A
2NAMESPACE     NAME                                             GATEWAY IP     INTERNAL IP    AGE
3kube-system   tkg-cluster-2-md-0-vrt25-7f44f4798xqbk9h-tc979   10.101.12.14   10.101.12.14   5h37m

And if I do the same operation just change the direction:

src-right-dst-left

The source IP is the gateway IP from the source cluster:

1NAMESPACE     NAME                                             GATEWAY IP     INTERNAL IP    AGE
2kube-system   tkg-cluster-3-md-0-h5ppw-9db445579xq45bn-nsq98   10.101.12.38   10.101.12.38   5h41m

Using antctl traceflow:

 1andreasm@tkg-bootstrap:~$ antctl traceflow -S prod/ubuntu-20-04-cbb58d77-tcrjb -D 10.135.1.58
 2name: prod-ubuntu-20-04-cbb58d77-tcrjb-to-10.135.1.58-5b8l7ms8
 3phase: Running
 4source: prod/ubuntu-20-04-cbb58d77-tcrjb
 5destination: 10.135.1.58
 6results:
 7- node: tkg-cluster-2-md-1-wmxhd-5998fcf669x64b9h-z7mnw
 8  timestamp: 1696064979
 9  observations:
10  - component: SpoofGuard
11    action: Forwarded
12  - component: Forwarding
13    componentInfo: Output
14    action: Forwarded
15    tunnelDstIP: 10.101.12.14
16Error: timeout waiting for Traceflow done

I know it is forwarded and delivered as I can see it on the destination pod using tcpdump while doing the traceflow:

1root@ubuntu-20-04-cbb58d77-l25zv:/# tcpdump -i eth0 icmp
2tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
3listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
409:09:39.161140 IP 10.101.12.14 > ubuntu-20-04-cbb58d77-l25zv: ICMP echo request, id 0, seq 0, length 8
509:09:39.161411 IP ubuntu-20-04-cbb58d77-l25zv > 10.101.12.14: ICMP echo reply, id 0, seq 0, length 8

Next up is how Antrea Policies can be used with Antrea Multi-cluster

Multi-cluster NetworkPolicy (ANP and ACNP)

This feature is about the possibility to create policies using ClusterSet objects as selector for ingress or egress, like exported Multi-cluster services or namespaces across the clusters in a ClusterSet. For this to work the enableStretchedNetworkPolicy Feature Gate must be set to true on both controller.conf and agent.conf. I already enabled this at cluster provisioning, but to check take a look at the antrea-config configMap:

 1andreasm@tkg-bootstrap:~$ k get configmaps -n kube-system antrea-config -oyaml
 2apiVersion: v1
 3data:
 4  antrea-agent.conf: |
 5    featureGates:
 6      Multicluster: true
 7    multicluster:
 8      enableGateway: true
 9      enableStretchedNetworkPolicy: true
10      enablePodToPodConnectivity: true    
11  ...
12  antrea-controller.conf: |
13    featureGates: 
14      Multicluster: true
15    multicluster:
16      enableStretchedNetworkPolicy: true    

Following the GitHub docs I will use the examples there to make some policies and demonstrate them.

Starting with the first example, an egress rule to a Multi-cluster service.

Egress rule to Multi-cluster service

Below is an example taken from the official docs creating a policy that will Drop traffic to a specific Multi-cluster service from a specific pod using label selector (adjusted to fit my environment):

 1apiVersion: crd.antrea.io/v1alpha1
 2kind: ClusterNetworkPolicy
 3metadata:
 4  name: acnp-drop-ubuntu-pod-to-nginx-mc-service
 5spec:
 6  priority: 1
 7  tier: securityops
 8  appliedTo:
 9    - podSelector:
10        matchLabels:
11          role: no-nginx
12  egress:
13    - action: Drop
14      toServices:
15        - name: nginx   # an exported Multi-cluster Service
16          namespace: nginx
17          scope: ClusterSet

I will now apply it on the member-cluster-blue, where my source test pod Ubuntu is running. I can also apply it on the source cluster where service originates from and it will have the same effect:

1k apply -f egress-nginx-mc.yaml
2clusternetworkpolicy.crd.antrea.io/acnp-drop-ubuntu-pod-to-nginx-mc-service created
1k get acnp
2NAME                                       TIER          PRIORITY   DESIRED NODES   CURRENT NODES   AGE
3acnp-drop-ubuntu-pod-to-nginx-mc-service   securityops   1          0               0               32s

Its applied, but not in effect as I dont have the correct labels applied yet on my test pod.

So from my Ubuntu test pod, I can still reach the service nginx. But I will now label the pod according to the yaml above.

1k label pod -n prod ubuntu-20-04-cbb58d77-tcrjb role=no-nginx
2pod/ubuntu-20-04-cbb58d77-tcrjb labeled
1k get acnp
2NAME                                       TIER          PRIORITY   DESIRED NODES   CURRENT NODES   AGE
3acnp-drop-ubuntu-pod-to-nginx-mc-service   securityops   1          1               1               39s

Something is in effect... Now from my test Ubuntu pod I will try to curl the nginx service.

1root@ubuntu-20-04-cbb58d77-tcrjb:/# nslookup 10.134.30.220
2220.30.134.10.in-addr.arpa	name = antrea-mc-nginx.nginx.svc.cluster.local.
3root@ubuntu-20-04-cbb58d77-tcrjb:/# curl http://antrea-mc-nginx.nginx.svc.cluster.local
4curl: (28) Failed to connect to antrea-mc-nginx.nginx.svc.cluster.local port 80: Connection timed out
5root@ubuntu-20-04-cbb58d77-tcrjb:/#

Then I can do a Antrea traceflow:

 1andreasm@tkg-bootstrap:~$ antctl traceflow -S prod/ubuntu-20-04-cbb58d77-tcrjb -D nginx/antrea-mc-nginx -f tcp,tcp_dst=80
 2name: prod-ubuntu-20-04-cbb58d77-tcrjb-to-nginx-antrea-mc-nginx-l55dnhvf
 3phase: Succeeded
 4source: prod/ubuntu-20-04-cbb58d77-tcrjb
 5destination: nginx/antrea-mc-nginx
 6results:
 7- node: tkg-cluster-2-md-1-wmxhd-5998fcf669x64b9h-z7mnw
 8  timestamp: 1696065339
 9  observations:
10  - component: SpoofGuard
11    action: Forwarded
12  - component: LB
13    action: Forwarded
14    translatedDstIP: 10.136.40.246
15  - component: NetworkPolicy
16    componentInfo: EgressMetric
17    action: Dropped
18    networkPolicy: AntreaClusterNetworkPolicy:acnp-drop-ubuntu-pod-to-nginx-mc-service

No need to troubleshoot connectivity issues, is being dropped by the above policy. ๐Ÿšท

This policy will drop all outgoing (egress) from any pods with label role=no-nginx to the exported/imported service using Multi-cluster. Pod selection is done regardless of namespace as my selection is done using pod labels and I am using a Antrea ClusterNetworkPolicy.

Such a policy is applied on the "source" cluster where you want to define strict egress policies. Another way is to define ingress policies on the destination cluster (source cluster of the exported service).

Ingress rule

Info

Before getting started on this chapter we need to enable the **enableStretchedNetworkPolicy" feature in the configMap of ALL antrea-mc-controllers in the ClusterSet (including the leader cluster). It is shown below...

I will have to edit the configMap in both member clusters and leader-cluster setting enableStretchedNetworkPolicy: to true

 1# Please edit the object below. Lines beginning with a '#' will be ignored,
 2# and an empty file will abort the edit. If an error occurs while saving this file will be
 3# reopened with the relevant failures.
 4#
 5apiVersion: v1
 6data:
 7  controller_manager_config.yaml: |
 8    apiVersion: multicluster.crd.antrea.io/v1alpha1
 9    kind: MultiClusterConfig
10    health:
11      healthProbeBindAddress: :8080
12    metrics:
13      bindAddress: "0"
14    webhook:
15      port: 9443
16    leaderElection:
17      leaderElect: false
18    serviceCIDR: ""
19    podCIDRs:
20      - "10.133.0.0/16"
21    gatewayIPPrecedence: "private"
22    endpointIPType: "ClusterIP"
23    enableStretchedNetworkPolicy: true #set to true    
24kind: ConfigMap
25metadata:
26  annotations:
27    kubectl.kubernetes.io/last-applied-configuration: |
28      {"apiVersion":"v1","data":{"controller_manager_config.yaml":"apiVersion: multicluster.crd.antrea.io/v1alpha1\nkind: MultiClusterConfig\nhealth:\n  healthProbeBindAddress: :8080\nmetrics:\n  bindAddress: \"0\"\nwebhook:\n  port: 9443\nleaderElection:\n  leaderElect: false\nserviceCIDR: \"\"\npodCIDRs:\n  - \"\"\ngatewayIPPrecedence: \"private\"\nendpointIPType: \"ClusterIP\"\nenableStretchedNetworkPolicy: false\n"},"kind":"ConfigMap","metadata":{"annotations":{},"labels":{"app":"antrea"},"name":"antrea-mc-controller-config","namespace":"kube-system"}}      
29  creationTimestamp: "2023-09-29T05:57:12Z"
30  labels:
31    app: antrea
32  name: antrea-mc-controller-config
33  namespace: kube-system
34  resourceVersion: "415934"
35  uid: fd760052-a18e-4623-b217-d9b96ae36cac

Restart the antrea-mc-controller after editing the above configMap.

Again I will be taking the two examples from the Antrea Gitub doc pages, adjust them to suit my environment. The first policy example will apply to namespaces with the label environment=protected where ingress will be denied/dropped from the selected pods in any namespace with the label environment=untrust where the scope is ClusterSet. This means it should filter on any traffic coming from any of the namespaces having the label environment=untrust in any member cluster in the ClusterSet. So lets see how this works.

 1apiVersion: crd.antrea.io/v1alpha1
 2kind: ClusterNetworkPolicy
 3metadata:
 4  name: drop-untrust-access-to-protected-namespace
 5spec:
 6  appliedTo:
 7  - namespaceSelector:
 8      matchLabels:
 9        environment: protected
10  priority: 1
11  tier: securityops
12  ingress:
13  - action: Drop
14    from:
15    # Select all Pods in environment=untrust Namespaces in the ClusterSet
16    - scope: ClusterSet
17      namespaceSelector:
18        matchLabels:
19          environment: untrust

This policy will be applied on the the member clusters where I do have such "protected" namespaces and want to ensure no incoming (ingress) from any "untrust" environments.

Lets start with applying the policy, and check if it has been applied:

1andreasm@tkg-bootstrap:~$ k apply -f drop-untrust-to-protected.yaml
2clusternetworkpolicy.crd.antrea.io/drop-untrust-access-to-protected-namespace created
3## check it
4andreasm@tkg-bootstrap:~$ k get acnp
5NAME                                         TIER          PRIORITY   DESIRED NODES   CURRENT NODES   AGE
6drop-untrust-access-to-protected-namespace   securityops   1          0               0               54s

Nothing enforced yet.

Back to my ubuntu pod again which resided in the namespace prod in member-cluster blue (tkg-cluster-2). I will label til namespace with environment=protected.

1andreasm@tkg-bootstrap:~$ k label namespaces prod environment=protected
2namespace/prod labeled
3## checking the policy now
4andreasm@tkg-bootstrap:~$ k get acnp
5NAME                                         TIER          PRIORITY   DESIRED NODES   CURRENT NODES   AGE
6drop-untrust-access-to-protected-namespace   securityops   1          1               1               3m10s

In my other member cluster, red, I will create a new namespace, spin up another ubuntu pod there, label the namespace environment=untrust.

1andreasm@tkg-bootstrap:~$ k get pods -n untrust-ns -owide
2NAME                          READY   STATUS    RESTARTS   AGE   IP             NODE                                              NOMINATED NODE   READINESS GATES
3ubuntu-20-04-cbb58d77-bl5w5   1/1     Running   0          43s   10.135.1.228   tkg-cluster-3-md-1-824bx-5bdb559f7bxqgbb8-bqwnr   <none>           <none>

Is now running in the untrust-ns

From the pod I will try to ping the "protected" pod in the member-cluster-blue before I label the namespace with untrust.

1root@ubuntu-20-04-cbb58d77-bl5w5:/# ping 10.133.1.57
2PING 10.133.1.57 (10.133.1.57) 56(84) bytes of data.
364 bytes from 10.133.1.57: icmp_seq=1 ttl=60 time=7.93 ms
464 bytes from 10.133.1.57: icmp_seq=2 ttl=60 time=4.17 ms
564 bytes from 10.133.1.57: icmp_seq=3 ttl=60 time=2.99 ms

This "protected" pod is also running an nginx instance, so lets see if I can curl it from my untrust pod:

 1root@ubuntu-20-04-cbb58d77-bl5w5:/# curl 10.133.1.57
 2<!DOCTYPE html>
 3<html>
 4<head>
 5<title>Welcome to nginx!</title>
 6<style>
 7    body {
 8        width: 35em;
 9        margin: 0 auto;
10        font-family: Tahoma, Verdana, Arial, sans-serif;
11    }
12</style>
13</head>
14<body>
15<h1>Welcome to nginx!</h1>
16<p>If you see this page, the nginx web server is successfully installed and
17working. Further configuration is required.</p>
18
19<p>For online documentation and support please refer to
20<a href="http://nginx.org/">nginx.org</a>.<br/>
21Commercial support is available at
22<a href="http://nginx.com/">nginx.com</a>.</p>
23
24<p><em>Thank you for using nginx.</em></p>
25</body>
26</html>

This works also. Now I will label the namespace accordingly.

1andreasm@tkg-bootstrap:~$ k get ns untrust-ns --show-labels
2NAME         STATUS   AGE    LABELS
3untrust-ns   Active   9m3s   environment=untrust,kubernetes.io/metadata.name=untrust-ns

Can I now both curl and ping the protected pod from the untrust pod?

1root@ubuntu-20-04-cbb58d77-bl5w5:/# ping 10.133.1.57
2PING 10.133.1.57 (10.133.1.57) 56(84) bytes of data.
3^C
4--- 10.133.1.57 ping statistics ---
596 packets transmitted, 0 received, 100% packet loss, time 97275ms
6## curl
7root@ubuntu-20-04-cbb58d77-bl5w5:/# curl 10.133.1.57
8curl: (28) Failed to connect to 10.133.1.57 port 80: Connection timed out

Thats a no...

How about that. Now I would like to see the resourceImport/Exports in the leader cluster:

  1andreasm@tkg-bootstrap:~$ k get resourceimports.multicluster.crd.antrea.io -A
  2NAMESPACE             NAME                              KIND            NAMESPACE             NAME                              AGE
  3antrea-multicluster   07114e55523175d2                  LabelIdentity                                                           7m4s
  4antrea-multicluster   085dd73c98e1d875                  LabelIdentity                                                           7m4s
  5antrea-multicluster   0c56bac2726cdc89                  LabelIdentity                                                           7m4s
  6antrea-multicluster   0f8eaa8d1ed0d024                  LabelIdentity                                                           7m4s
  7antrea-multicluster   1742a902fef9ecf2                  LabelIdentity                                                           7m4s
  8antrea-multicluster   1a7d18d61d0c0ee1                  LabelIdentity                                                           7m4s
  9antrea-multicluster   1f395d26ddf2e628                  LabelIdentity                                                           7m4s
 10antrea-multicluster   23f00caa60df7444                  LabelIdentity                                                           7m4s
 11antrea-multicluster   2ae09744db3c2971                  LabelIdentity                                                           7m4s
 12antrea-multicluster   2de39d651a0361e9                  LabelIdentity                                                           7m4s
 13antrea-multicluster   2ef5bcdb8443a24c                  LabelIdentity                                                           7m4s
 14antrea-multicluster   339dbb049e2e9a92                  LabelIdentity                                                           7m4s
 15antrea-multicluster   430e80a9621c621a                  LabelIdentity                                                           7m4s
 16antrea-multicluster   4c9c7b4329d0e128                  LabelIdentity                                                           3m55s
 17antrea-multicluster   5629f9c0856c3bab                  LabelIdentity                                                           7m4s
 18antrea-multicluster   593cb26f6e1ae9e3                  LabelIdentity                                                           7m4s
 19antrea-multicluster   66b072b8efc1faa7                  LabelIdentity                                                           7m4s
 20antrea-multicluster   67410707ad7a9908                  LabelIdentity                                                           7m4s
 21antrea-multicluster   7468af4ac6f5dfa7                  LabelIdentity                                                           7m4s
 22antrea-multicluster   7c59020a5dcbb1b9                  LabelIdentity                                                           7m4s
 23antrea-multicluster   7dac813f5932e57e                  LabelIdentity                                                           7m4s
 24antrea-multicluster   7f43c50b4566cd91                  LabelIdentity                                                           7m4s
 25antrea-multicluster   8327de14325c06f9                  LabelIdentity                                                           7m4s
 26antrea-multicluster   9227dd1f8d5eef10                  LabelIdentity                                                           7m4s
 27antrea-multicluster   9a2e5dbff4effe99                  LabelIdentity                                                           7m4s
 28antrea-multicluster   9a4b2085e53f890c                  LabelIdentity                                                           7m4s
 29antrea-multicluster   9b5c3a1ff3c1724f                  LabelIdentity                                                           7m4s
 30antrea-multicluster   9ba8fb64d35434a6                  LabelIdentity                                                           7m4s
 31antrea-multicluster   a59e6e24ceaabe76                  LabelIdentity                                                           7m4s
 32antrea-multicluster   a642d62a95b68860                  LabelIdentity                                                           7m4s
 33antrea-multicluster   afe73316119e5beb                  LabelIdentity                                                           7m4s
 34antrea-multicluster   b07efcf6d7df9ecc                  LabelIdentity                                                           7m4s
 35antrea-multicluster   b0ef5ea4e6654296                  LabelIdentity                                                           7m4s
 36antrea-multicluster   b4ab02dcfded7a88                  LabelIdentity                                                           7m4s
 37antrea-multicluster   b9f26e2c922bdfce                  LabelIdentity                                                           7m4s
 38antrea-multicluster   be152630c03e5d6b                  LabelIdentity                                                           7m4s
 39antrea-multicluster   c316283f47088c45                  LabelIdentity                                                           7m4s
 40antrea-multicluster   c7703628c133a9ae                  LabelIdentity                                                           7m4s
 41antrea-multicluster   db564a4a19f62e39                  LabelIdentity                                                           7m4s
 42antrea-multicluster   db672f99c9b13343                  LabelIdentity                                                           7m4s
 43antrea-multicluster   db674d682cb5db88                  LabelIdentity                                                           7m4s
 44antrea-multicluster   ef097265c27216d2                  LabelIdentity                                                           7m4s
 45antrea-multicluster   f8e5f6fba3fb9a5c                  LabelIdentity                                                           7m4s
 46antrea-multicluster   fc0e44265f8dce47                  LabelIdentity                                                           7m4s
 47antrea-multicluster   fc156481c7b8ebf2                  LabelIdentity                                                           7m4s
 48antrea-multicluster   member-cluster-blue-clusterinfo   ClusterInfo     antrea-multicluster   member-cluster-blue-clusterinfo   31h
 49antrea-multicluster   member-cluster-red-clusterinfo    ClusterInfo     antrea-multicluster   member-cluster-red-clusterinfo    31h
 50antrea-multicluster   nginx-nginx-endpoints             Endpoints       nginx                 nginx                             31h
 51antrea-multicluster   nginx-nginx-service               ServiceImport   nginx                 nginx                             31h
 52antrea-multicluster   yelb-yelb-appserver-endpoints     Endpoints       yelb                  yelb-appserver                    31h
 53antrea-multicluster   yelb-yelb-appserver-service       ServiceImport   yelb                  yelb-appserver                    31h
 54andreasm@tkg-bootstrap:~$ k get resourceexports.multicluster.crd.antrea.io -A
 55NAMESPACE             NAME                                               CLUSTER ID            KIND            NAMESPACE     NAME                  AGE
 56antrea-multicluster   member-cluster-blue-085dd73c98e1d875               member-cluster-blue   LabelIdentity                                       11m
 57antrea-multicluster   member-cluster-blue-0f8eaa8d1ed0d024               member-cluster-blue   LabelIdentity                                       11m
 58antrea-multicluster   member-cluster-blue-23f00caa60df7444               member-cluster-blue   LabelIdentity                                       11m
 59antrea-multicluster   member-cluster-blue-2ae09744db3c2971               member-cluster-blue   LabelIdentity                                       11m
 60antrea-multicluster   member-cluster-blue-2de39d651a0361e9               member-cluster-blue   LabelIdentity                                       11m
 61antrea-multicluster   member-cluster-blue-2ef5bcdb8443a24c               member-cluster-blue   LabelIdentity                                       11m
 62antrea-multicluster   member-cluster-blue-5629f9c0856c3bab               member-cluster-blue   LabelIdentity                                       11m
 63antrea-multicluster   member-cluster-blue-593cb26f6e1ae9e3               member-cluster-blue   LabelIdentity                                       11m
 64antrea-multicluster   member-cluster-blue-7c59020a5dcbb1b9               member-cluster-blue   LabelIdentity                                       11m
 65antrea-multicluster   member-cluster-blue-9a4b2085e53f890c               member-cluster-blue   LabelIdentity                                       11m
 66antrea-multicluster   member-cluster-blue-9b5c3a1ff3c1724f               member-cluster-blue   LabelIdentity                                       11m
 67antrea-multicluster   member-cluster-blue-a642d62a95b68860               member-cluster-blue   LabelIdentity                                       11m
 68antrea-multicluster   member-cluster-blue-afe73316119e5beb               member-cluster-blue   LabelIdentity                                       11m
 69antrea-multicluster   member-cluster-blue-b07efcf6d7df9ecc               member-cluster-blue   LabelIdentity                                       11m
 70antrea-multicluster   member-cluster-blue-b0ef5ea4e6654296               member-cluster-blue   LabelIdentity                                       11m
 71antrea-multicluster   member-cluster-blue-b4ab02dcfded7a88               member-cluster-blue   LabelIdentity                                       11m
 72antrea-multicluster   member-cluster-blue-b9f26e2c922bdfce               member-cluster-blue   LabelIdentity                                       11m
 73antrea-multicluster   member-cluster-blue-c7703628c133a9ae               member-cluster-blue   LabelIdentity                                       11m
 74antrea-multicluster   member-cluster-blue-clusterinfo                    member-cluster-blue   ClusterInfo     kube-system   member-cluster-blue   31h
 75antrea-multicluster   member-cluster-blue-db672f99c9b13343               member-cluster-blue   LabelIdentity                                       11m
 76antrea-multicluster   member-cluster-blue-fc156481c7b8ebf2               member-cluster-blue   LabelIdentity                                       11m
 77antrea-multicluster   member-cluster-red-07114e55523175d2                member-cluster-red    LabelIdentity                                       11m
 78antrea-multicluster   member-cluster-red-1742a902fef9ecf2                member-cluster-red    LabelIdentity                                       11m
 79antrea-multicluster   member-cluster-red-1a7d18d61d0c0ee1                member-cluster-red    LabelIdentity                                       11m
 80antrea-multicluster   member-cluster-red-1f395d26ddf2e628                member-cluster-red    LabelIdentity                                       11m
 81antrea-multicluster   member-cluster-red-339dbb049e2e9a92                member-cluster-red    LabelIdentity                                       11m
 82antrea-multicluster   member-cluster-red-430e80a9621c621a                member-cluster-red    LabelIdentity                                       11m
 83antrea-multicluster   member-cluster-red-4c9c7b4329d0e128                member-cluster-red    LabelIdentity                                       4m23s
 84antrea-multicluster   member-cluster-red-66b072b8efc1faa7                member-cluster-red    LabelIdentity                                       11m
 85antrea-multicluster   member-cluster-red-67410707ad7a9908                member-cluster-red    LabelIdentity                                       11m
 86antrea-multicluster   member-cluster-red-7468af4ac6f5dfa7                member-cluster-red    LabelIdentity                                       11m
 87antrea-multicluster   member-cluster-red-7dac813f5932e57e                member-cluster-red    LabelIdentity                                       11m
 88antrea-multicluster   member-cluster-red-7f43c50b4566cd91                member-cluster-red    LabelIdentity                                       11m
 89antrea-multicluster   member-cluster-red-8327de14325c06f9                member-cluster-red    LabelIdentity                                       11m
 90antrea-multicluster   member-cluster-red-9227dd1f8d5eef10                member-cluster-red    LabelIdentity                                       11m
 91antrea-multicluster   member-cluster-red-9a2e5dbff4effe99                member-cluster-red    LabelIdentity                                       11m
 92antrea-multicluster   member-cluster-red-9ba8fb64d35434a6                member-cluster-red    LabelIdentity                                       11m
 93antrea-multicluster   member-cluster-red-a59e6e24ceaabe76                member-cluster-red    LabelIdentity                                       11m
 94antrea-multicluster   member-cluster-red-be152630c03e5d6b                member-cluster-red    LabelIdentity                                       11m
 95antrea-multicluster   member-cluster-red-c316283f47088c45                member-cluster-red    LabelIdentity                                       11m
 96antrea-multicluster   member-cluster-red-clusterinfo                     member-cluster-red    ClusterInfo     kube-system   member-cluster-red    31h
 97antrea-multicluster   member-cluster-red-db564a4a19f62e39                member-cluster-red    LabelIdentity                                       11m
 98antrea-multicluster   member-cluster-red-db674d682cb5db88                member-cluster-red    LabelIdentity                                       11m
 99antrea-multicluster   member-cluster-red-ef097265c27216d2                member-cluster-red    LabelIdentity                                       11m
100antrea-multicluster   member-cluster-red-f8e5f6fba3fb9a5c                member-cluster-red    LabelIdentity                                       11m
101antrea-multicluster   member-cluster-red-fc0e44265f8dce47                member-cluster-red    LabelIdentity                                       11m
102antrea-multicluster   member-cluster-red-nginx-nginx-endpoints           member-cluster-red    Endpoints       nginx         nginx                 31h
103antrea-multicluster   member-cluster-red-nginx-nginx-service             member-cluster-red    Service         nginx         nginx                 31h
104antrea-multicluster   member-cluster-red-yelb-yelb-appserver-endpoints   member-cluster-red    Endpoints       yelb          yelb-appserver        31h
105antrea-multicluster   member-cluster-red-yelb-yelb-appserver-service     member-cluster-red    Service         yelb          yelb-appserver        31h

And if I describe one of them:

 1k describe resourceimports.multicluster.crd.antrea.io -n antrea-multicluster 07114e55523175d2
 2Name:         07114e55523175d2
 3Namespace:    antrea-multicluster
 4Labels:       <none>
 5Annotations:  <none>
 6API Version:  multicluster.crd.antrea.io/v1alpha1
 7Kind:         ResourceImport
 8Metadata:
 9  Creation Timestamp:  2023-09-30T13:33:02Z
10  Generation:          1
11  Resource Version:    442986
12  UID:                 4157d671-6c4d-4653-b97f-554bdfa705d9
13Spec:
14  Kind:  LabelIdentity
15  Label Identity:
16    Id:     35
17    Label:  ns:kubernetes.io/metadata.name=nginx&pod:app=nginx-ui,pod-template-hash=59c956b95b,tier=frontend
18Events:     <none>

The leader cluster is now aware of all labels, namespaces and pods, from the other member clusters which will allow us to create this ingress rule with namespace or pod selection from the other clusters. The applyTo field is only relevant for the cluster the policy is applied in.

The second ingress policy example below is using a "namespaced" policy (Antrea NetworkPolicy). It will be applied to a specific namespace in the "destination" cluster, using podSelector with labels app=db to select specific pods. The policy will be placed in the Application Tier, it will allow sources coming from any pods in the ClusterSet matching the label app=client to the pods in the specified namesapace with label app=db and dropping everything else.

 1apiVersion: crd.antrea.io/v1alpha1
 2kind: AntreaNetworkPolicy
 3metadata:
 4  name: db-svc-allow-ingress-from-client-only
 5  namespace: prod-us-west
 6spec:
 7  appliedTo:
 8  - podSelector:
 9      matchLabels:
10        app: db
11  priority: 1
12  tier: application
13  ingress:
14  - action: Allow
15    from:
16    # Select all Pods in Namespace "prod-us-west" from all clusters in the ClusterSet (if the
17    # Namespace exists in that cluster) whose labels match app=client
18    - scope: ClusterSet
19      podSelector:
20        matchLabels:
21          app: client
22  - action: Deny

I will not test this, as it will work similarly as the first example, the only difference being is that it is applied on a namespace, not clusterwide (using Antrea ClusterNetworkPolicy).

Next up is creating a ClusterNetworkPolicy that is replicated across all clusters.

Multi-cluster ClusterNetworkPolicy replication (ACNP)

In this last chapter I will test out the possibility with Multi-cluster to replicate a ClusterNetworkPolicy across each members in my ClusterSet, member-cluster-blue and red.

I will just take the example from the official Antrea Github docs page and use it as it is. Before I apply it I will just quickly check whether I have any policies applied in any of my member clusters:

 1## tkg-cluster-2 - member-blue
 2k config current-context
 3tkg-cluster-2-admin@tkg-cluster-2
 4## Any policies?
 5k get acnp
 6No resources found
 7k get anp -A
 8No resources found
 9## tkg-cluster-3 - member-red
10k config current-context
11tkg-cluster-3-admin@tkg-cluster-3
12## Any policies?
13k get acnp
14No resources found
15k get anp -A
16No resources found

No policies.

To replicate a policy to all members I will switch context to the leader-cluster and create a ResourceExport and the ClusterNetworkPolicy itself and apply it on the leader-cluster.

Info

The namespace for the Kind: ResourceExport needs to be in the same namespace as where the antrea-mc-controller is running!!

Below is the example I will be using, taken from the Antrea Github doc page:

 1apiVersion: multicluster.crd.antrea.io/v1alpha1
 2kind: ResourceExport
 3metadata:
 4  name: strict-namespace-isolation-for-test-clusterset
 5  namespace: antrea-multicluster # Namespace that Multi-cluster Controller is deployed
 6spec:
 7  kind: AntreaClusterNetworkPolicy
 8  name: strict-namespace-isolation # In each importing cluster, an ACNP of name antrea-mc-strict-namespace-isolation will be created with the spec below
 9  clusterNetworkPolicy:
10    priority: 1
11    tier: securityops
12    appliedTo:
13      - namespaceSelector: {} # Selects all Namespaces in the member cluster
14    ingress:
15      - action: Pass
16        from:
17          - namespaces:
18              match: Self # Skip drop rule for traffic from Pods in the same Namespace
19          - podSelector:
20              matchLabels:
21                k8s-app: kube-dns # Skip drop rule for traffic from the core-dns components
22      - action: Drop
23        from:
24          - namespaceSelector: {} # Drop from Pods from all other Namespaces

Now apply it in my leader-cluster:

1## tkg-cluster-1 - leader-cluster
2k config current-context
3tkg-cluster-1-admin@tkg-cluster-1
4## apply
5k apply -f acnp-replicated.yaml
6resourceexport.multicluster.crd.antrea.io/strict-namespace-isolation-for-test-clusterset created

Now I will switch over to the contexts of my member clusters and check what has happened there:

1## tkg-cluster-2 - member-blue
2k get acnp
3NAME                                   TIER          PRIORITY   DESIRED NODES   CURRENT NODES   AGE
4antrea-mc-strict-namespace-isolation   securityops   1          3               3               47s
1## tkg-cluster-3 - member-red
2k get acnp
3NAME                                   TIER          PRIORITY   DESIRED NODES   CURRENT NODES   AGE
4antrea-mc-strict-namespace-isolation   securityops   1          3               3               108s

This is really great!! Both my member cluster has gotten the policy applied.

On the leader-cluster?

1## tkg-cluster-1
2k get acnp
3No resources found

Nothing.

The only bad thing now is that this policy broke my Yelb application as my Yelb-ui can no longer reach the backends in the other cluster ๐Ÿ˜„ so will have to add some additional policies to support this application also. Which is perfectly normal, and I have covered a bunch of Antrea policies in this post and this post that can be re-used for this purpose.

Now I will just do a last thing an check the status of my replicated policy from the leader-cluster if there are any alert on any of the member cluster, why they have not applied the policy etc..

1k get resourceexports.multicluster.crd.antrea.io -A
2NAMESPACE             NAME                                               CLUSTER ID            KIND                         NAMESPACE     NAME                         AGE
3
4antrea-multicluster   strict-namespace-isolation-for-test-clusterset                           AntreaClusterNetworkPolicy                 strict-namespace-isolation   8m39s
 1k describe resourceexports.multicluster.crd.antrea.io -n antrea-multicluster strict-namespace-isolation-for-test-clusterset
 2Name:         strict-namespace-isolation-for-test-clusterset
 3Namespace:    antrea-multicluster
 4Labels:       sourceKind=AntreaClusterNetworkPolicy
 5              sourceName=strict-namespace-isolation
 6              sourceNamespace=
 7Annotations:  <none>
 8API Version:  multicluster.crd.antrea.io/v1alpha1
 9Kind:         ResourceExport
10Metadata:
11  Creation Timestamp:  2023-09-30T14:02:42Z
12  Finalizers:
13    resourceexport.finalizers.antrea.io
14  Generation:        1
15  Resource Version:  448359
16  UID:               3d7111dd-eecc-46bb-8307-c95d747474b0
17Spec:
18  Cluster Network Policy:
19    Applied To:
20      Namespace Selector:
21    Ingress:
22      Action:          Pass
23      Enable Logging:  false
24      From:
25        Namespaces:
26          Match:  Self
27        Pod Selector:
28          Match Labels:
29            k8s-app:   kube-dns
30      Action:          Drop
31      Enable Logging:  false
32      From:
33        Namespace Selector:
34    Priority:  1
35    Tier:      securityops
36  Kind:        AntreaClusterNetworkPolicy
37  Name:        strict-namespace-isolation
38Status:
39  Conditions:
40    Last Transition Time:  2023-09-30T14:02:42Z
41    Status:                True
42    Type:                  Succeeded
43Events:                    <none>

Looks good.

Bonus content

With the help from my friend ChatGPT I created a menu driven "automated" way of deploying all of the above steps.

The pre-requisities for this script is that it expects all your Kubernetes clusters already deployed and the Antrea Multi-cluster feature gates enabled. Then the script should be executed from a machine that has all the kubernetes clusters contexts added. It will prompt for the different contexts in some of the menus and will change to these context to execute specific commands in selected contexts.

I will just quickly go through the script/menus. When script is executed it will bring up this menu:

 1Main Menu:
 21. Select Antrea version
 32. Install Antrea Multi-cluster on leader cluster
 43. Install Antrea Multi-cluster on member cluster
 54. Create member-cluster secrets
 65. Apply member tokens
 76. Create ClusterSet on the leader cluster
 87. Create ClusterClaim on member cluster
 98. Create Multi-cluster Gateway
109. Create a Multi-cluster service
1110. Exit
12Enter your choice:

Explanation of the different menu selections and what it does:

  1. This will prompt you for the specific Antrea version you are using and use that as a tag for downloading and applying the correct yaml files

  2. This will let you select your "Leader Cluster", create the namespace antrea-multicluster, deploy the leader yaml manifests and deploy the antrea-mc-controller in the cluster.

  3. This will let you select a member cluster to install the member yaml and the antrea-mc-controller. This needs to be done for each of the member cluster you want to install it on.

  4. This will create the member-cluster secrets, asking for the leader-cluster contexts for them to be created in. And the export the token.yamls to be applied in next step

    1Enter your choice: 4
    2Creating member-cluster secrets...
    31) 10.13.90.1
    42) cluster-1
    53) ns-stc-1
    64) Back to Main Menu
    7Select a context as the leader cluster: 2
    8Switched to context "cluster-1".
    9Enter the name for Member Cluster (e.g., member-blue): member-red
    
  5. This will ask you for the context for the respective token.yamls created to be applied in. It will list all the yaml files created in the current folder for you to choose which token to be applied.

     1Enter your choice: 5
     2Applying member tokens...
     31) 10.13.90.1
     42) cluster-1
     53) ns-stc-1
     64) Back to Main Menu
     7Select a context to switch to: 2
     8Switched to context "cluster-1".
     91) member-blue-token.yaml
    102) Back to Main Menu
    11Select a YAML file to apply:
    
  6. This will create the clusterset prompting for the leader cluster context and ask for the ClusterID and ClusterSet name

     1Enter your choice: 6
     2Creating ClusterSet on the leader cluster...
     31) 10.13.90.1
     42) cluster-1
     53) ns-stc-1
     64) Back to Main Menu
     7Select the leader cluster context: 2
     8Switched to context "cluster-1".
     9Enter ClusterID (e.g., tkg-cluster-leader): leader-cluster
    10Enter ClusterSet name (e.g., andreasm-clusterset): super-clusterset
    
  7. This will create the clusterclaim on the member cluster to join the cluster leader/clusterset

     1Enter your choice: 7
     2Creating ClusterClaim on member cluster...
     31) 10.13.90.1
     42) cluster-1
     53) ns-stc-1
     64) Back to Main Menu
     7Select a context to switch to: 2
     8Switched to context "cluster-1".
     9Enter member-cluster-name (e.g., member-cluster-red): member-cluster-blue
    10Enter ClusterSet name (e.g., andreasm-clusterset): super-clusterset
    11Enter Leader ClusterID (e.g., tkg-cluster-leader): leader-cluster
    12Enter Member Token to use: member-blue-token
    13Enter Leader cluster API endpoint (e.g., https://10.101.114.100:6443): https://10.101.115.120:6443
    
  8. This will create the Multi-cluster Gateway by letting you select the which node in which cluster to annotate

     1Enter your choice: 8
     2Creating Multi-cluster Gateway...
     31) 10.13.90.1
     42) cluster-1
     53) ns-stc-1
     64) Back to Main Menu
     7Select a context to switch to: 2
     8Switched to context "cluster-1".
     91) cluster-1-f82lv-fdvw8			  3) cluster-1-node-pool-01-tb4tw-555756bd56-klgcs
    102) cluster-1-node-pool-01-tb4tw-555756bd56-76qv6  4) Back to Context Menu
    11Select a node to annotate as Multi-cluster Gateway: 2
    12node/cluster-1-node-pool-01-tb4tw-555756bd56-76qv6 annotated
    13Annotated cluster-1-node-pool-01-tb4tw-555756bd56-76qv6 as Multi-cluster Gateway.
    14Select a node to annotate as Multi-cluster Gateway: 4
    15Do you want to annotate another node? (yes/no): # Selecting yes brings up the node list again. Selecting no takes you back to main menu. This needs to be done on all member clusters you need to define a gateway node
    
  9. This will let you select a context, list all services defined in this cluster, let you select it from a menu then export is as a Multi-cluster service.

     1Enter your choice: 9
     2Creating a Multi-cluster service...
     31) 10.13.90.1
     42) cluster-1
     53) ns-stc-1
     64) Back to Main Menu
     7Select a context to switch to: 2
     8Switched to context "cluster-1".
     91) antrea-multicluster		   5) kube-public		     9) vmware-system-antrea	      13) vmware-system-tkg
    102) default			   6) kube-system		    10) vmware-system-auth	      14) yelb
    113) fruit			   7) secretgen-controller	    11) vmware-system-cloud-provider  15) Back to Context Menu
    124) kube-node-lease		   8) tkg-system		    12) vmware-system-csi
    13Select a namespace to list services from: 14
    141) redis-server
    152) yelb-appserver
    163) yelb-db
    174) yelb-ui
    185) Back to Namespace Menu
    19Select a service to export as Multi-cluster service: 2
    20ServiceExport created for yelb-appserver in namespace yelb.
    21serviceexport.multicluster.x-k8s.io/yelb-appserver unchanged
    22Multi-cluster service applied.
    23Select a service to export as Multi-cluster service: # hit enter to bring up menu
    241) antrea-multicluster		   5) kube-public		     9) vmware-system-antrea	      13) vmware-system-tkg
    252) default			   6) kube-system		    10) vmware-system-auth	      14) yelb
    263) fruit			   7) secretgen-controller	    11) vmware-system-cloud-provider  15) Back to Context Menu
    274) kube-node-lease		   8) tkg-system		    12) vmware-system-csi
    28Select a service to export as Multi-cluster service: 15
    

Here is the script:

  1#!/bin/bash
  2
  3# Function to create member-cluster secrets
  4create_member_cluster_secrets() {
  5    echo "Creating member-cluster secrets..."
  6
  7    # List available contexts and create a menu
  8    contexts=($(kubectl config get-contexts -o=name))
  9
 10    # Display the menu for selecting a context as the leader cluster
 11    PS3="Select a context as the leader cluster: "
 12    select LEADER_CONTEXT in "${contexts[@]}" "Back to Main Menu"; do
 13        if [[ -n "$LEADER_CONTEXT" ]]; then
 14            if [ "$LEADER_CONTEXT" == "Back to Main Menu" ]; then
 15                break
 16            fi
 17
 18            # Set the selected context as the leader cluster context
 19            kubectl config use-context "$LEADER_CONTEXT"
 20
 21            read -p "Enter the name for Member Cluster (e.g., member-blue): " MEMBER_CLUSTER_NAME
 22
 23            # Create YAML content for the member cluster
 24            cat <<EOF > member-cluster.yml
 25apiVersion: v1
 26kind: ServiceAccount
 27metadata:
 28  name: $MEMBER_CLUSTER_NAME
 29  namespace: antrea-multicluster
 30---
 31apiVersion: v1
 32kind: Secret
 33metadata:
 34  name: ${MEMBER_CLUSTER_NAME}-token
 35  namespace: antrea-multicluster
 36  annotations:
 37    kubernetes.io/service-account.name: $MEMBER_CLUSTER_NAME
 38type: kubernetes.io/service-account-token
 39---
 40apiVersion: rbac.authorization.k8s.io/v1
 41kind: RoleBinding
 42metadata:
 43  name: $MEMBER_CLUSTER_NAME
 44  namespace: antrea-multicluster
 45roleRef:
 46  apiGroup: rbac.authorization.k8s.io
 47  kind: Role
 48  name: antrea-mc-member-cluster-role
 49subjects:
 50  - kind: ServiceAccount
 51    name: $MEMBER_CLUSTER_NAME
 52    namespace: antrea-multicluster
 53EOF
 54
 55            # Apply the YAML content for the member cluster
 56            kubectl apply -f member-cluster.yml
 57
 58            # Create the member cluster secret file
 59            kubectl get secret ${MEMBER_CLUSTER_NAME}-token -n antrea-multicluster -o yaml | grep -w -e '^apiVersion' -e '^data' -e '^metadata' -e '^ *name:'  -e   '^kind' -e '  ca.crt' -e '  token:' -e '^type' -e '  namespace' | sed -e 's/kubernetes.io\/service-account-token/Opaque/g' -e "s/antrea-multicluster/kube-system/g" > "${MEMBER_CLUSTER_NAME}-token.yaml"
 60
 61            echo "Member cluster secrets created and YAML file generated: ${MEMBER_CLUSTER_NAME}-token.yaml."
 62            sleep 2
 63            break
 64        else
 65            echo "Invalid selection. Please choose a context or 'Back to Main Menu'."
 66        fi
 67    done
 68}
 69
 70# Function to apply member tokens
 71apply_member_tokens() {
 72    echo "Applying member tokens..."
 73
 74    # List available contexts and create a menu
 75    contexts=($(kubectl config get-contexts -o=name))
 76
 77    # Display the menu for selecting a context to switch to
 78    PS3="Select a context to switch to: "
 79    select SWITCH_CONTEXT in "${contexts[@]}" "Back to Main Menu"; do
 80        if [[ -n "$SWITCH_CONTEXT" ]]; then
 81            if [ "$SWITCH_CONTEXT" == "Back to Main Menu" ]; then
 82                break
 83            fi
 84
 85            kubectl config use-context "$SWITCH_CONTEXT"
 86
 87            # List YAML files in the current folder and create a menu
 88            yaml_files=($(ls *.yaml))
 89
 90            # Display the menu for selecting a YAML file to apply
 91            PS3="Select a YAML file to apply: "
 92            select SELECTED_YAML in "${yaml_files[@]}" "Back to Main Menu"; do
 93                if [[ -n "$SELECTED_YAML" ]]; then
 94                    if [ "$SELECTED_YAML" == "Back to Main Menu" ]; then
 95                        break
 96                    fi
 97
 98                    kubectl apply -f "$SELECTED_YAML"
 99
100                    echo "Applied $SELECTED_YAML in context $SWITCH_CONTEXT."
101                    sleep 2
102                    break
103                else
104                    echo "Invalid selection. Please choose a YAML file or 'Back to Main Menu'."
105                fi
106            done
107            break
108        else
109            echo "Invalid selection. Please choose a context or 'Back to Main Menu'."
110        fi
111    done
112}
113
114# Function to create ClusterSet on the leader cluster
115create_clusterset_on_leader() {
116    echo "Creating ClusterSet on the leader cluster..."
117
118    # List available contexts and create a menu
119    contexts=($(kubectl config get-contexts -o=name))
120
121    # Display the menu for selecting the leader cluster context
122    PS3="Select the leader cluster context: "
123    select LEADER_CONTEXT in "${contexts[@]}" "Back to Main Menu"; do
124        if [[ -n "$LEADER_CONTEXT" ]]; then
125            if [ "$LEADER_CONTEXT" == "Back to Main Menu" ]; then
126                break
127            fi
128
129            kubectl config use-context "$LEADER_CONTEXT"
130
131            # Prompt for ClusterID and ClusterSet name
132            read -p "Enter ClusterID (e.g., tkg-cluster-leader): " CLUSTER_ID
133            read -p "Enter ClusterSet name (e.g., andreasm-clusterset): " CLUSTERSET_NAME
134
135            # Create YAML content for ClusterSet
136            cat <<EOF > clusterset.yaml
137apiVersion: multicluster.crd.antrea.io/v1alpha2
138kind: ClusterClaim
139metadata:
140  name: id.k8s.io
141  namespace: antrea-multicluster
142value: $CLUSTER_ID
143---
144apiVersion: multicluster.crd.antrea.io/v1alpha2
145kind: ClusterClaim
146metadata:
147  name: clusterset.k8s.io
148  namespace: antrea-multicluster
149value: $CLUSTERSET_NAME
150---
151apiVersion: multicluster.crd.antrea.io/v1alpha1
152kind: ClusterSet
153metadata:
154  name: $CLUSTERSET_NAME
155  namespace: antrea-multicluster
156spec:
157  leaders:
158    - clusterID: $CLUSTER_ID
159EOF
160
161            # Apply the ClusterSet YAML
162            kubectl apply -f clusterset.yaml
163
164            echo "ClusterSet created on the leader cluster."
165            sleep 2
166            break
167        else
168            echo "Invalid selection. Please choose a context or 'Back to Main Menu'."
169        fi
170    done
171}
172
173# Function to create ClusterClaim on member cluster
174create_clusterclaim_on_member() {
175    echo "Creating ClusterClaim on member cluster..."
176
177    # List available contexts and create a menu
178    contexts=($(kubectl config get-contexts -o=name))
179
180    # Display the menu for selecting a context to switch to
181    PS3="Select a context to switch to: "
182    select MEMBER_CONTEXT in "${contexts[@]}" "Back to Main Menu"; do
183        if [[ -n "$MEMBER_CONTEXT" ]]; then
184            if [ "$MEMBER_CONTEXT" == "Back to Main Menu" ]; then
185                break
186            fi
187
188            kubectl config use-context "$MEMBER_CONTEXT"
189
190            # Prompt for ClusterClaim values
191            read -p "Enter member-cluster-name (e.g., member-cluster-red): " MEMBER_CLUSTER_NAME
192            read -p "Enter ClusterSet name (e.g., andreasm-clusterset): " CLUSTERSET_NAME
193            read -p "Enter Leader ClusterID (e.g., tkg-cluster-leader): " LEADER_CLUSTER_ID
194            read -p "Enter Member Token to use: " MEMBER_TOKEN
195            read -p "Enter Leader cluster API endpoint (e.g., https://10.101.114.100:6443): " LEADER_ENDPOINT
196
197            # Create YAML content for ClusterClaim
198            cat <<EOF > "${MEMBER_CLUSTER_NAME}-clusterclaim.yaml"
199apiVersion: multicluster.crd.antrea.io/v1alpha2
200kind: ClusterClaim
201metadata:
202  name: id.k8s.io
203  namespace: kube-system
204value: $MEMBER_CLUSTER_NAME
205---
206apiVersion: multicluster.crd.antrea.io/v1alpha2
207kind: ClusterClaim
208metadata:
209  name: clusterset.k8s.io
210  namespace: kube-system
211value: $CLUSTERSET_NAME
212---
213apiVersion: multicluster.crd.antrea.io/v1alpha1
214kind: ClusterSet
215metadata:
216  name: $CLUSTERSET_NAME
217  namespace: kube-system
218spec:
219  leaders:
220    - clusterID: $LEADER_CLUSTER_ID
221      secret: "$MEMBER_TOKEN"
222      server: "$LEADER_ENDPOINT"
223  namespace: antrea-multicluster
224EOF
225
226            # Apply the ClusterClaim YAML
227            kubectl apply -f "${MEMBER_CLUSTER_NAME}-clusterclaim.yaml"
228
229            echo "ClusterClaim created on member cluster."
230            sleep 2
231            break
232        else
233            echo "Invalid selection. Please choose a context or 'Back to Main Menu'."
234        fi
235    done
236}
237
238# Function to create Multi-cluster Gateway
239create_multi_cluster_gateway() {
240    echo "Creating Multi-cluster Gateway..."
241
242    # List available contexts and create a menu
243    contexts=($(kubectl config get-contexts -o=name))
244
245    # Display the menu for selecting a context to switch to
246    PS3="Select a context to switch to: "
247    select GATEWAY_CONTEXT in "${contexts[@]}" "Back to Main Menu"; do
248        if [[ -n "$GATEWAY_CONTEXT" ]]; then
249            if [ "$GATEWAY_CONTEXT" == "Back to Main Menu" ]; then
250                break
251            fi
252
253            kubectl config use-context "$GATEWAY_CONTEXT"
254
255            while true; do
256                # List nodes and create a menu
257                nodes=($(kubectl get nodes -o custom-columns=NAME:.metadata.name --no-headers))
258
259                # Display the menu for selecting a node to annotate
260                PS3="Select a node to annotate as Multi-cluster Gateway: "
261                select SELECTED_NODE in "${nodes[@]}" "Back to Context Menu"; do
262                    if [[ -n "$SELECTED_NODE" ]]; then
263                        if [ "$SELECTED_NODE" == "Back to Context Menu" ]; then
264                            break
265                        fi
266
267                        # Annotate the selected node
268                        kubectl annotate node "$SELECTED_NODE" multicluster.antrea.io/gateway=true
269
270                        echo "Annotated $SELECTED_NODE as Multi-cluster Gateway."
271                        sleep 2
272                    else
273                        echo "Invalid selection. Please choose a node or 'Back to Context Menu'."
274                    fi
275                done
276
277                read -p "Do you want to annotate another node? (yes/no): " ANNOTATE_ANOTHER
278                if [ "$ANNOTATE_ANOTHER" != "yes" ]; then
279                    break
280                fi
281            done
282            break
283        else
284            echo "Invalid selection. Please choose a context or 'Back to Main Menu'."
285        fi
286    done
287}
288
289# Function to create a Multi-cluster service
290create_multi_cluster_service() {
291    echo "Creating a Multi-cluster service..."
292
293    # List available contexts and create a menu
294    contexts=($(kubectl config get-contexts -o=name))
295
296    # Display the menu for selecting a context to switch to
297    PS3="Select a context to switch to: "
298    select SELECT_CONTEXT in "${contexts[@]}" "Back to Main Menu"; do
299        if [[ -n "$SELECT_CONTEXT" ]]; then
300            if [ "$SELECT_CONTEXT" == "Back to Main Menu" ]; then
301                break
302            fi
303
304            kubectl config use-context "$SELECT_CONTEXT"
305
306            # List namespaces and create a menu
307            namespaces=($(kubectl get namespaces -o custom-columns=NAME:.metadata.name --no-headers))
308
309            # Display the menu for selecting a namespace
310            PS3="Select a namespace to list services from: "
311            select SELECTED_NAMESPACE in "${namespaces[@]}" "Back to Context Menu"; do
312                if [[ -n "$SELECTED_NAMESPACE" ]]; then
313                    if [ "$SELECTED_NAMESPACE" == "Back to Context Menu" ]; then
314                        break
315                    fi
316
317                    # List services in the selected namespace and create a menu
318                    services=($(kubectl get services -n "$SELECTED_NAMESPACE" -o custom-columns=NAME:.metadata.name --no-headers))
319
320                    # Display the menu for selecting a service
321                    PS3="Select a service to export as Multi-cluster service: "
322                    select SELECTED_SERVICE in "${services[@]}" "Back to Namespace Menu"; do
323                        if [[ -n "$SELECTED_SERVICE" ]]; then
324                            if [ "$SELECTED_SERVICE" == "Back to Namespace Menu" ]; then
325                                break
326                            fi
327
328                            # Create YAML content for ServiceExport
329                            cat <<EOF > "${SELECTED_SERVICE}-multi-cluster-service.yaml"
330apiVersion: multicluster.x-k8s.io/v1alpha1
331kind: ServiceExport
332metadata:
333  name: $SELECTED_SERVICE
334  namespace: $SELECTED_NAMESPACE
335EOF
336
337                            echo "ServiceExport created for $SELECTED_SERVICE in namespace $SELECTED_NAMESPACE."
338
339                            # Apply the Multi-cluster service
340                            kubectl apply -f "${SELECTED_SERVICE}-multi-cluster-service.yaml"
341                            echo "Multi-cluster service applied."
342
343                            sleep 2
344                            break
345                        else
346                            echo "Invalid selection. Please choose a service or 'Back to Namespace Menu'."
347                        fi
348                    done
349                else
350                    echo "Invalid selection. Please choose a namespace or 'Back to Context Menu'."
351                fi
352            done
353            break
354        else
355            echo "Invalid selection. Please choose a context or 'Back to Main Menu'."
356        fi
357    done
358}
359
360
361
362# Main menu
363while true; do
364    clear
365    echo "Main Menu:"
366    echo "1. Select Antrea version"
367    echo "2. Install Antrea Multi-cluster on leader cluster"
368    echo "3. Install Antrea Multi-cluster on member cluster"
369    echo "4. Create member-cluster secrets"
370    echo "5. Apply member tokens"
371    echo "6. Create ClusterSet on the leader cluster"
372    echo "7. Create ClusterClaim on member cluster"
373    echo "8. Create Multi-cluster Gateway"
374    echo "9. Create a Multi-cluster service"
375    echo "10. Exit"
376
377    read -p "Enter your choice: " choice
378
379    case $choice in
380        1)
381            read -p "Enter Antrea version (e.g., v1.11.1): " TAG
382            ;;
383        2)
384            echo "Installing Antrea Multi-cluster on leader cluster..."
385
386            # List available contexts and create a menu
387            contexts=($(kubectl config get-contexts -o=name))
388
389            # Display the menu
390            PS3="Select a context to switch to: "
391            select SWITCH_CONTEXT in "${contexts[@]}" "Back to Main Menu"; do
392                if [[ -n "$SWITCH_CONTEXT" ]]; then
393                    if [ "$SWITCH_CONTEXT" == "Back to Main Menu" ]; then
394                        break
395                    fi
396
397                    kubectl config use-context "$SWITCH_CONTEXT"
398
399                    # Create namespace if it does not exist
400                    kubectl create namespace antrea-multicluster --dry-run=client -o yaml | kubectl apply -f -
401
402                    # Apply leader cluster YAMLs
403                    kubectl apply -f "https://github.com/antrea-io/antrea/releases/download/$TAG/antrea-multicluster-leader-global.yml"
404                    kubectl apply -f "https://github.com/antrea-io/antrea/releases/download/$TAG/antrea-multicluster-leader-namespaced.yml"
405
406                    echo "Antrea Multi-cluster installed on leader cluster."
407                    sleep 2
408                    break
409                else
410                    echo "Invalid selection. Please choose a context or 'Back to Main Menu'."
411                fi
412            done
413            ;;
414        3)
415            echo "Installing Antrea Multi-cluster on member cluster..."
416
417            # List available contexts and create a menu
418            contexts=($(kubectl config get-contexts -o=name))
419
420            # Display the menu
421            PS3="Select a context to switch to: "
422            select SWITCH_CONTEXT in "${contexts[@]}" "Back to Main Menu"; do
423                if [[ -n "$SWITCH_CONTEXT" ]]; then
424                    if [ "$SWITCH_CONTEXT" == "Back to Main Menu" ]; then
425                        break
426                    fi
427
428                    kubectl config use-context "$SWITCH_CONTEXT"
429
430                    # Apply member cluster YAML
431                    kubectl apply -f "https://github.com/antrea-io/antrea/releases/download/$TAG/antrea-multicluster-member.yml"
432
433                    echo "Antrea Multi-cluster installed on member cluster."
434                    sleep 2
435                    break
436                else
437                    echo "Invalid selection. Please choose a context or 'Back to Main Menu'."
438                fi
439            done
440            ;;
441        4)
442            create_member_cluster_secrets
443            ;;
444        5)
445            apply_member_tokens
446            ;;
447        6)
448            create_clusterset_on_leader
449            ;;
450        7)
451            create_clusterclaim_on_member
452            ;;
453        8)
454            create_multi_cluster_gateway
455            ;;
456        9)
457            create_multi_cluster_service
458            ;;
459        10)
460            echo "Exiting..."
461            exit 0
462            ;;
463        *)
464            echo "Invalid choice. Please choose a valid option."
465            sleep 2
466            ;;
467    esac
468done

Thats it for this post. Thanks for reading.