Managing Antrea in vSphere with Tanzu
Overview
Antrea in vSphere with Tanzu
Antrea is the default CNI being used in TKG 2.0 clusters. TKG 2.0 clusters are the workload clusters you deploy with the Supervisor deployed in vSphere 8. Antrea comes in to flavours, we have the open source edition of Antrea which can be found here and then we have the Antrea Advanced ("downstream") version which is being used in vSphere with Tanzu. This version is also needed when we want to integrate Antrea with NSX-T for policy management. The Antrea Advanced can be found in your VMware customer connect portal here. Both version of Antrea has a very broad support Kubernetes platforms it can be used in. Antrea can be used for Windows worker nodes, Photon, Ubuntu, ARM, x86, VMware TKG, OpenShift, Rancher, AKS, EKS. the list is long see more info here. This post will be focusing on the Antrea Advanced edition and its features like (read more here):
- Central management of Antrea Security Policies with NSX
- Central troubleshooting with TraceFlow with NSX
- FQDN/L7 Security policies
- RBAC
- Tiered policies
- Flow Exporter
- Egress (Source NAT IP selection of PODs egressing)
Managing Antrea settings and Feature Gates in TKG 2 clusters
When you deploy a TKG 2 cluster on vSphere with Tanzu and you dont specify a CNI Antrea will be de default CNI. Depending on the TKG version you are on a set of default Antrea features are enabled or disabled. You can easily check which features are enabled after a cluster has been provisioned by issuing the below command: If you know already before you deploy a cluster that a specific feature should be enabled or disabled this can also be handled during bring-up of the cluster so it should come with the settings you want. More on that later.
1linux-vm:~/from_ubuntu_vm/tkgs/tkgs-stc-cpod$ k get configmaps -n kube-system antrea-config -oyaml
2apiVersion: v1
3data:
4 antrea-agent.conf: |
5 featureGates:
6 AntreaProxy: true
7 EndpointSlice: true
8 Traceflow: true
9 NodePortLocal: true
10 AntreaPolicy: true
11 FlowExporter: false
12 NetworkPolicyStats: false
13 Egress: true
14 AntreaIPAM: false
15 Multicast: false
16 Multicluster: false
17 SecondaryNetwork: false
18 ServiceExternalIP: false
19 TrafficControl: false
20 trafficEncapMode: encap
21 noSNAT: false
22 tunnelType: geneve
23 trafficEncryptionMode: none
24 enableBridgingMode: false
25 disableTXChecksumOffload: false
26 wireGuard:
27 port: 51820
28 egress:
29 exceptCIDRs: []
30 serviceCIDR: 20.10.0.0/16
31 nodePortLocal:
32 enable: true
33 portRange: 61000-62000
34 tlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384
35 multicast: {}
36 antreaProxy:
37 proxyAll: false
38 nodePortAddresses: []
39 skipServices: []
40 proxyLoadBalancerIPs: false
41 multicluster: {}
42 antrea-cni.conflist: |
43 {
44 "cniVersion":"0.3.0",
45 "name": "antrea",
46 "plugins": [
47 {
48 "type": "antrea",
49 "ipam": {
50 "type": "host-local"
51 }
52 }
53 ,
54 {
55 "type": "portmap",
56 "capabilities": {"portMappings": true}
57 }
58 ,
59 {
60 "type": "bandwidth",
61 "capabilities": {"bandwidth": true}
62 }
63 ]
64 }
65 antrea-controller.conf: |
66 featureGates:
67 Traceflow: true
68 AntreaPolicy: true
69 NetworkPolicyStats: false
70 Multicast: false
71 Egress: true
72 AntreaIPAM: false
73 ServiceExternalIP: false
74 tlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384
75 nodeIPAM: null
76kind: ConfigMap
77metadata:
78 annotations:
79 kapp.k14s.io/identity: v1;kube-system//ConfigMap/antrea-config;v1
80 kapp.k14s.io/original: '{"apiVersion":"v1","data":{"antrea-agent.conf":"featureGates:\n AntreaProxy:
81 true\n EndpointSlice: true\n Traceflow: true\n NodePortLocal: true\n AntreaPolicy:
82 true\n FlowExporter: false\n NetworkPolicyStats: false\n Egress: true\n AntreaIPAM:
83 false\n Multicast: false\n Multicluster: false\n SecondaryNetwork: false\n ServiceExternalIP:
84 false\n TrafficControl: false\ntrafficEncapMode: encap\nnoSNAT: false\ntunnelType:
85 geneve\ntrafficEncryptionMode: none\nenableBridgingMode: false\ndisableTXChecksumOffload:
86 false\nwireGuard:\n port: 51820\negress:\n exceptCIDRs: []\nserviceCIDR: 20.10.0.0/16\nnodePortLocal:\n enable:
87 true\n portRange: 61000-62000\ntlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384\nmulticast:
88 {}\nantreaProxy:\n proxyAll: false\n nodePortAddresses: []\n skipServices:
89 []\n proxyLoadBalancerIPs: false\nmulticluster: {}\n","antrea-cni.conflist":"{\n \"cniVersion\":\"0.3.0\",\n \"name\":
90 \"antrea\",\n \"plugins\": [\n {\n \"type\": \"antrea\",\n \"ipam\":
91 {\n \"type\": \"host-local\"\n }\n }\n ,\n {\n \"type\":
92 \"portmap\",\n \"capabilities\": {\"portMappings\": true}\n }\n ,\n {\n \"type\":
93 \"bandwidth\",\n \"capabilities\": {\"bandwidth\": true}\n }\n ]\n}\n","antrea-controller.conf":"featureGates:\n Traceflow:
94 true\n AntreaPolicy: true\n NetworkPolicyStats: false\n Multicast: false\n Egress:
95 true\n AntreaIPAM: false\n ServiceExternalIP: false\ntlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384\nnodeIPAM:
96 null\n"},"kind":"ConfigMap","metadata":{"labels":{"app":"antrea","kapp.k14s.io/app":"1685607245932804320","kapp.k14s.io/association":"v1.c39c4aca919097e50452c3432329dd40"},"name":"antrea-config","namespace":"kube-system"}}'
97 kapp.k14s.io/original-diff-md5: c6e94dc94aed3401b5d0f26ed6c0bff3
98 creationTimestamp: "2023-06-01T08:14:14Z"
99 labels:
100 app: antrea
101 kapp.k14s.io/app: "1685607245932804320"
102 kapp.k14s.io/association: v1.c39c4aca919097e50452c3432329dd40
103 name: antrea-config
104 namespace: kube-system
105 resourceVersion: "948"
106 uid: fd18fd20-a82b-4df5-bb1a-686463b86f27
If you want to enable or disable any of these features its a matter of applying an AntreaConfig using the included AntreaConfig CRD in TKG 2.0.
One can apply this AntreaConfig on an already provisioned TKG 2.0 cluster or apply before the cluster is provisioned so it will get the features enabled or disabled at creation. Below is an example of AntreaConfig:
1apiVersion: cni.tanzu.vmware.com/v1alpha1
2kind: AntreaConfig
3metadata:
4 name: three-zone-cluster-2-antrea-package
5 namespace: ns-three-zone-1
6spec:
7 antrea:
8 config:
9 featureGates:
10 AntreaProxy: true
11 EndpointSlice: false
12 AntreaPolicy: true
13 FlowExporter: true
14 Egress: true #This needs to be enabled (an example)
15 NodePortLocal: true
16 AntreaTraceflow: true
17 NetworkPolicyStats: true
This example is applied either before or after provisioning of the TKG 2.0 cluster. Just make sure the config has been applied to the correct NS, the same NS as the cluster is deployed in and the name of the config needs to start like this CLUSTER-NAME-antrea-package. In other words the name needs to start with the clustername of the TKG 2.0 cluster and end with -antrea-package.
If it is being done after the cluster has provisioned we need to make sure the already running Antrea pods (agents and controller) are restarted so they can read the new configmap.
If you need to check which version of Antrea is included in your TKR version (and other components for that sake) just run the following command:
1linuxvm01:~/three-zones$ k get tkr v1.24.9---vmware.1-tkg.4 -o yaml
2apiVersion: run.tanzu.vmware.com/v1alpha3
3kind: TanzuKubernetesRelease
4metadata:
5 creationTimestamp: "2023-06-01T07:35:28Z"
6 finalizers:
7 - tanzukubernetesrelease.run.tanzu.vmware.com
8 generation: 2
9 labels:
10 os-arch: amd64
11 os-name: photon
12 os-type: linux
13 os-version: "3.0"
14 v1: ""
15 v1.24: ""
16 v1.24.9: ""
17 v1.24.9---vmware: ""
18 v1.24.9---vmware.1: ""
19 v1.24.9---vmware.1-tkg: ""
20 v1.24.9---vmware.1-tkg.4: ""
21 name: v1.24.9---vmware.1-tkg.4
22 ownerReferences:
23 - apiVersion: vmoperator.vmware.com/v1alpha1
24 kind: VirtualMachineImage
25 name: ob-21552850-ubuntu-2004-amd64-vmi-k8s-v1.24.9---vmware.1-tkg.4
26 uid: 92d3d6af-53f8-4f9a-b262-f70dd33ad19b
27 - apiVersion: vmoperator.vmware.com/v1alpha1
28 kind: VirtualMachineImage
29 name: ob-21554409-photon-3-amd64-vmi-k8s-v1.24.9---vmware.1-tkg.4
30 uid: 6a0aa87a-63e3-475d-a52d-e63589f454e9
31 resourceVersion: "12111"
32 uid: 54db049e-fdf0-45a2-b4d1-46fa90a22b44
33spec:
34 bootstrapPackages:
35 - name: antrea.tanzu.vmware.com.1.7.2+vmware.1-tkg.1-advanced
36 - name: vsphere-pv-csi.tanzu.vmware.com.2.6.1+vmware.1-tkg.1
37 - name: vsphere-cpi.tanzu.vmware.com.1.24.3+vmware.1-tkg.1
38 - name: kapp-controller.tanzu.vmware.com.0.41.5+vmware.1-tkg.1
39 - name: guest-cluster-auth-service.tanzu.vmware.com.1.1.0+tkg.1
40 - name: metrics-server.tanzu.vmware.com.0.6.2+vmware.1-tkg.1
41 - name: secretgen-controller.tanzu.vmware.com.0.11.2+vmware.1-tkg.1
42 - name: pinniped.tanzu.vmware.com.0.12.1+vmware.3-tkg.3
43 - name: capabilities.tanzu.vmware.com.0.28.0+vmware.2
44 - name: calico.tanzu.vmware.com.3.24.1+vmware.1-tkg.1
45 kubernetes:
46 coredns:
47 imageTag: v1.8.6_vmware.15
48 etcd:
49 imageTag: v3.5.6_vmware.3
50 imageRepository: localhost:5000/vmware.io
51 pause:
52 imageTag: "3.7"
53 version: v1.24.9+vmware.1
54 osImages:
55 - name: ob-21552850-ubuntu-2004-amd64-vmi-k8s-v1.24.9---vmware.1-tkg.4
56 - name: ob-21554409-photon-3-amd64-vmi-k8s-v1.24.9---vmware.1-tkg.4
57 version: v1.24.9+vmware.1-tkg.4
58status:
59 conditions:
60 - lastTransitionTime: "2023-06-01T07:35:28Z"
61 status: "True"
62 type: Ready
63 - lastTransitionTime: "2023-06-01T07:35:28Z"
64 status: "True"
65 type: Compatible
So enabling and disabling Antrea Feature Gates is quite simple. To summarize, the feature gates that can be adjusted is these (as of TKR 1.24.9):
1spec:
2 antrea:
3 config:
4 defaultMTU: ""
5 disableUdpTunnelOffload: false
6 featureGates:
7 AntreaPolicy: true
8 AntreaProxy: true
9 AntreaTraceflow: true
10 Egress: true
11 EndpointSlice: true
12 FlowExporter: false
13 NetworkPolicyStats: false
14 NodePortLocal: true
15 noSNAT: false
16 tlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384
17 trafficEncapMode: encap
Getting the Antrea config "templates" for a specific TKR version
Usually with new TKR versions, a new version of Antrea is shipped. And with a new version of Antrea is shipped it most liley containt new and exciting features. So if you want to see which feature gates are being available in your latest and greatest TKR, run these commands from the Supervisor context:
1# to get all the Antrea configs
2andreasm@ubuntu02:~/avi_nsxt_wcp$ k get antreaconfigs.cni.tanzu.vmware.com -A
3NAMESPACE NAME TRAFFICENCAPMODE DEFAULTMTU ANTREAPROXY ANTREAPOLICY SECRETREF
4ns-stc-1 cluster-1-antrea-package encap true true cluster-1-antrea-data-values
5vmware-system-tkg v1.23.15---vmware.1-tkg.4 encap true true
6vmware-system-tkg v1.23.15---vmware.1-tkg.4-routable noEncap true true
7vmware-system-tkg v1.23.8---vmware.2-tkg.2-zshippable encap true true
8vmware-system-tkg v1.23.8---vmware.2-tkg.2-zshippable-routable noEncap true true
9vmware-system-tkg v1.24.9---vmware.1-tkg.4 encap true true
10vmware-system-tkg v1.24.9---vmware.1-tkg.4-routable noEncap true true
11vmware-system-tkg v1.25.7---vmware.3-fips.1-tkg.1 encap true true
12vmware-system-tkg v1.25.7---vmware.3-fips.1-tkg.1-routable noEncap true true
13vmware-system-tkg v1.26.5---vmware.2-fips.1-tkg.1 encap true true
14vmware-system-tkg v1.26.5---vmware.2-fips.1-tkg.1-routable noEncap true true
15
16# Get the content of a specific Antrea config
17andreasm@ubuntu02:~/avi_nsxt_wcp$ k get antreaconfigs.cni.tanzu.vmware.com -n vmware-system-tkg v1.26.5---vmware.2-fips.1-tkg.1 -oyaml
18apiVersion: cni.tanzu.vmware.com/v1alpha1
19kind: AntreaConfig
20metadata:
21 annotations:
22 tkg.tanzu.vmware.com/template-config: "true"
23 creationTimestamp: "2023-09-24T17:49:37Z"
24 generation: 1
25 name: v1.26.5---vmware.2-fips.1-tkg.1
26 namespace: vmware-system-tkg
27 resourceVersion: "19483"
28 uid: 8cdaa6ec-4059-4d35-a0d4-63711831edc8
29spec:
30 antrea:
31 config:
32 antreaProxy:
33 proxyLoadBalancerIPs: true
34 defaultMTU: ""
35 disableTXChecksumOffload: false
36 disableUdpTunnelOffload: false
37 dnsServerOverride: ""
38 enableBridgingMode: false
39 enableUsageReporting: false
40 featureGates:
41 AntreaIPAM: false
42 AntreaPolicy: true
43 AntreaProxy: true
44 AntreaTraceflow: true
45 Egress: true
46 EndpointSlice: true
47 FlowExporter: false
48 Multicast: false
49 Multicluster: false
50 NetworkPolicyStats: true
51 NodePortLocal: true
52 SecondaryNetwork: false
53 ServiceExternalIP: false
54 TopologyAwareHints: false
55 TrafficControl: false
56 flowExporter:
57 activeFlowTimeout: 60s
58 collectorAddress: flow-aggregator/flow-aggregator:4739:tls
59 noSNAT: false
60 tlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384
61 trafficEncapMode: encap
62 tunnelCsum: false
63 tunnelPort: 0
With the above you can always get the latest config coming with the specific TKR release and use it as a template for your TKC cluster.
Integrating Antrea with NSX-T
To enable the NSX-T Antrea integration there is a couple of steps that needs to be prepared. All the steps can be followed here. I have decided to create a script that automates all these steps. So if you dont want to go through all these steps manually by following the link above you can use this script instead and just enter the necesarry information as prompted, and have the pre-requisities in place before excecuting. Copy and paste the below script into a .sh file on your Linux jumpiest and make it executable with chmod +x.
1#!/bin/bash
2
3# Echo information
4echo "This script has some dependencies... make sure they are met before continuing. Otherwise click ctrl+c now
51. This script is adjusted for vSphere with Tanzu TKG clusters using Tanzu CLI
62. Have downloaded the antrea-interworking*.zip
73. This script is located in the root of where you have downloaded the zip file above
84. curl is installed
95. Need connectivity to the NSX manager
106. kubectl is installed
117. vsphere with tanzu cli is installed
128. That you are in the correct context of the cluster you want to integrate to NSX
139. If not in the correct context the script will put you in the correct context anyway
1410. A big smile and good mood"
15
16# Prompt the user to press a key to continue
17echo "Press any key to continue..."
18read -n 1 -s
19
20# Continue with the script
21echo "Continuing..."
22
23# Prompt for name
24read -p "Enter the name of the tkg cluster - will be used for certificates and name in NSX: " name
25
26# Prompt for NSX_MGR
27read -p "Enter NSX Manager ip or FQDN: " nsx_mgr
28
29# Prompt for NSX_ADMIN
30read -p "Enter NSX admin username: " nsx_admin
31
32# Prompt for NSX_PASS
33read -p "Enter NSX Password: " nsx_pass
34
35# Prompt for Supervisor Endpoint IP or FQDN
36read -p "Enter Supervisor API IP or FQDN: " svc_api_ip
37
38# Prompt for vSphere Username
39read -p "Enter vSphere Username: " vsphere_username
40
41# Prompt for Tanzu Kubernetes Cluster Namespace
42read -p "Enter Tanzu Kubernetes Cluster Namespace: " tanzu_cluster_namespace
43
44# Prompt for Tanzu Kubernetes Cluster Name
45read -p "Enter Tanzu Kubernetes Cluster Name: " tanzu_cluster_name
46
47# Login to vSphere using kubectl
48kubectl vsphere login --server="$svc_api_ip" --insecure-skip-tls-verify --vsphere-username="$vsphere_username" --tanzu-kubernetes-cluster-namespace="$tanzu_cluster_namespace" --tanzu-kubernetes-cluster-name="$tanzu_cluster_name"
49
50key_name="${name}-private.key"
51csr_output="${name}.csr"
52crt_output="${name}.crt"
53
54openssl genrsa -out "$key_name" 2048
55openssl req -new -key "$key_name" -out "$csr_output" -subj "/C=US/ST=CA/L=Palo Alto/O=VMware/OU=Antrea Cluster/CN=$name"
56openssl x509 -req -days 3650 -sha256 -in "$csr_output" -signkey "$key_name" -out "$crt_output"
57
58# Convert the certificate file to a one-liner with line breaks
59crt_contents=$(awk '{printf "%s\\n", $0}' "$crt_output")
60
61# Replace the certificate and name in the curl body
62curl_body='{
63 "name": "'"$name"'",
64 "node_id": "'"$name"'",
65 "roles_for_paths": [
66 {
67 "path": "/",
68 "roles": [
69 {
70 "role": "enterprise_admin"
71 }
72 ]
73 }
74 ],
75 "role": "enterprise_admin",
76 "is_protected": "true",
77 "certificate_pem" : "'"$crt_contents"'"
78}'
79
80# Make the curl request with the updated body
81# curl -X POST -H "Content-Type: application/json" -d "$curl_body" https://example.com/api/endpoint
82curl -ku "$nsx_admin":"$nsx_pass" -X POST https://"$nsx_mgr"/api/v1/trust-management/principal-identities/with-certificate -H "Content-Type: application/json" -d "$curl_body"
83
84# Check if a subfolder starting with "antrea-interworking" exists
85if ls -d antrea-interworking* &>/dev/null; then
86 echo "Subfolder starting with 'antrea-interworking' exists. Skipping extraction."
87else
88 # Extract the zip file starting with "antrea-interworking"
89 unzip "antrea-interworking"*.zip
90fi
91
92# Create a new folder with the name antrea-interworking-"from-name"
93new_folder="antrea-interworking-$name"
94mkdir "$new_folder"
95
96# Copy all YAML files from the antrea-interworking subfolder to the new folder
97cp antrea-interworking*/{*.yaml,*.yml} "$new_folder/"
98
99# Replace the field after "image: vmware.io/antrea/interworking" with "image: projects.registry.vmware.com/antreainterworking/interworking-debian" in interworking.yaml
100sed -i 's|image: vmware.io/antrea/interworking|image: projects.registry.vmware.com/antreainterworking/interworking-debian|' "$new_folder/interworking.yaml"
101
102# Replace the field after "image: vmware.io/antrea/interworking" with "image: projects.registry.vmware.com/antreainterworking/interworking-debian" in deregisterjob.yaml
103sed -i 's|image: vmware.io/antrea/interworking|image: projects.registry.vmware.com/antreainterworking/interworking-debian|' "$new_folder/deregisterjob.yaml"
104
105# Edit the bootstrap.yaml file in the new folder
106sed -i 's|clusterName:.*|clusterName: '"$name"'|' "$new_folder/bootstrap-config.yaml"
107sed -i 's|NSXManagers:.*|NSXManagers: ["'"$nsx_mgr"'"]|' "$new_folder/bootstrap-config.yaml"
108tls_crt_base64=$(base64 -w 0 "$crt_output")
109sed -i 's|tls.crt:.*|tls.crt: '"$tls_crt_base64"'|' "$new_folder/bootstrap-config.yaml"
110tls_key_base64=$(base64 -w 0 "$key_name")
111sed -i 's|tls.key:.*|tls.key: '"$tls_key_base64"'|' "$new_folder/bootstrap-config.yaml"
112
113# Interactive prompt to select Kubernetes context
114kubectl config get-contexts
115read -p "Enter the name of the Kubernetes context: " kubectl_context
116kubectl config use-context "$kubectl_context"
117
118# Apply the bootstrap-config.yaml and interworking.yaml files from the new folder
119kubectl apply -f "$new_folder/bootstrap-config.yaml" -f "$new_folder/interworking.yaml"
120
121# Run the last command to verify that something is happening
122kubectl get pods -o wide -n vmware-system-antrea
123
124echo "As it was written each time we ssh'ed into a Suse Linux back in the good old days - Have a lot of fun"
As soon as the script has been processed through it should not take long until you have your TKG cluster in the NSX manager:
Thats it for the NSX-T integration, as soon as that have been done its time to look into what we can do with this integration in the following chapters
Antrea Security Policies
Antrea has two sets of security policies, Antrea Network Policies (ANP) and Antrea Cluster Network Policies (ACNP). The difference between these two is that ANP is applied on a Kubernetes Namespace and ACNP is cluster-wide. Both belongs to Antrea Native Policies. Both ANP and ACNP can work together with Kubernetes Network Policies.
There are many benefits of using Antrea Native Policies in combination or not in combination with Kubernetes Network Policies.
Some of the benefits of using Antrea Native Policies:
- Can be tiered
- Select both ingress and egress
- Support the following actions: allow, drop, reject and pass
- Support FQDN filtering in egress (to) with actions allow, drop and reject
Tiered policies
The benefit of having tiered policies is very useful when for example we have different parts of the organization are responsible for security at different levels/scopes in the platform. Antrea can have policies placed in different tiers where the tiers are evaluated in a given order. If we want some rules to be very early in the policy evaluation and enforced as soon as possible we can place rule in a tier that is considered first, then within the same tier the rules or policies are also being enforced in the order of a given priority, a number. The rule with the lowest number (higher priority) will be evaluated first and then when all rules in a tier has been processed it will go to the next tier. Antrea comes with a set of static tiers already defined. These tier can be shown by running the command:
1linuxvm01:~$ k get tiers
2NAME PRIORITY AGE
3application 250 3h11m
4baseline 253 3h11m
5emergency 50 3h11m
6networkops 150 3h11m
7platform 200 3h11m
8securityops 100 3h11m
Below will show a diagram of how they look, notice also where the Kubernets network policies will be placed:
There is also the option to add custom tiers using the following CRD (taken from the offical Antrea docs here:
1apiVersion: crd.antrea.io/v1alpha1
2kind: Tier
3metadata:
4 name: mytier
5spec:
6 priority: 10
7 description: "my custom tier"
When doing the Antrea NSX integration some additional tiers are added automatically (they start with nsx*):
1linuxvm01:~$ k get tiers
2NAME PRIORITY AGE
3application 250 3h11m
4baseline 253 3h11m
5emergency 50 3h11m
6networkops 150 3h11m
7nsx-category-application 4 87m
8nsx-category-emergency 1 87m
9nsx-category-environment 3 87m
10nsx-category-ethernet 0 87m
11nsx-category-infrastructure 2 87m
12platform 200 3h11m
13securityops 100 3h11m
I can quickly show two examples where I create one rule as a "security-admin", where this security admin has to follow the company's compliance policy to block access to a certain FQDN. This must be enforced all over. So I need to create this policy in the securityops tier. I could have defined it in the emergency tier also but in this tier it makes more sense to have rules applied that are disabled/not-enforced/idle in case of an emergency and we need a way to quickly enable it and override rules later down the hierarchy. So securityops it is:
Lets apply this one:
1apiVersion: crd.antrea.io/v1alpha1
2kind: ClusterNetworkPolicy
3metadata:
4 name: acnp-drop-yelb
5spec:
6 priority: 1
7 tier: securityops
8 appliedTo:
9 - podSelector:
10 matchLabels:
11 app: ubuntu-20-04
12 egress:
13 - action: Drop
14 to:
15 - fqdn: "yelb-ui.yelb.carefor.some-dns.net"
16 ports:
17 - protocol: TCP
18 port: 80
19 - action: Allow #Allow the rest
To check if it is applied and in use (notice under desired nodes and current nodes):
1linuxvm01:~/antrea/policies$ k get acnp
2NAME TIER PRIORITY DESIRED NODES CURRENT NODES AGE
3acnp-drop-yelb securityops 1 1 1 5m33s
Now from a test pod I will try to curl the blocked fqdn and another one not in any block rule:
1root@ubuntu-20-04-548545fc87-kkzbh:/# curl yelb-ui.yelb.cloudburst.somecooldomain.net
2curl: (6) Could not resolve host: yelb-ui.yelb.cloudburst.somecooldomain.net
3
4# Curling a FQDN that is allowed:
5root@ubuntu-20-04-548545fc87-kkzbh:/# curl allowed-yelb.yelb-2.carefor.some-dns.net
6<!doctype html>
7<html>
8<head>
9 <meta charset="utf-8">
10 <title>Yelb</title>
11 <base href="/">
12 <meta name="viewport" content="width=device-width, initial-scale=1">
13 <link rel="icon" type="image/x-icon" href="favicon.ico?v=2">
14</head>
15<body>
16<yelb>Loading...</yelb>
17<script type="text/javascript" src="inline.bundle.js"></script><script type="text/javascript" src="styles.bundle.js"></script><script type="text/javascript" src="scripts.bundle.js"></script><script type="text/javascript" src="vendor.bundle.js"></script><script type="text/javascript" src="main.bundle.js"></script></body>
18</html>
That works as expected. Now what happens then if another use with access to the Kubernetes cluster decide to create a rule later down in the hierarchy, lets go with the application tier, to create an allow rule for this FQDN that is currently being dropped? Lets see what happens
1apiVersion: crd.antrea.io/v1alpha1
2kind: ClusterNetworkPolicy
3metadata:
4 name: acnp-allow-yelb
5spec:
6 priority: 1
7 tier: application
8 appliedTo:
9 - podSelector:
10 matchLabels:
11 app: ubuntu-20-04
12 egress:
13 - action: Allow
14 to:
15 - fqdn: "yelb-ui.yelb.carefor.some-dns.net"
16 ports:
17 - protocol: TCP
18 port: 80
19 - action: Allow #Allow the rest
I will apply this above rule and then try to curl the same fqdn which is supposed to be dropped.
1linuxvm01:~/antrea/policies$ k get acnp
2NAME TIER PRIORITY DESIRED NODES CURRENT NODES AGE
3acnp-allow-yelb application 1 1 1 4s
4acnp-drop-yelb securityops 1 1 1 5h1m
From my test pod again:
1kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
2root@ubuntu-20-04-548545fc87-kkzbh:/# curl yelb-ui.yelb.cloudburst.somecooldomain.net
3curl: (6) Could not resolve host: yelb-ui.yelb.cloudburst.somecooldomain.net
4root@ubuntu-20-04-548545fc87-kkzbh:/# curl allowed-yelb.yelb-2.carefor.some-dns.net
5<!doctype html>
6<html>
7<head>
8 <meta charset="utf-8">
9 <title>Yelb</title>
10 <base href="/">
11 <meta name="viewport" content="width=device-width, initial-scale=1">
12 <link rel="icon" type="image/x-icon" href="favicon.ico?v=2">
13</head>
14<body>
15<yelb>Loading...</yelb>
16<script type="text/javascript" src="inline.bundle.js"></script><script type="text/javascript" src="styles.bundle.js"></script><script type="text/javascript" src="scripts.bundle.js"></script><script type="text/javascript" src="vendor.bundle.js"></script><script type="text/javascript" src="main.bundle.js"></script></body>
17</html>
That was expected. It is still being dropped by the first rule placed in the securityops tier. So far so good. But what if this user also has access to the tier where the first rule is applied? Well, then they can override it. That is why I we can now go to the next chapter.
Antrea RBAC
Antrea comes with a couple of CRDs that allow us to configure granular user permissions on the different objects, like the Policy Tiers. So to restrict "normal" users from applying and/or delete security polices created in the higher priority Tiers we need to apply some rolebindings, or to be exact ClusterRoleBindings. Let us see how we can achieve that.
In my lab environment I have defined two users, my own admin user (andreasm) that is part of the ClusterRole/cluster-admin
and a second user (User1) that is part of the the ClusterRole/view
. The ClusterRole View has only read access, not to all objects in the cluster but many. To see what run the following command:
1linuxvm01:~/antrea/policies$ k get clusterrole view -oyaml
2aggregationRule:
3 clusterRoleSelectors:
4 - matchLabels:
5 rbac.authorization.k8s.io/aggregate-to-view: "true"
6apiVersion: rbac.authorization.k8s.io/v1
7kind: ClusterRole
8metadata:
9 annotations:
10 rbac.authorization.kubernetes.io/autoupdate: "true"
11 creationTimestamp: "2023-06-04T09:37:44Z"
12 labels:
13 kubernetes.io/bootstrapping: rbac-defaults
14 rbac.authorization.k8s.io/aggregate-to-edit: "true"
15 name: view
16 resourceVersion: "1052"
17 uid: c4784a81-4451-42af-9134-e141ccf8bc50
18rules:
19- apiGroups:
20 - crd.antrea.io
21 resources:
22 - clustergroups
23 verbs:
24 - get
25 - list
26 - watch
27- apiGroups:
28 - crd.antrea.io
29 resources:
30 - clusternetworkpolicies
31 - networkpolicies
32 verbs:
33 - get
34 - list
35 - watch
36- apiGroups:
37 - crd.antrea.io
38 resources:
39 - traceflows
40 verbs:
41 - get
42 - list
43 - watch
44- apiGroups:
45 - ""
46 resources:
47 - configmaps
48 - endpoints
49 - persistentvolumeclaims
50 - persistentvolumeclaims/status
51 - pods
52 - replicationcontrollers
53 - replicationcontrollers/scale
54 - serviceaccounts
55 - services
56 - services/status
57 verbs:
58 - get
59 - list
60 - watch
61- apiGroups:
62 - ""
63 resources:
64 - bindings
65 - events
66 - limitranges
67 - namespaces/status
68 - pods/log
69 - pods/status
70 - replicationcontrollers/status
71 - resourcequotas
72 - resourcequotas/status
73 verbs:
74 - get
75 - list
76 - watch
77- apiGroups:
78 - ""
79 resources:
80 - namespaces
81 verbs:
82 - get
83 - list
84 - watch
85- apiGroups:
86 - discovery.k8s.io
87 resources:
88 - endpointslices
89 verbs:
90 - get
91 - list
92 - watch
93- apiGroups:
94 - apps
95 resources:
96 - controllerrevisions
97 - daemonsets
98 - daemonsets/status
99 - deployments
100 - deployments/scale
101 - deployments/status
102 - replicasets
103 - replicasets/scale
104 - replicasets/status
105 - statefulsets
106 - statefulsets/scale
107 - statefulsets/status
108 verbs:
109 - get
110 - list
111 - watch
112- apiGroups:
113 - autoscaling
114 resources:
115 - horizontalpodautoscalers
116 - horizontalpodautoscalers/status
117 verbs:
118 - get
119 - list
120 - watch
121- apiGroups:
122 - batch
123 resources:
124 - cronjobs
125 - cronjobs/status
126 - jobs
127 - jobs/status
128 verbs:
129 - get
130 - list
131 - watch
132- apiGroups:
133 - extensions
134 resources:
135 - daemonsets
136 - daemonsets/status
137 - deployments
138 - deployments/scale
139 - deployments/status
140 - ingresses
141 - ingresses/status
142 - networkpolicies
143 - replicasets
144 - replicasets/scale
145 - replicasets/status
146 - replicationcontrollers/scale
147 verbs:
148 - get
149 - list
150 - watch
151- apiGroups:
152 - policy
153 resources:
154 - poddisruptionbudgets
155 - poddisruptionbudgets/status
156 verbs:
157 - get
158 - list
159 - watch
160- apiGroups:
161 - networking.k8s.io
162 resources:
163 - ingresses
164 - ingresses/status
165 - networkpolicies
166 verbs:
167 - get
168 - list
169 - watch
170- apiGroups:
171 - metrics.k8s.io
172 resources:
173 - pods
174 - nodes
175 verbs:
176 - get
177 - list
178 - watch
179- apiGroups:
180 - policy
181 resourceNames:
182 - vmware-system-privileged
183 resources:
184 - podsecuritypolicies
185 verbs:
186 - use
On the other hand my own admin user has access to everything, get, list, create, patch, update, delete - the whole shabang. What I would like to demonstrate now is that user1 is a regular user and should only be allowed to create security policies in the Tier application while all other Tiers is restricted to the admins that have the responsibility to create policies there. User1 should also not be allowed to create any custom Tiers.
So the first thing I need to create is an Antrea TierEntitlement and TierEntitlementBinding like this:
1apiVersion: crd.antrea.tanzu.vmware.com/v1alpha1
2kind: TierEntitlement
3metadata:
4 name: secops-edit
5spec:
6 tiers: # Accept list of Tier names. Tier may or may not exist yet.
7 - emergency
8 - securityops
9 - networkops
10 - platform
11 - baseline
12 permission: edit
13---
14apiVersion: crd.antrea.tanzu.vmware.com/v1alpha1
15kind: TierEntitlementBinding
16metadata:
17 name: secops-bind
18spec:
19 subjects: # List of users to grant this entitlement to
20 - kind: User
21 name: sso:andreasm@cpod-nsxam-stc.az-stc.cloud-garage.net
22 apiGroup: rbac.authorization.k8s.io
23# - kind: Group
24# name: security-admins
25# apiGroup: rbac.authorization.k8s.io
26# - kind: ServiceAccount
27# name: network-admins
28# namespace: kube-system
29 tierEntitlement: secops-edit # Reference to the TierEntitlement
Now, notice that I am listing the Tiers that should only be available for the users, groups, or ServiceAccounts in the TierEntitlementBinding (I am only using Kind: User in this example). This means that all unlisted tiers should be allowed for other users to place security policies in.
Now apply it:
1linuxvm01:~/antrea/policies$ k apply -f tierentitlement.yaml
2tierentitlement.crd.antrea.tanzu.vmware.com/secops-edit created
3tierentitlementbinding.crd.antrea.tanzu.vmware.com/secops-bind created
Next up is to add my User1 to the Antrea CRD "tiers" to be allowed to list and get the tiers:
1apiVersion: rbac.authorization.k8s.io/v1
2kind: ClusterRole
3metadata:
4 name: tier-placement
5rules:
6- apiGroups: ["crd.antrea.io"]
7 resources: ["tiers"]
8 verbs: ["get","list"]
9---
10apiVersion: rbac.authorization.k8s.io/v1
11kind: ClusterRoleBinding
12metadata:
13 name: tier-bind
14subjects:
15- kind: User
16 name: sso:user1@cpod-nsxam-stc.az-stc.cloud-garage.net # Name is case sensitive
17 apiGroup: rbac.authorization.k8s.io
18roleRef:
19 kind: ClusterRole
20 name: tier-placement
21 apiGroup: rbac.authorization.k8s.io
If you want some user to also add/create/delete custom Tiers this can be allowed by adding: "create","patch","update","delete"
Now apply the above yaml:
1linuxvm01:~/antrea/policies$ k apply -f antrea-crd-tier-list.yaml
2clusterrole.rbac.authorization.k8s.io/tier-placement created
3clusterrolebinding.rbac.authorization.k8s.io/tier-bind created
I will now log in with the User1 and try to apply this network policy:
1apiVersion: crd.antrea.io/v1alpha1
2kind: ClusterNetworkPolicy
3metadata:
4 name: override-rule-allow-yelb
5spec:
6 priority: 1
7 tier: securityops
8 appliedTo:
9 - podSelector:
10 matchLabels:
11 app: ubuntu-20-04
12 egress:
13 - action: Allow
14 to:
15 - fqdn: "yelb-ui.yelb.carefor.some-dns.net"
16 ports:
17 - protocol: TCP
18 port: 80
19 - action: Allow
As User1:
1linuxvm01:~/antrea/policies$ k apply -f fqdn-rule-secops-tier.test.yaml
2Error from server (Forbidden): error when creating "fqdn-rule-secops-tier.test.yaml": clusternetworkpolicies.crd.antrea.io is forbidden: User "sso:user1@cpod-nsxam-stc.az-stc.cloud-garage.net" cannot create resource "clusternetworkpolicies" in API group "crd.antrea.io" at the cluster scope
First bump in the road.. This user is not allowed to create any security policies at all.
So I need to use my admin user and apply this ClusterRoleBinding:
1apiVersion: rbac.authorization.k8s.io/v1
2kind: ClusterRole
3metadata:
4 name: clusternetworkpolicies-edit
5rules:
6- apiGroups: ["crd.antrea.io"]
7 resources: ["clusternetworkpolicies"]
8 verbs: ["get","list","create","patch","update","delete"]
9---
10apiVersion: rbac.authorization.k8s.io/v1
11kind: ClusterRoleBinding
12metadata:
13 name: clusternetworkpolicies-bind
14subjects:
15- kind: User
16 name: sso:user1@cpod-nsxam-stc.az-stc.cloud-garage.net # Name is case sensitive
17 apiGroup: rbac.authorization.k8s.io
18roleRef:
19 kind: ClusterRole
20 name: clusternetworkpolicies-edit
21 apiGroup: rbac.authorization.k8s.io
Now the user1 has access to create policies... Lets try again:
1linuxvm01:~/antrea/policies$ k apply -f fqdn-rule-secops-tier.test.yaml
2Error from server: error when creating "fqdn-rule-secops-tier.test.yaml": admission webhook "acnpvalidator.antrea.io" denied the request: user not authorized to access Tier securityops
There it is, I am not allowed to place any security policies in the tier securityops. That is what I wanted to achieve, so thats good. What if user1 tries to apply a policy in the application tier? Lets see:
1apiVersion: crd.antrea.io/v1alpha1
2kind: ClusterNetworkPolicy
3metadata:
4 name: override-attempt-failed-allow-yelb
5spec:
6 priority: 1
7 tier: application
8 appliedTo:
9 - podSelector:
10 matchLabels:
11 app: ubuntu-20-04
12 egress:
13 - action: Allow
14 to:
15 - fqdn: "yelb-ui.yelb.carefor.some-dns.net"
16 ports:
17 - protocol: TCP
18 port: 80
19 - action: Allow
1linuxvm01:~/antrea/policies$ k apply -f fqdn-rule-baseline-tier.test.yaml
2clusternetworkpolicy.crd.antrea.io/override-attempt-failed-allow-yelb created
3linuxvm01:~/antrea/policies$ k get acnp
4NAME TIER PRIORITY DESIRED NODES CURRENT NODES AGE
5acnp-allow-yelb application 1 1 1 147m
6acnp-drop-yelb securityops 1 1 1 18h
7override-attempt-failed-allow-yelb application 1 1 1 11s
That worked, even though the above rule is trying to allow access to yelb it will not allow it due to the Drop rule in the securityops Tier. So how much the User1 tries to get this access it will be blocked.
These users....
What if user1 tries to apply the same policy without stating any Tier in in the policy? Lets see:
1apiVersion: crd.antrea.io/v1alpha1
2kind: ClusterNetworkPolicy
3metadata:
4 name: override-attempt-failed-allow-yelb
5spec:
6 priority: 1
7 appliedTo:
8 - podSelector:
9 matchLabels:
10 app: ubuntu-20-04
11 egress:
12 - action: Allow
13 to:
14 - fqdn: "yelb-ui.yelb.carefor.some-dns.net"
15 ports:
16 - protocol: TCP
17 port: 80
18 - action: Allow
1linuxvm01:~/antrea/policies$ k apply -f fqdn-rule-no-tier.yaml
2clusternetworkpolicy.crd.antrea.io/override-attempt-failed-allow-yelb created
3linuxvm01:~/antrea/policies$ k get acnp
4NAME TIER PRIORITY DESIRED NODES CURRENT NODES AGE
5acnp-allow-yelb application 1 1 1 151m
6acnp-drop-yelb securityops 1 1 1 18h
7override-attempt-failed-allow-yelb application 1 1 1 10s
The rule will be placed in the application Tier, even though the user has permission to create clusternetworkpolicies...
With this the network or security admins have full control of the network policies before and after the application Tier (ref the Tier diagram above).
This example has only shown how to do this on Cluster level, one can also add more granular permission on Namespace level.
So far I have gone over how to manage the Antrea FeatureGates in TKG, how to configure the Antrea-NSX integration, Antrea Policies in general and how to manage RBAC. In the the two next chapters I will cover two different ways how we can apply the Antrea Policies. Lets get into it
How to manage the Antrea Native Policies
As mentioned previously Antrea Native Policies can be applied from inside the Kubernetes cluster using yaml manifests, but there is also another way to manage them using the NSX manager. As not mentioned previously this opens up for a whole new way of managing security policies. Centrally managed across multiple clusters wherever located, easier adoption of roles and responsibilities. If NSX is already in place, chances are that NSX security policies are already in place and being managed by the network or security admins. Now they can continue doing that but also take into consideration pod network security across the different TKG/Kubernetes clusters.
Antrea Security policies from the NSX manager
After you have connected your TKG clusters to the NSX manager (as shown earlier in this post) you will see the status of these connections in the NSX manager under System -> Fabric -> Nodes:
The status indicator is also a benefit of this integration as it will show you the status of Antrea Controller, and the components responsible for the Antrea-NSX integration.
Under inventory we can get all the relevant info from the TKG clusters:
Where in the screenshot above stc-tkg-cluster 1 and 2 are my TKG Antrea clusters. I can get all kinds of information like namespaces, pods, labels, ip addresses, names, services. This informaton is relevant as I can use them in my policy creation, but it also gives me status on whether pods, services are up.
Antrea Cluster Network Policies - Applied from the NSX manager
With the NSX manager we can create and manage the Antrea Native Policies from the NSX graphical user interface instead of CLI. Using NSX security groups and labels make it also much more fun, but also very easy to maintain know what we do as we can see the policies.
Lets create some policies from the NSX manager microsegmenting my demo application Yelb. This is my demo application, it consists of four pods, and a service called yelb-ui where the webpage is exposed.
I know the different parts of the application (e.g pods) are using labels so I will use them. First let us list them from cli and then get them from the NSX manager.
1linuxvm01:~/antrea/policies$ k get pods -n yelb --show-labels
2NAME READY STATUS RESTARTS AGE LABELS
3redis-server-69846b4888-5m757 1/1 Running 0 22h app=redis-server,pod-template-hash=69846b4888,tier=cache
4yelb-appserver-857c5c76d5-4cgbq 1/1 Running 0 22h app=yelb-appserver,pod-template-hash=857c5c76d5,tier=middletier
5yelb-db-6bd4fc5d9b-92rkf 1/1 Running 0 22h app=yelb-db,pod-template-hash=6bd4fc5d9b,tier=backenddb
6yelb-ui-6df49457d6-4bktw 1/1 Running 0 20h app=yelb-ui,pod-template-hash=6df49457d6,tier=frontend
Ok, there I have the labels. Fine, just for the sake of it I will find the same labels in the NSX manager also:
Now I need to create some security groups in NSX using these labels.
First group is called acnp-yelb-frontend-ui and are using these membership criterias: (I am also adding the namespace criteria, to exclude any other applications using the same labels in other namespaces).
Now hurry back to the security group and check whether there are any members.... Disappointment. Just empty:
Fear not, let us quickly create a policy with this group:
Create a new policy and set Antrea Container Clusters in the applied to field:
The actual rule:
The rule above allows my AVI Service Engines to reach the web-port on my yelb-ui pod on port 80 (http) as they are the loadbalancer for my application.
Any members in the group now?
Yes 😃
Now go ahead and create similar groups and rules (except the ports) for the other pods using their respective label.
End result:
Do they work? Let us find that out a bit later as I need something to put in my TraceFlow chapter 😄
The rules I have added above was just for the application in the namespace Yelb. If I wanted to extend this ruleset to also include the same application from other clusters its just adding the Kubernetes cluster in the Applied field like this:
NSX Distributed Firewall - Kubernetes objects Policies
In additon to managing the Antrea Native Policies from the NSX manager as above, in the recent NSX release additional features have been added to support security policies enforced in the NSX Distributed Firewall to also cover these components:
With this we can create security policies in NSX using the distributed firewall to cover the above components using security groups. With this feature its no longer necessary to investigate to get the information about the above components as they are already reported into the NSX manager. Let is do an example of how such a rule can be created and work.
I will create a security policy based on this feature where I will use Kubernetes Service in my example. I will create a security group as above, but this time I will do some different selections. First grab the labels from the service, I will use the yelb-ui service in my example:
1linuxvm01:~/antrea/policies$ k get svc -n yelb --show-labels
2NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE LABELS
3redis-server ClusterIP 20.10.102.8 <none> 6379/TCP 23h app=redis-server,tier=cache
4yelb-appserver ClusterIP 20.10.247.130 <none> 4567/TCP 23h app=yelb-appserver,tier=middletier
5yelb-db ClusterIP 20.10.44.17 <none> 5432/TCP 23h app=yelb-db,tier=backenddb
6yelb-ui LoadBalancer 20.10.194.179 10.13.210.10 80:30912/TCP 21h app=yelb-ui,tier=frontend
I can either decide to use app=yelb-ui or tier=frontend. Now that I have my labels I will create my security group like this:
I used the name of the service itself and the name of the namespace. This gives me this member:
Which is right...
Now create a security policy using this group where source is another group where I have defined a VM running in the same NSX environment. I have also created a any group which contains just 0.0.0.0/0. Remember that this policy is enforced in the DFW, so there must be something that is running in NSX for this to work, which in my environments is not only the the TKG cluster, but also the Avi Service Engines which acts as LoadBalancer and Ingress for my exposed services. This is kind of important to think of, as the Avi Service Engines communicates with the TKG cluster nodes using NodePortLocal in the default portrange 61000-62000 (if not changed in the Antrea configmap).
Lets see if the below rule works then:
I will adjust it to Action Drop:
Test Yelb ui access from my linux vm via curl and my physical laptop's browser, and the results are in:
1ubuntu02:~$ curl yelb-ui.yelb.carefor.some-dns.net
2curl: (28) Failed to connect to yelb-ui.yelb.carefor.some-dns.net port 80: Connection timed out
From my physical laptop browser:
This will be dropped even though I still have these rules in place from earlier (remember):
Now, what about the Avi Service Engines?
If we just look at the rules above, the NSX Kubernetes Service rule and the Antrea Policies rules we are doing the firewall enforcing at two different levels. When creating policies with the Antrea Native Policies, like the one just above, we are applying and enforcing inside the Kubernetes cluster, with the NSX Kubernetes Service rule we are applying and enforcing on the DFW layer. So the Avi Service Engines will first need a policy that is allowing them to communicate to the TKG worker nodes on specific ports/protocol, in my exampe above with Yelb it is port 61002 and TCP. We can see that by looking in Avi UI:
Regardless of the Avi SE's are using the same DFW as the worker nodes, we need to create this policy for the SE to reach the worker nodes to allow this connection. These policies can either be very "lazy" allowing the SEs on everyting TCP with a range of 61000-62000 to the worker nodes or can be made very granual pr service. The Avi SEs are automatically being grouped in NSX security groups if using Avi with NSX Cloud, explore that.
If we are not allowing the SEs this traffic, we will se this in the Avi UI:
Why is that though, I dont have a default block-all rule in my NSX environment... Well this is because of a set of default rules being created by NCP from TKG. Have a look at this rule:
What is the membership in the group used in this Drop rule?
That is all my TKG nodes including the Supervisor Control Plane nodes (the workload interface).
Now in the Antrea Policies, we need to allow the IP addresses the SEs are using to reach the yelb-ui, as its not the actual client-ip that is being used, it is the SEs dataplane network.
The above diagram tries to explain the traffic flow and how it will be enforced. First the user want to access the VIP of the Yelb UI service. This is allowed by the NSX Firewall saying, yes Port 80 on IP 10.13.210.10 is OK to pass. As this VIP is realized by the Avi SEs, and are on NSX this rule will be enforced by the NSX firewall. Then the Avi SEs will forward (loadbalance) the traffic to the worker node(s) using NodePortLocal ranges between 61000-62000(default) where the worker nodes are also on the same NSX DFW, so we need to allow the SEs to forward this traffic. When all above is allowed, we will get "into" the actual TKG (Kubernetes) cluster and need to negiotiate the Antrea Native Policies that have been applied. These rules remember are allowing the SE dataplane IPs to reach the pod yelb-ui on port 80. And thats it.
Just before we end up this chapter and head over to the next, let us quickly see how the policies created from the NSX manager look like inside the TKG cluster:
1linuxvm01:~/antrea/policies$ k get acnp
2NAME TIER PRIORITY DESIRED NODES CURRENT NODES AGE
3823fca6f-88ee-4032-8150-ac8cf22f1c93 nsx-category-infrastructure 1.000000017763571 3 3 23h
49ae2599a-3bd3-4413-849e-06f53f467559 nsx-category-application 1.0000000532916369 2 2 24h
The policies will be placed according to the NSX tiers from the UI:
If I describe one of the policies I will get the actual yaml manifest:
1linuxvm01:~/antrea/policies$ k get acnp 9ae2599a-3bd3-4413-849e-06f53f467559 -oyaml
2apiVersion: crd.antrea.io/v1alpha1
3kind: ClusterNetworkPolicy
4metadata:
5 annotations:
6 ccp-adapter.antrea.tanzu.vmware.com/display-name: Yelb-Zero-Trust
7 creationTimestamp: "2023-06-05T12:12:14Z"
8 generation: 6
9 labels:
10 ccp-adapter.antrea.tanzu.vmware.com/managedBy: ccp-adapter
11 name: 9ae2599a-3bd3-4413-849e-06f53f467559
12 resourceVersion: "404591"
13 uid: 6477e785-fde4-46ba-b0a1-5ff5f784db8c
14spec:
15 ingress:
16 - action: Allow
17 appliedTo:
18 - group: 6f39fadf-04e8-4f49-be77-da0d4005ff37
19 enableLogging: false
20 from:
21 - ipBlock:
22 cidr: 10.13.11.101/32
23 - ipBlock:
24 cidr: 10.13.11.100/32
25 name: "4084"
26 ports:
27 - port: 80
28 protocol: TCP
29 - action: Allow
30 appliedTo:
31 - group: 31cf5eab-8bcd-4305-b72d-f1a44843fd8e
32 enableLogging: false
33 from:
34 - group: 6f39fadf-04e8-4f49-be77-da0d4005ff37
35 name: "4085"
36 ports:
37 - port: 4567
38 protocol: TCP
39 - action: Allow
40 appliedTo:
41 - group: 672f4d75-c83b-4fa1-b0ab-ae414c2e8e8c
42 enableLogging: false
43 from:
44 - group: 31cf5eab-8bcd-4305-b72d-f1a44843fd8e
45 name: "4087"
46 ports:
47 - port: 5432
48 protocol: TCP
49 - action: Allow
50 appliedTo:
51 - group: 52c3548b-4758-427f-bcde-b25d36613de6
52 enableLogging: false
53 from:
54 - group: 31cf5eab-8bcd-4305-b72d-f1a44843fd8e
55 name: "4088"
56 ports:
57 - port: 6379
58 protocol: TCP
59 - action: Drop
60 appliedTo:
61 - group: d250b7d7-3041-4f7f-8fdf-c7360eee9615
62 enableLogging: false
63 from:
64 - group: d250b7d7-3041-4f7f-8fdf-c7360eee9615
65 name: "4089"
66 priority: 1.0000000532916369
67 tier: nsx-category-application
68status:
69 currentNodesRealized: 2
70 desiredNodesRealized: 2
71 observedGeneration: 6
72 phase: Realized
Antrea Security policies from kubernetes api
I have already covered this topic in another post here. Head over and have look, also its worth reading the official documentation page from Antrea here as it contains examples and is updated on new features.
One thing I would like to use this chapter for though is trying to apply a policy on the NSX added Tiers when doing the integration (explained above). Remember the Tiers?
1linuxvm01:~/antrea/policies$ k get tiers
2NAME PRIORITY AGE
3application 250 2d2h
4baseline 253 2d2h
5emergency 50 2d2h
6networkops 150 2d2h
7nsx-category-application 4 2d
8nsx-category-emergency 1 2d
9nsx-category-environment 3 2d
10nsx-category-ethernet 0 2d
11nsx-category-infrastructure 2 2d
12platform 200 2d2h
13securityops 100 2d2h
These nsx* tiers are coming from the NSX manager, but can I as a cluster-owner/editor place rules in here by default? If you look at the PRIORITY of these, they are pretty low.
Let us apply the same rule as used earlier in this post, by just editing in the tier placement:
1apiVersion: crd.antrea.io/v1alpha1
2kind: ClusterNetworkPolicy
3metadata:
4 name: acnp-nsx-tier-from-kubectl
5spec:
6 priority: 1
7 tier: nsx-category-environment
8 appliedTo:
9 - podSelector:
10 matchLabels:
11 app: ubuntu-20-04
12 egress:
13 - action: Allow
14 to:
15 - fqdn: "yelb-ui.yelb.carefor.some-dns.net"
16 ports:
17 - protocol: TCP
18 port: 80
19 - action: Allow
1linuxvm01:~/antrea/policies$ k apply -f fqdn-rule-nsx-tier.yaml
2Error from server: error when creating "fqdn-rule-nsx-tier.yaml": admission webhook "acnpvalidator.antrea.io" denied the request: user not authorized to access Tier nsx-category-environment
Even though I am the cluster-owner/admin/superuser I am not allowed to place any rules in these nsx tiers. So this just gives us further control and mechanisms to support both NSX created Antrea policies and Antrea policies from kubectl. This allows for a good control of security enforcement by roles in the organization.
Antrea Dashboard
As the Octant dashboard is no more, Antrea now has its own dashboard. Its very easy to deploy. Let me quickly go through it. Read more about it here
1# Add the helm charts
2helm repo add antrea https://charts.antrea.io
3helm repo update
Install it:
1helm install antrea-ui antrea/antrea-ui --namespace kube-system
1linuxvm01:~/antrea/policies$ helm repo add antrea https://charts.antrea.io
2"antrea" has been added to your repositories
3linuxvm01:~/antrea/policies$ helm repo update
4Hang tight while we grab the latest from your chart repositories...
5...Successfully got an update from the "ako" chart repository
6...Successfully got an update from the "antrea" chart repository
7...Successfully got an update from the "bitnami" chart repository
8Update Complete. ⎈Happy Helming!⎈
9linuxvm01:~/antrea/policies$ helm install antrea-ui antrea/antrea-ui --namespace kube-system
10NAME: antrea-ui
11LAST DEPLOYED: Tue Jun 6 12:56:21 2023
12NAMESPACE: kube-system
13STATUS: deployed
14REVISION: 1
15TEST SUITE: None
16NOTES:
17The Antrea UI has been successfully installed
18
19You are using version 0.1.1
20
21To access the UI, forward a local port to the antrea-ui service, and connect to
22that port locally with your browser:
23
24 $ kubectl -n kube-system port-forward service/antrea-ui 3000:3000
25
26After running the command above, access "http://localhost:3000" in your browser.For the Antrea documentation, please visit https://antrea.io
This will spin up a new pod, and a clusterip service.
1linuxvm01:~/antrea/policies$ k get pods -n kube-system
2NAME READY STATUS RESTARTS AGE
3antrea-agent-9rvqc 2/2 Running 0 2d16h
4antrea-agent-m7rg7 2/2 Running 0 2d16h
5antrea-agent-wvpp8 2/2 Running 0 2d16h
6antrea-controller-6d56b6d664-vlmh2 1/1 Running 0 2d16h
7antrea-ui-9c89486f4-msw6m 2/2 Running 0 62s
1linuxvm01:~/antrea/policies$ k get svc -n kube-system
2NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
3antrea ClusterIP 20.10.96.45 <none> 443/TCP 2d16h
4antrea-ui ClusterIP 20.10.228.144 <none> 3000/TCP 95s
Now instead of exposing the service as nodeport, I am just creating a serviceType loadBalancer for it like this:
1apiVersion: v1
2kind: Service
3metadata:
4 name: antrea-dashboard-ui
5 labels:
6 app: antrea-ui
7 namespace: kube-system
8spec:
9 loadBalancerClass: ako.vmware.com/avi-lb
10 type: LoadBalancer
11 ports:
12 - port: 80
13 protocol: TCP
14 targetPort: 3000
15 selector:
16 app: antrea-ui
Apply it:
1linuxvm01:~/antrea$ k apply -f antrea-dashboard-lb-yaml
2service/antrea-dashboard-ui created
3linuxvm01:~/antrea$ k get svc -n kube-system
4NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
5antrea ClusterIP 20.10.96.45 <none> 443/TCP 2d16h
6antrea-dashboard-ui LoadBalancer 20.10.76.243 10.13.210.12 80:31334/TCP 7s
7antrea-ui ClusterIP 20.10.228.144 <none> 3000/TCP 8m47s
Now access it through my browser:
Default password is admin
Good overview:
The option to do Traceflow:
Oops, dropped by a NetworkPolicy... Where does that come from 🤔 ... More on this later.
Antrea Network Monitoring
Being able to know what's going on is crucial when planning security policies, but also to know if the policies are working and being enforced. With that information available we can know if we are compliant with the policies apllied. Without any network flow information we are kind of in the blind. Luckily Antrea is fully capable of report full flow information, and export it. To be able to export the flow information we need to enable the FeatureGate FlowExporter:
1apiVersion: cni.tanzu.vmware.com/v1alpha1
2kind: AntreaConfig
3metadata:
4 name: stc-tkg-cluster-1-antrea-package
5 namespace: ns-stc-1
6spec:
7 antrea:
8 config:
9 featureGates:
10 AntreaProxy: true
11 EndpointSlice: false
12 AntreaPolicy: true
13 FlowExporter: true #This needs to be enabled
14 Egress: true
15 NodePortLocal: true
16 AntreaTraceflow: true
17 NetworkPolicyStats: true
Flow-Exporter - IPFIX
From the offical Antrea documentation:
Antrea is a Kubernetes network plugin that provides network connectivity and security features for Pod workloads. Considering the scale and dynamism of Kubernetes workloads in a cluster, Network Flow Visibility helps in the management and configuration of Kubernetes resources such as Network Policy, Services, Pods etc., and thereby provides opportunities to enhance the performance and security aspects of Pod workloads.
For visualizing the network flows, Antrea monitors the flows in Linux conntrack module. These flows are converted to flow records, and then flow records are post-processed before they are sent to the configured external flow collector. High-level design is given below:
From the Antrea official documentation again:
Flow Exporter
In Antrea, the basic building block for the Network Flow Visibility is the Flow Exporter. Flow Exporter operates within Antrea Agent; it builds and maintains a connection store by polling and dumping flows from conntrack module periodically. Connections from the connection store are exported to the Flow Aggregator Service using the IPFIX protocol, and for this purpose we use the IPFIX exporter process from the go-ipfix library.
Read more Network Flow Visibility in Antrea here.
Traceflow
When troubleshooting network issues or even firewall rules (is my traffic being blocked or allowed?) it is very handy to have the option to do Traceflow. Antrea supports Traceflow. To be able to use Traceflow, the AntreaTraceFlow FeatureGate needs to be enabled if not already.
1apiVersion: cni.tanzu.vmware.com/v1alpha1
2kind: AntreaConfig
3metadata:
4 name: stc-tkg-cluster-1-antrea-package
5 namespace: ns-stc-1
6spec:
7 antrea:
8 config:
9 featureGates:
10 AntreaProxy: true
11 EndpointSlice: false
12 AntreaPolicy: true
13 FlowExporter: true
14 Egress: true
15 NodePortLocal: true
16 AntreaTraceflow: true #This needs to be enabled
17 NetworkPolicyStats: true
Now that it is enabled, how can we perform Traceflow?
We can do Traceflow using kubectl, Antrea UI or even from the NSX manager if using the NSX/Antrea integration.
Traceflow in Antrea supports the following:
- Source: pod, protocol (TCP/UDP/ICMP) and port numbers
- Destination: pod, service, ip, protocol (TCP/UDP/ICMP) and port numbers
- One time Traceflow or live
Now to get back to my Antrea policies created earlier I want to test if they are actually being in use and enforced. So let me do a Traceflow form my famous Yelb-ui pod and see if it can reach the application pod on its allowed port. Remember that the UI pod needed to communicate with the appserver pod on TCP 4567 and that I created a rule that only allows this, all else is blocked.
If I want to do Traceflow from kubectl, this is an example to test if port 4567 is allowed from ui pod to appserver pod:
1apiVersion: crd.antrea.io/v1alpha1
2kind: Traceflow
3metadata:
4 name: tf-test
5spec:
6 source:
7 namespace: yelb
8 pod: yelb-ui-6df49457d6-m5clv
9 destination:
10 namespace: yelb
11 pod: yelb-appserver-857c5c76d5-4cd86
12 # destination can also be an IP address ('ip' field) or a Service name ('service' field); the 3 choices are mutually exclusive.
13 packet:
14 ipHeader: # If ipHeader/ipv6Header is not set, the default value is IPv4+ICMP.
15 protocol: 6 # Protocol here can be 6 (TCP), 17 (UDP) or 1 (ICMP), default value is 1 (ICMP)
16 transportHeader:
17 tcp:
18 srcPort: 0 # Source port needs to be set when Protocol is TCP/UDP.
19 dstPort: 4567 # Destination port needs to be set when Protocol is TCP/UDP.
20 flags: 2 # Construct a SYN packet: 2 is also the default value when the flags field is omitted.
Now apply it and get the output:
1linuxvm01:~/antrea/policies$ k apply -f traceflow.yaml
2traceflow.crd.antrea.io/tf-test created
1linuxvm01:~/antrea/policies$ k get traceflows.crd.antrea.io -n yelb tf-test -oyaml
2apiVersion: crd.antrea.io/v1alpha1
3kind: Traceflow
4metadata:
5 annotations:
6 kubectl.kubernetes.io/last-applied-configuration: |
7 {"apiVersion":"crd.antrea.io/v1alpha1","kind":"Traceflow","metadata":{"annotations":{},"name":"tf-test"},"spec":{"destination":{"namespace":"yelb","pod":"yelb-appserver-857c5c76d5-4cd86"},"packet":{"ipHeader":{"protocol":6},"transportHeader":{"tcp":{"dstPort":4567,"flags":2,"srcPort":0}}},"source":{"namespace":"yelb","pod":"yelb-ui-6df49457d6-m5clv"}}}
8 creationTimestamp: "2023-06-07T12:47:14Z"
9 generation: 1
10 name: tf-test
11 resourceVersion: "904386"
12 uid: c550596b-ed43-4bab-a6f1-d23e90d35f84
13spec:
14 destination:
15 namespace: yelb
16 pod: yelb-appserver-857c5c76d5-4cd86
17 packet:
18 ipHeader:
19 protocol: 6
20 transportHeader:
21 tcp:
22 dstPort: 4567
23 flags: 2
24 srcPort: 0
25 source:
26 namespace: yelb
27 pod: yelb-ui-6df49457d6-m5clv
28status:
29 phase: Succeeded
30 results:
31 - node: stc-tkg-cluster-1-node-pool-01-p6nms-84c55d4574-5r8gj
32 observations:
33 - action: Received
34 component: Forwarding
35 - action: Forwarded
36 component: NetworkPolicy
37 componentInfo: IngressRule
38 networkPolicy: AntreaClusterNetworkPolicy:9ae2599a-3bd3-4413-849e-06f53f467559
39 - action: Delivered
40 component: Forwarding
41 componentInfo: Output
42 timestamp: 1686142036
43 - node: stc-tkg-cluster-1-node-pool-01-p6nms-84c55d4574-bpx7s
44 observations:
45 - action: Forwarded
46 component: SpoofGuard
47 - action: Forwarded
48 component: Forwarding
49 componentInfo: Output
50 tunnelDstIP: 10.13.82.39
51 timestamp: 1686142036
52 startTime: "2023-06-07T12:47:14Z"
That was a success. - action: Forwarded
Now I want to run it again but with another port. So I change the above yaml to use port 4568 (which should not be allowed):
1linuxvm01:~/antrea/policies$ k get traceflows.crd.antrea.io -n yelb tf-test -oyaml
2apiVersion: crd.antrea.io/v1alpha1
3kind: Traceflow
4metadata:
5 annotations:
6 kubectl.kubernetes.io/last-applied-configuration: |
7 {"apiVersion":"crd.antrea.io/v1alpha1","kind":"Traceflow","metadata":{"annotations":{},"name":"tf-test"},"spec":{"destination":{"namespace":"yelb","pod":"yelb-appserver-857c5c76d5-4cd86"},"packet":{"ipHeader":{"protocol":6},"transportHeader":{"tcp":{"dstPort":4568,"flags":2,"srcPort":0}}},"source":{"namespace":"yelb","pod":"yelb-ui-6df49457d6-m5clv"}}}
8 creationTimestamp: "2023-06-07T12:53:59Z"
9 generation: 1
10 name: tf-test
11 resourceVersion: "905571"
12 uid: d76ec419-3272-4595-98a5-72a49adce9d3
13spec:
14 destination:
15 namespace: yelb
16 pod: yelb-appserver-857c5c76d5-4cd86
17 packet:
18 ipHeader:
19 protocol: 6
20 transportHeader:
21 tcp:
22 dstPort: 4568
23 flags: 2
24 srcPort: 0
25 source:
26 namespace: yelb
27 pod: yelb-ui-6df49457d6-m5clv
28status:
29 phase: Succeeded
30 results:
31 - node: stc-tkg-cluster-1-node-pool-01-p6nms-84c55d4574-bpx7s
32 observations:
33 - action: Forwarded
34 component: SpoofGuard
35 - action: Forwarded
36 component: Forwarding
37 componentInfo: Output
38 tunnelDstIP: 10.13.82.39
39 timestamp: 1686142441
40 - node: stc-tkg-cluster-1-node-pool-01-p6nms-84c55d4574-5r8gj
41 observations:
42 - action: Received
43 component: Forwarding
44 - action: Dropped
45 component: NetworkPolicy
46 componentInfo: IngressMetric
47 networkPolicy: AntreaClusterNetworkPolicy:9ae2599a-3bd3-4413-849e-06f53f467559
48 timestamp: 1686142441
49 startTime: "2023-06-07T12:53:59Z"
That was also a success, as it was dropped by design: - action: Dropped
Its great being able to do this from kubectl, if one quickly need to check this before starting to look somewhere else and create a support ticket 😃 or one dont have access to other tools like the Antrea UI or even the NSX manager, speaking of NSX manager. Let us do the exact same trace from the NSX manager gui:
Head over Plan&Troubleshoot -> Traffic Analysis:
Results:
Now I change it to another port again and test it again:
Dropped again.
The same procedure can also be done from the Antrea UI as shown above, now with a port that is allowed:
To read more on Traceflow in Antrea, head over here.
Theia
Now that we have know it's possible to export all flows using IPFIX, I thought it would be interesting to just showcase how the flow information can be presented with a solution called Theia. From the official docs:
Theia is a network observability and analytics platform for Kubernetes. It is built on top of Antrea, and consumes network flows exported by Antrea to provide fine-grained visibility into the communication and NetworkPolicies among Pods and Services in a Kubernetes cluster.
To install Theia I have followed the instructions from here which is also a greate place to read more about Theia.
Theia is installed using Helm, start by adding the charts, do an update and deploy:
1linuxvm01:~/antrea$ helm repo add antrea https://charts.antrea.io
2"antrea" already exists with the same configuration, skipping
3linuxvm01:~/antrea$ helm repo update
4Hang tight while we grab the latest from your chart repositories...
5...Successfully got an update from the "antrea" chart repository
6Update Complete. ⎈Happy Helming!⎈
Make sure that FlowExporter has been enabled, if not apply an AntreaConfig that enables it:
1apiVersion: cni.tanzu.vmware.com/v1alpha1
2kind: AntreaConfig
3metadata:
4 name: stc-tkg-cluster-1-antrea-package
5 namespace: ns-stc-1
6spec:
7 antrea:
8 config:
9 featureGates:
10 AntreaProxy: true
11 EndpointSlice: false
12 AntreaPolicy: true
13 FlowExporter: true #Enable this!
14 Egress: true
15 NodePortLocal: true
16 AntreaTraceflow: true
17 NetworkPolicyStats: true
After the config has been enabled, delete the Antrea agents and controller so these will read the new configMap:
1linuxvm01:~/antrea/theia$ k delete pod -n kube-system -l app=antrea
2pod "antrea-agent-58nn2" deleted
3pod "antrea-agent-cnq9p" deleted
4pod "antrea-agent-sx6vr" deleted
5pod "antrea-controller-6d56b6d664-km64t" deleted
After the Helm charts have been added, I start by installing the Flow Aggregator
1helm install flow-aggregator antrea/flow-aggregator --set clickHouse.enable=true,recordContents.podLabels=true -n flow-aggregator --create-namespace
As usual with Helm charts, if there is any specific settings you would like to change get the helm chart values for your specific charts first and refer to them by using -f values.yaml..
1linuxvm01:~/antrea/theia$ helm show values antrea/flow-aggregator > flow-agg-values.yaml
I dont have any specifics I need to change for this one, so I will just deploy using the defaults:
1linuxvm01:~/antrea/theia$ helm install flow-aggregator antrea/flow-aggregator --set clickHouse.enable=true,recordContents.podLabels=true -n flow-aggregator --create-namespace
2NAME: flow-aggregator
3LAST DEPLOYED: Tue Jun 6 21:28:49 2023
4NAMESPACE: flow-aggregator
5STATUS: deployed
6REVISION: 1
7TEST SUITE: None
8NOTES:
9The Antrea Flow Aggregator has been successfully installed
10
11You are using version 1.12.0
12
13For the Antrea documentation, please visit https://antrea.io
Now what has happened in my TKG cluster:
1linuxvm01:~/antrea/theia$ k get pods -n flow-aggregator
2NAME READY STATUS RESTARTS AGE
3flow-aggregator-5b4c69885f-mklm5 1/1 Running 1 (10s ago) 22s
4linuxvm01:~/antrea/theia$ k get pods -n flow-aggregator
5NAME READY STATUS RESTARTS AGE
6flow-aggregator-5b4c69885f-mklm5 1/1 Running 1 (13s ago) 25s
7linuxvm01:~/antrea/theia$ k get pods -n flow-aggregator
8NAME READY STATUS RESTARTS AGE
9flow-aggregator-5b4c69885f-mklm5 0/1 Error 1 (14s ago) 26s
10linuxvm01:~/antrea/theia$ k get pods -n flow-aggregator
11NAME READY STATUS RESTARTS AGE
12flow-aggregator-5b4c69885f-mklm5 0/1 CrashLoopBackOff 3 (50s ago) 60s
Well, that did'nt go so well...
The issue is that Flow Aggregator is looking for a service that is not created yet and will just fail until this is deployed. This is our next step.
1linuxvm01:~/antrea/theia$ helm install theia antrea/theia --set sparkOperator.enable=true,theiaManager.enable=true -n flow-visibility --create-namespace
2
3NAME: theia
4LAST DEPLOYED: Tue Jun 6 22:02:37 2023
5NAMESPACE: flow-visibility
6STATUS: deployed
7REVISION: 1
8TEST SUITE: None
9NOTES:
10Theia has been successfully installed
11
12You are using version 0.6.0
13
14For the Antrea documentation, please visit https://antrea.io
What has been created now:
1linuxvm01:~/antrea/theia$ k get pods -n flow-visibility
2NAME READY STATUS RESTARTS AGE
3chi-clickhouse-clickhouse-0-0-0 2/2 Running 0 8m52s
4grafana-684d8948b-c6wzn 1/1 Running 0 8m56s
5theia-manager-5d8d6b86b7-cbxrz 1/1 Running 0 8m56s
6theia-spark-operator-54d9ddd544-nqhqd 1/1 Running 0 8m56s
7zookeeper-0 1/1 Running 0 8m56s
Now flow-aggreator should also be in a runing state, if not just delete the pod and it should get back on its feet.
1linuxvm01:~/antrea/theia$ k get pods -n flow-aggregator
2NAME READY STATUS RESTARTS AGE
3flow-aggregator-5b4c69885f-xhdkx 1/1 Running 0 5m2s
So, now its all about getting access to the Grafana dashboard. I will just expose this with serviceType loadBalancer as it "out-of-the-box" is only exposed with NodePort:
1linuxvm01:~/antrea/theia$ k get svc -n flow-visibility
2NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
3chi-clickhouse-clickhouse-0-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 8m43s
4clickhouse-clickhouse ClusterIP 20.10.136.211 <none> 8123/TCP,9000/TCP 10m
5grafana NodePort 20.10.172.165 <none> 3000:30096/TCP 10m
6theia-manager ClusterIP 20.10.156.217 <none> 11347/TCP 10m
7zookeeper ClusterIP 20.10.219.137 <none> 2181/TCP,7000/TCP 10m
8zookeepers ClusterIP None <none> 2888/TCP,3888/TCP 10m
So let us create a LoadBalancer service for this:
1apiVersion: v1
2kind: Service
3metadata:
4 name: theia-dashboard-ui
5 labels:
6 app: grafana
7 namespace: flow-visibility
8spec:
9 loadBalancerClass: ako.vmware.com/avi-lb
10 type: LoadBalancer
11 ports:
12 - port: 80
13 protocol: TCP
14 targetPort: 3000
15 selector:
16 app: grafana
1linuxvm01:~/antrea/theia$ k get svc -n flow-visibility
2NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) grafana NodePort 20.10.172.165 <none> 3000:30096/TCP 15m
3theia-dashboard-ui LoadBalancer 20.10.24.174 10.13.210.13 80:32075/TCP 13s
Lets try to access it through the browser:
Great. Theia comes with a couple of predefined dashobards that is interesting to start out with. So let me list some of the screenshots from the predefined dashboards below:
The homepage:
List of dashboards:
Flow_Records_Dashboard:
Network_Topology_Dashboard:
Network Policy Recommendation
From the official docs:
Theia NetworkPolicy Recommendation recommends the NetworkPolicy configuration to secure Kubernetes network and applications. It analyzes the network flows collected by Grafana Flow Collector to generate Kubernetes NetworkPolicies or Antrea NetworkPolicies. This feature assists cluster administrators and app developers in securing their applications according to Zero Trust principles.
I like the sound of that. Let us try it out.
The first I need to install inst the Theia CLI, this can be found and the instructions from here
Theia CLI
1curl -Lo ./theia "https://github.com/antrea-io/theia/releases/download/v0.6.0/theia-$(uname)-x86_64"
2chmod +x ./theia
3mv ./theia /usr/local/bin/theia
4theia help
1linuxvm01:~/antrea/theia$ curl -Lo ./theia "https://github.com/antrea-io/theia/releases/download/v0.6.0/theia-$(uname)-x86_64"
2 % Total % Received % Xferd Average Speed Time Time Time Current
3 Dload Upload Total Spent Left Speed
4 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
5100 37.9M 100 37.9M 0 0 11.6M 0 0:00:03 0:00:03 --:--:-- 17.2M
6linuxvm01:~/antrea/theia$ chmod +x ./theia
7linuxvm01:~/antrea/theia$ sudo cp theia /usr/local/bin/theia
8linuxvm01:~/antrea/theia$ theia help
9theia is the command line tool for Theia which provides access
10to Theia network flow visibility capabilities
11
12Usage:
13 theia [command]
14
15Available Commands:
16 clickhouse Commands of Theia ClickHouse feature
17 completion Generate the autocompletion script for the specified shell
18 help Help about any command
19 policy-recommendation Commands of Theia policy recommendation feature
20 supportbundle Generate support bundle
21 throughput-anomaly-detection Commands of Theia throughput anomaly detection feature
22 version Show Theia CLI version
23
24Flags:
25 -h, --help help for theia
26 -k, --kubeconfig string absolute path to the k8s config file, will use $KUBECONFIG if not specified
27 -v, --verbose int set verbose level
28
29Use "theia [command] --help" for more information about a command.
These are the following commands when running the policy-recommendation option:
1andreasm@linuxvm01:~/antrea/theia$ theia policy-recommendation --help
2Command group of Theia policy recommendation feature.
3Must specify a subcommand like run, status or retrieve.
4
5Usage:
6 theia policy-recommendation [flags]
7 theia policy-recommendation [command]
8
9Aliases:
10 policy-recommendation, pr
11
12Available Commands:
13 delete Delete a policy recommendation job
14 list List all policy recommendation jobs
15 retrieve Get the recommendation result of a policy recommendation job
16 run Run a new policy recommendation job
17 status Check the status of a policy recommendation job
18
19
20Use "theia policy-recommendation [command] --help" for more information about a command.
These are the options for the run command:
1linuxvm01:~/antrea/theia$ theia policy-recommendation run --help
2Run a new policy recommendation job.
3Must finish the deployment of Theia first
4
5Usage:
6 theia policy-recommendation run [flags]
7
8Examples:
9Run a policy recommendation job with default configuration
10$ theia policy-recommendation run
11Run an initial policy recommendation job with policy type anp-deny-applied and limit on last 10k flow records
12$ theia policy-recommendation run --type initial --policy-type anp-deny-applied --limit 10000
13Run an initial policy recommendation job with policy type anp-deny-applied and limit on flow records from 2022-01-01 00:00:00 to 2022-01-31 23:59:59.
14$ theia policy-recommendation run --type initial --policy-type anp-deny-applied --start-time '2022-01-01 00:00:00' --end-time '2022-01-31 23:59:59'
15Run a policy recommendation job with default configuration but doesn't recommend toServices ANPs
16$ theia policy-recommendation run --to-services=false
17
18
19Flags:
20 --driver-core-request string Specify the CPU request for the driver Pod. Values conform to the Kubernetes resource quantity convention.
21 Example values include 0.1, 500m, 1.5, 5, etc. (default "200m")
22 --driver-memory string Specify the memory request for the driver Pod. Values conform to the Kubernetes resource quantity convention.
23 Example values include 512M, 1G, 8G, etc. (default "512M")
24 -e, --end-time string The end time of the flow records considered for the policy recommendation.
25 Format is YYYY-MM-DD hh:mm:ss in UTC timezone. No limit of the end time of flow records by default.
26 --exclude-labels Enable this option will exclude automatically generated Pod labels including 'pod-template-hash',
27 'controller-revision-hash', 'pod-template-generation' during policy recommendation. (default true)
28 --executor-core-request string Specify the CPU request for the executor Pod. Values conform to the Kubernetes resource quantity convention.
29 Example values include 0.1, 500m, 1.5, 5, etc. (default "200m")
30 --executor-instances int32 Specify the number of executors for the Spark application. Example values include 1, 2, 8, etc. (default 1)
31 --executor-memory string Specify the memory request for the executor Pod. Values conform to the Kubernetes resource quantity convention.
32 Example values include 512M, 1G, 8G, etc. (default "512M")
33 -f, --file string The file path where you want to save the result. It can only be used when wait is enabled.
34 -h, --help help for run
35 -l, --limit int The limit on the number of flow records read from the database. 0 means no limit.
36 -n, --ns-allow-list string List of default allow Namespaces.
37 If no Namespaces provided, Traffic inside Antrea CNI related Namespaces: ['kube-system', 'flow-aggregator',
38 'flow-visibility'] will be allowed by default.
39 -p, --policy-type string Types of generated NetworkPolicy.
40 Currently we have 3 generated NetworkPolicy types:
41 anp-deny-applied: Recommending allow ANP/ACNP policies, with default deny rules only on Pods which have an allow rule applied.
42 anp-deny-all: Recommending allow ANP/ACNP policies, with default deny rules for whole cluster.
43 k8s-np: Recommending allow K8s NetworkPolicies. (default "anp-deny-applied")
44 -s, --start-time string The start time of the flow records considered for the policy recommendation.
45 Format is YYYY-MM-DD hh:mm:ss in UTC timezone. No limit of the start time of flow records by default.
46 --to-services Use the toServices feature in ANP and recommendation toServices rules for Pod-to-Service flows,
47 only works when option is anp-deny-applied or anp-deny-all. (default true)
48 -t, --type string {initial|subsequent} Indicates this recommendation is an initial recommendion or a subsequent recommendation job. (default "initial")
49 --wait Enable this option will hold and wait the whole policy recommendation job finishes.
50
51Global Flags:
52 -k, --kubeconfig string absolute path to the k8s config file, will use $KUBECONFIG if not specified
53 --use-cluster-ip Enable this option will use ClusterIP instead of port forwarding when connecting to the Theia
54 Manager Service. It can only be used when running in cluster.
55 -v, --verbose int set verbose level
I will just run the following theia policy-recommendation run --type initial --policy-type anp-deny-applied --limit 10000 to generate some output.
1linuxvm01:~/antrea/theia$ theia policy-recommendation run --type initial --policy-type anp-deny-applied --limit 10000
2Successfully created policy recommendation job with name pr-e81a42e4-013a-4cf6-be43-b1ee48ea9a18
Lets check the status:
1# First I will list all the runs to get the name
2linuxvm01:~/antrea/theia$ theia policy-recommendation list
3CreationTime CompletionTime Name Status
42023-06-08 07:50:28 N/A pr-e81a42e4-013a-4cf6-be43-b1ee48ea9a18 SCHEDULED
5# Then I will check the status on the specific run
6linuxvm01:~/antrea/theia$ theia policy-recommendation status pr-e81a42e4-013a-4cf6-be43-b1ee48ea9a18
7Status of this policy recommendation job is SCHEDULED
Seems like I have to wait some, time to grab a coffee.
Just poured my coffee, wanted to check again:
1linuxvm01:~/antrea/theia$ theia policy-recommendation status pr-e81a42e4-013a-4cf6-be43-b1ee48ea9a18
2Status of this policy recommendation job is RUNNING: 0/1 (0%) stages completed
Alright, it is running.
Now time to drink the coffee.
Lets check in on it again:
1linuxvm01:~/antrea/theia$ theia policy-recommendation status pr-e81a42e4-013a-4cf6-be43-b1ee48ea9a18
2Status of this policy recommendation job is COMPLETED
Oh yes, now I am excited which policies it recommends:
1linuxvm01:~/antrea/theia$ theia policy-recommendation retrieve pr-e81a42e4-013a-4cf6-be43-b1ee48ea9a18
2apiVersion: crd.antrea.io/v1alpha1
3kind: ClusterNetworkPolicy
4metadata:
5 name: recommend-reject-acnp-9np4b
6spec:
7 appliedTo:
8 - namespaceSelector:
9 matchLabels:
10 kubernetes.io/metadata.name: yelb
11 podSelector:
12 matchLabels:
13 app: traffic-generator
14 egress:
15 - action: Reject
16 to:
17 - podSelector: {}
18 ingress:
19 - action: Reject
20 from:
21 - podSelector: {}
22 priority: 5
23 tier: Baseline
24---
25apiVersion: crd.antrea.io/v1alpha1
26kind: ClusterNetworkPolicy
27metadata:
28 name: recommend-reject-acnp-ega4b
29spec:
30 appliedTo:
31 - namespaceSelector:
32 matchLabels:
33 kubernetes.io/metadata.name: avi-system
34 podSelector:
35 matchLabels:
36 app.kubernetes.io/instance: ako-1685884771
37 app.kubernetes.io/name: ako
38 statefulset.kubernetes.io/pod-name: ako-0
39 egress:
40 - action: Reject
41 to:
42 - podSelector: {}
43 ingress:
44 - action: Reject
45 from:
46 - podSelector: {}
47 priority: 5
48 tier: Baseline
49---
50apiVersion: crd.antrea.io/v1alpha1
51kind: NetworkPolicy
52metadata:
53 name: recommend-allow-anp-nl6re
54 namespace: yelb
55spec:
56 appliedTo:
57 - podSelector:
58 matchLabels:
59 app: traffic-generator
60 egress:
61 - action: Allow
62 ports:
63 - port: 80
64 protocol: TCP
65 to:
66 - ipBlock:
67 cidr: 10.13.210.10/32
68 ingress: []
69 priority: 5
70 tier: Application
71---
72apiVersion: crd.antrea.io/v1alpha1
73kind: NetworkPolicy
74metadata:
75 name: recommend-allow-anp-2ifjo
76 namespace: avi-system
77spec:
78 appliedTo:
79 - podSelector:
80 matchLabels:
81 app.kubernetes.io/instance: ako-1685884771
82 app.kubernetes.io/name: ako
83 statefulset.kubernetes.io/pod-name: ako-0
84 egress:
85 - action: Allow
86 ports:
87 - port: 443
88 protocol: TCP
89 to:
90 - ipBlock:
91 cidr: 172.24.3.50/32
92 ingress: []
93 priority: 5
94 tier: Application
95---
96apiVersion: crd.antrea.io/v1alpha1
97kind: ClusterNetworkPolicy
98metadata:
99 name: recommend-allow-acnp-kube-system-kaoh6
100spec:
101 appliedTo:
102 - namespaceSelector:
103 matchLabels:
104 kubernetes.io/metadata.name: kube-system
105 egress:
106 - action: Allow
107 to:
108 - podSelector: {}
109 ingress:
110 - action: Allow
111 from:
112 - podSelector: {}
113 priority: 5
114 tier: Platform
115---
116apiVersion: crd.antrea.io/v1alpha1
117kind: ClusterNetworkPolicy
118metadata:
119 name: recommend-allow-acnp-flow-aggregator-dnvhc
120spec:
121 appliedTo:
122 - namespaceSelector:
123 matchLabels:
124 kubernetes.io/metadata.name: flow-aggregator
125 egress:
126 - action: Allow
127 to:
128 - podSelector: {}
129 ingress:
130 - action: Allow
131 from:
132 - podSelector: {}
133 priority: 5
134 tier: Platform
135---
136apiVersion: crd.antrea.io/v1alpha1
137kind: ClusterNetworkPolicy
138metadata:
139 name: recommend-allow-acnp-flow-visibility-sqjwf
140spec:
141 appliedTo:
142 - namespaceSelector:
143 matchLabels:
144 kubernetes.io/metadata.name: flow-visibility
145 egress:
146 - action: Allow
147 to:
148 - podSelector: {}
149 ingress:
150 - action: Allow
151 from:
152 - podSelector: {}
153 priority: 5
154 tier: Platform
155---
156apiVersion: crd.antrea.io/v1alpha1
157kind: ClusterNetworkPolicy
158metadata:
159 name: recommend-reject-acnp-hmjt8
160spec:
161 appliedTo:
162 - namespaceSelector:
163 matchLabels:
164 kubernetes.io/metadata.name: yelb-2
165 podSelector:
166 matchLabels:
167 app: yelb-ui
168 tier: frontend
169 egress:
170 - action: Reject
171 to:
172 - podSelector: {}
173 ingress:
174 - action: Reject
175 from:
176 - podSelector: {}
177 priority: 5
178 tier: Baseline
Ok, well. I appreciate the output, but I would need to do some modifications to it before I would apply it. As my lab is not generating that much traffic, it does not create all the flows needed to generate a better recommendation. For it to generate better recommendations, the flows also needs to be there. My traffic-generator is not doing a good job to achieve this. I will need to generate some more activity for the recommendation engine to get enough flows to consider.
Throughput Anomaly Detection
From the offical docs:
From Theia v0.5, Theia supports Throughput Anomaly Detection. Throughput Anomaly Detection (TAD) is a technique for understanding and reporting the throughput abnormalities in the network traffic. It analyzes the network flows collected by Grafana Flow Collector to report anomalies in the network. TAD uses three algorithms to find the anomalies in network flows such as ARIMA, EWMA, and DBSCAN. These anomaly analyses help the user to find threats if present.
Lets try it out. I already have the dependencies and Theia CLI installed.
What is the different commands available:
1linuxvm01:~/antrea/theia$ theia throughput-anomaly-detection --help
2Command group of Theia throughput anomaly detection feature.
3 Must specify a subcommand like run, list, delete, status or retrieve
4
5Usage:
6 theia throughput-anomaly-detection [flags]
7 theia throughput-anomaly-detection [command]
8
9Aliases:
10 throughput-anomaly-detection, tad
11
12Available Commands:
13 delete Delete a anomaly detection job
14 list List all anomaly detection jobs
15 retrieve Get the result of an anomaly detection job
16 run throughput anomaly detection using Algo
17 status Check the status of a anomaly detection job
18
19Flags:
20 -h, --help help for throughput-anomaly-detection
21 --use-cluster-ip Enable this option will use ClusterIP instead of port forwarding when connecting to the Theia
22 Manager Service. It can only be used when running in cluster.
23
24Global Flags:
25 -k, --kubeconfig string absolute path to the k8s config file, will use $KUBECONFIG if not specified
26 -v, --verbose int set verbose level
27
28Use "theia throughput-anomaly-detection [command] --help" for more information about a command.
1linuxvm01:~/antrea/theia$ theia throughput-anomaly-detection run --help
2throughput anomaly detection using algorithms, currently supported algorithms are EWMA, ARIMA and DBSCAN
3
4Usage:
5 theia throughput-anomaly-detection run [flags]
6
7Examples:
8Run the specific algorithm for throughput anomaly detection
9 $ theia throughput-anomaly-detection run --algo ARIMA --start-time 2022-01-01T00:00:00 --end-time 2022-01-31T23:59:59
10 Run throughput anomaly detection algorithm of type ARIMA and limit on flow records from '2022-01-01 00:00:00' to '2022-01-31 23:59:59'
11 Please note, algo is a mandatory argument'
12
13Flags:
14 -a, --algo string The algorithm used by throughput anomaly detection.
15 Currently supported Algorithms are EWMA, ARIMA and DBSCAN.
16 --driver-core-request string Specify the CPU request for the driver Pod. Values conform to the Kubernetes resource quantity convention.
17 Example values include 0.1, 500m, 1.5, 5, etc. (default "200m")
18 --driver-memory string Specify the memory request for the driver Pod. Values conform to the Kubernetes resource quantity convention.
19 Example values include 512M, 1G, 8G, etc. (default "512M")
20 -e, --end-time string The end time of the flow records considered for the anomaly detection.
21 Format is YYYY-MM-DD hh:mm:ss in UTC timezone. No limit of the end time of flow records by default.
22 --executor-core-request string Specify the CPU request for the executor Pod. Values conform to the Kubernetes resource quantity convention.
23 Example values include 0.1, 500m, 1.5, 5, etc. (default "200m")
24 --executor-instances int32 Specify the number of executors for the Spark application. Example values include 1, 2, 8, etc. (default 1)
25 --executor-memory string Specify the memory request for the executor Pod. Values conform to the Kubernetes resource quantity convention.
26 Example values include 512M, 1G, 8G, etc. (default "512M")
27 -h, --help help for run
28 -n, --ns-ignore-list string List of default drop Namespaces. Use this to ignore traffic from selected namespaces
29 If no Namespaces provided, Traffic from all namespaces present in flows table will be allowed by default.
30 -s, --start-time string The start time of the flow records considered for the anomaly detection.
31 Format is YYYY-MM-DD hh:mm:ss in UTC timezone. No limit of the start time of flow records by default.
32
33Global Flags:
34 -k, --kubeconfig string absolute path to the k8s config file, will use $KUBECONFIG if not specified
35 --use-cluster-ip Enable this option will use ClusterIP instead of port forwarding when connecting to the Theia
36 Manager Service. It can only be used when running in cluster.
37 -v, --verbose int set verbose level
I will use the example above:
1linuxvm01:~/antrea/theia$ theia throughput-anomaly-detection run --algo ARIMA --start-time 2023-06-06T00:00:00 --end-time 2023-06-08T09:00:00
2Successfully started Throughput Anomaly Detection job with name: tad-2ecb054a-8c0d-4ae1-8444-c3493e7bb6d9
1linuxvm01:~/antrea/theia$ theia throughput-anomaly-detection list
2CreationTime CompletionTime Name Status
32023-06-08 08:25:10 N/A tad-2ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 RUNNING
4linuxvm01:~/antrea/theia$ theia throughput-anomaly-detection status tad-2ecb054a-8c0d-4ae1-8444-c3493e7bb6d9
5Status of this anomaly detection job is RUNNING: 0/0 (0%) stages completed
Lets wait for it to finish...
1linuxvm01:~$ theia throughput-anomaly-detection list
2CreationTime CompletionTime Name Status
32023-06-08 08:25:10 2023-06-08 08:40:03 tad-2ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 COMPLETED
Now check the output:
1# It is a long list so I am piping it to a text file
2linuxvm01:~$ theia throughput-anomaly-detection retrieve tad-2ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 > anomaly-detection-1.txt
A snippet from the output:
1id sourceIP sourceTransportPort destinationIP destinationTransportPort flowStartSeconds flowEndSeconds throughput algoCalc anomaly
22ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-07T21:41:38Z 54204 65355.16485680155 true
32ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-08T07:09:32Z 49901 54713.50251767502 true
42ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-08T03:33:48Z 50000 53550.532983008845 true
52ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-08T00:52:48Z 59725 52206.69079880149 true
62ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-07T18:49:03Z 48544 53287.107990749006 true
72ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-07T22:06:43Z 61832 53100.99541753638 true
82ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-07T20:57:28Z 58295 54168.70924924757 true
92ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-08T07:36:38Z 47309 53688.236655529385 true
102ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-08T08:05:43Z 59227 52623.71668244673 true
112ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-08T05:22:12Z 58217 53709.42205164235 true
122ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-08T02:19:03Z 48508 55649.8819138477 true
132ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-07T23:27:28Z 53846 48125.33491950862 true
142ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-08T00:09:38Z 59562 52143.367660610136 true
152ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-08T02:44:08Z 50966 57119.323329628125 true
162ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-08T08:24:17Z 55553 50480.7443391562 true
172ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-08T00:02:38Z 44172 53694.11880964807 true
182ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-08T04:07:57Z 53612 49714.00885995446 true
192ecb054a-8c0d-4ae1-8444-c3493e7bb6d9 20.20.3.12 59448 20.20.0.12 2181 2023-06-06T22:04:43Z 2023-06-08T01:54:58Z 59089 51972.42465384903 true
Antrea Egress
This chapter is also covered in another post I have done here with some tweaks 😃
From the offical docs
Egress
is a CRD API that manages external access from the Pods in a cluster. It supports specifying which egress (SNAT) IP the traffic from the selected Pods to the external network should use. When a selected Pod accesses the external network, the egress traffic will be tunneled to the Node that hosts the egress IP if it’s different from the Node that the Pod runs on and will be SNATed to the egress IP when leaving that Node.You may be interested in using this capability if any of the following apply:
- A consistent IP address is desired when specific Pods connect to services outside of the cluster, for source tracing in audit logs, or for filtering by source IP in external firewall, etc.
- You want to force outgoing external connections to leave the cluster via certain Nodes, for security controls, or due to network topology restrictions.