A Quick Glance at Cilium CNI

Overview

About Cilium

Instead of me using my own words I will just copy the text from the official Cilium website:

eBPF-based Networking, Observability, Security

Cilium is an open source, cloud native solution for providing, securing, and observing network connectivity between workloads, fueled by the revolutionary Kernel technology eBPF

eBPF-based Networking, Observability, Security

Now what is eBPF?

From ebpf.io

Dynamically program the kernel for efficient networking, observability, tracing, and security

What is eBPF?

eBPF is a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in a privileged context such as the operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules.

Overview

As it is always interesting to learn new technology I thought writing a post about Cilium was about time. At first look Cilium is kind of a Swiss Army knife with a lot interesting features. I will go through this post beginning with a basic installation of Cilium on a new cluster (upstream K8s based on Ubuntu nodes). Then I will continune with some of the features I found interesting, and needed myself in my lab, and how to enable and configure them.

This post will be divided into dedicated sections for the installtion part and the different features respectively, starting with the installation of Cilium as the CNI in my Kubernetes cluster.

Preparations

This post assumes the following:

  • Already prepared the Kubernetes nodes with all the software installed ready to do the kubeadm init.
  • A jumphost or Linux mgmt vm/server to operate from
  • Helm installed and configured on the Linux jumphost
  • Kubectl installed on the Linux jumphost
  • Cilium CLI installed on the Linux jumphost

Cilium-cli is installed using this command:

1CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
2CLI_ARCH=amd64
3if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
4curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
5sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
6sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
7rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}

For more information have a look here

Below is my lab's topology for this post:

my-lab-topology

Installation of Cilium

Cililum can be installed using Helm or using Ciliums nifty cilium-cli tool.

Info

One can use Helm to configure/install features but also the Cilium cli tool. In my post I will mostly use Helm when adding some features or changing certain settings and cilium-cli for others just to showcase how easy it is to use cilium cli for certain features/tasks.

According to the official docs:

Install the latest version of the Cilium CLI. The Cilium CLI can be used to install Cilium, inspect the state of a Cilium installation, and enable/disable various features (e.g. clustermesh, Hubble).

The first feature of Cilium in this post is how it can fully replace kube-proxy by providing distributed load balancing using eBPF. Naturally I would like to use this feature. This means I need to deploy my Kubernetes cluster without kube-proxy. That is easiest done during the initial upbringing of the Kubernetes cluster. It can be done post-upringing also, see more info here

kubeadm init with no-kube-proxy

To bring up my Kubernetes cluster without kube-proxy, this is the command I will use on my first control-plane node:

1sudo kubeadm init --pod-network-cidr=10.22.0.0/16 --service-cidr=10.23.0.0/16  --control-plane-endpoint "test-cluster-1.my-doamin.net" --upload-certs --skip-phases=addon/kube-proxy --cri-socket unix:///var/run/containerd/containerd.sock

This is the parameter to disable kube-proxy --skip-phases=addon/kube-proxy

 1I1219 14:08:17.376790   13327 version.go:256] remote version is much newer: v1.29.0; falling back to: stable-1.28
 2[init] Using Kubernetes version: v1.28.4
 3[preflight] Running pre-flight checks
 4[preflight] Pulling images required for setting up a Kubernetes cluster
 5[preflight] This might take a minute or two, depending on the speed of your internet connection
 6[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
 7W1219 14:08:33.520592   13327 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.5" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image.
 8[certs] Using certificateDir folder "/etc/kubernetes/pki"
 9[certs] Generating "ca" certificate and key
10[certs] Generating "apiserver" certificate and key
11[certs] apiserver serving cert is signed for DNS names [k8s-master-01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local test-cluster-1.my-domain.net] and IPs [10.23.0.1 10.160.1.10]
12[certs] Generating "apiserver-kubelet-client" certificate and key
13[certs] Generating "front-proxy-ca" certificate and key
14[certs] Generating "front-proxy-client" certificate and key
15[certs] Generating "etcd/ca" certificate and key
16[certs] Generating "etcd/server" certificate and key
17[certs] etcd/server serving cert is signed for DNS names [k8s-master-01 localhost] and IPs [10.160.1.10 127.0.0.1 ::1]
18[certs] Generating "etcd/peer" certificate and key
19[certs] etcd/peer serving cert is signed for DNS names [k8s-master-01 localhost] and IPs [10.160.1.10 127.0.0.1 ::1]
20[certs] Generating "etcd/healthcheck-client" certificate and key
21[certs] Generating "apiserver-etcd-client" certificate and key
22[certs] Generating "sa" key and public key
23[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
24[kubeconfig] Writing "admin.conf" kubeconfig file
25[kubeconfig] Writing "kubelet.conf" kubeconfig file
26[kubeconfig] Writing "controller-manager.conf" kubeconfig file
27[kubeconfig] Writing "scheduler.conf" kubeconfig file
28[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
29[control-plane] Using manifest folder "/etc/kubernetes/manifests"
30[control-plane] Creating static Pod manifest for "kube-apiserver"
31[control-plane] Creating static Pod manifest for "kube-controller-manager"
32[control-plane] Creating static Pod manifest for "kube-scheduler"
33[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
34[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
35[kubelet-start] Starting the kubelet
36[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
37[kubelet-check] Initial timeout of 40s passed.
38[apiclient] All control plane components are healthy after 106.047476 seconds
39[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
40[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
41[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
42[upload-certs] Using certificate key:
433c9fa959a7538baaaf484e931ade45fbad07934dc40d456cae54839a7d888715
44[mark-control-plane] Marking the node k8s-master-01 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
45[mark-control-plane] Marking the node k8s-master-01 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
46[bootstrap-token] Using token: q495cj.apdasczda14j87tc
47[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
48[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
49[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
50[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
51[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
52[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
53[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
54[addons] Applied essential addon: CoreDNS
55
56Your Kubernetes control-plane has initialized successfully!
57
58To start using your cluster, you need to run the following as a regular user:
59
60  mkdir -p $HOME/.kube
61  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
62  sudo chown $(id -u):$(id -g) $HOME/.kube/config
63
64Alternatively, if you are the root user, you can run:
65
66  export KUBECONFIG=/etc/kubernetes/admin.conf
67
68You should now deploy a pod network to the cluster.
69Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
70  https://kubernetes.io/docs/concepts/cluster-administration/addons/
71
72You can now join any number of the control-plane node running the following command on each as root:
73
74  kubeadm join test-cluster-1.my-domain.net:6443 --token q4da14j87tc \
75	--discovery-token-ca-cert-hash sha256: \
76	--control-plane --certificate-key 
77
78Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
79As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
80"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
81
82Then you can join any number of worker nodes by running the following on each as root:
83
84kubeadm join test-cluster-1.my-domain.net:6443 --token q4aczda14j87tc \
85	--discovery-token-ca-cert-hash sha256:

NB, notice that it "complains" No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy"

Now on my worker nodes:

1kubeadm join test-cluster-1.my-domain.net:6443 --token q414j87tc \
2	--discovery-token-ca-cert-hash sha256:edf4d18883f94f0b5aa646001606147
 1[preflight] Running pre-flight checks
 2[preflight] Reading configuration from the cluster...
 3[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
 4W1219 16:25:57.203844    1279 configset.go:78] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" is forbidden: User "system:bootstrap:q495cj" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
 5[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
 6[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
 7[kubelet-start] Starting the kubelet
 8[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
 9
10This node has joined the cluster:
11* Certificate signing request was sent to apiserver and a response was received.
12* The Kubelet was informed of the new secure connection details.
13
14Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

NB, notice that it "complains" No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy"

When all worker nodes has been joined:

1andreasm@linuxmgmt01:~/test-cluster-1$ k get nodes
2NAME            STATUS   ROLES           AGE     VERSION
3k8s-master-01   Ready    control-plane   135m    v1.28.2
4k8s-worker-01   Ready    <none>          12s     v1.28.2
5k8s-worker-02   Ready    <none>          38s     v1.28.2
6k8s-worker-03   Ready    <none>          4m28s   v1.28.2

Notice they are not ready. CoreDNS is pending and there is no CNI in place to cover IPAM etc..

1andreasm@linuxmgmt01:~/test-cluster-1$ k get pods -A
2NAMESPACE     NAME                                    READY   STATUS    RESTARTS      AGE
3kube-system   coredns-5dd5756b68-c5xml                0/1     Pending   0             35m
4kube-system   coredns-5dd5756b68-fgdzj                0/1     Pending   0             35m
5kube-system   etcd-k8s-master-01                      1/1     Running   0             35m
6kube-system   kube-apiserver-k8s-master-01            1/1     Running   0             35m
7kube-system   kube-controller-manager-k8s-master-01   1/1     Running   1 (19m ago)   35m
8kube-system   kube-scheduler-k8s-master-01            1/1     Running   1 (19m ago)   35m

Now its time to jump over to my jumphost where I will do all the remaining configurations/interactions with my test-cluster-1.

Install Cilium CNI

From my jumphost I already have all the tools I need to deploy Cilium. To install the Cilium CNI I will just use the cilium-cli tool as it is so easy. With a very short command it will automatically install Cilium on all my worker/control-plane nodes. The cilium-cli will act according to the kube context you are in, so make sure you are in the correct context (the context that needs Cilium to be installed):

1andreasm@linuxmgmt01:~/test-cluster-1$ k config current-context
2test-cluster-1-admin@kubernetes
3
4andreasm@linuxmgmt01:~/test-cluster-1$ cilium install --version 1.14.5
5ℹ️  Using Cilium version 1.14.5
6🔮 Auto-detected cluster name: test-cluster-1
7🔮 Auto-detected kube-proxy has not been installed
8ℹ️  Cilium will fully replace all functionalities of kube-proxy

Thats it.... 😄

Version 1.14.5 is the latest stable at the writing of this post.

Install Cilium on a cluster with no kube-proxy

If I have prepared my cluster as above with no kube-proxy I need to install Cilium using the following command:

1API_SERVER_IP=10.160.1.111
2API_SERVER_PORT=6443
3helm install cilium cilium/cilium --version 1.14.5 -f cilium.1.14.5.values-prod-cluster-1.yaml \
4    --namespace kube-system \
5    --set kubeProxyReplacement=strict \
6    --set k8sServiceHost=${API_SERVER_IP} \
7    --set k8sServicePort=${API_SERVER_PORT}

Where the API_SERVER_PORT is one of my k8s control plane node (I did try to use the loadbalanced IP for the k8s api endpoint as I have 3 control plane nodes but that did not work out so I went with the IP of my first cp node). The value file is the value file I am using to set all the Cilium settings, more on that later.

Now, whats inside my Kubernetes cluster now:

 1andreasm@linuxmgmt01:~/test-cluster-1$ k get pods -A
 2NAMESPACE     NAME                                    READY   STATUS    RESTARTS       AGE
 3kube-system   cilium-6gx5b                            1/1     Running   0              11m
 4kube-system   cilium-bsqzw                            1/1     Running   1              11m
 5kube-system   cilium-ct8n4                            1/1     Running   0              53s
 6kube-system   cilium-operator-545dc68d55-fsh6s        1/1     Running   1              11m
 7kube-system   cilium-v7rdz                            1/1     Running   0              5m9s
 8kube-system   coredns-5dd5756b68-j9vwm                1/1     Running   0              77s
 9kube-system   coredns-5dd5756b68-mjbzk                1/1     Running   0              78s
10kube-system   etcd-k8s-master-01                      1/1     Running   0              136m
11kube-system   hubble-relay-d478c79c8-pbn4v            1/1     Running   0              16m
12kube-system   kube-apiserver-k8s-master-01            1/1     Running   0              136m
13kube-system   kube-controller-manager-k8s-master-01   1/1     Running   1 (120m ago)   136m
14kube-system   kube-scheduler-k8s-master-01            1/1     Running   1 (120m ago)   136m

Everything is up and running. The Cilium cli contains a lot of useful features. Like checking the status of Cilium. Lets test that:

 1andreasm@linuxmgmt01:~/test-cluster-1$ cilium status
 2    /¯¯\
 3 /¯¯\__/¯¯\    Cilium:             OK
 4 \__/¯¯\__/    Operator:           OK
 5 /¯¯\__/¯¯\    Envoy DaemonSet:    disabled (using embedded mode)
 6 \__/¯¯\__/    Hubble Relay:       disabled
 7    \__/       ClusterMesh:        disabled
 8
 9DaemonSet              cilium             Desired: 4, Ready: 4/4, Available: 4/4
10Deployment             cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
11Containers:            cilium             Running: 4
12                       cilium-operator    Running: 1
13Cluster Pods:          2/2 managed by Cilium
14Helm chart version:    1.14.5
15Image versions         cilium-operator    quay.io/cilium/operator-generic:v1.14.5@sha256:303f9076bdc73b3fc32aaedee64a14f6f44c8bb08ee9e3956d443021103ebe7a: 1
16                       cilium             quay.io/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b: 4

Looks great

Did you notice above that the Cilium installer discovered there was no kube-proxy and that it told me it will replace all the feature of kube-proxy? Well it did.. Lets check the config of Cilium and see if that is also reflected there. Look after this key-value:

kube-proxy-replacement strict

  1andreasm@linuxmgmt01:~/test-cluster-1$ cilium config view
  2agent-not-ready-taint-key                         node.cilium.io/agent-not-ready
  3arping-refresh-period                             30s
  4auto-direct-node-routes                           false
  5bpf-lb-external-clusterip                         false
  6bpf-lb-map-max                                    65536
  7bpf-lb-sock                                       false
  8bpf-map-dynamic-size-ratio                        0.0025
  9bpf-policy-map-max                                16384
 10bpf-root                                          /sys/fs/bpf
 11cgroup-root                                       /run/cilium/cgroupv2
 12cilium-endpoint-gc-interval                       5m0s
 13cluster-id                                        0
 14cluster-name                                      test-cluster-1
 15cluster-pool-ipv4-cidr                            10.0.0.0/8
 16cluster-pool-ipv4-mask-size                       24
 17cni-exclusive                                     true
 18cni-log-file                                      /var/run/cilium/cilium-cni.log
 19cnp-node-status-gc-interval                       0s
 20custom-cni-conf                                   false
 21debug                                             false
 22debug-verbose
 23disable-cnp-status-updates                        true
 24egress-gateway-reconciliation-trigger-interval    1s
 25enable-auto-protect-node-port-range               true
 26enable-bgp-control-plane                          false
 27enable-bpf-clock-probe                            false
 28enable-endpoint-health-checking                   true
 29enable-health-check-nodeport                      true
 30enable-health-checking                            true
 31enable-hubble                                     true
 32enable-ipv4                                       true
 33enable-ipv4-big-tcp                               false
 34enable-ipv4-masquerade                            true
 35enable-ipv6                                       false
 36enable-ipv6-big-tcp                               false
 37enable-ipv6-masquerade                            true
 38enable-k8s-networkpolicy                          true
 39enable-k8s-terminating-endpoint                   true
 40enable-l2-neigh-discovery                         true
 41enable-l7-proxy                                   true
 42enable-local-redirect-policy                      false
 43enable-policy                                     default
 44enable-remote-node-identity                       true
 45enable-sctp                                       false
 46enable-svc-source-range-check                     true
 47enable-vtep                                       false
 48enable-well-known-identities                      false
 49enable-xt-socket-fallback                         true
 50external-envoy-proxy                              false
 51hubble-disable-tls                                false
 52hubble-listen-address                             :4244
 53hubble-socket-path                                /var/run/cilium/hubble.sock
 54hubble-tls-cert-file                              /var/lib/cilium/tls/hubble/server.crt
 55hubble-tls-client-ca-files                        /var/lib/cilium/tls/hubble/client-ca.crt
 56hubble-tls-key-file                               /var/lib/cilium/tls/hubble/server.key
 57identity-allocation-mode                          crd
 58identity-gc-interval                              15m0s
 59identity-heartbeat-timeout                        30m0s
 60install-no-conntrack-iptables-rules               false
 61ipam                                              cluster-pool
 62ipam-cilium-node-update-rate                      15s
 63k8s-client-burst                                  10
 64k8s-client-qps                                    5
 65kube-proxy-replacement                            strict
 66kube-proxy-replacement-healthz-bind-address
 67mesh-auth-enabled                                 true
 68mesh-auth-gc-interval                             5m0s
 69mesh-auth-queue-size                              1024
 70mesh-auth-rotated-identities-queue-size           1024
 71monitor-aggregation                               medium
 72monitor-aggregation-flags                         all
 73monitor-aggregation-interval                      5s
 74node-port-bind-protection                         true
 75nodes-gc-interval                                 5m0s
 76operator-api-serve-addr                           127.0.0.1:9234
 77preallocate-bpf-maps                              false
 78procfs                                            /host/proc
 79proxy-connect-timeout                             2
 80proxy-max-connection-duration-seconds             0
 81proxy-max-requests-per-connection                 0
 82proxy-prometheus-port                             9964
 83remove-cilium-node-taints                         true
 84routing-mode                                      tunnel
 85set-cilium-is-up-condition                        true
 86set-cilium-node-taints                            true
 87sidecar-istio-proxy-image                         cilium/istio_proxy
 88skip-cnp-status-startup-clean                     false
 89synchronize-k8s-nodes                             true
 90tofqdns-dns-reject-response-code                  refused
 91tofqdns-enable-dns-compression                    true
 92tofqdns-endpoint-max-ip-per-hostname              50
 93tofqdns-idle-connection-grace-period              0s
 94tofqdns-max-deferred-connection-deletes           10000
 95tofqdns-proxy-response-max-delay                  100ms
 96tunnel-protocol                                   vxlan
 97unmanaged-pod-watcher-interval                    15
 98vtep-cidr
 99vtep-endpoint
100vtep-mac
101vtep-mask
102write-cni-conf-when-ready                         /host/etc/cni/net.d/05-cilium.conflist

One can also verify with this command:

1andreasm@linuxmgmt01:~/test-cluster-1$ kubectl -n kube-system exec ds/cilium -- cilium status | grep KubeProxyReplacement
2Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
3KubeProxyReplacement:    Strict   [ens18 10.160.1.11 (Direct Routing)]

The feature to easily list all features and the status on them is valuable and a really helpful feature.

It took a couple of seconds and Cilium CNI was installed. Now the fun begins to explore some of the features. Lets tag along

Enabling features using Helm

When I installed Cilium using the cilium-cli tool, it actually deploys using Helm in the background. Lets see if there is a Helm manifest in the kube-system:

1andreasm@linuxmgmt01:~/test-cluster-1$ helm list -n kube-system
2NAME  	NAMESPACE  	REVISION	UPDATED                                	STATUS  	CHART        	APP VERSION
3cilium	kube-system	1       	2023-12-19 13:50:40.121866679 +0000 UTC	deployed	cilium-1.14.5	1.14.5

Well there it is..

That makes it all more interesting. As I will use Helm to update certain parameters going forward in this post I will take a "snapshot" of the current values in the manifest above and altered in the next sections when I enabel additional features. How does the values look like now?

  1From the configMap cilium-config
  2apiVersion: v1
  3data:
  4  agent-not-ready-taint-key: node.cilium.io/agent-not-ready
  5  arping-refresh-period: 30s
  6  auto-direct-node-routes: "false"
  7  bpf-lb-external-clusterip: "false"
  8  bpf-lb-map-max: "65536"
  9  bpf-lb-sock: "false"
 10  bpf-map-dynamic-size-ratio: "0.0025"
 11  bpf-policy-map-max: "16384"
 12  bpf-root: /sys/fs/bpf
 13  cgroup-root: /run/cilium/cgroupv2
 14  cilium-endpoint-gc-interval: 5m0s
 15  cluster-id: "0"
 16  cluster-name: test-cluster-1
 17  cluster-pool-ipv4-cidr: 10.0.0.0/8
 18  cluster-pool-ipv4-mask-size: "24"
 19  cni-exclusive: "true"
 20  cni-log-file: /var/run/cilium/cilium-cni.log
 21  cnp-node-status-gc-interval: 0s
 22  custom-cni-conf: "false"
 23  debug: "false"
 24  debug-verbose: ""
 25  disable-cnp-status-updates: "true"
 26  egress-gateway-reconciliation-trigger-interval: 1s
 27  enable-auto-protect-node-port-range: "true"
 28  enable-bgp-control-plane: "false"
 29  enable-bpf-clock-probe: "false"
 30  enable-endpoint-health-checking: "true"
 31  enable-health-check-nodeport: "true"
 32  enable-health-checking: "true"
 33  enable-hubble: "true"
 34  enable-ipv4: "true"
 35  enable-ipv4-big-tcp: "false"
 36  enable-ipv4-masquerade: "true"
 37  enable-ipv6: "false"
 38  enable-ipv6-big-tcp: "false"
 39  enable-ipv6-masquerade: "true"
 40  enable-k8s-networkpolicy: "true"
 41  enable-k8s-terminating-endpoint: "true"
 42  enable-l2-neigh-discovery: "true"
 43  enable-l7-proxy: "true"
 44  enable-local-redirect-policy: "false"
 45  enable-policy: default
 46  enable-remote-node-identity: "true"
 47  enable-sctp: "false"
 48  enable-svc-source-range-check: "true"
 49  enable-vtep: "false"
 50  enable-well-known-identities: "false"
 51  enable-xt-socket-fallback: "true"
 52  external-envoy-proxy: "false"
 53  hubble-disable-tls: "false"
 54  hubble-listen-address: :4244
 55  hubble-socket-path: /var/run/cilium/hubble.sock
 56  hubble-tls-cert-file: /var/lib/cilium/tls/hubble/server.crt
 57  hubble-tls-client-ca-files: /var/lib/cilium/tls/hubble/client-ca.crt
 58  hubble-tls-key-file: /var/lib/cilium/tls/hubble/server.key
 59  identity-allocation-mode: crd
 60  identity-gc-interval: 15m0s
 61  identity-heartbeat-timeout: 30m0s
 62  install-no-conntrack-iptables-rules: "false"
 63  ipam: cluster-pool
 64  ipam-cilium-node-update-rate: 15s
 65  k8s-client-burst: "10"
 66  k8s-client-qps: "5"
 67  kube-proxy-replacement: strict
 68  kube-proxy-replacement-healthz-bind-address: ""
 69  mesh-auth-enabled: "true"
 70  mesh-auth-gc-interval: 5m0s
 71  mesh-auth-queue-size: "1024"
 72  mesh-auth-rotated-identities-queue-size: "1024"
 73  monitor-aggregation: medium
 74  monitor-aggregation-flags: all
 75  monitor-aggregation-interval: 5s
 76  node-port-bind-protection: "true"
 77  nodes-gc-interval: 5m0s
 78  operator-api-serve-addr: 127.0.0.1:9234
 79  preallocate-bpf-maps: "false"
 80  procfs: /host/proc
 81  proxy-connect-timeout: "2"
 82  proxy-max-connection-duration-seconds: "0"
 83  proxy-max-requests-per-connection: "0"
 84  proxy-prometheus-port: "9964"
 85  remove-cilium-node-taints: "true"
 86  routing-mode: tunnel
 87  set-cilium-is-up-condition: "true"
 88  set-cilium-node-taints: "true"
 89  sidecar-istio-proxy-image: cilium/istio_proxy
 90  skip-cnp-status-startup-clean: "false"
 91  synchronize-k8s-nodes: "true"
 92  tofqdns-dns-reject-response-code: refused
 93  tofqdns-enable-dns-compression: "true"
 94  tofqdns-endpoint-max-ip-per-hostname: "50"
 95  tofqdns-idle-connection-grace-period: 0s
 96  tofqdns-max-deferred-connection-deletes: "10000"
 97  tofqdns-proxy-response-max-delay: 100ms
 98  tunnel-protocol: vxlan
 99  unmanaged-pod-watcher-interval: "15"
100  vtep-cidr: ""
101  vtep-endpoint: ""
102  vtep-mac: ""
103  vtep-mask: ""
104  write-cni-conf-when-ready: /host/etc/cni/net.d/05-cilium.conflist
105kind: ConfigMap
106metadata:
107  annotations:
108    meta.helm.sh/release-name: cilium
109    meta.helm.sh/release-namespace: kube-system
110  creationTimestamp: "2023-12-19T13:50:42Z"
111  labels:
112    app.kubernetes.io/managed-by: Helm
113  name: cilium-config
114  namespace: kube-system
115  resourceVersion: "4589"
116  uid: f501a3d0-8b33-43af-9fae-63625dcd6df1

These are the two settings that have been changed from being completely default:

1  cluster-name: test-cluster-1
2  kube-proxy-replacement: strict

I prefer editing the changes in a dedicated value.yaml file and run helm upgrade -f value.yaml each time I want to do a change so going forward I will be the adding/changing certain settings in this value.yaml file to update the settings in Cilium.

I grabbed the default value yaml from the Helm repo and use that to alter the settings in the next sections.

Enabling features using Cilium cli

The cilium-cli can also be used to enable disable features certain features like Hubble and clustermesh. An example on how to install Hubble with cilium-cli is shown below in the next chapter, but I can also use Helm to achieve the same. I enable Hubble using cilium-cli just to show how easy it is.

But as I mention above, I prefer using the Helm method as I can keep better track of the settings and have them consistent each time I alter an update and refering to my value.yaml file.

Observability and flow-monitoring - Hubble Observability

Cilium comes with a very neat monitor tool out of the box called Hubble. It is enabled by default but I need to enable the Hubble Relay and Hubble UI feature to get the information from my nodes, pods etc available in a nice dashboard (Hubble UI), so this is a feature I certainly want to enable as one of the first features to test out.

More details here and here

If I do a quick check with cilium cli now I can see the Hubble Relay is disabled.

 1andreasm@linuxmgmt01:~/test-cluster-1$ cilium status
 2    /¯¯\
 3 /¯¯\__/¯¯\    Cilium:             OK
 4 \__/¯¯\__/    Operator:           OK
 5 /¯¯\__/¯¯\    Envoy DaemonSet:    disabled (using embedded mode)
 6 \__/¯¯\__/    Hubble Relay:       disabled
 7    \__/       ClusterMesh:        disabled
 8
 9DaemonSet              cilium             Desired: 4, Ready: 4/4, Available: 4/4
10Deployment             cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
11Containers:            cilium             Running: 4
12                       cilium-operator    Running: 1
13Cluster Pods:          2/2 managed by Cilium
14Helm chart version:    1.14.5
15Image versions         cilium             quay.io/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b: 4
16                       cilium-operator    quay.io/cilium/operator-generic:v1.14.5@sha256:303f9076bdc73b3fc32aaedee64a14f6f44c8bb08ee9e3956d443021103ebe7a: 1
17
18andreasm@linuxmgmt01:~/test-cluster-1$ k get svc -n kube-system
19NAME          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
20hubble-peer   ClusterIP   10.23.182.223   <none>        443/TCP                  69m
21kube-dns      ClusterIP   10.23.0.10      <none>        53/UDP,53/TCP,9153/TCP   109m

To enable it, it is as simple as running this command:

1andreasm@linuxmgmt01:~/test-cluster-1$ cilium hubble enable

This command enables the Hubble Relay.

Lets check the stats of Cilium:

 1andreasm@linuxmgmt01:~/test-cluster-1$ cilium status
 2    /¯¯\
 3 /¯¯\__/¯¯\    Cilium:             OK
 4 \__/¯¯\__/    Operator:           OK
 5 /¯¯\__/¯¯\    Envoy DaemonSet:    disabled (using embedded mode)
 6 \__/¯¯\__/    Hubble Relay:       OK
 7    \__/       ClusterMesh:        disabled
 8
 9DaemonSet              cilium             Desired: 4, Ready: 4/4, Available: 4/4
10Deployment             cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
11Deployment             hubble-relay       Desired: 1, Ready: 1/1, Available: 1/1
12Containers:            cilium             Running: 4
13                       cilium-operator    Running: 1
14                       hubble-relay       Running: 1
15Cluster Pods:          3/3 managed by Cilium
16Helm chart version:    1.14.5
17Image versions         cilium             quay.io/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b: 4
18                       cilium-operator    quay.io/cilium/operator-generic:v1.14.5@sha256:303f9076bdc73b3fc32aaedee64a14f6f44c8bb08ee9e3956d443021103ebe7a: 1
19                       hubble-relay       quay.io/cilium/hubble-relay:v1.14.5@sha256:dbef89f924a927043d02b40c18e417c1ea0e8f58b44523b80fef7e3652db24d4: 1

Hubble Relay OK

Now I need to enable the Hubble UI.

Again, uisng Cilium-CLI its a very quick and simple operation:

1andreasm@linuxmgmt01:~$ cilium hubble enable --ui

Lets check the services in my cluster:

1andreasm@linuxmgmt01:~$ k get svc -n kube-system
2NAME           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
3hubble-peer    ClusterIP   10.23.182.223   <none>        443/TCP                  6h19m
4hubble-relay   ClusterIP   10.23.182.76    <none>        80/TCP                   4h34m
5hubble-ui      ClusterIP   10.23.31.4      <none>        80/TCP                   42s
6kube-dns       ClusterIP   10.23.0.10      <none>        53/UDP,53/TCP,9153/TCP   6h59m

Hubble Relay and Hubble UI service is enabled. The issue though is that they are exposed using clusterIP, I need to reach them from the outside of my cluster. Lets continue with the next feature to test: LB-IPAM.

Using Helm to enable Hubble Relay and Hubble-UI

Instead of using clilium cli I would have enabled the Relay and UI in my value.yaml file and run the following command:

 1andreasm@linuxmgmt01:~/test-cluster-1$ helm upgrade -n kube-system cilium cilium/cilium --version 1.14.5 -f cilium-values-feature-by-feature.yaml
 2Release "cilium" has been upgraded. Happy Helming!
 3NAME: cilium
 4LAST DEPLOYED: Tue Dec 19 20:32:12 2023
 5NAMESPACE: kube-system
 6STATUS: deployed
 7REVISION: 13
 8TEST SUITE: None
 9NOTES:
10You have successfully installed Cilium with Hubble Relay and Hubble UI.
11
12Your release version is 1.14.5.
13
14For any further help, visit https://docs.cilium.io/en/v1.14/gettinghelp

Where I have changed these settings in the value.yaml:

 1 relay:
 2    # -- Enable Hubble Relay (requires hubble.enabled=true)
 3    enabled: true
 4 ......
 5   ui:
 6    # -- Whether to enable the Hubble UI.
 7    enabled: true
 8 ........
 9 hubble:
10  # -- Enable Hubble (true by default).
11  enabled: true
12............
13  # -- Buffer size of the channel Hubble uses to receive monitor events. If this
14  # value is not set, the queue size is set to the default monitor queue size.
15  # eventQueueSize: ""
16
17  # -- Number of recent flows for Hubble to cache. Defaults to 4095.
18  # Possible values are:
19  #   1, 3, 7, 15, 31, 63, 127, 255, 511, 1023,
20  #   2047, 4095, 8191, 16383, 32767, 65535
21  # eventBufferCapacity: "4095"
22
23  # -- Hubble metrics configuration.
24  # See https://docs.cilium.io/en/stable/observability/metrics/#hubble-metrics
25  # for more comprehensive documentation about Hubble metrics.
26  metrics:
27    # -- Configures the list of metrics to collect. If empty or null, metrics
28    # are disabled.
29    # Example:
30    #
31    #   enabled:
32    #   - dns:query;ignoreAAAA
33    #   - drop
34    #   - tcp
35    #   - flow
36    #   - icmp
37    #   - http
38    #
39    # You can specify the list of metrics from the helm CLI:
40    #
41    #   --set metrics.enabled="{dns:query;ignoreAAAA,drop,tcp,flow,icmp,http}"
42    #
43    enabled:
44    - dns:query;ignoreAAAA  ### added these
45    - drop                  ### added these
46    - tcp                   ### added these
47    - flow                  ### added these
48    - icmp                  ### added these
49    - http                  ### added these
50 .........
51 
52 
 1andreasm@linuxmgmt01:~/test-cluster-1$ cilium status
 2    /¯¯\
 3 /¯¯\__/¯¯\    Cilium:             OK
 4 \__/¯¯\__/    Operator:           OK
 5 /¯¯\__/¯¯\    Envoy DaemonSet:    disabled (using embedded mode)
 6 \__/¯¯\__/    Hubble Relay:       OK
 7    \__/       ClusterMesh:        disabled
 8
 9Deployment             cilium-operator    Desired: 2, Ready: 2/2, Available: 2/2
10DaemonSet              cilium             Desired: 4, Ready: 4/4, Available: 4/4
11Deployment             hubble-ui          Desired: 1, Ready: 1/1, Available: 1/1
12Deployment             hubble-relay       Desired: 1, Ready: 1/1, Available: 1/1
13Containers:            cilium             Running: 4
14                       hubble-ui          Running: 1
15                       hubble-relay       Running: 1
16                       cilium-operator    Running: 2
17Cluster Pods:          4/4 managed by Cilium
18Helm chart version:    1.14.5
19Image versions         cilium             quay.io/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b: 4
20                       hubble-ui          quay.io/cilium/hubble-ui:v0.12.1@sha256:9e5f81ee747866480ea1ac4630eb6975ff9227f9782b7c93919c081c33f38267: 1
21                       hubble-ui          quay.io/cilium/hubble-ui-backend:v0.12.1@sha256:1f86f3400827a0451e6332262467f894eeb7caf0eb8779bd951e2caa9d027cbe: 1
22                       hubble-relay       quay.io/cilium/hubble-relay:v1.14.5@sha256:dbef89f924a927043d02b40c18e417c1ea0e8f58b44523b80fef7e3652db24d4: 1
23                       cilium-operator    quay.io/cilium/operator-generic:v1.14.5@sha256:303f9076bdc73b3fc32aaedee64a14f6f44c8bb08ee9e3956d443021103ebe7a: 2

I will get back to the Hubble UI later...

LB-IPAM

Exposing a service from Kubernetes to be accessible from outside the cluster can be done in a couple of ways:

  • Exporting the service by binding it to a node using NodePort (not scalable and manageable).
  • Exporting the service using a servicetype of loadBalancer, only Layer4, though scalable. Usually requires external load balancer installed or some additional component installed and configured to support your Kubernetes platform.
  • Exporting using Ingress, Layer7, requires a loadbalancer to provide exernal IP address
  • Exporting using GatewayAPI (Ingress successor), requires a loadbalancer to provide exernal IP address.

Cilium has really made it simple here, it comes with a built in LoadBalancer-IPAM. More info here.

This is already enabled, no feature to install or enable. The only thing I need to do is to configure an IP pool that will provide ip addresses from a defined subnet when I request a serviceType loadBalancer, Ingress or Gateway. We can configure multiple pools with different subnets, and configure a serviceSelector matching on labels or expressions.

In my lab I have already configured a couple of IP pools, using different subnets and different serviceSelectors so I can control which service gets IP addresses from which pool.

A couple of example pools from my lab:

 1apiVersion: "cilium.io/v2alpha1"
 2kind: CiliumLoadBalancerIPPool
 3metadata:
 4  name: "gateway-api-pool-10.150.14.x"
 5spec:
 6  cidrs:
 7  - cidr: "10.150.14.0/24"
 8  serviceSelector:
 9    matchExpressions:
10      - {key: io.kubernetes.service.namespace, operator: In, values: [harbor, booking]}
11---
12apiVersion: "cilium.io/v2alpha1"
13kind: CiliumLoadBalancerIPPool
14metadata:
15  name: "lb-pool-prod.10.150.11.x"
16spec:
17  cidrs:
18  - cidr: "10.150.11.0/24"
19  serviceSelector:
20    matchExpressions:
21      - {key: env, operator: In, values: [prod]}

The first pool will only provide IP addresses to services being deployed in any of the two namespaces "harbor" or "booking". This is an "OR" selection, not AND, meaning it can be deployed in any of the namespaces, not both. The second will use lablels and match on the key-value: env=prod.

Info

Bear in mind that these IP Pools will only listen for services (serviceType loadBalancer) not Ingress pr say. That means each time you create an Ingress or a Gateway the serviceType loadBalancer will be auto-created as a reaction to the Ingress/Gateway creation. So if you try to create labels on the Ingress/Gatewat object it will not be noticed by the LB-IPAM pool. Instead you can adjust the selection based on the namespace you know it will be created in, or use this label that is auto-created on the svc: "Labels: io.cilium.gateway/owning-gateway="name-of-gateway""

As soon as you have created an ip-pool, applied it, it will immediately start to serve requests by providing IP addresses to them. This is very nice. There is a small catch though. If I create IP Pools, as above, which is outside of my nodes subnet how does my network know how to reach these subnets? Creating static routes and pointing to my nodes that potentially holds these ip addresses? Nah.. Not scalable, nor manageable. Some kind of dynamic routing protocol would be best here, BGP or OSPF. Did I mention that Cilium also includes support for BGP out of the box?

BGP Control Plane

Yes, you guessed it, Cilium includes BGP. A brilliant way of advertising all my IP pools. Creating many IP pools with a bunch of subnets have never been more fun. This is the same concept as I write about here, the biggest difference is that with Cilium this only needs to be enabled as a feature and then define a yaml to confgure the bgp settings. Nothing additional to install, just Plug'nPlay.

For more info on the BGP control plane, read here.

First out, enable the BGP control plane feature. To enable it I will alter my Helm value.yaml file with this setting:

1# -- This feature set enables virtual BGP routers to be created via
2# CiliumBGPPeeringPolicy CRDs.
3bgpControlPlane:
4  # -- Enables the BGP control plane.
5  enabled: true

Then run the command:

1helm upgrade -n kube-system cilium cilium/cilium --version 1.14.5 -f cilium-values-feature-by-feature.yaml

This will enable the bgp control plane feature:

 1andreasm@linuxmgmt01:~/test-cluster-1$ cilium config view
 2agent-not-ready-taint-key                         node.cilium.io/agent-not-ready
 3arping-refresh-period                             30s
 4auto-direct-node-routes                           false
 5bpf-lb-external-clusterip                         false
 6bpf-lb-map-max                                    65536
 7bpf-lb-sock                                       false
 8bpf-map-dynamic-size-ratio                        0.0025
 9bpf-policy-map-max                                16384
10bpf-root                                          /sys/fs/bpf
11cgroup-root                                       /run/cilium/cgroupv2
12cilium-endpoint-gc-interval                       5m0s
13cluster-id                                        0
14cluster-name                                      test-cluster-1
15cluster-pool-ipv4-cidr                            10.0.0.0/8
16cluster-pool-ipv4-mask-size                       24
17cni-exclusive                                     true
18cni-log-file                                      /var/run/cilium/cilium-cni.log
19cnp-node-status-gc-interval                       0s
20custom-cni-conf                                   false
21debug                                             false
22disable-cnp-status-updates                        true
23egress-gateway-reconciliation-trigger-interval    1s
24enable-auto-protect-node-port-range               true
25enable-bgp-control-plane                          true #### Here it is

Now I need to create a yaml that contains the BGP peering info I need for my workers to peer to my upstream router. For reference I will paste my lab topology here again:

my-lab-topology

When I apply my below BGPPeeringPolicy yaml, my nodes will enable a BGP peering session to the switch (their upstream bgp neighbor) they are connected to in the diagram above. This switch has also been configured to allow them as BGP neigbors. Please take into consideration creating some ip-prefix/route-maps so we dont accidentally advertise routes that confilcts, or should not be advertised into the network to prevent BGP blackholes etc...

Here is my BGP config I apply on my cluster:

 1apiVersion: "cilium.io/v2alpha1"
 2kind: CiliumBGPPeeringPolicy
 3metadata:
 4 name: 01-bgp-peering-policy
 5spec:
 6 nodeSelector:
 7   matchLabels:
 8     bgp-policy: worker-nodes
 9 virtualRouters:
10 - localASN: 64520
11   serviceSelector:
12     matchExpressions:
13        - {key: somekey, operator: NotIn, values: ['never-used-value']}
14   exportPodCIDR: false
15   neighbors:
16    - peerAddress: '10.160.1.1/24'
17      peerASN: 64512
18      eBGPMultihopTTL: 10
19      connectRetryTimeSeconds: 120
20      holdTimeSeconds: 12
21      keepAliveTimeSeconds: 4
22      gracefulRestart:
23        enabled: true
24        restartTimeSeconds: 120

Here we can also configure a serviceSelector to prevent services we dont want to be advertised. I used used the example from the official docs to allow everything. If I also have a good BGP route-map config on my switch side or upstream bgp neighbour subnets that are not allowed will never be advertised.

Now that I have applied it I can check the bgp peering status using the Cilium cli:

1andreasm@linuxmgmt01:~/prod-cluster-1/cilium$ cilium bgp peers
2Node               Local AS   Peer AS   Peer Address   Session State   Uptime     Family         Received   Advertised
3k8s-prod-node-01   64520      64512     10.160.1.1     established     13h2m53s   ipv4/unicast   47         6
4                                                                                  ipv6/unicast   0          0
5k8s-prod-node-02   64520      64512     10.160.1.1     established     13h2m25s   ipv4/unicast   45         6
6                                                                                  ipv6/unicast   0          0
7k8s-prod-node-03   64520      64512     10.160.1.1     established     13h2m27s   ipv4/unicast   43         6
8                                                                                  ipv6/unicast   0          0

I can see some prefixes being Advertised and some being Received and the Session State is Established. I can also confirm that on my switch, and the routes they advertise:

 1GUZ-SW-01# show ip bgp summary
 2
 3 Peer Information
 4
 5  Remote Address  Remote-AS Local-AS State         Admin Status
 6  --------------- --------- -------- ------------- ------------
 7  10.160.1.114    64520     64512    Established   Start
 8  10.160.1.115    64520     64512    Established   Start
 9  10.160.1.116    64520     64512    Established   Start
10  172.18.1.1      64500     64512    Established   Start
11GUZ-SW-01# show ip bgp
12
13  Local AS            : 64512         Local Router-id  : 172.18.1.2
14  BGP Table Version   : 1706
15
16  Status codes: * - valid, > - best, i - internal, e - external, s - stale
17  Origin codes: i - IGP, e - EGP, ? - incomplete
18
19     Network            Nexthop         Metric     LocalPref  Weight AsPath
20     ------------------ --------------- ---------- ---------- ------ ---------
21* e  10.150.11.10/32    10.160.1.114    0                     0      64520    i
22*>e  10.150.11.10/32    10.160.1.115    0                     0      64520    i
23* e  10.150.11.10/32    10.160.1.116    0                     0      64520    i
24* e  10.150.11.199/32   10.160.1.114    0                     0      64520    i
25* e  10.150.11.199/32   10.160.1.115    0                     0      64520    i
26*>e  10.150.11.199/32   10.160.1.116    0                     0      64520    i
27* e  10.150.12.4/32     10.160.1.114    0                     0      64520    i
28* e  10.150.12.4/32     10.160.1.115    0                     0      64520    i
29*>e  10.150.12.4/32     10.160.1.116    0                     0      64520    i
30* e  10.150.14.32/32    10.160.1.114    0                     0      64520    i
31* e  10.150.14.32/32    10.160.1.115    0                     0      64520    i
32*>e  10.150.14.32/32    10.160.1.116    0                     0      64520    i
33* e  10.150.14.150/32   10.160.1.114    0                     0      64520    i
34*>e  10.150.14.150/32   10.160.1.115    0                     0      64520    i
35* e  10.150.14.150/32   10.160.1.116    0                     0      64520    i
36* e  10.150.15.100/32   10.160.1.114    0                     0      64520    i
37* e  10.150.15.100/32   10.160.1.115    0                     0      64520    i
38*>e  10.150.15.100/32   10.160.1.116    0                     0      64520    i

Now I can just create my IP Pools, create some services and they should be immediately advertised and reachable in my network (unless they are being stopped by some route-maps ofcourse).

Note, it will only advertise ip-addresses in use by a service, not the whole subnet I define in my IP-Pools. That means I will only see host-routes advertised (as seen above).

LB-IPAM - does it actually loadbalance?

It says LoadBalancer IPAM, but does it actually loadbalance? Let me quicly put that to a test.

I have exposed a web service using serviceType loadBalancer consisting of three simple nginx web pods.

Here is the yaml I am using (think I grabbed it from the offical Cilium docs)

 1apiVersion: v1
 2kind: Service
 3metadata:
 4  name: test-lb
 5  namespace: example
 6  labels:
 7     env: prod #### added this label to match with my ip pool
 8spec:
 9  type: LoadBalancer
10  ports:
11  - port: 80
12    targetPort: 80
13    protocol: TCP
14    name: http
15  selector:
16    svc: test-lb
17---
18apiVersion: apps/v1
19kind: Deployment
20metadata:
21  name: nginx
22  namespace: example
23spec:
24  selector:
25    matchLabels:
26      svc: test-lb
27  template:
28    metadata:
29      labels:
30        svc: test-lb
31    spec:
32      containers:
33      - name: web
34        image: nginx
35        imagePullPolicy: IfNotPresent
36        ports:
37        - containerPort: 80
38        readinessProbe:
39          httpGet:
40            path: /
41            port: 80

Initially it deploys one pod, I will scale it up to three

They are running here, perfectly distributed across all my worker nodes:

1andreasm@linuxmgmt01:~/prod-cluster-1/cilium$ k get pods -n example -owide
2NAME                     READY   STATUS    RESTARTS   AGE    IP           NODE               NOMINATED NODE   READINESS GATES
3nginx-698447f456-5xczj   1/1     Running   0          18s    10.0.0.239   k8s-prod-node-01   <none>           <none>
4nginx-698447f456-plknk   1/1     Running   0          117s   10.0.4.167   k8s-prod-node-02   <none>           <none>
5nginx-698447f456-xs4jq   1/1     Running   0          18s    10.0.5.226   k8s-prod-node-03   <none>           <none>

And here is the LB service:

1example        test-lb                             LoadBalancer   10.21.69.190    10.150.11.48    80:31745/TCP

Now let me do a curl against the LoadBalancer IP and see if something changes:

 1Every 0.5s: curl http://10.150.11.48                                               linuxmgmt01: Wed Dec 20 07:58:14 2023
 2
 3  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
 4                                 Dload  Upload   Total   Spent    Left  Speed
 5   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0 100   567  100   567    0     0   184k
 6    0 --:--:-- --:--:-- --:--:--  276k
 7<!DOCTYPE html>
 8<html>
 9<head>
10Pod 2   ##### Notice this 
11<style>
12html { color-scheme: light dark; }
13body { width: 35em; margin: 0 auto;
14font-family: Tahoma, Verdana, Arial, sans-serif; }
15</style>
16</head>
17<body>
18Pod 2
19<p>If you see this page, the nginx web server is successfully installed and
20working. Further configuration is required.</p>
21
22<p>For online documentation and support please refer to
23<a href="http://nginx.org/">nginx.org</a>.<br/>
24Commercial support is available at
25<a href="http://nginx.com/">nginx.com</a>.</p>
26
27<p><em>Thank you for using nginx.</em></p>
28</body>
29</html>
 1Every 0.5s: curl http://10.150.11.48                                               linuxmgmt01: Wed Dec 20 07:59:15 2023
 2
 3  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
 4                                 Dload  Upload   Total   Spent    Left  Speed
 5   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0 100   567  100   567    0     0   110k
 6    0 --:--:-- --:--:-- --:--:--  138k
 7<!DOCTYPE html>
 8<html>
 9<head>
10Pod 1     ##### Notice this
11<style>
12html { color-scheme: light dark; }
13body { width: 35em; margin: 0 auto;
14font-family: Tahoma, Verdana, Arial, sans-serif; }
15</style>
16</head>
17<body>
18Pod 1
19<p>If you see this page, the nginx web server is successfully installed and
20working. Further configuration is required.</p>
21
22<p>For online documentation and support please refer to
23<a href="http://nginx.org/">nginx.org</a>.<br/>
24Commercial support is available at
25<a href="http://nginx.com/">nginx.com</a>.</p>
26
27<p><em>Thank you for using nginx.</em></p>
28</body>
29</html>
 1Every 0.5s: curl http://10.150.11.48                                               linuxmgmt01: Wed Dec 20 08:01:02 2023
 2
 3  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
 4                                 Dload  Upload   Total   Spent    Left  Speed
 5   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0 100   567  100   567    0     0   553k
 6    0 --:--:-- --:--:-- --:--:--  553k
 7<!DOCTYPE html>
 8<html>
 9<head>
10Pod 3    ##### Notice this
11<style>
12html { color-scheme: light dark; }
13body { width: 35em; margin: 0 auto;
14font-family: Tahoma, Verdana, Arial, sans-serif; }
15</style>
16</head>
17<body>
18Pod 3
19<p>If you see this page, the nginx web server is successfully installed and
20working. Further configuration is required.</p>
21
22<p>For online documentation and support please refer to
23<a href="http://nginx.org/">nginx.org</a>.<br/>
24Commercial support is available at
25<a href="http://nginx.com/">nginx.com</a>.</p>
26
27<p><em>Thank you for using nginx.</em></p>
28</body>
29</html>

Well, it is actually load-balancing the requests to the three different pods, running on three different nodes. And it took me about 5 seconds to apply the ip-pool yaml and the bgppeeringpolicy yaml and I had a fully functioning load-balancer.

A bit more info on this feature from the offical Cilium docs:

LB IPAM works in conjunction with features like the Cilium BGP Control Plane (Beta). Where LB IPAM is responsible for allocation and assigning of IPs to Service objects and other features are responsible for load balancing and/or advertisement of these IPs.

So I assume the actual loadbalancing is done by BGP here.

Cilium Ingress

As I covered above serviceType loadBalancer, let me quickly cover how to enable Cilium IngressController. More info can be found here

I will head into my Helm value.yaml and edit the following:

 1ingressController:
 2  # -- Enable cilium ingress controller
 3  # This will automatically set enable-envoy-config as well.
 4  enabled: true
 5
 6  # -- Set cilium ingress controller to be the default ingress controller
 7  # This will let cilium ingress controller route entries without ingress class set
 8  default: false
 9
10  # -- Default ingress load balancer mode
11  # Supported values: shared, dedicated
12  # For granular control, use the following annotations on the ingress resource
13  # ingress.cilium.io/loadbalancer-mode: shared|dedicated,
14  loadbalancerMode: dedicated

The Cilium Ingress controller can be dedicated or shared, meaning that it can support a shared IP for multiple Ingress objects. Nice if we are IP limited etc. Additionally we can edit the shared Ingress to configured with a specific IP like this:

 1  # -- Load-balancer service in shared mode.
 2  # This is a single load-balancer service for all Ingress resources.
 3  service:
 4    # -- Service name
 5    name: cilium-ingress
 6    # -- Labels to be added for the shared LB service
 7    labels: {}
 8    # -- Annotations to be added for the shared LB service
 9    annotations: {}
10    # -- Service type for the shared LB service
11    type: LoadBalancer
12    # -- Configure a specific nodePort for insecure HTTP traffic on the shared LB service
13    insecureNodePort: ~
14    # -- Configure a specific nodePort for secure HTTPS traffic on the shared LB service
15    secureNodePort : ~
16    # -- Configure a specific loadBalancerClass on the shared LB service (requires Kubernetes 1.24+)
17    loadBalancerClass: ~
18    # -- Configure a specific loadBalancerIP on the shared LB service
19    loadBalancerIP : 10.150.11.100 ### Set your preferred IP here
20    # -- Configure if node port allocation is required for LB service
21    # ref: https://kubernetes.io/docs/concepts/services-networking/service/#load-balancer-nodeport-allocation
22    allocateLoadBalancerNodePorts: ~

This will dictate that the shared Ingress object will get this IP address.

Now save changes and run the helm upgrade command:

 1andreasm@linuxmgmt01:~/test-cluster-1$ helm upgrade -n kube-system cilium cilium/cilium --version 1.14.5 -f cilium-values-feature-by-feature.yaml
 2Release "cilium" has been upgraded. Happy Helming!
 3NAME: cilium
 4LAST DEPLOYED: Wed Dec 20 08:18:58 2023
 5NAMESPACE: kube-system
 6STATUS: deployed
 7REVISION: 15
 8TEST SUITE: None
 9NOTES:
10You have successfully installed Cilium with Hubble Relay and Hubble UI.
11
12Your release version is 1.14.5.
13
14For any further help, visit https://docs.cilium.io/en/v1.14/gettinghelp

Now is also a good time to restart the Cilium Operator and Cilium Agents to re-read the new configMap.

1andreasm@linuxmgmt01:~/test-cluster-1$ kubectl -n kube-system rollout restart deployment/cilium-operator
2deployment.apps/cilium-operator restarted
3andreasm@linuxmgmt01:~/test-cluster-1$ kubectl -n kube-system rollout restart ds/cilium
4daemonset.apps/cilium restarted

Also check the Cilium status by running this:

 1andreasm@linuxmgmt01:~/test-cluster-1$ cilium status
 2    /¯¯\
 3 /¯¯\__/¯¯\    Cilium:             OK
 4 \__/¯¯\__/    Operator:           OK
 5 /¯¯\__/¯¯\    Envoy DaemonSet:    disabled (using embedded mode)
 6 \__/¯¯\__/    Hubble Relay:       OK
 7    \__/       ClusterMesh:        disabled
 8
 9Deployment             hubble-ui          Desired: 1, Ready: 1/1, Available: 1/1
10DaemonSet              cilium             Desired: 4, Ready: 4/4, Available: 4/4
11Deployment             cilium-operator    Desired: 2, Ready: 2/2, Available: 2/2
12Deployment             hubble-relay       Desired: 1, Ready: 1/1, Available: 1/1
13Containers:            cilium             Running: 4
14                       hubble-ui          Running: 1
15                       cilium-operator    Running: 2
16                       hubble-relay       Running: 1
17Cluster Pods:          4/4 managed by Cilium
18Helm chart version:    1.14.5
19Image versions         hubble-ui          quay.io/cilium/hubble-ui:v0.12.1@sha256:9e5f81ee747866480ea1ac4630eb6975ff9227f9782b7c93919c081c33f38267: 1
20                       hubble-ui          quay.io/cilium/hubble-ui-backend:v0.12.1@sha256:1f86f3400827a0451e6332262467f894eeb7caf0eb8779bd951e2caa9d027cbe: 1
21                       cilium-operator    quay.io/cilium/operator-generic:v1.14.5@sha256:303f9076bdc73b3fc32aaedee64a14f6f44c8bb08ee9e3956d443021103ebe7a: 2
22                       hubble-relay       quay.io/cilium/hubble-relay:v1.14.5@sha256:dbef89f924a927043d02b40c18e417c1ea0e8f58b44523b80fef7e3652db24d4: 1
23                       cilium             quay.io/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b: 4

As soon as I enable the Ingress controller it will create this object for me, and provide an IngressClass in my cluster.

1andreasm@linuxmgmt01:~/test-cluster-1$ k get ingressclasses.networking.k8s.io
2NAME     CONTROLLER                     PARAMETERS   AGE
3cilium   cilium.io/ingress-controller   <none>       86s

Now I suddenly have an IngressController also. Let me deploy a test app to test this.

First I deploy two pods with their corresponding clusterIP services:

 1kind: Pod
 2apiVersion: v1
 3metadata:
 4  name: apple-app
 5  labels:
 6    app: apple
 7  namespace: fruit
 8spec:
 9  containers:
10    - name: apple-app
11      image: hashicorp/http-echo
12      args:
13        - "-text=apple"
14
15---
16
17kind: Service
18apiVersion: v1
19metadata:
20  name: apple-service
21  namespace: fruit
22spec:
23  selector:
24    app: apple
25  ports:
26    - port: 5678 # Default port for image
 1kind: Pod
 2apiVersion: v1
 3metadata:
 4  name: banana-app
 5  labels:
 6    app: banana
 7  namespace: fruit
 8spec:
 9  containers:
10    - name: banana-app
11      image: hashicorp/http-echo
12      args:
13        - "-text=banana"
14
15---
16
17kind: Service
18apiVersion: v1
19metadata:
20  name: banana-service
21  namespace: fruit
22spec:
23  selector:
24    app: banana
25  ports:
26    - port: 5678 # Default port for image

And then the Ingress pointing to the two services apple and banana:

 1apiVersion: networking.k8s.io/v1
 2kind: Ingress
 3metadata:
 4  name: ingress-example
 5  namespace: fruit
 6  labels:
 7    env: test
 8  annotations:
 9    ingress.cilium.io/loadbalancer-mode: dedicated
10spec:
11  ingressClassName: cilium
12  rules:
13    - host: fruit.my-domain.net
14      http:
15        paths:
16        - path: /apple
17          pathType: Prefix
18          backend:
19            service:
20              name: apple-service
21              port:
22                number: 5678
23        - path: /banana
24          pathType: Prefix
25          backend:
26            service:
27              name: banana-service
28              port:
29                number: 5678

Notice the only annotation I have used is the loadbalancer-mode: dedicated. The other value that is accepted is shared. By using this annotation I can choose on specific Ingresses whether they should be using the ip from the shared Ingress object or if I want it to be a dedicated one with its own IP address. If I dont want this Ingress to consume a specific IP address I will use shared, if I want to create a dedicated IP for this Ingress I can use dedicated. The shared service for Ingress object is automatically created when enabling the IngressController. You can see this here:

1andreasm@linuxmgmt01:~$ k get svc -n kube-system
2NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                      AGE
3cilium-shared-ingress   LoadBalancer   10.21.104.15    10.150.15.100   80:30810/TCP,443:31104/TCP   46h

I have configured this shared-ingress to use a specific ip-address. When using dedicated it will create a cilium-ingress-name-of-Ingress on a new IP address (as can be seen below).

As soon as this has been applied Cilium will automatically take care of the serviceType loadBalancer object by getting an IP address from one of the IP pools that matches my serviceSelections (depending on shared or dedicated ofcourse). Then BGP will automatically advertise the host-route to my BGP router. And the Ingress object should now be listening on HTTP requests on this IP. Here is the services/objects created:

1andreasm@linuxmgmt01:~/prod-cluster-1/cilium$ k get ingress -n fruit
2NAME              CLASS    HOSTS               ADDRESS       PORTS   AGE
3ingress-example   cilium   fruit.my-domain.net   10.150.12.4   80      44h
4andreasm@linuxmgmt01:~/prod-cluster-1/cilium$ k get svc -n fruit
5NAME                             TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
6apple-service                    ClusterIP      10.21.243.103   <none>        5678/TCP                     4d9h
7banana-service                   ClusterIP      10.21.124.111   <none>        5678/TCP                     4d9h
8cilium-ingress-ingress-example   LoadBalancer   10.21.50.107    10.150.12.4   80:30792/TCP,443:31553/TCP   43h

Let me see if the Ingress responds to my http requests (I have registered the IP above with a DNS record so I can resolve it):

1andreasm@linuxmgmt01:~$ curl http://fruit.my-domain.net/apple
2apple
3andreasm@linuxmgmt01:~$ curl http://fruit.my-domain.net/banana
4banana

The Ingress works. Again for more information on Cilium IngressController (like supported annotations etc) head over here

Cilium Gateway API

Another ingress solution to use is Gateway API, read more about that here

Gateway API is an "evolution" of the regular Ingress, so it would be natural to take this into consideration going forward. Again Cilium supports Gateway API out of the box, I will use Helm to enable it and it just needs a couple of CRDs to be installed. Read more on Cilium API support here.

To enable Cilium Gateway API I did the following:

  • Edit my Helm value.yaml with the following setting:
1gatewayAPI:
2  # -- Enable support for Gateway API in cilium
3  # This will automatically set enable-envoy-config as well.
4  enabled: true
  • Installed these CRDs before I ran the Helm upgrade command
1$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v0.7.0/config/crd/standard/gateway.networking.k8s.io_gatewayclasses.yaml
2$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v0.7.0/config/crd/standard/gateway.networking.k8s.io_gateways.yaml
3$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v0.7.0/config/crd/standard/gateway.networking.k8s.io_httproutes.yaml
4$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v0.7.0/config/crd/standard/gateway.networking.k8s.io_referencegrants.yaml
5$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v0.7.0/config/crd/experimental/gateway.networking.k8s.io_tlsroutes.yaml
  • Run the helm upgrade command:
 1andreasm@linuxmgmt01:~/test-cluster-1$ helm upgrade -n kube-system cilium cilium/cilium --version 1.14.5 -f cilium-values-feature-by-feature.yaml
 2Release "cilium" has been upgraded. Happy Helming!
 3NAME: cilium
 4LAST DEPLOYED: Wed Dec 20 11:12:55 2023
 5NAMESPACE: kube-system
 6STATUS: deployed
 7REVISION: 16
 8TEST SUITE: None
 9NOTES:
10You have successfully installed Cilium with Hubble Relay and Hubble UI.
11
12Your release version is 1.14.5.
13
14For any further help, visit https://docs.cilium.io/en/v1.14/gettinghelp
15
16andreasm@linuxmgmt01:~/test-cluster-1$ kubectl -n kube-system rollout restart deployment/cilium-operator
17deployment.apps/cilium-operator restarted
18andreasm@linuxmgmt01:~/test-cluster-1$ kubectl -n kube-system rollout restart ds/cilium
19daemonset.apps/cilium restarted
Info

It is very important to install the above CRDs first before attempting to enable the GatewayAPI in Cilium. Otherwise it will create any gatewayclass, aka no GatewayAPI realized.

Now I should have a gatewayClass:

1andreasm@linuxmgmt01:~/test-cluster-1$ k get gatewayclasses.gateway.networking.k8s.io
2NAME     CONTROLLER                     ACCEPTED   AGE
3cilium   io.cilium/gateway-controller   True       96s

Now I can just go ahead and create a gateway and some httproutes. When it comes to providing an external IP address for my gateway, this is provided by my ip-pools the same way as for the IngressController.

Lets go ahead and create a gateway, and for this excercise I will be creating a gateway with corresponding httproutes to support my Harbor registry installation.

Below is the config I have used, this has also been configured to do a https redirect (from http to https):

  1apiVersion: gateway.networking.k8s.io/v1beta1
  2kind: Gateway
  3metadata:
  4  name: harbor-tls-gateway
  5  namespace: harbor
  6spec:
  7  gatewayClassName: cilium
  8  listeners:
  9  - name: http
 10    protocol: HTTP
 11    port: 80
 12    hostname: "registry.my-domain.net"
 13  - name: https
 14#    allowedRoutes:
 15#      namespaces:
 16#        from: Same
 17    protocol: HTTPS
 18    port: 443
 19    hostname: "registry.my-domain.net"
 20#    allowedRoutes:
 21#      namespaces:
 22#        from: Same
 23    tls:
 24      mode: Terminate
 25      certificateRefs:
 26      - kind: Secret
 27        name: harbor-tls-prod
 28        namespace: harbor
 29
 30---
 31apiVersion: gateway.networking.k8s.io/v1beta1
 32kind: HTTPRoute
 33metadata:
 34  name: harbor-tls-redirect
 35  namespace: harbor
 36spec:
 37  parentRefs:
 38  - name: harbor-tls-gateway
 39    sectionName: http
 40    namespace: harbor
 41  hostnames:
 42  - registry.my-domain.net
 43  rules:
 44  - filters:
 45    - type: RequestRedirect
 46      requestRedirect:
 47        scheme: https
 48        port: 443
 49
 50---
 51apiVersion: gateway.networking.k8s.io/v1beta1
 52kind: HTTPRoute
 53metadata:
 54  name: harbor-api-route
 55  namespace: harbor
 56spec:
 57  parentRefs:
 58  - name: harbor-tls-gateway
 59    sectionName: https
 60    namespace: harbor
 61  hostnames:
 62  - registry.my-domain.net
 63  rules:
 64  - matches:
 65    - path:
 66        type: PathPrefix
 67        value: /api/
 68    backendRefs:
 69    - name: harbor
 70      port: 80
 71  - matches:
 72    - path:
 73        type: PathPrefix
 74        value: /service/
 75    backendRefs:
 76    - name: harbor
 77      port: 80
 78  - matches:
 79    - path:
 80        type: PathPrefix
 81        value: /v2/
 82    backendRefs:
 83    - name: harbor
 84      port: 80
 85  - matches:
 86    - path:
 87        type: PathPrefix
 88        value: /chartrepo/
 89    backendRefs:
 90    - name: harbor
 91      port: 80
 92  - matches:
 93    - path:
 94        type: PathPrefix
 95        value: /c/
 96    backendRefs:
 97    - name: harbor
 98      port: 80
 99  - matches:
100    - path:
101        type: PathPrefix
102        value: /
103    backendRefs:
104    - name: harbor-portal
105      port: 80

I have already created the certificate as the secret I refer to in the yaml above.

Lets have a look at the gateway, httproutes and the svc that provides the external IP address , also the Harbor services the httproutes refer to:

1#### gateway created #####
2andreasm@linuxmgmt01:~/prod-cluster-1/harbor$ k get gateway -n harbor
3NAME                 CLASS    ADDRESS        PROGRAMMED   AGE
4harbor-tls-gateway   cilium   10.150.14.32   True         28h
1#### HTTPROUTES ####
2andreasm@linuxmgmt01:~/prod-cluster-1/harbor$ k get httproutes.gateway.networking.k8s.io -n harbor
3NAME                  HOSTNAMES                  AGE
4harbor-api-route      ["registry.my-domain.net"]   28h
5harbor-tls-redirect   ["registry.my-domain.net"]   28h
1andreasm@linuxmgmt01:~/prod-cluster-1/harbor$ k get svc -n harbor
2NAME                                TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)                      AGE
3cilium-gateway-harbor-tls-gateway   LoadBalancer   10.21.27.25     10.150.14.32   80:32393/TCP,443:31932/TCP   28h

Then the services that Harbor installs:

 1andreasm@linuxmgmt01:~/prod-cluster-1/harbor$ k get svc -n harbor
 2NAME                                TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)                      AGE
 3harbor                              ClusterIP      10.21.101.75    <none>         80/TCP                       40h
 4harbor-core                         ClusterIP      10.21.1.68      <none>         80/TCP                       40h
 5harbor-database                     ClusterIP      10.21.193.216   <none>         5432/TCP                     40h
 6harbor-jobservice                   ClusterIP      10.21.120.54    <none>         80/TCP                       40h
 7harbor-portal                       ClusterIP      10.21.64.138    <none>         80/TCP                       40h
 8harbor-redis                        ClusterIP      10.21.213.160   <none>         6379/TCP                     40h
 9harbor-registry                     ClusterIP      10.21.212.118   <none>         5000/TCP,8080/TCP            40h
10harbor-trivy                        ClusterIP      10.21.138.224   <none>         8080/TCP                     40h

Now I can reach my Harbor using the UI and docker cli all through the Gateway API...

harbor

I will use Harbor in the next chapter with Hubble UI..

Hubble UI

As you may recall, I did enable the two features Hubble Relay and Hubble UI as we can se below:

1andreasm@linuxmgmt01:~$ k get svc -n kube-system
2NAME             TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                      AGE
3hubble-metrics   ClusterIP      None            <none>          9965/TCP                     35h
4hubble-peer      ClusterIP      10.23.182.223   <none>          443/TCP                      42h
5hubble-relay     ClusterIP      10.23.182.76    <none>          80/TCP                       40h
6hubble-ui        ClusterIP      10.23.31.4      <none>          80/TCP                       36h

It is not exposed so I can reach from outside the Kubernetes cluster. So let me first start by just creating a serviceType loadBalancer service to expose the Hubble UI clusterIP service. Below is the yaml I use for that:

 1apiVersion: v1
 2kind: Service
 3metadata:
 4  name: hubble-ui-lb
 5  namespace: kube-system
 6  labels:
 7    env: prod
 8  annotations:
 9    "io.cilium/lb-ipam-ips": "10.150.11.10"
10spec:
11  type: LoadBalancer
12  ports:
13  - port: 8081
14    targetPort: 8081
15    protocol: TCP
16    name: http
17  selector:
18    k8s-app: hubble-ui

Apply it and I should see the service:

1andreasm@linuxmgmt01:~/prod-cluster-1/cilium/services$ k get svc -n kube-system
2NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                      AGE
3hubble-ui-lb            LoadBalancer   10.21.47.47     10.150.11.10    8081:32328/TCP               3d11h

There it is, now open my browser and point to this ip:port

hubble-ui-frontpage

Now let me go to a test application I have deployed in the yelb namespace. Click on it from the list or the dropdown top left corner:

Soo much empty...

yelb-ns

I can see the pods are running:

1andreasm@linuxmgmt01:~/prod-cluster-1/cilium/services$ k get pods -n yelb
2NAME                            READY   STATUS    RESTARTS   AGE
3redis-server-84f4bf49b5-fq26l   1/1     Running   0          5d18h
4yelb-appserver-6dc7cd98-s6kt7   1/1     Running   0          5d18h
5yelb-db-84d6f6fc6c-m7xvd        1/1     Running   0          5d18h

They are probably not so interested in talking to each other unless they have to. Let me deploy the Fronted service and create some interactions.

1andreasm@linuxmgmt01:~/prod-cluster-1/cilium$ k apply -f yelb-lb-frontend.yaml
2service/yelb-ui created
3deployment.apps/yelb-ui created
4andreasm@linuxmgmt01:~/prod-cluster-1/cilium$ k get svc -n yelb
5NAME             TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)        AGE
6redis-server     ClusterIP      10.21.67.23     <none>          6379/TCP       5d18h
7yelb-appserver   ClusterIP      10.21.81.95     <none>          4567/TCP       5d18h
8yelb-db          ClusterIP      10.21.188.43    <none>          5432/TCP       5d18h
9yelb-ui          LoadBalancer   10.21.207.114   10.150.11.221   80:32335/TCP   49s

I will now open the Yelb UI and do some quick "votes"

yelb-ui

Instantly, even by just opening the yelb webpage I get a lot of flow information in Hubble. And not only that, it automatically creates a "service-map" so I can see the involved services in the Yelb app.

yelb-hubble

This will only show me L4 information. What about Layer 7? Lets test that also by heading over to Harbor

In Hubble I will switch to the namespace harbor

A nice diagram with all involced services, but no L7 Information yet. Well there is, but I have no recent interactions to Harbor using the Gateway API, as soon as I use docker or web-ui against harbor what happens then?

harbor-hubble-1

Whats this, an ingress object?

harbor-hubble-2

Now when I click on the ingress object:

harbor-ingress-l7

Look at the L7 info coming there.

I logged out from Harbor:

log-out

Logged back in:

log-in

Browsing the Harbor Projects/repositories:

browsing

Very rich set of information presented in a very snappy and responsive dashbord. Its instantly updated as soon as there is a request coming.

For now, this concludes this post.

It has been a nice experience getting a bit more under the hood of Cilium, and so far I must say it looks very good.

Things I have not covered yet wich I will at a later stage

I will update this post with some other features at a later stage. Some of the features I am interested looking at is:

  • Security policies with Cilium - just have quick look here many interesting topics, Host Firewall?
  • Egress
  • Cilium Multi-Cluster