Instead of me using my own words I will just copy the text from the official Cilium website:
eBPF-based Networking, Observability, Security
Cilium is an open source, cloud native solution for providing, securing, and observing network connectivity between workloads, fueled by the revolutionary Kernel technology eBPF
eBPF is a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in a privileged context such as the operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules.
As it is always interesting to learn new technology I thought writing a post about Cilium was about time. At first look Cilium is kind of a Swiss Army knife with a lot interesting features. I will go through this post beginning with a basic installation of Cilium on a new cluster (upstream K8s based on Ubuntu nodes). Then I will continune with some of the features I found interesting, and needed myself in my lab, and how to enable and configure them.
This post will be divided into dedicated sections for the installtion part and the different features respectively, starting with the installation of Cilium as the CNI in my Kubernetes cluster.
Preparations
This post assumes the following:
Already prepared the Kubernetes nodes with all the software installed ready to do the kubeadm init.
A jumphost or Linux mgmt vm/server to operate from
Helm installed and configured on the Linux jumphost
Cililum can be installed using Helm or using Ciliums nifty cilium-cli tool.
Info
One can use Helm to configure/install features but also the Cilium cli tool. In my post I will mostly use Helm when adding some features or changing certain settings and cilium-cli for others just to showcase how easy it is to use cilium cli for certain features/tasks.
According to the official docs:
Install the latest version of the Cilium CLI. The Cilium CLI can be used to install Cilium, inspect the state of a Cilium installation, and enable/disable various features (e.g. clustermesh, Hubble).
The first feature of Cilium in this post is how it can fully replace kube-proxy by providing distributed load balancing using eBPF. Naturally I would like to use this feature. This means I need to deploy my Kubernetes cluster without kube-proxy. That is easiest done during the initial upbringing of the Kubernetes cluster. It can be done post-upringing also, see more info here
kubeadm init with no-kube-proxy
To bring up my Kubernetes cluster without kube-proxy, this is the command I will use on my first control-plane node:
This is the parameter to disable kube-proxy --skip-phases=addon/kube-proxy
1I1219 14:08:17.376790 13327 version.go:256] remote version is much newer: v1.29.0; falling back to: stable-1.28
2[init] Using Kubernetes version: v1.28.4
3[preflight] Running pre-flight checks
4[preflight] Pulling images required for setting up a Kubernetes cluster
5[preflight] This might take a minute or two, depending on the speed of your internet connection
6[preflight] You can also perform this action in beforehand using 'kubeadm config images pull' 7W1219 14:08:33.520592 13327 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.5" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image.
8[certs] Using certificateDir folder "/etc/kubernetes/pki" 9[certs] Generating "ca" certificate and key
10[certs] Generating "apiserver" certificate and key
11[certs] apiserver serving cert is signed for DNS names [k8s-master-01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local test-cluster-1.my-domain.net] and IPs [10.23.0.1 10.160.1.10]12[certs] Generating "apiserver-kubelet-client" certificate and key
13[certs] Generating "front-proxy-ca" certificate and key
14[certs] Generating "front-proxy-client" certificate and key
15[certs] Generating "etcd/ca" certificate and key
16[certs] Generating "etcd/server" certificate and key
17[certs] etcd/server serving cert is signed for DNS names [k8s-master-01 localhost] and IPs [10.160.1.10 127.0.0.1 ::1]18[certs] Generating "etcd/peer" certificate and key
19[certs] etcd/peer serving cert is signed for DNS names [k8s-master-01 localhost] and IPs [10.160.1.10 127.0.0.1 ::1]20[certs] Generating "etcd/healthcheck-client" certificate and key
21[certs] Generating "apiserver-etcd-client" certificate and key
22[certs] Generating "sa" key and public key
23[kubeconfig] Using kubeconfig folder "/etc/kubernetes"24[kubeconfig] Writing "admin.conf" kubeconfig file
25[kubeconfig] Writing "kubelet.conf" kubeconfig file
26[kubeconfig] Writing "controller-manager.conf" kubeconfig file
27[kubeconfig] Writing "scheduler.conf" kubeconfig file
28[etcd] Creating static Pod manifest forlocal etcd in "/etc/kubernetes/manifests"29[control-plane] Using manifest folder "/etc/kubernetes/manifests"30[control-plane] Creating static Pod manifest for"kube-apiserver"31[control-plane] Creating static Pod manifest for"kube-controller-manager"32[control-plane] Creating static Pod manifest for"kube-scheduler"33[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"34[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"35[kubelet-start] Starting the kubelet
36[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
37[kubelet-check] Initial timeout of 40s passed.
38[apiclient] All control plane components are healthy after 106.047476 seconds
39[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
40[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
41[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
42[upload-certs] Using certificate key:
433c9fa959a7538baaaf484e931ade45fbad07934dc40d456cae54839a7d888715
44[mark-control-plane] Marking the node k8s-master-01 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]45[mark-control-plane] Marking the node k8s-master-01 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]46[bootstrap-token] Using token: q495cj.apdasczda14j87tc
47[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
48[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
49[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
50[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
51[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
52[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
53[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
54[addons] Applied essential addon: CoreDNS
5556Your Kubernetes control-plane has initialized successfully!
5758To start using your cluster, you need to run the following as a regular user:
5960 mkdir -p $HOME/.kube
61 sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
62 sudo chown $(id -u):$(id -g)$HOME/.kube/config
6364Alternatively, if you are the root user, you can run:
6566exportKUBECONFIG=/etc/kubernetes/admin.conf
6768You should now deploy a pod network to the cluster.
69Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
70 https://kubernetes.io/docs/concepts/cluster-administration/addons/
7172You can now join any number of the control-plane node running the following command on each as root:
7374 kubeadm join test-cluster-1.my-domain.net:6443 --token q4da14j87tc \
75 --discovery-token-ca-cert-hash sha256: \
76 --control-plane --certificate-key
7778Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
79As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
80"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
8182Then you can join any number of worker nodes by running the following on each as root:
8384kubeadm join test-cluster-1.my-domain.net:6443 --token q4aczda14j87tc \
85 --discovery-token-ca-cert-hash sha256:
NB, notice that it "complains" No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy"
1[preflight] Running pre-flight checks
2[preflight] Reading configuration from the cluster...
3[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' 4W1219 16:25:57.203844 1279 configset.go:78] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" is forbidden: User "system:bootstrap:q495cj" cannot get resource "configmaps" in API group "" in the namespace "kube-system" 5[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" 6[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" 7[kubelet-start] Starting the kubelet
8[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
910This node has joined the cluster:
11* Certificate signing request was sent to apiserver and a response was received.
12* The Kubelet was informed of the new secure connection details.
1314Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
NB, notice that it "complains" No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy"
When all worker nodes has been joined:
1andreasm@linuxmgmt01:~/test-cluster-1$ k get nodes
2NAME STATUS ROLES AGE VERSION
3k8s-master-01 Ready control-plane 135m v1.28.2
4k8s-worker-01 Ready <none> 12s v1.28.2
5k8s-worker-02 Ready <none> 38s v1.28.2
6k8s-worker-03 Ready <none> 4m28s v1.28.2
Notice they are not ready. CoreDNS is pending and there is no CNI in place to cover IPAM etc..
1andreasm@linuxmgmt01:~/test-cluster-1$ k get pods -A
2NAMESPACE NAME READY STATUS RESTARTS AGE
3kube-system coredns-5dd5756b68-c5xml 0/1 Pending 0 35m
4kube-system coredns-5dd5756b68-fgdzj 0/1 Pending 0 35m
5kube-system etcd-k8s-master-01 1/1 Running 0 35m
6kube-system kube-apiserver-k8s-master-01 1/1 Running 0 35m
7kube-system kube-controller-manager-k8s-master-01 1/1 Running 1(19m ago) 35m
8kube-system kube-scheduler-k8s-master-01 1/1 Running 1(19m ago) 35m
Now its time to jump over to my jumphost where I will do all the remaining configurations/interactions with my test-cluster-1.
Install Cilium CNI
From my jumphost I already have all the tools I need to deploy Cilium. To install the Cilium CNI I will just use the cilium-cli tool as it is so easy. With a very short command it will automatically install Cilium on all my worker/control-plane nodes. The cilium-cli will act according to the kube context you are in, so make sure you are in the correct context (the context that needs Cilium to be installed):
1andreasm@linuxmgmt01:~/test-cluster-1$ k config current-context
2test-cluster-1-admin@kubernetes
34andreasm@linuxmgmt01:~/test-cluster-1$ cilium install --version 1.14.5
5ℹ️ Using Cilium version 1.14.5
6🔮 Auto-detected cluster name: test-cluster-1
7🔮 Auto-detected kube-proxy has not been installed
8ℹ️ Cilium will fully replace all functionalities of kube-proxy
Thats it.... 😄
Version 1.14.5 is the latest stable at the writing of this post.
Install Cilium on a cluster with no kube-proxy
If I have prepared my cluster as above with no kube-proxy I need to install Cilium using the following command:
Where the API_SERVER_PORT is one of my k8s control plane node (I did try to use the loadbalanced IP for the k8s api endpoint as I have 3 control plane nodes but that did not work out so I went with the IP of my first cp node). The value file is the value file I am using to set all the Cilium settings, more on that later.
Did you notice above that the Cilium installer discovered there was no kube-proxy and that it told me it will replace all the feature of kube-proxy? Well it did.. Lets check the config of Cilium and see if that is also reflected there. Look after this key-value:
The feature to easily list all features and the status on them is valuable and a really helpful feature.
It took a couple of seconds and Cilium CNI was installed. Now the fun begins to explore some of the features. Lets tag along
Enabling features using Helm
When I installed Cilium using the cilium-cli tool, it actually deploys using Helm in the background. Lets see if there is a Helm manifest in the kube-system:
1andreasm@linuxmgmt01:~/test-cluster-1$ helm list -n kube-system
2NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
3cilium kube-system 1 2023-12-19 13:50:40.121866679 +0000 UTC deployed cilium-1.14.5 1.14.5
Well there it is..
That makes it all more interesting. As I will use Helm to update certain parameters going forward in this post I will take a "snapshot" of the current values in the manifest above and altered in the next sections when I enabel additional features. How does the values look like now?
I prefer editing the changes in a dedicated value.yaml file and run helm upgrade -f value.yaml each time I want to do a change so going forward I will be the adding/changing certain settings in this value.yaml file to update the settings in Cilium.
I grabbed the default value yaml from the Helm repo and use that to alter the settings in the next sections.
Enabling features using Cilium cli
The cilium-cli can also be used to enable disable features certain features like Hubble and clustermesh. An example on how to install Hubble with cilium-cli is shown below in the next chapter, but I can also use Helm to achieve the same. I enable Hubble using cilium-cli just to show how easy it is.
But as I mention above, I prefer using the Helm method as I can keep better track of the settings and have them consistent each time I alter an update and refering to my value.yaml file.
Observability and flow-monitoring - Hubble Observability
Cilium comes with a very neat monitor tool out of the box called Hubble. It is enabled by default but I need to enable the Hubble Relay and Hubble UI feature to get the information from my nodes, pods etc available in a nice dashboard (Hubble UI), so this is a feature I certainly want to enable as one of the first features to test out.
1andreasm@linuxmgmt01:~$ k get svc -n kube-system
2NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
3hubble-peer ClusterIP 10.23.182.223 <none> 443/TCP 6h19m
4hubble-relay ClusterIP 10.23.182.76 <none> 80/TCP 4h34m
5hubble-ui ClusterIP 10.23.31.4 <none> 80/TCP 42s
6kube-dns ClusterIP 10.23.0.10 <none> 53/UDP,53/TCP,9153/TCP 6h59m
Hubble Relay and Hubble UI service is enabled. The issue though is that they are exposed using clusterIP, I need to reach them from the outside of my cluster. Lets continue with the next feature to test: LB-IPAM.
Using Helm to enable Hubble Relay and Hubble-UI
Instead of using clilium cli I would have enabled the Relay and UI in my value.yaml file and run the following command:
1andreasm@linuxmgmt01:~/test-cluster-1$ helm upgrade -n kube-system cilium cilium/cilium --version 1.14.5 -f cilium-values-feature-by-feature.yaml
2Release "cilium" has been upgraded. Happy Helming!
3NAME: cilium
4LAST DEPLOYED: Tue Dec 19 20:32:12 2023 5NAMESPACE: kube-system
6STATUS: deployed
7REVISION: 13 8TEST SUITE: None
9NOTES:
10You have successfully installed Cilium with Hubble Relay and Hubble UI.
1112Your release version is 1.14.5.
1314For any further help, visit https://docs.cilium.io/en/v1.14/gettinghelp
Where I have changed these settings in the value.yaml:
1relay: 2# -- Enable Hubble Relay (requires hubble.enabled=true) 3enabled:true 4...... 5ui: 6# -- Whether to enable the Hubble UI. 7enabled:true 8........ 9hubble:10# -- Enable Hubble (true by default).11enabled:true12............13# -- Buffer size of the channel Hubble uses to receive monitor events. If this14# value is not set, the queue size is set to the default monitor queue size.15# eventQueueSize: ""1617# -- Number of recent flows for Hubble to cache. Defaults to 4095.18# Possible values are:19# 1, 3, 7, 15, 31, 63, 127, 255, 511, 1023,20# 2047, 4095, 8191, 16383, 32767, 6553521# eventBufferCapacity: "4095"2223# -- Hubble metrics configuration.24# See https://docs.cilium.io/en/stable/observability/metrics/#hubble-metrics25# for more comprehensive documentation about Hubble metrics.26metrics:27# -- Configures the list of metrics to collect. If empty or null, metrics28# are disabled.29# Example:30#31# enabled:32# - dns:query;ignoreAAAA33# - drop34# - tcp35# - flow36# - icmp37# - http38#39# You can specify the list of metrics from the helm CLI:40#41# --set metrics.enabled="{dns:query;ignoreAAAA,drop,tcp,flow,icmp,http}"42#43enabled:44- dns:query;ignoreAAAA ### added these45- drop ### added these46- tcp ### added these47- flow ### added these48- icmp ### added these49- http ### added these50.........5152
Exposing a service from Kubernetes to be accessible from outside the cluster can be done in a couple of ways:
Exporting the service by binding it to a node using NodePort (not scalable and manageable).
Exporting the service using a servicetype of loadBalancer, only Layer4, though scalable. Usually requires external load balancer installed or some additional component installed and configured to support your Kubernetes platform.
Exporting using Ingress, Layer7, requires a loadbalancer to provide exernal IP address
Exporting using GatewayAPI (Ingress successor), requires a loadbalancer to provide exernal IP address.
Cilium has really made it simple here, it comes with a built in LoadBalancer-IPAM. More info here.
This is already enabled, no feature to install or enable. The only thing I need to do is to configure an IP pool that will provide ip addresses from a defined subnet when I request a serviceType loadBalancer, Ingress or Gateway. We can configure multiple pools with different subnets, and configure a serviceSelector matching on labels or expressions.
In my lab I have already configured a couple of IP pools, using different subnets and different serviceSelectors so I can control which service gets IP addresses from which pool.
The first pool will only provide IP addresses to services being deployed in any of the two namespaces "harbor" or "booking". This is an "OR" selection, not AND, meaning it can be deployed in any of the namespaces, not both. The second will use lablels and match on the key-value: env=prod.
Info
Bear in mind that these IP Pools will only listen for services (serviceType loadBalancer) not Ingress pr say. That means each time you create an Ingress or a Gateway the serviceType loadBalancer will be auto-created as a reaction to the Ingress/Gateway creation. So if you try to create labels on the Ingress/Gatewat object it will not be noticed by the LB-IPAM pool. Instead you can adjust the selection based on the namespace you know it will be created in, or use this label that is auto-created on the svc: "Labels: io.cilium.gateway/owning-gateway="name-of-gateway""
As soon as you have created an ip-pool, applied it, it will immediately start to serve requests by providing IP addresses to them. This is very nice.
There is a small catch though. If I create IP Pools, as above, which is outside of my nodes subnet how does my network know how to reach these subnets? Creating static routes and pointing to my nodes that potentially holds these ip addresses? Nah.. Not scalable, nor manageable. Some kind of dynamic routing protocol would be best here, BGP or OSPF.
Did I mention that Cilium also includes support for BGP out of the box?
BGP Control Plane
Yes, you guessed it, Cilium includes BGP. A brilliant way of advertising all my IP pools. Creating many IP pools with a bunch of subnets have never been more fun. This is the same concept as I write about here, the biggest difference is that with Cilium this only needs to be enabled as a feature and then define a yaml to confgure the bgp settings. Nothing additional to install, just Plug'nPlay.
For more info on the BGP control plane, read here.
First out, enable the BGP control plane feature. To enable it I will alter my Helm value.yaml file with this setting:
1# -- This feature set enables virtual BGP routers to be created via2# CiliumBGPPeeringPolicy CRDs.3bgpControlPlane:4# -- Enables the BGP control plane.5enabled:true
Now I need to create a yaml that contains the BGP peering info I need for my workers to peer to my upstream router. For reference I will paste my lab topology here again:
When I apply my below BGPPeeringPolicy yaml, my nodes will enable a BGP peering session to the switch (their upstream bgp neighbor) they are connected to in the diagram above. This switch has also been configured to allow them as BGP neigbors. Please take into consideration creating some ip-prefix/route-maps so we dont accidentally advertise routes that confilcts, or should not be advertised into the network to prevent BGP blackholes etc...
Here we can also configure a serviceSelector to prevent services we dont want to be advertised. I used used the example from the official docs to allow everything. If I also have a good BGP route-map config on my switch side or upstream bgp neighbour subnets that are not allowed will never be advertised.
Now that I have applied it I can check the bgp peering status using the Cilium cli:
1andreasm@linuxmgmt01:~/prod-cluster-1/cilium$ cilium bgp peers
2Node Local AS Peer AS Peer Address Session State Uptime Family Received Advertised
3k8s-prod-node-01 6452064512 10.160.1.1 established 13h2m53s ipv4/unicast 4764 ipv6/unicast 005k8s-prod-node-02 6452064512 10.160.1.1 established 13h2m25s ipv4/unicast 4566 ipv6/unicast 007k8s-prod-node-03 6452064512 10.160.1.1 established 13h2m27s ipv4/unicast 4368 ipv6/unicast 00
I can see some prefixes being Advertised and some being Received and the Session State is Established. I can also confirm that on my switch, and the routes they advertise:
1GUZ-SW-01# show ip bgp summary
2 3 Peer Information
4 5 Remote Address Remote-AS Local-AS State Admin Status
6 --------------- --------- -------- ------------- ------------
7 10.160.1.114 6452064512 Established Start
8 10.160.1.115 6452064512 Established Start
9 10.160.1.116 6452064512 Established Start
10 172.18.1.1 6450064512 Established Start
11GUZ-SW-01# show ip bgp
1213 Local AS : 64512 Local Router-id : 172.18.1.2
14 BGP Table Version : 17061516 Status codes: * - valid, > - best, i - internal, e - external, s - stale
17 Origin codes: i - IGP, e - EGP, ? - incomplete
1819 Network Nexthop Metric LocalPref Weight AsPath
20 ------------------ --------------- ---------- ---------- ------ ---------
21* e 10.150.11.10/32 10.160.1.114 0064520 i
22*>e 10.150.11.10/32 10.160.1.115 0064520 i
23* e 10.150.11.10/32 10.160.1.116 0064520 i
24* e 10.150.11.199/32 10.160.1.114 0064520 i
25* e 10.150.11.199/32 10.160.1.115 0064520 i
26*>e 10.150.11.199/32 10.160.1.116 0064520 i
27* e 10.150.12.4/32 10.160.1.114 0064520 i
28* e 10.150.12.4/32 10.160.1.115 0064520 i
29*>e 10.150.12.4/32 10.160.1.116 0064520 i
30* e 10.150.14.32/32 10.160.1.114 0064520 i
31* e 10.150.14.32/32 10.160.1.115 0064520 i
32*>e 10.150.14.32/32 10.160.1.116 0064520 i
33* e 10.150.14.150/32 10.160.1.114 0064520 i
34*>e 10.150.14.150/32 10.160.1.115 0064520 i
35* e 10.150.14.150/32 10.160.1.116 0064520 i
36* e 10.150.15.100/32 10.160.1.114 0064520 i
37* e 10.150.15.100/32 10.160.1.115 0064520 i
38*>e 10.150.15.100/32 10.160.1.116 0064520 i
Now I can just create my IP Pools, create some services and they should be immediately advertised and reachable in my network (unless they are being stopped by some route-maps ofcourse).
Note, it will only advertise ip-addresses in use by a service, not the whole subnet I define in my IP-Pools. That means I will only see host-routes advertised (as seen above).
LB-IPAM - does it actually loadbalance?
It says LoadBalancer IPAM, but does it actually loadbalance? Let me quicly put that to a test.
I have exposed a web service using serviceType loadBalancer consisting of three simple nginx web pods.
Here is the yaml I am using (think I grabbed it from the offical Cilium docs)
1apiVersion:v1 2kind:Service 3metadata: 4name:test-lb 5namespace:example 6labels: 7env:prod#### added this label to match with my ip pool 8spec: 9type:LoadBalancer10ports:11- port:8012targetPort:8013protocol:TCP14name:http15selector:16svc:test-lb17---18apiVersion:apps/v119kind:Deployment20metadata:21name:nginx22namespace:example23spec:24selector:25matchLabels:26svc:test-lb27template:28metadata:29labels:30svc:test-lb31spec:32containers:33- name:web34image:nginx35imagePullPolicy:IfNotPresent36ports:37- containerPort:8038readinessProbe:39httpGet:40path:/41port:80
Initially it deploys one pod, I will scale it up to three
They are running here, perfectly distributed across all my worker nodes:
1andreasm@linuxmgmt01:~/prod-cluster-1/cilium$ k get pods -n example -owide
2NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
3nginx-698447f456-5xczj 1/1 Running 0 18s 10.0.0.239 k8s-prod-node-01 <none> <none>
4nginx-698447f456-plknk 1/1 Running 0 117s 10.0.4.167 k8s-prod-node-02 <none> <none>
5nginx-698447f456-xs4jq 1/1 Running 0 18s 10.0.5.226 k8s-prod-node-03 <none> <none>
Now let me do a curl against the LoadBalancer IP and see if something changes:
1Every 0.5s: curl http://10.150.11.48 linuxmgmt01: Wed Dec 20 07:58:14 2023 2 3 % Total % Received % Xferd Average Speed Time Time Time Current
4 Dload Upload Total Spent Left Speed
500000000 --:--:-- --:--:-- --:--:-- 010056710056700 184k
60 --:--:-- --:--:-- --:--:-- 276k
7<!DOCTYPE html>
8<html>
9<head>
10Pod 2##### Notice this 11<style>
12html { color-scheme: light dark;}13body { width: 35em; margin: 0 auto;14font-family: Tahoma, Verdana, Arial, sans-serif;}15</style>
16</head>
17<body>
18Pod 219<p>If you see this page, the nginx web server is successfully installed and
20working. Further configuration is required.</p>
2122<p>For online documentation and support please refer to
23<a href="http://nginx.org/">nginx.org</a>.<br/>
24Commercial support is available at
25<a href="http://nginx.com/">nginx.com</a>.</p>
2627<p><em>Thank you for using nginx.</em></p>
28</body>
29</html>
1Every 0.5s: curl http://10.150.11.48 linuxmgmt01: Wed Dec 20 07:59:15 2023 2 3 % Total % Received % Xferd Average Speed Time Time Time Current
4 Dload Upload Total Spent Left Speed
500000000 --:--:-- --:--:-- --:--:-- 010056710056700 110k
60 --:--:-- --:--:-- --:--:-- 138k
7<!DOCTYPE html>
8<html>
9<head>
10Pod 1##### Notice this11<style>
12html { color-scheme: light dark;}13body { width: 35em; margin: 0 auto;14font-family: Tahoma, Verdana, Arial, sans-serif;}15</style>
16</head>
17<body>
18Pod 119<p>If you see this page, the nginx web server is successfully installed and
20working. Further configuration is required.</p>
2122<p>For online documentation and support please refer to
23<a href="http://nginx.org/">nginx.org</a>.<br/>
24Commercial support is available at
25<a href="http://nginx.com/">nginx.com</a>.</p>
2627<p><em>Thank you for using nginx.</em></p>
28</body>
29</html>
1Every 0.5s: curl http://10.150.11.48 linuxmgmt01: Wed Dec 20 08:01:02 2023 2 3 % Total % Received % Xferd Average Speed Time Time Time Current
4 Dload Upload Total Spent Left Speed
500000000 --:--:-- --:--:-- --:--:-- 010056710056700 553k
60 --:--:-- --:--:-- --:--:-- 553k
7<!DOCTYPE html>
8<html>
9<head>
10Pod 3##### Notice this11<style>
12html { color-scheme: light dark;}13body { width: 35em; margin: 0 auto;14font-family: Tahoma, Verdana, Arial, sans-serif;}15</style>
16</head>
17<body>
18Pod 319<p>If you see this page, the nginx web server is successfully installed and
20working. Further configuration is required.</p>
2122<p>For online documentation and support please refer to
23<a href="http://nginx.org/">nginx.org</a>.<br/>
24Commercial support is available at
25<a href="http://nginx.com/">nginx.com</a>.</p>
2627<p><em>Thank you for using nginx.</em></p>
28</body>
29</html>
Well, it is actually load-balancing the requests to the three different pods, running on three different nodes.
And it took me about 5 seconds to apply the ip-pool yaml and the bgppeeringpolicy yaml and I had a fully functioning load-balancer.
A bit more info on this feature from the offical Cilium docs:
LB IPAM works in conjunction with features like the Cilium BGP Control Plane (Beta). Where LB IPAM is responsible for allocation and assigning of IPs to Service objects and other features are responsible for load balancing and/or advertisement of these IPs.
So I assume the actual loadbalancing is done by BGP here.
Cilium Ingress
As I covered above serviceType loadBalancer, let me quickly cover how to enable Cilium IngressController.
More info can be found here
I will head into my Helm value.yaml and edit the following:
1ingressController: 2# -- Enable cilium ingress controller 3# This will automatically set enable-envoy-config as well. 4enabled:true 5 6# -- Set cilium ingress controller to be the default ingress controller 7# This will let cilium ingress controller route entries without ingress class set 8default:false 910# -- Default ingress load balancer mode11# Supported values: shared, dedicated12# For granular control, use the following annotations on the ingress resource13# ingress.cilium.io/loadbalancer-mode: shared|dedicated,14loadbalancerMode:dedicated
The Cilium Ingress controller can be dedicated or shared, meaning that it can support a shared IP for multiple Ingress objects. Nice if we are IP limited etc. Additionally we can edit the shared Ingress to configured with a specific IP like this:
1# -- Load-balancer service in shared mode. 2# This is a single load-balancer service for all Ingress resources. 3service: 4# -- Service name 5name:cilium-ingress 6# -- Labels to be added for the shared LB service 7labels:{} 8# -- Annotations to be added for the shared LB service 9annotations:{}10# -- Service type for the shared LB service11type:LoadBalancer12# -- Configure a specific nodePort for insecure HTTP traffic on the shared LB service13insecureNodePort:~14# -- Configure a specific nodePort for secure HTTPS traffic on the shared LB service15secureNodePort :~16# -- Configure a specific loadBalancerClass on the shared LB service (requires Kubernetes 1.24+)17loadBalancerClass:~18# -- Configure a specific loadBalancerIP on the shared LB service19loadBalancerIP :10.150.11.100### Set your preferred IP here20# -- Configure if node port allocation is required for LB service21# ref: https://kubernetes.io/docs/concepts/services-networking/service/#load-balancer-nodeport-allocation22allocateLoadBalancerNodePorts:~
This will dictate that the shared Ingress object will get this IP address.
Now save changes and run the helm upgrade command:
1andreasm@linuxmgmt01:~/test-cluster-1$ helm upgrade -n kube-system cilium cilium/cilium --version 1.14.5 -f cilium-values-feature-by-feature.yaml
2Release "cilium" has been upgraded. Happy Helming!
3NAME: cilium
4LAST DEPLOYED: Wed Dec 20 08:18:58 2023 5NAMESPACE: kube-system
6STATUS: deployed
7REVISION: 15 8TEST SUITE: None
9NOTES:
10You have successfully installed Cilium with Hubble Relay and Hubble UI.
1112Your release version is 1.14.5.
1314For any further help, visit https://docs.cilium.io/en/v1.14/gettinghelp
Now is also a good time to restart the Cilium Operator and Cilium Agents to re-read the new configMap.
As soon as I enable the Ingress controller it will create this object for me, and provide an IngressClass in my cluster.
1andreasm@linuxmgmt01:~/test-cluster-1$ k get ingressclasses.networking.k8s.io
2NAME CONTROLLER PARAMETERS AGE
3cilium cilium.io/ingress-controller <none> 86s
Now I suddenly have an IngressController also. Let me deploy a test app to test this.
First I deploy two pods with their corresponding clusterIP services:
1kind:Pod 2apiVersion:v1 3metadata: 4name:apple-app 5labels: 6app:apple 7namespace:fruit 8spec: 9containers:10- name:apple-app11image:hashicorp/http-echo12args:13- "-text=apple"1415---1617kind:Service18apiVersion:v119metadata:20name:apple-service21namespace:fruit22spec:23selector:24app:apple25ports:26- port:5678# Default port for image
1kind:Pod 2apiVersion:v1 3metadata: 4name:banana-app 5labels: 6app:banana 7namespace:fruit 8spec: 9containers:10- name:banana-app11image:hashicorp/http-echo12args:13- "-text=banana"1415---1617kind:Service18apiVersion:v119metadata:20name:banana-service21namespace:fruit22spec:23selector:24app:banana25ports:26- port:5678# Default port for image
And then the Ingress pointing to the two services apple and banana:
Notice the only annotation I have used is the loadbalancer-mode: dedicated. The other value that is accepted is shared. By using this annotation I can choose on specific Ingresses whether they should be using the ip from the shared Ingress object or if I want it to be a dedicated one with its own IP address. If I dont want this Ingress to consume a specific IP address I will use shared, if I want to create a dedicated IP for this Ingress I can use dedicated. The shared service for Ingress object is automatically created when enabling the IngressController. You can see this here:
1andreasm@linuxmgmt01:~$ k get svc -n kube-system
2NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
3cilium-shared-ingress LoadBalancer 10.21.104.15 10.150.15.100 80:30810/TCP,443:31104/TCP 46h
I have configured this shared-ingress to use a specific ip-address.
When using dedicated it will create a cilium-ingress-name-of-Ingress on a new IP address (as can be seen below).
As soon as this has been applied Cilium will automatically take care of the serviceType loadBalancer object by getting an IP address from one of the IP pools that matches my serviceSelections (depending on shared or dedicated ofcourse). Then BGP will automatically advertise the host-route to my BGP router. And the Ingress object should now be listening on HTTP requests on this IP.
Here is the services/objects created:
1andreasm@linuxmgmt01:~/prod-cluster-1/cilium$ k get ingress -n fruit
2NAME CLASS HOSTS ADDRESS PORTS AGE
3ingress-example cilium fruit.my-domain.net 10.150.12.4 80 44h
4andreasm@linuxmgmt01:~/prod-cluster-1/cilium$ k get svc -n fruit
5NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
6apple-service ClusterIP 10.21.243.103 <none> 5678/TCP 4d9h
7banana-service ClusterIP 10.21.124.111 <none> 5678/TCP 4d9h
8cilium-ingress-ingress-example LoadBalancer 10.21.50.107 10.150.12.4 80:30792/TCP,443:31553/TCP 43h
Let me see if the Ingress responds to my http requests (I have registered the IP above with a DNS record so I can resolve it):
The Ingress works.
Again for more information on Cilium IngressController (like supported annotations etc) head over here
Cilium Gateway API
Another ingress solution to use is Gateway API, read more about that here
Gateway API is an "evolution" of the regular Ingress, so it would be natural to take this into consideration going forward. Again Cilium supports Gateway API out of the box, I will use Helm to enable it and it just needs a couple of CRDs to be installed.
Read more on Cilium API support here.
To enable Cilium Gateway API I did the following:
Edit my Helm value.yaml with the following setting:
1gatewayAPI:2# -- Enable support for Gateway API in cilium3# This will automatically set enable-envoy-config as well.4enabled:true
Installed these CRDs before I ran the Helm upgrade command
1andreasm@linuxmgmt01:~/test-cluster-1$ helm upgrade -n kube-system cilium cilium/cilium --version 1.14.5 -f cilium-values-feature-by-feature.yaml
2Release "cilium" has been upgraded. Happy Helming!
3NAME: cilium
4LAST DEPLOYED: Wed Dec 20 11:12:55 2023 5NAMESPACE: kube-system
6STATUS: deployed
7REVISION: 16 8TEST SUITE: None
9NOTES:
10You have successfully installed Cilium with Hubble Relay and Hubble UI.
1112Your release version is 1.14.5.
1314For any further help, visit https://docs.cilium.io/en/v1.14/gettinghelp
1516andreasm@linuxmgmt01:~/test-cluster-1$ kubectl -n kube-system rollout restart deployment/cilium-operator
17deployment.apps/cilium-operator restarted
18andreasm@linuxmgmt01:~/test-cluster-1$ kubectl -n kube-system rollout restart ds/cilium
19daemonset.apps/cilium restarted
Info
It is very important to install the above CRDs first before attempting to enable the GatewayAPI in Cilium. Otherwise it will create any gatewayclass, aka no GatewayAPI realized.
Now I should have a gatewayClass:
1andreasm@linuxmgmt01:~/test-cluster-1$ k get gatewayclasses.gateway.networking.k8s.io
2NAME CONTROLLER ACCEPTED AGE
3cilium io.cilium/gateway-controller True 96s
Now I can just go ahead and create a gateway and some httproutes. When it comes to providing an external IP address for my gateway, this is provided by my ip-pools the same way as for the IngressController.
Lets go ahead and create a gateway, and for this excercise I will be creating a gateway with corresponding httproutes to support my Harbor registry installation.
Below is the config I have used, this has also been configured to do a https redirect (from http to https):
I have already created the certificate as the secret I refer to in the yaml above.
Lets have a look at the gateway, httproutes and the svc that provides the external IP address , also the Harbor services the httproutes refer to:
1#### gateway created #####2andreasm@linuxmgmt01:~/prod-cluster-1/harbor$ k get gateway -n harbor
3NAME CLASS ADDRESS PROGRAMMED AGE
4harbor-tls-gateway cilium 10.150.14.32 True 28h
1#### HTTPROUTES ####2andreasm@linuxmgmt01:~/prod-cluster-1/harbor$ k get httproutes.gateway.networking.k8s.io -n harbor
3NAME HOSTNAMES AGE
4harbor-api-route ["registry.my-domain.net"] 28h
5harbor-tls-redirect ["registry.my-domain.net"] 28h
1andreasm@linuxmgmt01:~/prod-cluster-1/harbor$ k get svc -n harbor
2NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
3cilium-gateway-harbor-tls-gateway LoadBalancer 10.21.27.25 10.150.14.32 80:32393/TCP,443:31932/TCP 28h
Now I can reach my Harbor using the UI and docker cli all through the Gateway API...
I will use Harbor in the next chapter with Hubble UI..
Hubble UI
As you may recall, I did enable the two features Hubble Relay and Hubble UI as we can se below:
1andreasm@linuxmgmt01:~$ k get svc -n kube-system
2NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
3hubble-metrics ClusterIP None <none> 9965/TCP 35h
4hubble-peer ClusterIP 10.23.182.223 <none> 443/TCP 42h
5hubble-relay ClusterIP 10.23.182.76 <none> 80/TCP 40h
6hubble-ui ClusterIP 10.23.31.4 <none> 80/TCP 36h
It is not exposed so I can reach from outside the Kubernetes cluster. So let me first start by just creating a serviceType loadBalancer service to expose the Hubble UI clusterIP service. Below is the yaml I use for that:
1andreasm@linuxmgmt01:~/prod-cluster-1/cilium/services$ k get svc -n kube-system
2NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
3hubble-ui-lb LoadBalancer 10.21.47.47 10.150.11.10 8081:32328/TCP 3d11h
There it is, now open my browser and point to this ip:port
Now let me go to a test application I have deployed in the yelb namespace. Click on it from the list or the dropdown top left corner:
Soo much empty...
I can see the pods are running:
1andreasm@linuxmgmt01:~/prod-cluster-1/cilium/services$ k get pods -n yelb
2NAME READY STATUS RESTARTS AGE
3redis-server-84f4bf49b5-fq26l 1/1 Running 0 5d18h
4yelb-appserver-6dc7cd98-s6kt7 1/1 Running 0 5d18h
5yelb-db-84d6f6fc6c-m7xvd 1/1 Running 0 5d18h
They are probably not so interested in talking to each other unless they have to. Let me deploy the Fronted service and create some interactions.
1andreasm@linuxmgmt01:~/prod-cluster-1/cilium$ k apply -f yelb-lb-frontend.yaml
2service/yelb-ui created
3deployment.apps/yelb-ui created
4andreasm@linuxmgmt01:~/prod-cluster-1/cilium$ k get svc -n yelb
5NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
6redis-server ClusterIP 10.21.67.23 <none> 6379/TCP 5d18h
7yelb-appserver ClusterIP 10.21.81.95 <none> 4567/TCP 5d18h
8yelb-db ClusterIP 10.21.188.43 <none> 5432/TCP 5d18h
9yelb-ui LoadBalancer 10.21.207.114 10.150.11.221 80:32335/TCP 49s
I will now open the Yelb UI and do some quick "votes"
Instantly, even by just opening the yelb webpage I get a lot of flow information in Hubble. And not only that, it automatically creates a "service-map" so I can see the involved services in the Yelb app.
This will only show me L4 information. What about Layer 7? Lets test that also by heading over to Harbor
In Hubble I will switch to the namespace harbor
A nice diagram with all involced services, but no L7 Information yet. Well there is, but I have no recent interactions to Harbor using the Gateway API, as soon as I use docker or web-ui against harbor what happens then?
Whats this, an ingress object?
Now when I click on the ingress object:
Look at the L7 info coming there.
I logged out from Harbor:
Logged back in:
Browsing the Harbor Projects/repositories:
Very rich set of information presented in a very snappy and responsive dashbord. Its instantly updated as soon as there is a request coming.
For now, this concludes this post.
It has been a nice experience getting a bit more under the hood of Cilium, and so far I must say it looks very good.
Things I have not covered yet wich I will at a later stage
I will update this post with some other features at a later stage. Some of the features I am interested looking at is:
Security policies with Cilium - just have quick look here many interesting topics, Host Firewall?