Traefik (’træfik’) Proxy #
First thing first, this is probably the first time in any of my blog posts that I can use a letter that is part of my native alphabet, namely the “æ”. I tried to look up whether I should stick to Træfik or not, so I ended up on this post by Traefik themselves, its not the letter “Æ” itself that is the reason behind the use of it, but it is the phonetic pronunciation of Traefik = ’træfik’. Nice that the letter “æ” has some use internationally though 😄 A fun and nice post though. For the rest of the post I will stick to using Traefik instead of Træfik as Træfik is just the logo and how Traefik is pronounced, it is called Traefik (and to be kind to the non native “æ” speakers out there).
From the offical Traefik homepage:
Traefik is an open-source Edge Router that makes publishing your services a fun and easy experience. It receives requests on behalf of your system and finds out which components are responsible for handling them.
What sets Traefik apart, besides its many features, is that it automatically discovers the right configuration for your services. The magic happens when Traefik inspects your infrastructure, where it finds relevant information and discovers which service serves which request.
Traefik is natively compliant with every major cluster technology, such as Kubernetes, Docker, Docker Swarm, AWS, Mesos, Marathon, and the list goes on; and can handle many at the same time. (It even works for legacy software running on bare metal.)
Why Traefik…
I needed an advanced reverse proxy for my lab that could cover all kinds of backends, from Kubernetes services, services running in regular workload such as virtual machines. I wanted it to be highly available and to solve one of my challenges when exposing services on the the Internet with one public IP and multiple services using the same port. After some quick research I ended up with Traefik. I am not sure why I exactly landed on Traefik, it could have been Nginx or HAProxy just to mention some of the bigger ones out there, or was it the “Æ”? Traefik offers both paid Enterprise editions, and free open source alternatives. I did not want to use time on a product that has some basic features included in their free open source edition and as soon as I wanted a more advanced feature I had to upgrade to a enterprise solution. After some reading Traefik seemed to have all the features I wanted in their open source product Traefik Proxy.
I decided to write this post as I wanted to document all the configurations I have done so far with Traefik. By searching in different forums, blog pages etc some say it is very easy to manage Traefik. I cant say I found it very easy to begin with, but as with everything new one need to learn how to master it. The official Traefik documentation is very good at describing and explaining all the possibilites with Traefik, but several times I was missing some “real life” example configs. But with the help of the great community out there I managed to solve the challenges I had and make them work with Traefik. So thanks to all the blog pages, forums with people asking questions and people willing to answer and explain. This is much appreciated as always.
So lets begin this post wth some high level explanations on some terminology used in Traefik, then the installation and how I have configured Traefik to serve as a reverse proxy for some of my services.
Important terminology used in Traefik Proxy #
Entrypoints #
EntryPoints are the network entry points into Traefik. They define the port which will receive the packets, and whether to listen for TCP or UDP.
In other words either an externally exposed service (NodePort or loadBalancer) or internal service (ClusterIP) defined, the destination endpoints for these entrypoints will here be the Traefik pods responsible for listening to any requests coming their way and do something useful with the traffic if configured.
See more here
Routers #
A router is in charge of connecting incoming requests to the services that can handle them. In the process, routers may use pieces of middleware to update the request, or act before forwarding the request to the service.
So this is the actual component that knows which service to forward the requests to based on for example host header.
See more here
Middleware #
Attached to the routers, pieces of middleware are a means of tweaking the requests before they are sent to your service (or before the answer from the services are sent to the clients).
An example can be the redirectscheme to redirect all http requests to https. For a full list of options, hava a look here
Services #
The
Services
are responsible for configuring how to reach the actual services that will eventually handle the incoming requests.
Services here can be of servicetype loadBalancer, ClusterIP, ExternalName etc
Providers #
Configuration discovery in Traefik is achieved through Providers.
The providers are infrastructure components, whether orchestrators, container engines, cloud providers, or key-value stores. The idea is that Traefik queries the provider APIs in order to find relevant information about routing, and when Traefik detects a change, it dynamically updates the routes.
More info on providers can be found here
My lab #
Before getting into the actual installaton and configuration of Traefik, a quick context. My lab in this post:
- A physical server running Proxmox
- A physical switch with VLAN and routing support
- Virtual PfSense firewall
- Kubernetes version 1.28.2
- 3x Control Plane nodes (Ubuntu)
- 3x Worker nodes (Ubuntu)
- A management Ubuntu VM (also running on Proxmox) with all tools needed like Helm and kubectl
- Cert-Manager configured and installed with LetsEncrypt provider
- Cilium has been configured with BGP, LB IPAM pools have been defined an provide external ip addresses to servicetype loadBalancer requests in the Kubernetes cluster
Deploying Traefik #
Traefik can be deployed in Kubernetes using Helm. First I need to add the Traefik Helm repo:
helm repo add traefik https://traefik.github.io/charts
helm repo update
Now it would be as simple as just installing Traefik using helm install traefik traefik/traefik -n traefik, but I have done some adjustements in the values. So before I install Traefik I have adjusted the chart values for Traefik to use this config. See comments inline below. Note that I have removed all the comments from the default value.yaml and just added my own comments. The value yaml can be fetched by issuing this command: helm show values traefik/traefik > traefik-values.yaml
image:
registry: docker.io
repository: traefik
tag: ""
pullPolicy: IfNotPresent
commonLabels: {}
deployment:
enabled: true
kind: Deployment
replicas: 3 ### Adjusted to three for high availability
terminationGracePeriodSeconds: 60
minReadySeconds: 0
annotations: {}
labels: {}
podAnnotations: {}
podLabels: {}
additionalContainers: []
additionalVolumes: []
initContainers:
# The "volume-permissions" init container is required if you run into permission issues.
# Related issue: https://github.com/traefik/traefik-helm-chart/issues/396
- name: volume-permissions
image: busybox:latest
command: ["sh", "-c", "touch /data/acme.json; chmod -v 600 /data/acme.json"]
securityContext:
runAsNonRoot: true
runAsGroup: 65532
runAsUser: 65532
volumeMounts:
- name: data
mountPath: /data
shareProcessNamespace: false
dnsConfig: {}
imagePullSecrets: []
lifecycle: {}
podDisruptionBudget:
enabled: false
ingressClass:
enabled: true
isDefaultClass: false # I have set this to false as I also have Cilium IngressController
experimental:
plugins: {}
kubernetesGateway:
enabled: false
ingressRoute:
dashboard:
enabled: false # I will enable this later
annotations: {}
labels: {}
matchRule: PathPrefix(`/dashboard`) || PathPrefix(`/api`)
entryPoints: ["traefik"]
middlewares: []
tls: {}
healthcheck:
enabled: false
annotations: {}
labels: {}
matchRule: PathPrefix(`/ping`)
entryPoints: ["traefik"]
middlewares: []
tls: {}
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0
maxSurge: 1
readinessProbe:
failureThreshold: 1
initialDelaySeconds: 2
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 2
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 2
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 2
startupProbe:
providers:
kubernetesCRD:
enabled: true # set to true
allowCrossNamespace: true # set to true
allowExternalNameServices: true # set to true also
allowEmptyServices: false
namespaces: []
kubernetesIngress:
enabled: true # set to true
allowExternalNameServices: true # set to true
allowEmptyServices: false
namespaces: []
publishedService:
enabled: false
file:
enabled: false
watch: true
content: ""
volumes: []
additionalVolumeMounts: []
logs:
general:
level: ERROR
access:
enabled: false
filters: {}
fields:
general:
defaultmode: keep
names: {}
headers:
defaultmode: drop
names: {}
metrics:
prometheus:
entryPoint: metrics
addEntryPointsLabels: true # set to true
addRoutersLabels: true # set to true
addServicesLabels: true # set to true
buckets: "0.1,0.3,1.2,5.0,10.0" # adjusted according to the official docs
tracing: {}
globalArguments:
- "--global.checknewversion"
- "--global.sendanonymoususage"
additionalArguments: []
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
envFrom: []
ports: # these are the entrypoints
traefik:
port: 9000
expose: false
exposedPort: 9000
protocol: TCP
web:
port: 8000
expose: true
exposedPort: 80
protocol: TCP
websecure:
port: 8443
expose: true
exposedPort: 443
protocol: TCP
http3:
enabled: false
tls:
enabled: true
options: ""
certResolver: ""
domains: []
middlewares: []
metrics:
port: 9100
expose: true
exposedPort: 9100
protocol: TCP
tlsOptions: {}
tlsStore: {}
service:
enabled: false # I will create this later, set to false, all values below will be ignored
single: true
type: LoadBalancer
annotations: {}
annotationsTCP: {}
annotationsUDP: {}
labels:
env: prod
spec:
loadBalancerIP: "10.150.11.11"
loadBalancerSourceRanges: []
externalIPs: []
autoscaling:
enabled: false #This is interesting, need to test
persistence:
enabled: true
resourcePolicy: "keep" # I have added this to keep the PVC even after uninstall
name: data
accessMode: ReadWriteOnce
size: 128Mi
path: /data
annotations: {}
certResolvers: {}
hostNetwork: false
rbac:
enabled: true
namespaced: false
podSecurityPolicy:
enabled: false
serviceAccount:
name: ""
serviceAccountAnnotations: {}
resources: {}
nodeSelector: {}
tolerations: []
topologySpreadConstraints: []
priorityClassName: ""
securityContext:
capabilities:
drop: [ALL]
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
podSecurityContext:
fsGroupChangePolicy: "OnRootMismatch"
runAsGroup: 65532
runAsNonRoot: true
runAsUser: 65532
extraObjects: []
Now I can install Traefik using the following command:
helm install traefik traefik/traefik -f traefik-values.yaml -n traefik
# or
helm upgrade -i traefik traefik/traefik -f traefik-values.yaml -n traefik
When updating changes etc. or just use from the start.
After a successful installation we should see this message:
Release "traefik" has been upgraded. Happy Helming!
NAME: traefik
LAST DEPLOYED: Wed Dec 27 20:36:23 2023
NAMESPACE: traefik
STATUS: deployed
REVISION: 15
TEST SUITE: None
NOTES:
Traefik Proxy v2.10.6 has been deployed successfully on traefik namespace !
🚨 When enabling persistence for certificates, permissions on acme.json can be
lost when Traefik restarts. You can ensure correct permissions with an
initContainer. See https://github.com/traefik/traefik-helm-chart/issues/396 for
more info. 🚨
Now I should also have a bunch of CRDs, an additional IngressClass (if you have a couple from before as I did).
andreasm@linuxmgmt01:~/prod-cluster-1/traefik$ k get crd
NAME CREATED AT
ingressroutes.traefik.containo.us 2023-12-24T08:53:45Z
ingressroutes.traefik.io 2023-12-24T08:53:45Z
ingressroutetcps.traefik.containo.us 2023-12-24T08:53:45Z
ingressroutetcps.traefik.io 2023-12-24T08:53:45Z
ingressrouteudps.traefik.containo.us 2023-12-24T08:53:45Z
ingressrouteudps.traefik.io 2023-12-24T08:53:45Z
middlewares.traefik.containo.us 2023-12-24T08:53:45Z
middlewares.traefik.io 2023-12-24T08:53:45Z
middlewaretcps.traefik.containo.us 2023-12-24T08:53:45Z
middlewaretcps.traefik.io 2023-12-24T08:53:46Z
serverstransports.traefik.containo.us 2023-12-24T08:53:45Z
serverstransports.traefik.io 2023-12-24T08:53:46Z
serverstransporttcps.traefik.io 2023-12-24T08:53:46Z
tlsoptions.traefik.containo.us 2023-12-24T08:53:45Z
tlsoptions.traefik.io 2023-12-24T08:53:46Z
tlsstores.traefik.containo.us 2023-12-24T08:53:45Z
tlsstores.traefik.io 2023-12-24T08:53:46Z
traefikservices.traefik.containo.us 2023-12-24T08:53:45Z
traefikservices.traefik.io 2023-12-24T08:53:46Z
A note on this list of CRDs above. The former Traefik APIs used the traefik.containo.us but from version Traefik 2.x they are now using the APIs traefik.io the former APIs are there for backward compatibility.
Below I can see the new Traefik Ingress controller.
andreasm@linuxmgmt01:~/prod-cluster-1/traefik$ k get ingressclasses.networking.k8s.io
NAME CONTROLLER PARAMETERS AGE
cilium cilium.io/ingress-controller <none> 10d
traefik traefik.io/ingress-controller <none> 59s
Deployment info:
andreasm@linuxmgmt01:~/prod-cluster-1/traefik$ k get all -n traefik
NAME READY STATUS RESTARTS AGE
pod/traefik-59657c9c59-75cxg 1/1 Running 0 27h
pod/traefik-59657c9c59-p2kdv 1/1 Running 0 27h
pod/traefik-59657c9c59-tqcrm 1/1 Running 0 27h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# No services...
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/traefik 3/3 3 3 2d11h
NAME DESIRED CURRENT READY AGE
replicaset.apps/traefik-59657c9c59 3 3 3 27h
Now it is all about configuring Traefik to receieve the requests, configure routes, middleware and services. I will start by getting the Traefik dashboard up.
EntryPoints #
As I have disabled all the services in the Helm value yaml none are created, therefore I need to create these entrypoints before anything can reach Traefik.
A quick explanation why wanted to create these myself. One can have multiple entrypoints to Traefik, even in the same Kubernetes cluster. Assume I want to use different IP addresses and subnets for certain services, some may even call it VIPs, for IP separation, easier physical firewall creation etc. Then I need to create these services to expose the entrypoints I want to use. The Helm chart enables 4 entrypoints by default: web port 8000 (http), websecure port 8443 (https), traefik port 9000 and metrics port 9100 TCP. But these are only configured on the Traefik pods themselves, there is no service to expose them either internally in the cluster or outside. So I need to create these external or internal services to expose these entrypoints.
Describe the pod to see the ports and labels:
andreasm@linuxmgmt01:~/prod-cluster-1/traefik$ k describe pod -n traefik traefik-59657c9c59-75cxg
Name: traefik-59657c9c59-75cxg
Namespace: traefik
Priority: 0
Service Account: traefik
Node: k8s-prod-node-01/10.160.1.114
Start Time: Tue, 26 Dec 2023 17:03:23 +0000
Labels: app.kubernetes.io/instance=traefik-traefik
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=traefik
traefik:
Container ID: containerd://b6059c9c6cdf45469403fb153ee8ddd263a870d3e5917a79e0181f543775a302
Image: docker.io/traefik:v2.10.6
Image ID: docker.io/library/traefik@sha256:1957e3314f435c85b3a19f7babd53c630996aa1af65d1f479d75539251b1e112
Ports: 9100/TCP, 9000/TCP, 8000/TCP, 8443/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Args:
--global.checknewversion
--global.sendanonymoususage
--entrypoints.metrics.address=:9100/tcp
--entrypoints.traefik.address=:9000/tcp
--entrypoints.web.address=:8000/tcp
--entrypoints.websecure.address=:8443/tcp
--api.dashboard=true
--ping=true
--metrics.prometheus=true
--metrics.prometheus.entrypoint=metrics
--metrics.prometheus.addRoutersLabels=true
--metrics.prometheus.addEntryPointsLabels=true
--metrics.prometheus.addServicesLabels=true
--metrics.prometheus.buckets=0.1,0.3,1.2,5.0,10.0
--providers.kubernetescrd
--providers.kubernetescrd.allowCrossNamespace=true
--providers.kubernetescrd.allowExternalNameServices=true
--providers.kubernetesingress
--providers.kubernetesingress.allowExternalNameServices=true
--entrypoints.websecure.http.tls=true
My first service I define and apply will primarily be used for management, interacting with Traefik internal services using the correct label selector to select the Traefik pods and refering to the two entrypoint web and websecure. This is how the first entrypoint is defined:
apiVersion: v1
kind: Service
metadata:
annotations:
io.cilium/lb-ipam-ips: "10.150.11.11"
name: traefik-mgmt
labels:
env: prod
namespace: traefik
spec:
ports:
- name: web
port: 80
protocol: TCP
targetPort: web
- name: websecure
port: 443
protocol: TCP
targetPort: websecure
selector:
app.kubernetes.io/name: traefik
type: LoadBalancer
This will create a servicetype LoadBalancer, the IP address is fixed by using the annotation and my confiigured Cilium LB-IPAM pool will provide the IP address for the service and BGP control plane will take care of advertising the IP address for me.
Lets apply the above yaml and check the service:
andreasm@linuxmgmt01:~/prod-cluster-1/traefik$ k get svc -n traefik
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
traefik-mgmt LoadBalancer 10.21.183.19 10.150.11.11 80:30343/TCP,443:30564/TCP 49s
This means I can now start registering relevant DNS records to this external IP and Traefik will receieve requests coming to this address/service.
But as I would like to separate out services by type/function using different ip addresses I have created another service using the same entrypoints but with a different external-ip.
apiVersion: v1
kind: Service
metadata:
annotations:
io.cilium/lb-ipam-ips: "10.150.16.10"
name: traefik-exposed-pool-1
labels:
env: traefik-pool-1
namespace: traefik
spec:
ports:
- name: web
port: 80
protocol: TCP
targetPort: web
- name: websecure
port: 443
protocol: TCP
targetPort: websecure
selector:
app.kubernetes.io/name: traefik
type: LoadBalancer
I can go ahead and register DNS records against this IP address also and they will be forwarded to Traefik to handle.
The beauty of this is that I can create as many services I want, using different external-ip addresses, and even specify different Traefik entrypoints. In my physical firewall I can more easily create firewall rules allowing or denying which source is allowed to reach these ip-addresses and then separate out apps from apps, services from services. Like in the next chapter when I expose the Traefik Dashboard.
Traefik Dashbord #
Traffic comes with a nice dashboard which gives a quick overview of services enabled, status and detailed information:
As I have not enabled any services I will need to define these to make the Dashboard accessible. I also want it accessible from outside my Kubernetes cluster using basic authentication.
I will prepare three yaml files. The first one will be the secret for the authentication part, the second the middleware config to enable basic authentication and the third and final the actual IngressRoute.
For the secret I used the following command to generate a base64 encoded string containing both username and password:
andreasm@linuxmgmt01:~/temp$ htpasswd -nb admin 'password' | openssl base64
YWRtaW46JGFwcjEkmlBejdSYnZW5uN1oualB1Lm1LOUo0dVhqVDB3LgoK
Then I created the 01-secret.yaml and pasted the bas64 output above
apiVersion: v1
kind: Secret
metadata:
name: traefik-dashboard-auth
namespace: traefik
data:
users: YWRtaW4JGFwcEkdmlBejdSYnEkZW5uN1oualB1Lm1LOUo0dVhqVDB3LgoK
The second yaml, the 02-middleware.yaml, to enable basic authentication:
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: traefik-dashboard-basicauth
namespace: traefik
spec:
basicAuth:
secret: traefik-dashboard-auth
Then the last yaml, the dashboard IngressRoute:
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: traefik-dashboard
namespace: traefik
spec:
entryPoints:
- websecure
routes:
- match: Host(`traefik-ui.my-domain.net`)
kind: Rule
middlewares:
- name: traefik-dashboard-basicauth
namespace: traefik
services:
# - name: traefik-mgmt
- name: api@internal
kind: TraefikService
tls:
secretName: my-doamin-net-tls-prod
Notice I refer to a tls secret? More on this just a tad later.
Lets see the three objects created.
andreasm@linuxmgmt01:~/prod-cluster-1/traefik/traefik-dashboard$ k get secrets -n traefik
NAME TYPE DATA AGE
traefik-dashboard-auth Opaque 1 39s
andreasm@linuxmgmt01:~/prod-cluster-1/traefik/traefik-dashboard$ k get middleware.traefik.io -n traefik
NAME AGE
traefik-dashboard-basicauth 55s
andreasm@linuxmgmt01:~/prod-cluster-1/traefik/traefik-dashboard$ k get ingressroutes.traefik.io -n traefik
NAME AGE
traefik-dashboard 1m
I have created a DNS record to point to external-ip of the traefik-mgmt service and made sure the Host definition in the IngressRoute matches this dns.
Now the dashboard is available and prompting for username and password.
The official doc here for more info.
Cert-Manager #
Instead of configuring Traefik to generate the certificates I need for my HTTPS services I have already configured Cert-Manager to create the certificates I need, you can read how I have done it here. I use mostly wildcard certificates, and dont see the need to request certificates all the time.
Then I use reflector to share/sync the certificates across namespaces. Read more on reflector here and here.
Monitoring with Prometheus and Grafana #
Another nice feature is Traefik’s built in Prometheus metrics. These Prometheus metrics can then be used as datasource in Grafana. So here is how I configured Prometheus and Grafana.
I followed these two blog post’s here and here, used them in combination to configure Traefik with Prometheus.
Prometheus #
I will start by getting Prometheus up and running, then Grafana
In my Traefik value.yaml I made these changes before I ran helm upgrade on the Traefik installation:
metrics:
## -- Prometheus is enabled by default.
## -- It can be disabled by setting "prometheus: null"
prometheus:
# -- Entry point used to expose metrics.
entryPoint: metrics
## Enable metrics on entry points. Default=true
addEntryPointsLabels: true
## Enable metrics on routers. Default=false
addRoutersLabels: true
## Enable metrics on services. Default=true
addServicesLabels: true
## Buckets for latency metrics. Default="0.1,0.3,1.2,5.0"
buckets: "0.1,0.3,1.2,5.0,10.0"
First I registered a DNS record on the external-ip service below with the name prometheus-traefik.my-domain.net as I consider this also a service that belongs within the management category. Now I have two dns records pointing to the same IP (the traefik-ui above included).
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
traefik-mgmt LoadBalancer 10.21.183.19 10.150.11.11 80:30343/TCP,443:30564/TCP
I will prepare 6 different yaml files. All files will be explained below. First yaml traefik-metrics-service:
apiVersion: v1
kind: Service
metadata:
name: traefik-metrics
namespace: traefik
spec:
ports:
- name: metrics
protocol: TCP
port: 9100
targetPort: metrics
selector:
app.kubernetes.io/instance: traefik-traefik
app.kubernetes.io/name: traefik
type: ClusterIP
As I followed the two blogs above there is a couple of approaches to make this work. One approach is to to expose the metrics using ClusterIP by applying the yaml above. Then the Prometheus target is refering to this svc (requires Prometheus to be runnning on same cluster). The other approach is to configure Prometheus scraping the Traefik pods.
One can also use this ClusterIP service later on with an IngressRoute to expose it outside its Kubernetes cluster for an easy way to just check whether there is metrics coming or need to access this metrics externally. If scraping the pods, this service is not needed as Prometheus will scrape the Traefik pods directly.
Then I need to create a Prometheus configMap telling Promethus what and how to scrape. I will paste below two ways Prometheus can scrape the metrics. The first yaml will scrape the pods directly using the kubernetes_sd_config and filter on the annotations on the Traefik pods.
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: prometheus
data:
prometheus.yml: |
global:
scrape_interval: 5s
evaluation_interval: 5s
scrape_configs:
- job_name: 'traefik'
kubernetes_sd_configs:
- role: pod
selectors:
- role: pod
label: "app.kubernetes.io/name=traefik"
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\\d+)?;(\\d+)
replacement: $1:$2
target_label: __address__
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
This approach will scrape the metrics from the relevant Traefik pods, using annotation. But it also means I need to give the Prometheus pod access to scrape pods not in its own namespace. So I will go ahead and create a service account, role and role binding for that:
andreasm@linuxmgmt01:~/prod-cluster-1/traefik/traefik-dashboard$ kubectl -n prometheus create serviceaccount prometheus
serviceaccount/prometheus created
andreasm@linuxmgmt01:~/prod-cluster-1/traefik/traefik-dashboard$ k create clusterrole prometheus --verb=get,list,watch --resource=pods,services,endpoints
clusterrole.rbac.authorization.k8s.io/prometheus created
andreasm@linuxmgmt01:~/prod-cluster-1/traefik/traefik-dashboard$ kubectl create clusterrolebinding prometheus --clusterrole=prometheus --serviceaccount=prometheus:prometheus
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
The second approach is to point to the ClusterIP metrics service (defined above) and let Prometheus scrape this service instead. This approach does not need the serviceAccount.
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: prometheus
data:
prometheus.yml: |
global:
scrape_interval: 5s
evaluation_interval: 5s
scrape_configs:
- job_name: 'traefik'
static_configs:
- targets: ['traefik-metrics.traefik.svc.cluster.local:9100']
Then I created the third yaml file that creates the PersistentVolumeClaim for my Prometheus instance:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: prometheus-storage-persistence
namespace: prometheus
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
The fourth file is the actual Promethus deployment, refering to the objects created in the previous yamls:
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
namespace: prometheus
spec:
selector:
matchLabels:
app: prometheus
replicas: 1
template:
metadata:
labels:
app: prometheus
spec:
serviceAccountName: prometheus # Remember to add the serviceaccount for scrape access
containers:
- name: prometheus
image: prom/prometheus:latest
ports:
- containerPort: 9090
name: default
volumeMounts:
- name: prometheus-storage
mountPath: /prometheus
- name: config-volume
mountPath: /etc/prometheus
volumes:
- name: prometheus-storage
persistentVolumeClaim:
claimName: prometheus-storage-persistence
- name: config-volume
configMap:
name: prometheus-config
The fifth yaml file is the Prometheus service where I expose Prometheus internally in the cluster:
kind: Service
apiVersion: v1
metadata:
name: prometheus
namespace: prometheus
spec:
selector:
app: prometheus
type: ClusterIP
ports:
- protocol: TCP
port: 9090
targetPort: 9090
The last yaml is the IngressRoute if I want to access Promethus outside my Kubernetes Cluster. Strictly optional if Grafana is also deployed in the same cluster as it can then just use the previously created ClusterIP service. But nice to have if in need to troubleshoot etc.
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: prometheus
namespace: prometheus
spec:
entryPoints:
- websecure
routes:
- kind: Rule
match: Host(`prometheus-traefik.my-domain.net`)
services:
- kind: Service
name: prometheus
port: 9090
Here comes the DNS record into play, the record I created earlier. Now after I have applied all the above yamls Prometheus should be up and running and I can use the IngressRoute to access the Prometheus Dashboard from my laptop.
Screnshot below is when scraping the pods directly
Screenshot below is scraping the metrics-service:
Grafana #
Now I more or less just need to install Grafana, add the Prometheus ClusterIP as datasource. To install Grafana, that is easily done by using Helm. Below is the steps I did to install Grafana:
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
## grabbing the default values
helm show values grafana/grafana > grafana-values.yaml
Below is the changes I have done in the value.yaml I am using to install Grafana:
## Configure grafana datasources
## ref: http://docs.grafana.org/administration/provisioning/#datasources
##
datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: Prometheus-traefik
type: prometheus
url: http://prometheus.prometheus.svc.cluster.local:9090
access: proxy
editable: true
orgId: 1
version: 1
isDefault: true
ingress:
enabled: false
persistence:
# type: pvc
enabled: true
# resourcePolicy: "keep"
# storageClassName: default
accessModes:
- ReadWriteOnce
size: 10Gi
annotations:
helm.sh/resource-policy: "keep"
finalizers:
- kubernetes.io/pvc-protection
# selectorLabels: {}
## Sub-directory of the PV to mount. Can be templated.
# subPath: ""
## Name of an existing PVC. Can be templated.
# existingClaim:
## Extra labels to apply to a PVC.
extraPvcLabels: {}
## Expose the grafana service to be accessed from outside the cluster (LoadBalancer service).
## or access it from within the cluster (ClusterIP service). Set the service type and the port to serve it.
## ref: http://kubernetes.io/docs/user-guide/services/
##
service:
enabled: true
type: ClusterIP
port: 80
targetPort: 3000
# targetPort: 4181 To be used with a proxy extraContainer
## Service annotations. Can be templated.
annotations: {}
labels: {}
portName: service
# Adds the appProtocol field to the service. This allows to work with istio protocol selection. Ex: "http" or "tcp"
appProtocol: ""
# Administrator credentials when not using an existing secret (see below)
adminUser: admin
adminPassword: 'password'
This will deploy Grafana with a pvc, not deleted if the Helm installation of Grafana is uninstalled, it will create a ClusterIP exposing the Grafana UI internally in the cluster. So I need to create an IngressRoute to expose it outside the cluster using Traefik.
Below is the IngressRoute for this:
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: grafana-ingressroute
namespace: grafana
spec:
entryPoints:
- web
routes:
- kind: Rule
match: Host(`grafana-prod.my-domain.net`)
services:
- kind: Service
name: grafana
passHostHeader: true
namespace: grafana
port: 80
Again, the match Host DNS is already registered using the same ExternalIP as the Prometheus one.
Now Grafana should be up and running.
Dashboard depicted above is the Traefik Official Standalone Dashboard which can be imported from here or use the following ID: 17346.
Thats it for monitoring with Prometheus and Grafana. Now onto just a simple web application.
Test application Yelb #
I wanted to just expose my test application Yelb deployed twice, but using two different DNS records. I also wanted these services to be exposed using a completely different subnet, to create the IP separation I have mentioned a couple of times. I have already deployed the Yelb application twice in my cluster in their own respective namespaces:
andreasm@linuxmgmt01:~/prod-cluster-1/traefik/grafana$ k get pods -n yelb
NAME READY STATUS RESTARTS AGE
redis-server-84f4bf49b5-fq26l 1/1 Running 0 13d
yelb-appserver-6dc7cd98-s6kt7 1/1 Running 0 13d
yelb-db-84d6f6fc6c-m7xvd 1/1 Running 0 13d
yelb-ui-6fbbcc4c87-qjdzg 1/1 Running 0 2d20h
andreasm@linuxmgmt01:~/prod-cluster-1/traefik/grafana$ k get pods -n yelb-2
NAME READY STATUS RESTARTS AGE
redis-server-84f4bf49b5-4sx7f 1/1 Running 0 2d16h
yelb-appserver-6dc7cd98-tqkkh 1/1 Running 0 2d16h
yelb-db-84d6f6fc6c-t4td2 1/1 Running 0 2d16h
yelb-ui-2-84cc897d6d-64r9x 1/1 Running 0 2d16h
I want to expose the yelb-ui in both namespaces on their different DNS records using IngressRoutes. I also want to use a completely different external IP address than what I have been using so far under the management category. So this time I will be using this external-ip:
apiVersion: v1
kind: Service
metadata:
annotations:
io.cilium/lb-ipam-ips: "10.150.16.10"
name: traefik-exposed-pool-1
labels:
env: traefik-pool-1
namespace: traefik
spec:
ports:
- name: web
port: 80
protocol: TCP
targetPort: web
- name: websecure
port: 443
protocol: TCP
targetPort: websecure
selector:
app.kubernetes.io/name: traefik
type: LoadBalancer
So I will need to register two DNS records against the IP above: 10.150.16.10 with the following names: yellb-1.my-domain-net" and yelb-2.my-domain.net
Then I can expose the Yelb UI services from both the namespaces yelb and yelb-2 with the following IngressRoutes:
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: yelb-ingressroute-1
namespace: yelb
spec:
entryPoints:
- web
routes:
- kind: Rule
match: Host(`yelb-1.my-domain.net`)
services:
- kind: Service
name: yelb-ui-1
passHostHeader: true
namespace: yelb
port: 80
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: yelb-ingressroute-2
namespace: yelb-2
spec:
entryPoints:
- web
routes:
- kind: Rule
match: Host(`yelb-2.my-domain.net`)
services:
- kind: Service
name: yelb-ui-2
passHostHeader: true
namespace: yelb-2
port: 80
The two IngressRoutes applied:
andreasm@linuxmgmt01:~/prod-cluster-1/cilium/test-apps/yelb$ k get ingressroutes.traefik.io -n yelb
NAME AGE
yelb-ingressroute-1 2d20h
andreasm@linuxmgmt01:~/prod-cluster-1/cilium/test-apps/yelb$ k get ingressroutes.traefik.io -n yelb-2
NAME AGE
yelb-ingressroute-2 2d16h
Now I can access both of them using their own dns records:
Yelb-1
Yelb-2
Traefik in front of my Home Assistant server #
Another requirement I had was to expose my Home Assistant server using Traefik, including MQTT. This is how I configured Traefik to handle this.
Home Assistant port 8123 #
As Home Assistant is running outside my Kubernetes cluster I needed to create an ExternalName service in my Kubernetes cluster for Traefik to use when forwarding requests to my “external” Home Assistant server.
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: external-homeassistant
name: external-homeassistant
namespace: traefik
spec:
type: ExternalName
ports:
- name: homeassistant
port: 8123
targetPort: 8123
protocol: TCP
externalName: 10.100.2.14
selector:
app.kubernetes.io/instance: traefik
app.kubernetes.io/name: traefik
The IP is the IP of my Home Assistant server and the port it is listening on. I decided to place the service in the same namespace as Traefik as Home Assistant is not residing in any namespaces in my Kubernetes cluster.
For this to work I needed to make sure my Traefik installation had this value enabled in my value.yaml config before running the helm upgrade of the Traefik installation:
providers:
kubernetesCRD:
# -- Allows to reference ExternalName services in IngressRoute
allowExternalNameServices: true
Here is the service after it has been applied:
andreasm@linuxmgmt01:~/prod-cluster-1/traefik/hass$ k get svc -n traefik
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
external-homeassistant ExternalName <none> 10.100.2.14 8123/TCP 42h
Now I needed to create a middleware to redirect all http requests to https:
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: hass-redirectscheme
namespace: traefik
spec:
redirectScheme:
scheme: https
permanent: true
And finally the IngressRoute which routes the requests to the HomeAssistant Externalname service and TLS termination using my wildcard certificate:
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: homeassistant-ingressroute
namespace: traefik
spec:
entryPoints:
- websecure
routes:
- match: Host(`hass.my-domain.net`)
kind: Rule
middlewares:
- name: hass-redirectscheme
namespace: traefik
services:
- name: external-homeassistant
kind: Service
port: 8123
tls:
secretName: net-tls-prod
Thats it, now I can access my Home Assistant over Traefik with TLS termination. And I dont have to worry about certificate expiration as the certificate will be automatically updated by Cert-Manager.
The DNS record is pointing to the ip I have decided to use for this purpose. Same concept as earlier.
Home Assistant MQTT 1883 #
I am also running MQTT in Home Assistant to support a bunch of devices, even remote devices (not in the same house). So I wanted to use Traefik for that also. This is how I configured Traefik to handle that:
I needed to create a new entrypoint in Traefik with port 1883 called mqtt. So I edited the Traefik value yaml and updated it accordingly. Then ran Helm upgrade on the Traefik installation. Below is te config I added:
ports:
mqtt:
port: 1883
protocol: TCP
expose: true
exposedPort: 1883
Now my Traefik pods also includes the port 1883:
Containers:
traefik:
Container ID: containerd://edf07e67ade4b005e7a7f8ac8a0991b2793c9320cabc35b6a5ea3c6271d63e6d
Image: docker.io/traefik:v2.10.6
Image ID: docker.io/library/traefik@sha256:1957e3314f435c85b3a19f7babd53c630996aa1af65d1f479d75539251b1e112
Ports: 9100/TCP, 1883/TCP, 9000/TCP, 8000/TCP, 8443/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
Args:
--global.checknewversion
--global.sendanonymoususage
--entrypoints.metrics.address=:9100/tcp
--entrypoints.mqtt.address=:1883/tcp
--entrypoints.traefik.address=:9000/tcp
--entrypoints.web.address=:8000/tcp
--entrypoints.websecure.address=:8443/tcp
--api.dashboard=true
--ping=true
--metrics.prometheus=true
--metrics.prometheus.entrypoint=metrics
--metrics.prometheus.addRoutersLabels=true
--metrics.prometheus.addEntryPointsLabels=true
--metrics.prometheus.addServicesLabels=true
--metrics.prometheus.buckets=0.1,0.3,1.2,5.0,10.0
--providers.kubernetescrd
--providers.kubernetescrd.allowCrossNamespace=true
--providers.kubernetescrd.allowExternalNameServices=true
--providers.kubernetesingress
--providers.kubernetesingress.allowExternalNameServices=true
--entrypoints.websecure.http.tls=true
This service is not exposed to the internet, so I decided to create a third Service using another subnet for internal services, that is services within my network, but not exposed to the internet.
I then created a DNS record for the mqtt service in this IP address. Below is the service I am using for mqtt:
apiVersion: v1
kind: Service
metadata:
annotations:
io.cilium/lb-ipam-ips: "10.150.20.10"
name: traefik-internal-pool-2
labels:
env: traefik-pool-2
namespace: traefik
spec:
ports:
- name: web
port: 80
protocol: TCP
targetPort: web
- name: websecure
port: 443
protocol: TCP
targetPort: websecure
- name: mqtt
port: 1883
protocol: TCP
targetPort: mqtt
selector:
app.kubernetes.io/name: traefik
type: LoadBalancer
This Service includes the entrypoints web 80, websecure 443 AND the newly created entrypoiint mqtt 1883. Then I can reuse it for other internal purposes also.
Now I can go ahead and create another ExternalName:
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: mqtt-homeassistant
name: mqtt-homeassistant
namespace: traefik
spec:
type: ExternalName
ports:
- name: mqtt-homeassistant
port: 1883
targetPort: 1883
protocol: TCP
externalName: 10.100.2.14
selector:
app.kubernetes.io/instance: traefik
app.kubernetes.io/name: traefik
This is also pointing to the IP of my Home Assistant server but using port 1883 instead.
Last step is to create a TCP IngressRoute like this:
apiVersion: traefik.io/v1alpha1
kind: IngressRouteTCP
metadata:
name: homeassistant-mqtt-ingressroute
namespace: traefik
spec:
entryPoints:
- mqtt
routes:
- match: ClientIP(`172.20.1.0/24`)
services:
- name: mqtt-homeassistant
port: 1883
I can now go ahead and repoint all my mqtt clients to point to the DNS record I have created using the external IP above.
Traefik and Harbor Registry #
The last usecase I had for Traefik this round is my Harbor registry. I will quickly show how I done that here.
I deploy Harbor using Helm, below is the steps to add the repo and my value.yaml I am using:
helm repo add harbor https://helm.goharbor.io
helm repo update
Here is my Harbor Helm value yaml file:
expose:
type: clusterIP
tls:
enabled: false
certSource: secret
secret:
secretName: "net-tls-prod"
auto:
commonName: registry.my-domain.net
clusterIP:
name: harbor
ports:
httpPort: 80
httpsPort: 443
externalURL: "https://registry.my-domain.net"
harborAdminPassword: "password"
persistence:
enabled: true
# Setting it to "keep" to avoid removing PVCs during a helm delete
# operation. Leaving it empty will delete PVCs after the chart deleted
# (this does not apply for PVCs that are created for internal database
# and redis components, i.e. they are never deleted automatically)
resourcePolicy: "keep"
persistentVolumeClaim:
registry:
# Use the existing PVC which must be created manually before bound,
# and specify the "subPath" if the PVC is shared with other components
existingClaim: ""
# Specify the "storageClass" used to provision the volume. Or the default
# StorageClass will be used (the default).
# Set it to "-" to disable dynamic provisioning
storageClass: "nfs-client"
subPath: ""
accessMode: ReadWriteOnce
size: 50Gi
annotations: {}
database:
existingClaim: ""
storageClass: "nfs-client"
subPath: "postgres-storage"
accessMode: ReadWriteOnce
size: 1Gi
annotations: {}
portal:
tls:
existingSecret: net-tls-prod
Then I install Harbor using Helm, and it should end up like this, only ClusterIP services:
andreasm@linuxmgmt01:~/prod-cluster-1/traefik/harbor$ k get svc -n harbor
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
harbor ClusterIP 10.21.152.146 <none> 80/TCP 29h
harbor-core ClusterIP 10.21.89.119 <none> 80/TCP 29h
harbor-database ClusterIP 10.21.174.146 <none> 5432/TCP 29h
harbor-jobservice ClusterIP 10.21.191.45 <none> 80/TCP 29h
harbor-portal ClusterIP 10.21.71.241 <none> 80/TCP 29h
harbor-redis ClusterIP 10.21.131.55 <none> 6379/TCP 29h
harbor-registry ClusterIP 10.21.90.29 <none> 5000/TCP,8080/TCP 29h
harbor-trivy ClusterIP 10.21.6.124 <none> 8080/TCP 29h
I want to expose my Harbor registry to the Internet so I will be using the Service with the corresponding externalIP I am using to expose things to Internet. This will also be the same externalIP as I am using for my Home Automation exposure. This means I can expose several services to Internet using the same port, like 443, no need to create custom ports etc. Traefik will happily handle the requests coming to the respective DNS records as long as I have configured it to listen 😄
Now I just need to create a middleware to redirect all http to https and the IngressRoute itself.
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: harbor-redirectscheme
namespace: harbor
spec:
redirectScheme:
scheme: https
permanent: true
Then the IngressRoute:
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: harbor-ingressroute
namespace: harbor
spec:
entryPoints:
- websecure
routes:
- match: Host(`registry.my-domain.net`)
kind: Rule
middlewares:
- name: harbor-redirectscheme
namespace: harbor
services:
- name: harbor-portal
kind: Service
port: 80
- match: Host(`registry.my-domain.net`) && PathPrefix(`/api/`, `/c/`, `/chartrepo/`, `/service/`, `/v2/`)
kind: Rule
middlewares:
- name: harbor-redirectscheme
namespace: harbor
services:
- name: harbor
kind: Service
port: 80
tls:
secretName: net-tls-prod
Now, let me see if I can reach Harbor:
And can I login via Docker?
andreasm@linuxmgmt01:~/prod-cluster-1/traefik/harbor$ docker login registry.my-domain.net
Username: andreasm
Password:
WARNING! Your password will be stored unencrypted in /home/andreasm/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store
Login Succeeded
Summary #
I found Traefik in combination with Cilium a very pleasant experience. The ease of creating IP Pools in Cilium and using BGP to advertise the host routes. How I could configure and manage Traefik to use different external IP entrypoints covering my needs like ip separation. The built-in Traefik dashboard, using Grafana for dashboard creation using Prometheus metrics was very nice. I feel very confident that Traefik is one of my go-to reverse proxies going forward. By deploying Traefik on my Kubernetes cluster I also achieved high-availability and scalability. When I started out with Traefik I found it a bit “difficult”, as I mention in the beginning of this post also, but after playing around with it for a while and got the terminology under my skin I find Traefik to be quite easy to manage and operate. Traefik has a good community out there, which also helped out getting the help I needed when I was stuck.
This post is not meant to be an exhaustive list of Traefik capabilities, this post is just scraping the surface of what Traefik is capable of, so I will most likely create a follow up post when I dive into more deeper and advanced topics with Traefik.