distributed cloud
7 TopicsAutomating TLS Certificates in Kubernetes with cert-manager and F5 Distributed Cloud DNS
Introduction If you run workloads in Kubernetes or Open Shift, you've almost certainly dealt with TLS certificates. You need them everywhere — Ingress controllers, internal services, mutual TLS between microservices, and API gateways. Managing them by hand is error-prone and doesn't scale: certificates expire silently, rotation is forgotten, and the person who originally created the wildcard cert is now working somewhere else. cert-manager solves this elegantly. It's a Kubernetes-native certificate controller that automates the issuance, renewal, and management of TLS certificates. It speaks ACME — the same protocol Let's Encrypt uses — but ACME is a standard, not a Let's Encrypt exclusive. You can point cert-manager at: Let's Encrypt (free, public, widely trusted) ZeroSSL, Buypass, Google Trust Services — other public CAs supporting ACME Step CA / HashiCorp Vault / Smallstep — for private PKI running inside your infrastructure Any commercial CA that has implemented an ACME endpoint This means the same workflow, the same Kubernetes manifests, and the same tooling can back both your public-facing services and your internal, corporate-signed certificates. One operator to rule them all. Why DNS-01 Challenge? ACME offers multiple ways to prove you own a domain. The most common is the HTTP-01 challenge, where the ACME server checks a well-known URL on your domain. It works well, but has limitations: The endpoint must be publicly reachable on port 80 It cannot issue wildcard certificates (*.example.com) The DNS-01 challenge takes a different approach: cert-manager (via a solver webhook) creates a _acme-challenge.example.com TXT record in your DNS zone. The ACME server checks for this record. Once verified, the TXT record is cleaned up automatically. Benefits: Works behind firewalls — no inbound HTTP needed Supports wildcard certificates — a single *.example.com certificate covers all subdomains Fully automated — the webhook handles record creation and deletion If your DNS is managed by F5 Distributed Cloud (F5 XC), you can now wire this entire flow together with the open-source cert-manager-webhook-f5xc solver. Architecture Overview Here's what happens when cert-manager issues a certificate using the F5 XC webhook: Developer applies Certificate manifest ▼ cert-manager creates Order + Challenge ▼ cert-manager calls webhook (DNS-01 solver) ▼ Webhook calls F5 XC DNS API → Creates TXT record: _acme-challenge.example.com ▼ ACME server (Let's Encrypt / other) validates the TXT record ▼ Webhook cleans up the TXT record ▼ cert-manager stores the issued certificate in a Kubernetes Secret The webhook runs as a standard Kubernetes Deployment inside your cluster, registered with cert-manager via a ValidatingWebhookConfiguration. It receives solver calls from cert-manager and translates them into F5 XC DNS API calls using an API token you provide as a Kubernetes Secret. Prerequisites Before we start, make sure you have the following in place: A Kubernetes cluster (1.21+) cert-manager installed (v1.14+) kubectl apply -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.yaml Helm 3.8+ An F5 Distributed Cloud tenant with DNS management enabled Your domain's DNS zone managed in F5 XC F5 XC credentials for DNS API access — either an API Token or a Client Certificate (P12); see Step 1 below for details Least privilege note: The service account or user whose credentials you use must have permission to manage DNS records. As of the time of writing, the built-in role f5xc-dns-management-admin is sufficient. Avoid using tenant-admin or other overly broad roles — the webhook only needs to create and delete TXT records in your DNS zone. Step 1: Prepare F5 XC Credentials The webhook supports two authentication methods against the F5 XC DNS API: an API Token or a Client Certificate (P12). Both are stored as a Kubernetes Secret in the cert-manager namespace. Obtaining credentials in F5 XC Console Regardless of which method you choose, the service account must have sufficient permissions to manage DNS records. Follow the least-privilege principle: In the F5 XC Console, navigate to Account Settings → Administration. Create a dedicated service credential under IAM and assign it the f5xc-dns-management-admin role in the system namespace — this is the minimum required role as of the time of writing and grants access to DNS Zone Management without unnecessary privileges elsewhere in the tenant. Or use an existing account privileges under Personal Management credentials. Generate the credentials of your preferred type (API Token or API Certificate) Option A: API Token (simpler, recommended for most setups) Take the API Token obtained from the F5XC console and use it with the following command kubectl create secret generic f5xc-api-token \ --namespace cert-manager \ --from-literal=token=<YOUR_F5XC_API_TOKEN> Option B: Client Certificate (P12) F5 XC also supports certificate-based authentication using a P12 (PKCS#12) client certificate, which may be preferred in environments. Use the certificate and password generated in the previous step and store it as a Secret: kubectl create secret generic f5xc-client-cert \ --namespace cert-manager \ --from-file=certificate.p12=<PATH_TO_YOUR.p12> \ --from-literal=password=<P12_PASSWORD> Refer to the webhook documentation for the exact certificateSecretRef fields to use in the solver config when choosing this method. Verify the secret was created: kubectl get secret f5xc-api-token -n cert-manager Step 2: Install the Webhook via Helm The chart is published as an OCI artifact on GitHub Container Registry. Install it into the cert-manager namespace: helm install cert-manager-webhook-f5xc \ oci://ghcr.io/wenkow/charts/cert-manager-webhook-f5xc \ --namespace cert-managerbaba Check the rollout: kubectl rollout status \ deployment cert-manager-webhook-f5xc \ -n cert-manager kubectl get pods -n cert-manager Step 3: Configure a ClusterIssuer A ClusterIssuer tells cert-manager which ACME server to use and how to solve challenges. Here we're pointing it at Let's Encrypt production and configuring the F5 XC DNS-01 solver. Note on groupName: The field appears twice in the YAML below and serves different purposes. At the webhook level, it's a fixed identifier (acme.f5xc.io) that tells cert-manager which registered webhook to call. Inside config, it's the RRSet group name within your F5 XC DNS zone — a logical container for DNS records created by the webhook. You can choose any name; F5 XC will create the group if it doesn't exist. clusterissuer.yaml: apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: letsencrypt-f5xc-prod spec: acme: email: admin@example.com server: https://acme-v02.api.letsencrypt.org/directory privateKeySecretRef: name: letsencrypt-f5xc-prod-account-key solvers: - dns01: webhook: # Fixed identifier — tells cert-manager which webhook to call groupName: acme.f5xc.io solverName: f5xc config: # Your F5 XC tenant name (subdomain part of your console URL) tenantName: my-tenant # RRSet group name in F5 XC DNS zone groupName: "cert-manager" # ttl: 120 # optional, default is 120 seconds apiTokenSecretRef: name: f5xc-api-token key: token # If using certificate-based auth instead of a token, replace # apiTokenSecretRef with certificateSecretRef — see webhook docs. Apply it: kubectl apply -f clusterissuer.yaml Verify it's ready: kubectl get clusterissuer letsencrypt-f5xc-prod Step 4: Request a Certificate Now the fun part. Create a Certificate resource. Note that we're requesting both the apex domain and a wildcard — something that's only possible with DNS-01. certificate.yaml: apiVersion: cert-manager.io/v1 kind: Certificate metadata: name: example-tls namespace: default spec: secretName: example-tls issuerRef: name: letsencrypt-f5xc-prod kind: ClusterIssuer dnsNames: - example.com - "*.example.com" Apply it: kubectl apply -f certificate.yaml Step 5: Watch the Certificate Lifecycle This is where it gets satisfying to watch. cert-manager creates an Order, which spawns one or more Challenge objects. Each Challenge triggers the F5 XC webhook to create a DNS TXT record. Watch Certificate status kubectl get certificate example-tls -n default -w Inspect the Order kubectl get orders -n default kubectl describe order example-tls-1-3552197254 -n default Status: Authorizations: Challenges: Token: <token> Type: dns-01 URL: https://acme-v02.api.letsencrypt.org/acme/chall/... Identifier: example.com Initial State: valid URL: https://acme-v02.api.letsencrypt.org/acme/authz/... Wildcard: true Challenges: Token: <token> Type: dns-01 URL: https://acme-v02.api.letsencrypt.org/acme/chall/... Identifier: example.com Initial State: valid URL: https://acme-v02.api.letsencrypt.org/acme/authz/... Wildcard: false Certificate: REDACTED Finalize URL: https://acme-v02.api.letsencrypt.org/acme/finalize/... State: valid URL: https://acme-v02.api.letsencrypt.org/acme/order/... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Complete 8m37s cert-manager-orders Order completed successfully Inspect the Challenges before it's done kubectl get challenges -n default kubectl describe challenge example-tls-<1234567890> -n default Verify the Issued Certificate kubectl get secret example-tls -n default -o jsonpath='{.data.tls\.crt}' \ | base64 -d | openssl x509 -noout -text | grep -E "Subject:|DNS:|Not After" Not After : Aug 19 19:22:31 2026 GMT Subject: CN = example.com DNS:*.example.com, DNS:example.com Both the apex domain and the wildcard are covered by a single certificate. Using a Custom or Internal ACME Server One of the most powerful aspects of this setup is that the ACME server is completely configurable. If you run an internal CA — for example HashiCorp Vault with the ACME secrets engine, or Step CA — just change the server field in your ClusterIssuer: spec: acme: email: admin@internal.example.com server: https://vault.internal.example.com/v1/pki/acme/directory # or # server: https://step-ca.internal.example.com/acme/acme/directory The webhook doesn't care which ACME server you use — it only handles the DNS side of the challenge. This makes the setup equally useful for: Internet-facing services using Let's Encrypt Internal services using a corporate CA Mixed environments where different ClusterIssuer objects point to different CAs, all sharing the same F5 XC DNS solver Troubleshooting Tips Challenge stays in pending for a long time Check the webhook logs for API errors: kubectl logs -n cert-manager -l app=cert-manager-webhook-f5xc --tail=50 READY: False on ClusterIssuer Usually means cert-manager couldn't register an ACME account. Check that the email field is valid and the ACME server URL is reachable. TXT record not appearing in F5 XC - Verify that the credentials you stored in the Secret have the right DNS permissions. In F5 XC Console, check that the service account has the f5xc-dns-management-admin role (or equivalent). API token permission issues will typically surface as 403 Forbidden errors in the webhook logs. Summary The cert-manager-webhook-f5xc project closes the loop between F5 Distributed Cloud DNS and the Kubernetes-native certificate management ecosystem. With a few manifests and a Helm install, you get: Fully automated certificate issuance and renewal — no manual interventions, no expiry surprises Wildcard certificate support out of the box via DNS-01 ACME provider flexibility — works with Let's Encrypt, commercial CAs, or your internal PKI Clean Kubernetes-native UX — certificates are just resources; the entire lifecycle is observable with standard kubectl commands The webhook is open source, available on GitHub and packaged on Artifact Hub. Contributions and feedback are welcome. Related Resources cert-manager documentation cert-manager DNS01 webhook reference F5 Distributed Cloud DNS Management docs GitHub: Wenkow/cert-manager-webhook-f5xc Artifact Hub: cert-manager-webhook-f5xc158Views2likes1CommentUpdate an ASM Policy Template via REST-API - the reverse engineering way
I always want to automate as many tasks as possible. I have already a pipeline to import ASM policy templates. Today I had the demand to update this base policies. Simply overwriting the template with the import tasks does not work. I got the error message "The policy template ax-f5-waf-jump-start-template already exists.". Ok, I need an overwrite tasks. Searching around does not provide me a solution, not even a solution that does not work. Simply nothing, my google-foo have deserted me. Quick chat with an AI, gives me a solution that was hallucinated. The AI answer would be funny if it weren't so sad. I had no hope that AI could solve this problem for me and it was confirmed, again. I was configuring Linux systems before the internet was widely available. Let's dig us in the internals of the F5 REST API implementation and solve the problem on my own. I took a valid payload and removed a required parameter, "name" in this case. The error response changes, this is always a good signal in this stage of experimenting. The error response was "Failed Required Fields: Must have at least 1 of (title, name, policyTemplate)". There is also a valid field named "policyTemplate". My first thought: This could be a reference for an existing template to update. I added the "policyTemplate" parameter and assigned it an existing template id. The error message has changed again. It now throws "Can't use string (\"ox91NUGR6mFXBDG4FnQSpQ\") as a HASH ref while \"strict refs\" in use at /usr/local/share/perl5/F5/ASMConfig/Entity/Base.pm line 888.". An perl error that is readable and the perl file is in plain text available. Looking at the file at line 888: The Perl code looks for an "id" field as property of the "policyTemplate" parameter. Changing the payload again and added the id property. And wow that was easy, it works and the template was updated. Final the payload for people who do not want to do reverse engineering. Update POST following payload to /mgmt/tm/asm/tasks/import-policy-template to update an ASM policy template: { "filename": "<username>~<filename>", "policyTemplate": { "id": "ox91NUGR6mFXBDG4FnQSpQ" } } Create POST following payload /mgmt/tm/asm/tasks/import-policy-template to create an ASM policy template: { "name": "<name>", "filename": "<username>~<filename>" } Hint: You must upload the template before to /var/config/rest/downloads/<username>~<filename>". Conclusion Documentation is sometimes overrated if you can read Perl. Missed I the API documentation for this endpoint and it was just a exercise for me?484Views2likes8CommentsF5 Distributed Cloud (XC) Custom Routes: Capabilities, Limitations, and Key Design Considerations
This article explores how Custom Routes work in F5 Distributed Cloud (XC), why they differ architecturally from standard Load Balancer routes, and what to watch out for in real-world deployments, covering backend abstraction, Endpoint/Cluster dependencies, and critical TLS trust and Root CA requirements.742Views2likes1CommentXC Distributed Cloud and how to keep the Source IP from changing with customer edges(CE)!
Code is community submitted, community supported, and recognized as ‘Use At Your Own Risk’. Old applications sometimes do not accept a different IP address to be used by the clients during the session/connection. How can make certain the IP stays the same for a client? The best will always be the application to stop tracking users based on something primitive as an ip address and sometimes the issue is in the Load Balancer or ADC after the XC RE and then if the persistence is based on source IP address on the ADC to be changed in case it is BIG-IP to Cookie or Universal or SSL session based if the Load Balancer is doing no decryption and it is just TCP/UDP layer . As an XC Regional Edge (RE) has many ip addresses it can connect to the origin servers adding a CE for the legacy apps is a good option to keep the source IP from changing for the same client HTTP requests during the session/transaction. Before going through this article I recommend reading the links below: F5 Distributed Cloud – CE High Availability Options: A Comparative Exploration | DevCentral F5 Distributed Cloud - Customer Edge | F5 Distributed Cloud Technical Knowledge Create Two Node HA Infrastructure for Load Balancing Using Virtual Sites with Customer Edges | F5 Distributed Cloud Technical Knowledge RE to CE cluster of 3 nodes The new SNAT prefix option under the origin pool allows no mater what CE connects to the origin pool the same IP address to be seen by the origin. Be careful as if you have more than a single IP with /32 then again the client may get each time different IP address. This may cause "inet port exhaustion " ( that is what it is called in F5BIG-IP) if there are too many connections to the origin server, so be careful as the SNAT option was added primary for that use case. There was an older option called "LB source IP persistence" but better not use it as it was not so optimized and clean as this one. RE to 2 CE nodes in a virtual site The same option with SNAT pool is not allowed for a virtual site made of 2 standalone CE. For this we can use the ring hash algorithm. Why this works? Well as Kayvan explained to me the hashing of the origin is taking into account the CE name, so the same origin under 2 different CE will get the same ring hash and the same source IP address will be send to the same CE to access the Origin Server. This will not work for a single 3-node CE cluster as it all 3 nodes have the same name. I have seen 503 errors when ring hash is enabled under the HTTP LB so enable it only under the XC route object and the attached origin pool to it! CE hosted HTTP LB with Advertise policy In XC with CE you can do do HA with 3-cluster CE that can be layer2 HA based on VRRP and arp or Layer 3 persistence based bgp that can work 3 node CE cluster or 2 CE in a virtual site and it's control options like weight, as prepend or local preference options at the router level. For the Layer 2 I will just mention that you need to allow 224.0.0.8 for the VRRP if you are migrating from BIG-IP HA and that XC selects 1 CE to hold active IP that is seen in the XC logs and at the moment the selection for some reason can't be controlled. if a CE can't reach the origin servers in the origin pool it should stop advertising the HTTP LB IP address through BGP. For those options Deploying F5 Distributed Cloud (XC) Services in Cisco ACI - Layer Three Attached Deployment is a great example as it shows ECMP BGP but with the BGP attributes you can easily select one CE to be active and processing connections, so that just one ip address is seen by the origin server. When a CE gets traffic by default it does prefer to send it to the origin as by default "Local Preferred" is enabled under the origin pool. In the clouds like AWS/Azure just a cloud native LB is added In front of the 3 CE cluster and the solution there is simple as to just modify the LB to have a persistence. Public Clouds do not support ARP, so forget about Layer 2 and play with the native LB that load balances between the CE 😉 CE on Public Cloud (AWS/Azure/GCP) When deploying on a public cloud the CE can be deployed in two ways one is through the XC GUI and adding the AWS credentials but this way you have not a big freedom to be honest as you can't deploy 2 CE and make a virtual site out of them and add cloud LB in-front of them as it always will be 3-CE cluster with preconfigured cloud LB that will use all 3 LB! Using the newer "clickops" method is much better https://docs.cloud.f5.com/docs-v2/multi-cloud-network-connect/how-to/site-management/deploy-site-aws-clickops or using terraform but with manual mode and aws as a provider (not XC/volterra as it is the same as the XC GUI deployment) https://docs.cloud.f5.com/docs-v2/multi-cloud-network-connect/how-to/site-management/deploy-aws-site-terraform This way you can make the Cloud LB to use just one CE or have some client Persistence or if traffic comes from RE to CE to implement the virtual site 2 CE node! There is no Layer 2 ARP support as I mentioned in public cloud with 3-node cluster but there is NAT policy https://docs.cloud.f5.com/docs-v2/multi-cloud-network-connect/how-tos/networking/nat-policies but I haven't tried it myself to comment on it. Hope you enjoyed this article!415Views2likes0CommentsF5 XC CE Debug commands through GUI cloud console and API
Why this feature is important and helpful? With this capability if the IPSEC/SSL tunnels are up from the Customer Edge(CE) to the Regional Edge(RE), there is no need to log into the CE, when troubleshooting is needed. This is possible for Secure Mesh(SM) and Secure Mesh V2 (SMv2) CE deployments. As XC CE are actually SDN-based ADC/proxy devices the option to execute commands from the SDN controller that is the XC cloud seems a logical next step. Using the XC GUI to send SiteCLI debug commands. The first example is sending the "netstat" command to "master-3" of a 3-node CE cluster. This is done under Home > Multi-Cloud Network Connect > Overview > Infrastructure > Sites and finding the site, where you want to trigger the commands. In the VPM logs it is possible to see the command that was send in API format by searching for it or for logs starting with "debug", as to automate this task. If you capture and review the full log, you will even see not only the API URL endpoint but also the POST body data that needs to be added. The VPM logs that can also be seen from the web console and API, are the best place to start investigating issues. XC Commands reference: Node Serviceability Commands Reference | F5 Distributed Cloud Technical Knowledge Troubleshooting Guidelines for Customer Edge Site | F5 Distributed Cloud Technical Knowledge Troubleshooting Guide for Secure Mesh Site v2 Deployment | F5 Distributed Cloud Technical Knowledge Using the XC API to send SiteCLI debug commands. The same commands can be send using the XC API and first the commands can be tested and reviewed using the API doc and developer portals. API documentation even has examples of how to run these commands with vesctl that is the XC shell client that can be installed on any computer or curl. Postman can also be used instead of curl but the best option to test commands through the API is the developer portal. Postman can also be used by the "old school" people 😉 Link reference: F5 Distributed Cloud Services API for ves.io.schema.operate.debug | F5 Distributed Cloud Technical Knowledge F5 Distributed Cloud Dev Portal ves-io-schema-operate-debug-CustomPublicAPI-Exec | F5 Distributed Cloud Technical Knowledge Summary: The option to trigger commands though the XC GUI or even the API is really useful if for example there is a need to periodically monitor the cpu or memory jump with commands like "execcli check-mem" or "execcli top" or even automating the tcpdump with "execcli vifdump xxxx". The use cases for this functionality really are endless.470Views0likes1Comment