BIG-IP in Tanzu Kubernetes Grid in an NSX-T network

Introduction

BIG-IP in Tanzu Kubernetes Grid provides a Ingress solution which is implemented with a single tier of Load Balancing. Typically Ingress requirea an in-cluster Ingress Controller and an external Load Balancer. By using BIG-IP, Ingress services get greatly simplified, while improving the perfomance by removing one hop and at the same time exposing all the BIG-IP's advanced load balancing functionalities and security modules.

Tanzu Kubernetes Grid is a Kubernetes distribution supported by VMware that comes with the choice of two CNIs:

Antrea - Geneve overlay based.
Calico - BGP based, no overlay networking.

When using Antrea in NSX-T environments Antrea uses it's own Geneve overlay on top of NSX-T's own Geneve overlay networking. This makes the communications in the TKG cluster to have the overhead of two encapsulations as it can be seen in the next wireshark capture.

Modern NICs are able to off-load the CPUs from the task of handling Geneve's encapsulation but these are not able to cope with double-encapsulation as it is the case when using Antrea+NSX-T.

In this blog we will describe how to setup BIG-IP with TKG's Calico CNI which doesn't have such overhead. The article is divided in the following sections:

Deploying a TKG cluster with built-in Calico
Deploying BIG-IP in TKG with NSX-T
Configuring BIG-IP and TKG to peer with Calico (BGP)
Configuring BIG-IP to handle Kubernetes workloads
Verifying the resulting configuration
Alternative topologies and multi-tenancy considerations
Closing notes

All the configuration files referenced in this blog can be found in the repository https://github.com/f5devcentral/f5-bd-tanzu-tkg-bigip

Deploying a TKG cluster with built-in Calico

Deploying a TKG cluster is as simple as running kubectl with a definition like the following one:

Note that the only thing required to choose between using Antrea or Calico is to specify the cni: name: field. It is perfectly fine running clusters with different CNIs in the same environment. At time of this writing (H1 2021) Calico v3.11 is the version included in Kubernetes v1.18.

As you can see from the definition above we will be creating a small cluster for testing purposes. This can be scaled-up/down later on as desired by re-applying an updated TKG cluster definition.

Deploying BIG-IP in TKG with NSX-T

When a TKG cluster is deployed in NSX-T the Tanzu Kubernetes Grid Service automatically creates the necessary networking configuration. This includes amongst other: creating a T1 Gateway (also known as DLR) named vnet-domain-c8:<uuid>-<namespace>-<name>-vnet-rtr and a subnet named vnet-domain-c8:<uuid>-<namespace>-<name>-vnet-0 where Kubernetes nodes are attached. This is shown in the next screenshot where we can see that the subnet is using the range 10.244.1.113/28.

The BIG-IPs will be part of this network as if it was another Kubernetes node. Additionally, we will create another segment named VIP in the same T1 DLR to keep all TKG resources under the same network leaf (this can be customized). As the name suggests this VIP segment is used for exposing the Ingress services implemented in the BIG-IPs. This is shown in the next Node's view figure.

The BIG-IPs will additionally have the regular management and HA segments which are not shown for clarity. Following regular BIG-IP rules, these segments should be outside the data plane path and the HA segment doesn't need to be connected to any T1.

The IP addresses used by the BIG-IP's in the Node's segments 10.244.1.{124,125,126} will need to be allocated in NSX-T so when scaling the TKG cluster they are not used by the Kubernetes nodes. This is done in the next figure.

This is done by first login in the NSX-T manager > Networking > IP Address Pools where we will find the subnet allocated to our tkg2-tkg2 cluster. In this screen we can obtain the API path to operate with it. In the figure we show the use of POSTMAN to make the IP address allocation with an API request. More precisely a PUT policy/api/v1/<API path>/ip-allocations/<name of allocation> request indicating in the body the desired IP to allocate. In this link at code.vmware.com you can find the full details of this API call. The names of the IP allocations are not relevant. In this case the names I chose are: bigip-floating, bigip1 and bigip2 respectively.

Configuring BIG-IP and TKG to peer with Calico (BGP)

In the BIG-IP we will configure the VLANs and Self-IPs normally, the only additional consideration is that since we are going to use Calico we have to allow BGP communication (TCP port 179) on the TKG's self-IPs port lock down settings (non-floating only) as shown next.

Next we will enable BGP in the Networking -> Route Domains -> 0 (existing) as shown next.

At this point we can configure BGP using the imish command line that brings us access to dynamic routing protocols configuration:

In the figure above it is shown the configuration for BIG-IP1. Only the router-id needs to be changed to apply it in BIG-IP2.

To finish the Calico configuration we have to instruct the TKG cluster to peer with the BIG-IPs, this is done with the following Kubernetes resources:

kubectl apply -f - <<EOF
kind: BGPPeer
apiVersion: crd.projectcalico.org/v1
metadata:
 name: bigip1
spec:
 peerIP: 10.244.1.125
 asNumber: 64512
---
kind: BGPPeer
apiVersion: crd.projectcalico.org/v1
metadata:
 name: bigip2
spec:
 peerIP: 10.244.1.126
 asNumber: 64512
EOF

As you might have noticed we are not using BGP passwords this is because Calico v3.17+ is needed for this feature. In any case the Node's network is protected by default by TKG's firewall rules in the T1 Gateway.

Finally we will verify that all the routes are advertised as expected:

The above verification has to be done in both BIG-IPs. Note from the above verification that it is expected to see for SNAT's range the nexthop 0.0.0.0. In the verification we can see a /26 prefix for each Node (these prefixes are created by Calico on demand when more PODs are created) and an /24 prefix for the SNAT range. These are shown next as a picture.

The network 100.128.0.0/24 has been chosen semi-arbitrarily. It is a range after POD's range that we indicated on cluster creation. This range will be only seen within the iBGP mesh (the PODs). It is best to use a range not used at all within NSX-T to avoid any possible IP range clashes.

It is worth to remark that the SNAT Pool range will be used for VIP to POD traffic whilst for health monitoring the BIG-IP will use the Self-IPs in the Kubernetes Node's segment.

Configuring BIG-IP to handle Kubernetes workloads

BIG-IP plugs into Kubernetes API by means of using Controller Ingress Services (CIS), We deploy one CIS POD per BIG-IP. Each CIS watches Kubernetes events and when a configuration is applied or Deployment scaling occurs it will update BIG-IP's configuration. CIS works with BIG-IP appliances, chasis or Virtual Edition and exposes BIG-IP's advanced services in the Kubernetes API using standard Ingress resources as well as Custom Resource Definitions (CRDs).

Each CIS instance works independently on each BIG-IP. This is good for redundancy purposes but at the same time it makes the two BIG-IPs believe that they are out of sync of each other because the CIS instances update the tkg partition of each BIG-IP independently. This behavior is only cosmetic. The BIG-IP's cluster failover mechanisms (up to 8 BIG-IPs) are independent of this.

The next steps are to configure CIS in Tanzu Kubernetes Grid , which is the same as with any regular Kubernetes with Calico.

We will install CIS by means of using the Helm installer

kubectl create secret generic bigip1-login -n kube-system --from-literal=username=admin --from-literal=password=<password>
kubectl create secret generic bigip2-login -n kube-system --from-literal=username=admin --from-literal=password=<password>

Add the CIS chart repository in Helm using following command:

helm repo add f5-stable https://f5networks.github.io/charts/stable

Create values-bigip<unit>.yaml for each BIG-IP as follows:

bigip_login_secret: bigip1-login
rbac:
 create: true
serviceAccount:
 create: true
 name: k8s-bigip1-ctlr
# This namespace is where the Controller lives;
namespace: kube-system
args:
 # See https://clouddocs.f5.com/containers/latest/userguide/config-parameters.html
 # NOTE: helm has difficulty with values using `-`; `_` are used for naming
 # and are replaced with `-` during rendering.
 bigip_url: 192.168.200.14
 bigip_partition: tkg
 default_ingress_ip: 10.106.32.100
 # Use the following settings if you want to restrict CIS to specific namespaces
 # namespace:
 # namespace_label:
 pool_member_type: cluster
 # Trust default BIG-IP's self-signed TLS certificate
 insecure: true
 # Force using the SNAT pool
 override-as3-declaration: kube-system/bigip-snatpool
 log-as3-response: true
image:
 # Use the tag to target a specific version of the Controller
 user: f5networks
 repo: k8s-bigip-ctlr
 pullPolicy: Always
resources: {}
version: latest

Note that in the above values file the following values are per BIG-IP unit:

Login's Secret
ServiceAccount
Management IP

The values files above indicate that we are going to use two resources:

A partition named "tkg" in BIG-IP
An AS3 override ConfigMap to inject custom BIG-IP configurations on top of a regular Ingress declaration. In this case we use this feature to force all the traffic to use a SNATPool.

Next, create the BIG-IP partition indicated in the Helm values file:

root@(bigip1)(cfg-sync In Sync)(Standby)(/Common)(tmos)# create auth partition tkg
root@(bigip1)(cfg-sync Changes Pending)(Standby)(/Common)(tmos)# run cm config-sync to-group

Create a ConfigMap to apply our custom SNATPool:

kubectl apply -n kube-system -f bigip-snatpool.yaml

Do a Helm chart installation for each BIG-IP unit, following the next pattern:

helm install -n kube-system -f values-bigip<unit>.yaml  --name-template bigip<unit> f5-stable/f5-bigip-ctlr

In a 2 BIG-IP cluster that is:

helm install -n kube-system -f values-bigip1.yaml  --name-template bigip1 f5-stable/f5-bigip-ctlr
helm install -n kube-system -f values-bigip2.yaml  --name-template bigip2 f5-stable/f5-bigip-ctlr

which will result in the following for each BIG-IP:

$ helm -n kube-system ls
NAME      NAMESPACE      REVISION    UPDATED                                  STATUS      CHART                   APP VERSION
bigip1    kube-system    1           2021-06-09 15:19:20.651138 +0200 CEST    deployed    f5-bigip-ctlr               
bigip2    kube-system    1           2021-06-10 10:01:57.902215 +0200 CEST    deployed    f5-bigip-ctlr

Configuring a Kubernetes Ingress Service

We are ready to deploy Ingress services. In this case we will deploy a single Ingress, named cafe.tkg.bd.f5.com which will perform TLS termination and will send the traffic to the applications tea and coffee depending on the URL requested. This is exhibit in the next figure:

This is deployed with the following commands:

kubectl create ns cafe
kubectl apply -n cafe -f cafe-rbac.yaml
kubectl apply -n cafe -f cafe-svcs.yaml
kubectl apply -n cafe -f cafe-ingress.yaml

We are going to describe the Ingress definition cafe-ingress.yaml:

CIS supports a wide range of annotations for customizing Ingress services. You can find detailed information in the supported annotations reference page but when these are large the Ingress definitions might become not easy to mantain.

BIG-IP's solution to Ingress resource's limited schema capabilities is the use of the AS3 override ConfigMap mechanism. In our configuration we are using this mechanism to create a SNAT Pool that we apply to the CIS's default Ingress VIP. This is shown next:

Given that AS3 Override customizations are applied outside the Ingress definitions, by means of an overlay ConfigMap, we avoid having Ingress definitions with huge annotation sections. In this blog we have implemented the healtchchecks with annotations but these could have been implemented with the AS3 Override ConfigMap mechanism as well.

The AS3 Override ConfigMap can be used to define any advanced service configuration which is possible with any module of BIG-IP, not only LTM. Please check this link for more information about AS3 automation.

Verifying the resulting configuration

At this point if we should see the following Kubernetes resources:

And in the BIG-IP UI we will see the objects shown next in the Network Map section:

But in order to reach the VIPs we need to add an NSX-T firewall rule in TKG's T1 gateway. This is shown next:

After the rule has been applied we can run a curl command to perform an end to end validation:

Alternative topologies and multi-tenancy considerations

In this blog post we have shown a topology where each cluster and its is contained within the scope of a T1 Gateway. A single BIG-IP cluster can be used for several clusters just using more interfaces. We could also use a shared VIP network directly connected in the T0 Gateway as shown next.

Note that in the above example there will be at least one CIS POD for each BIG-IP/cluster combination. Notice that we mean "at least one" CIS. This is because it is also possible to have multiple CIS per TKG cluster. CIS can be configured to listen a specific set of namespaces (possibly selected using labels) and owning a specific partition in the BIG-IP. This is shown next.

Closing notes

Before deleting a TKG cluster we should delete the NSX-T created resources when integrating the BIG-IP. These are:

The TKG's NSX-T segments from the BIG-IPs (disconnecting them is not enough).
The IP allocations.
The T1's Gateway Firewall Rules if these have been created.

We have seen how easy is to use TKG's Calico CNI and take advantage of its reduced overlay overhead. We've also seen how to configure a BIG-IP cluster in TKG to provide a simpler, higher-performance single-tier Ingress Controller. We have only shown the use of CIS with BIG-IP's LTM load balancing module and the standard Ingress resource type. We've also seen how to extend Ingress resource's limited capabilities in a manageable fashion by using the AS3 override mechanism while also reducing the annotations required. It is also worth to remark the additional CRDs that CIS provides besides the standard Ingress resource type. The possibilities are limitless: any BIG-IP configuration or module that you are used to use for VMs or baremetal workloads can be applied to containers through CIS.

We hope that you enjoyed this blog. We look forward to your comments.

Published Jun 16, 2021

Version 1.0

Employee

Solutions architect in Business Development with focus in automation and integration with partner's technologies. Prior to this role I was consultant in Professional Services and escalations engineer in Technical Support. Outside F5, I worked in mobile and wired network operators as network engineer. I started my career in academic research. In all these years, no matter what I've been doing Linux has been always the best tool. Not the one in the picture :-)

View Profile