Service Discovery in Google Cloud with F5 BIG-IP
Service discovery allows cool things to happen like dynamic node discovery for your applications. The BIG-IP device can utilize service discovery to automate the scale in/out of pool members. What does this mean? Your BIG-IP configs will get updated without user intervention. Google Cloud uses "Labels" that are assigned to virtual machines (VM). The BIG-IP will use these "Labels" to automate the dynamic nature of pool members coming and going. It periodically scans the cloud provider for VMs matching those labels.
Benefit? Yes indeed! No tickets to IT asking for pool member modifications...no waiting on emails...no approvals.
I setup a BIG-IP (3nic standalone) with service discovery in Google Cloud using our F5 cloud templates on GitHub, and I am here to share the how-to and results. After reading the Github repo as well as visiting our CloudDocs site, I went to work.
This article has the following sections:
- Prerequisites
- Download, Customize, and Deploy Template
- Attach Service Account to BIG-IP VM
- Login to the BIG-IP via SSH and Set Password
- Configure Service Discovery with iApp
- Configure Service Discovery with AS3
- Summary
- Appendix Sections
Prerequisites - Google Cloud SDK
Must have Google Cloud SDK...easy. Go here https://cloud.google.com/sdk/docs/quickstarts.
curl https://sdk.cloud.google.com | bash exec -l $SHELL gcloud init
The Google Cloud SDK lets you do "?" and "tab" helpers. Meaning, type gcloud then hit tab a few times to see all the options. When you run "gcloud init", it will authenticate your device to the google network resulting in your laptop having Google Cloud API access. If you are running "gcloud init" for the first time, the SSH shell will pop open a browser window in which you authenticate to Google with your credentials. You'll be given an option to select the project name, and you can also configure default zones and regions. Play around with gcloud on CLI and then hit "tab" to see all the options. You can also the Google Cloud docs here https://cloud.google.com/sdk/docs/initializing.
Prerequisites - VPC Networks, Subnets, Firewalls, and Routes
A VM in Google Cloud can only have one NIC per VPC network. Therefore, a BIG-IP 3nic deployment requires 3 VPC networks with 1 subnet each. Before deploying the GDM template, you'll need to create the required networks and subnets. Then make sure any necessary ports are open via firewall rules.
VPC Networks and Subnets
Here are some screenshots of my setup. I have a management, external, and internal network.
Here are the network and subnet properties for the management network as an example.
Firewalls
Modify firewall rules to allow any necessary management ports. These are not setup by the template. Therefore, common management ports like tcp:22 and tcp:443 should be created. Here is my management firewall ruleset as an example.
In my example, my BIG-IP has additional interfaces (NICs) and therefore additional networks and firewall rules. Make sure to allow appropriate application access (ex. 80/443) to the NICs on the BIG-IP that are processing application traffic. Here is an example for my external NIC.
Routes
I didn't touch these. However, it is important to review the VPC network routes and make sure you have a default gateway if required. If the network is meant to be internal/private, then it's best to remove the default route pointing to internet gateway. Here is an example of my management network routes in use.
Prerequisites - Service Account
To do auto-discovery of pool members (aka Service Discovery), the BIG-IP device requires a role assigned to its VM. When a VM instance has an assigned role in the cloud, it will inherit permissions assigned by the role to do certain tasks like list compute instances, access cloud storage, re-map elastic IPs, and more. This avoids the need to hard-code credentials in application code. In the example of service discovery, we need the service account to have a minimum of "Compute Viewer" or "Compute Engine - Read Only" permissions. Other deployment examples may require more permissions such as storage permissions or pub-sub permissions.
Create a new service account in the IAM section, then find "Service accounts". Here's an example of a new service account and role assigned.
If done correctly, it will be visible as an IAM user. See below for example.
Prerequisites - Pools Members Tagged Correctly
Deploy the VM instances that will run your app...the pool members (e.g. http server running port 80) and add a "Label" name with value to each VM instance. For example, my label name = app. My label value = demo. Any VM instance in my project with label name = app, value = demo will be discovered by the BIG-IP and pulled in as new pool member(s).
Example...
Pre-reqs are done!
Download and Customize YAML Template File
The Google Cloud templates make use of a YAML file, a PY file, and a schema file (requirements and defaults). The GitHub README contains helpful guidance. Visit the GitHub site to download the necessary files. As a reminder, this demo uses the following 3nic standalone template:
Scroll down to the "Deploying the template" section to review the requirements. Download the YAML file to your desktop and ALSO make sure to download the python (PY) and schema files to your desktop. Here's an example of my modified YAML file. Notice some fields are optional and not required by the template as noted in the GitHub README file parameters table. Therefore, 'mgmtSubnetAddress' is commented out and ignored, but I left it in the example for visualization purposes.
# Copyright 2019 F5 Networks All rights reserved. # # Version 3.2.0 imports: - path: f5-existing-stack-payg-3nic-bigip.py resources: - name: f5-existing-stack-payg-3nic-bigip type: f5-existing-stack-payg-3nic-bigip.py properties: region: us-west1 availabilityZone1: us-west1-b mgmtNetwork: jgiroux-net-mgmt mgmtSubnet: jgiroux-subnet-mgmt #mgmtSubnetAddress: <DYNAMIC or address> restrictedSrcAddress: 0.0.0.0/0 restrictedSrcAddressApp: 0.0.0.0/0 network1: jgiroux-net-ext subnet1: jgiroux-subnet-ext #subnet1Address: <DYNAMIC or address> network2: jgiroux-net-int subnet2: jgiroux-subnet-int #subnet2Address: <DYNAMIC or address> provisionPublicIP: 'yes' imageName: f5-bigip-15-0-1-0-0-11-payg-best-25mbps-190803012348 instanceType: n1-standard-8 #mgmtGuiPort: <port> #applicationPort: <port port> #ntpServer: <server server> #timezone: <timezone> bigIpModules: ltm:nominal allowUsageAnalytics: 'yes' #logLevel: <level> declarationUrl: default
Deploy Template
Make sure you point to the correct file location, and you're ready to go! Again, reference the GitHub README for more info.
Example syntax...
gcloud deployment-manager deployments create <your-deployment-name> --config <your-file-name.yaml>
Attach Service Account to BIG-IP VM
Once deployed, you will need to attach the service account to the newly created BIG-IP VM instance. You do this by shutting down the BIG-IP VM instance, binding a service account to the VM, and then starting the VM again. It's worth noting that the template can be easily modified to include service account binding during VM instance creation. You can also do this via orchestration tools like Ansible or Terraform.
Example...
gcloud compute instances stop bigip1-jg-f5-sd gcloud compute instances set-service-account bigip1-jg-f5-sd --service-account svc-jgiroux@xxxxx.iam.gserviceaccount.com gcloud compute instances start bigip1-jg-f5-sd
Login to the BIG-IP and Set Password
You should have a running BIG-IP at this point with attached service account. In order to access the web UI, you'll need to first access SSH via SSH key authentication and then set the admin password There are orchestrated ways to do this, but let's do the old fashion manual way.
First, go to Google Cloud Console, and view properties of the BIG-IP VM instance. Look for the mgmt NIC public IP.
Note: In Google Cloud, the BIG-IP mgmt interface is swapped with NIC1
Open your favorite SSH client and access the BIG-IP. Make sure your SSH key already exists in your Google Console. Instructions for uploading SSH keys are found here.
Example syntax...
ssh admin@x.x.x.x -i /key/location
Update the admin password and save config while on the TMOS CLI prompt. Here's an example.
admin@(bigip1-jg-f5-sd)(tmos)# modify auth user admin password myNewPassword123! admin@(bigip1-jg-f5-sd)(tmos)# save sys config
Now access the web UI using the mgmt public IP via https://x.x.x.x. Login with admin and the newly modified password.
Configure Service Discovery with iApp
The BIG-IP device is very programmable, and you can apply configurations using various methods like web UI or CLI, iApps, imperative APIs, and declarative APIs. For demo purposes, I will illustrate the iApp method in this section. The F5 cloud templates automatically include the Service Discovery iApp as part of the onboard and build process, but you'll still need to configure an application service.
First, the CLI method is a quick TMSH one-liner to configure the app service using the Service Discovery iApp. It does the following:
- creates new app service called "serviceDiscovery"
- uses "gce" (Google) as provider
- chooses a region "default" (causes script to look in same region as BIG-IP VM)
- chooses intervals and health checks
- creates new pool, looks for pool tag:value (app=demo, port 80)
tmsh create /sys application service serviceDiscovery template f5.service_discovery variables add { basic__advanced { value no } basic__display_help { value hide } cloud__cloud_provider { value gce } cloud__gce_region { value \"/#default#\" } monitor__frequency { value 30 } monitor__http_method { value GET } monitor__http_verison { value http11 } monitor__monitor { value \"/#create_new#\"} monitor__response { value \"\" } monitor__uri { value / } pool__interval { value 60 } pool__member_conn_limit { value 0 } pool__member_port { value 80 } pool__pool_to_use { value \"/#create_new#\" } pool__public_private {value private} pool__tag_key { value 'app'} pool__tag_value { value 'demo'} }
If you still love the web UI, then go to the BIG-IP web UI > iApps > Application Services. If you executed the TMSH command above, then you should have an app service called "serviceDiscovery". Select it, then hit "Reconfigure" to review the settings. If no app service exists yet, then create a new app service and set it to match your environmental requirements. Here is my example.
Validate Results of Service Discovery with iApp
Review the /var/log/ltm file. It will show pool up/down messages for the service discovery pool. It will also indicate if the service discovery script is failing to run or not.
tail -f /var/log/ltm
Example showing successful member add to pool...
Another place to look is the /var/log/cloud/service_discovery/get_nodes.log file. You'll see messages showing the script query and status.
tail -f /var/log/cloud/service_discovery/get_nodes.log
Example showing getNodes.js call and parameters with successful "finished" message...
Last but not least, you can check the UI within the LTM > Pools section.
Note: Service Discovery with iApp complete
Attach this new pool to a BIG-IP virtual server, and now your app can dynamically scale. I'll leave the virtual server creation up to you! In other words...challenge time! For additional methods to configure service discovery on a BIG-IP, continue reading.
Configure Service Discovery with AS3 (declarative option)
As mentioned earlier, the BIG-IP device is very programmable. We used the iApp method to automate BIG-IP configs for pool members changes in the previous section. Now let's look at a declarative API approach using AS3 from the F5 Automation Toolchain. You can read more about AS3 - here.
At a high level, AS3 enables L4-L7 application services to be managed in a declarative model. This enables teams to place BIG-IP security and traffic management services in orchestration pipelines and greatly eases the configuration of L4-L7 services. This also has the benefit of using consistent patterns to deploy and migrate applications.
Similar to the Service Discovery iApp...the AS3 rpm comes bundled with the handy F5 cloud templates. If you deployed via alternative methods, if you do not have AS3 rpm pre-loaded, if you want to upgrade, or if you simply want a place to start learning, review the Quick Start AS3 Docs. Review the Additional Declarations for examples on how to use AS3 with iRules, WAF policies, and more.
Note: AS3 is a declarative API and therefore no web UI exists to configure L4-L7 services.
Postman will be used in my example to POST the JSON declaration. Open Postman, authenticate to the BIG-IP, and then post the app declaration. Learn how by reviewing the Quick Start AS3 Docs. Here's my example declaration. It does the following:
- creates new application (aka VIP) with public IP of 10.1.10.34
- uses "gce" (Google) as cloud provider
- defines tenant as "Sample_sd_01"
- chooses a region "us-west1" in which to query for VMs
- creates new pool 'web_pool' with members matching tag=app, value=demo on port 80
{ "class": "ADC", "schemaVersion": "3.0.0", "id": "urn:uuid:33045210-3ab8-4636-9b2a-c98d22ab425d", "controls": { "class": "Controls", "trace": true, "logLevel": "debug" }, "label": "GCP Service Discovery", "remark": "Simple HTTP application with a pool using GCP service discovery", "Sample_sd_01": { "class": "Tenant", "verifiers": { }, "A1": { "class": "Application", "template": "http", "serviceMain": { "class": "Service_HTTP", "virtualAddresses": [ "10.1.1.34" ], "pool": "web_pool" }, "web_pool": { "class": "Pool", "monitors": [ "http" ], "members": [ { "servicePort": 80, "addressDiscovery": "gce", "updateInterval": 1, "tagKey": "app", "tagValue": "demo", "addressRealm": "private", "region": "us-west1" } ] } } } }
Validate Results of Service Discovery with AS3
Similar to the iApp method, review the /var/log/ltm file to validate AS3 pool member discovery. You'll see basic pool member up/down messages.
tail -f /var/log/ltm
Example showing successful member add to pool...
Another place to look is the /var/log/restnoded/restnoded.log file. You'll see messages showing status.
tail -f /var/log/restnoded/restnoded.log
Example showing restnoded.log sample logs...
Last but not least, you can check the UI within the LTM > Pools section. AS3 is multi-tenant and therefore uses partitions (tenants). Make sure to change the partition in upper-right corner of web UI if you don't see your config objects.
Change partition...
View pool and pool member...
As a bonus, you can test the application from a web browser. AS3 deployed full L4-L7 services in my example. Therefore, it also deployed a virtual server listening on the value in declaration parameter 'virtualAddresses' which is IP 10.1.10.34. Here is my virtual server example...
This IP of 10.1.10.34 maps to a public IP associated with the BIG-IP VM in Google Cloud of 34.82.79.120 on nic0. See example NIC layout below...
Open a web browser and test the app on http://34.82.79.120.
Note: Service Discovery with AS3 complete
Great job! You're done! Review the Appendix sections for more information.
Summary
I hope you enjoyed this writeup and gained some new knowledge along the way. I demonstrated a basic Google Cloud network, deployed an F5 BIG-IP instance using F5 cloud templates, and then configured service discovery to dynamically populate pool members.
As for other general guidance around BIG-IP and high availability designs in the cloud, I'll leave those details for another article.
Appendix: Google Networking and BIG-IP Listeners
In my example, I have an external IP mapping to the BIG-IP VM private IP on nic0...no forwarding rules. Therefore Google NATs the incoming traffic from 34.82.79.120 to the VM private IP 10.1.10.34. The BIG-IP virtual server listener will be 10.1.10.34. On the other hand, Google Cloud forwarding rules map public IPs to VM instances and do not NAT. Therefore a forwarding rule of 35.85.85.125 mapping to my BIG-IP VM will result in a virtual server listener of 35.85.85.125. Remember...
- External public IP > VM private IP mapping = NAT to VM
- Forwarding rule public IP > VM instance = no NAT to VM
Learn more about Google Cloud forwarding rules in the following links:
- Forwarding Rule Concepts - how rules interact with Google LBs
- Using Protocol Forwarding - allowing multiple public IPs to one VM with forwarding rules
Appendix: Example Errors
This is an example of incorrect permissions. You will find this in /var/log/ltm. The iCall script runs and calls the cloud provider API to get a list of pool members.
Jan 24 23:40:00 jg-f5-sd err scriptd[26229]: 014f0013:3: Script (/Common/serviceDiscovery.app/serviceDiscovery_service_discovery_icall_script) generated this Tcl error: (script did not successfully complete: (jq: error: unxpected response from node worker while executing "exec /bin/bash -c "NODE_JSON=\$(curl -sku admin: https://localhost:$mgmt_port/mgmt/shared/cloud/nodes?mgmtPort=$mgmt_port\\&cloud=gce\\&memberTag=app=..." invoked from within "set members [exec /bin/bash -c "NODE_JSON=\$(curl -sku admin: https://localhost:$mgmt_port/mgmt/shared/cloud/nodes?mgmtPort=$mgmt_port\\&cloud=gce\\&m..." line:4))
Review the iCall script to see the curl command running against the cloud provider. Run it manually via CLI for troubleshooting.
curl -sku admin: "https://localhost/mgmt/shared/cloud/nodes?mgmtPort=443&cloud=gce&memberTag=app=demo&memberAddressType=private&memberPort=80&providerOptions=region%3d" {"error":{"code":500,"message":"Required 'compute.instances.list' permission for 'projects/xxxxx'","innererror":{"referer":"restnoded","originalRequestBody":"","errorStack":[]}}}
From the output, I had the wrong permissions as indicated in the logs (error 500 permissions). Correct the permissions for the service account that is in use by the BIG-IP VM.
- Ted_ByerlyRet. Employee
Great article.. Keep up the focus.. I learn something every time.
- Jeff_Giroux_F5Ret. Employee
Feel free to reach out if you have any question on the setup or actual use. Thx!