Handy Reference: RegEx in the Network
Over the years I’ve shared a lot of posts on using programmability in the network to do, well, a lot of different things. Like implement A/B testing, and Canary deployments, and proxying requests for memcached. All these patterns can be and are implemented by proxies that offer a platform for taking advantage of data path programmability.
One ingredient in the secret sauce that is data path programmability (a.k.a. programmability in the network) is the ability to match data. Usually that data is the URI, but sometimes it’s a cookie or the user-agent or even data in the payload. Basically, most deployment and scalability patterns that make intelligent decisions require extracting some piece of data and comparing it with pre-determined values to decide where to send the request. Conversely, many security-related patterns – such as credit card or account number scrubbing – also rely on being able to find a needle in the haystack that is the data.
In code, I might use a method / function standard in the language. Whether it’s Java or C/C++, Python or PHP, there are a veritable cornucopia of options available for matching strings within strings. For network and ops type folks, however, the gold standard for matching strings has got to be regex.
Now, I know some folks whose regex fu is so strong they can pop off the right expression without thinking about it. They are that good.
I am not one of those people. Which is why I found this post to be so, so, SO valuable that it needed to be shared. The target audience for the post is developers, but trust me when I say that it’s just as valuable for those folks who write code for the network. Maybe more so, because it includes snippets for seeking out IP addresses, checking password strengths, extracting a domain from a URL, stripping HTML comments (optimization on the fly, anyone?), and matching a URI string.
These are tasks that are fairly common for someone developing a new or enhancing an existing service that resides in the network. So having a quick reference to these regular expressions is certain to be of use at some point in the course of the next year.