F5 OpenStack Testing Methodology: Part Two
In our first article, we briefly discussed the why and the how of testing for the OpenStack team. In this article, we'd like to elaborate more on the testing methodology and some of the drivers behind the tests we create. Curiously, this isn't defined by the categories of unit/function/system tests, but in a more realistic way. We like to accumulate the above categories of tests in a way that gives us well-rounded coverage of our feature. Here's how we think about testing.
Use-Case Tests:
This is less of a type of test and more of a mindset to be in while developing a test plan. First and foremost, we develop our tests with the customer use-case(s) in mind. These are what we often refer to as 'happy-path' tests, because the customer certainly doesn't hope for something to go wrong. They hope for all things to come up roses, and the first tests we write ensure this is true. It may seem like the low-hanging fruit, but it is an important step in the testing automation process to convince ourselves that, all things being perfect in the universe, our new feature can accomplish what it needs to. Manual testing is often the true first step to vetting a new feature, but it's prone to error and it may assume some tribal knowledge that new developers don't have. The use-case tests give us a first-glimpse into new functionality, and if we find bugs here, they are very often critical in nature.
The nature of a use-case is pretty simple: we use the requirements specifications for the feature, verbatim, as inspiration for the test. This may mean talking to the customer directly for further clarification on their expectations. They tell us they want X, Y, and Z when they provide the product with A, B, and C, and we write a test that does exactly that. We may go a step further and do additional validations based on our intimate knowledge of the feature, but the core of the test is driven by the customer's needs.
One important thing to note here is that these tests may not all be full Tempest tests (end-to-end system tests). They are a mixture of unit, functional, and system tests. The unit tests are often written as we develop, to offer some white-box testing of functions or modules. The functional tests may require a real BIG-IP device, but no actual OpenStack installation.
Negative Tests:
The negative tests (unhappy path) aim to ensure the feature falls over very gracefully, like a ballet dancer, if something ever goes wrong. When developing this type of test, we put on our creative hats, and try to come up with interesting ways in which things could go awry. An example might be a link flapping on some network in the communication pathway or simple network lag. Another might be a mistake in the configuration for the feature. This type of test can be approached in two general ways, black-box and white-box. The black-box tests are better suited for Tempest and functional tests, because they reproduce real-world situations, such as an untested environment, with real infrastructure. The unit tests are good for white-box testing, because we can craft a specific set of arguments to a function/method that we know will fail. Then we ensure evasive action is taken appropriately or the proper log messages show up to tell someone that something went wrong.
Negative tests are where we have a whole lot more fun. Happy path tests generally evoke a small smile of victory, but we're only really excited when we start breaking things. We make notes while writing code that 'this' function would be good for a unit test, and 'this' may be suited to a Tempest test. Our notes derive from knowledge that a specific algorithm is terrific when we are in a happy path situation, but outside of that, the code may be peanut brittle. Mentally, we say, "Yes, this works in one way, but we need to really batter it with tests." This often stems from a deep suspicion of ourselves and our own abilities, but that is a topic of another article. We are never happy with our negative tests until we find bugs. If we don't find issues, then we aren't writing our tests correctly.
Stress/Fuzz/Performance Tests:
The remainder of testing encapsulates a whole host of use-cases, edge testing, and stress testing. I think this would be a great topic for a third version of this article. There are many paths to go by here, and we can really turn on the creativity again to ensure our feature survives in the face of great adversity (like Sean Astin in Rudy).
Story Time:
As an example of most of the above, I'll demonstrate how we wrote tests for the new Layer 7 Content Switching feature in our OpenStack LBaaSv2 Agent and OpenStack LBaaSv2 Driver. The feature provides a way for customers to use the BIG-IP (via OpenStack's Neutron LBaaS product) to shape traffic on their networks. This is done by deploying LTM Policies and LTM Policy Rules on the BIG-IP.
I developed the translation layer of the agent, which takes in the OpenStack description for layer 7 policies and rules and converts that into something the BIG-IP can understand. This translation takes in a JSON object and produces a new JSON object. So I naively started with unit tests that validated that the JSON object out is what I expected it to be based on the rules of translation. Then I quickly saw that it doesn't make a whole heap of difference what the result of the translation looks like if it cannot be deployed on the device. So I morphed these unit tests into functional-type tests with a real BIG-IP, where I deployed the policies and rules. This was just the first step in my happy path use-case test plan. It was kind of boring, but informative. I found a couple of bugs that will never embarrass me across the company. However, I knew much more exciting things were around the corner.
I then moved into writing system tests (Tempest) for testing the deployment of these policies and running real traffic through the BIG-IP to determine if the set of policies and rules shape the traffic in the way I expected them to. This is also happy path, but it made me mindful of more interesting types of use-case tests that would be good to implement. What if I were to create five policies, each with two rules, then reorder the policies from an OpenStack perspective. Is my traffic now steered the way I expect? Does the BIG-IP look like I expect it to?
Then we moved on to conducting negative tests. I say 'conducting' because they may not be automated, but we at least need to verify the feature fails in informative ways. If a policy fails to deploy, can I validate that the agent log has the proper message, so a customer isn't pulling their hair out trying to figure out what went wrong? One good gauge for such a thing is how informative are the failure messages to me as a developer? If I can't figure out what went wrong at first glance, then a customer likely would not either. And what happens if an OpenStack user attempts to create a rule we don't currently implement? These are the types of tests where we begin to rub our hands together like Dr. Evil. What happens when we want to steer traffic based on whether a request header called "W_AND_P" is defined and the value of the header is the first chapter of War and Peace? We want to find those flaws in our code before the customer does. That's our main driver.
We hope you have enjoyed our second installment of testing in OpenStack. Leave us some feedback for possible next topics of discussion or if you're looking for further information.