Testing strategies
Along the SDLC lifecycle there are a lot of different types of testing. Throughout this post, I am going to talk about why we do testing. The different types of testing and common patterns and pitfalls people fall into.
Why do we do testing? On the surface this sounds easy, but if we want to take a data driven decision it becomes more difficult to answer assertively. What does the existence of tests tell us about the quality of the software? Nothing really, its difficult to answer. How about the existence of no testing? Well, yeah. If there is no testing, then we do not know if it is working or not. So, applying the converse to no testing we get → testing is better than no testing. This naturally sparks a discussion of how much testing?
How much testing do we need? You are not going to like this answer, but, it depends. This is where the two leadership principles, insist on the highest standards and customer obsession come into play. Something like a control system for a billion dollar space ship needs extensive testing, because the cost of failure is massive. On the other hand, an internal service that is not directly customer facing can get away with less testing, because the cost of failure is negligible.
There are four primary types of testing: unit testing, modular testing, integration testing, and canary testing. Many people bundle unit testing and modular testing together, as they tend to solve the same problems and can operate interchangeably. But, they are distinct forms of testing.
Unit testing
Unit testing is ran during build time. Since it is ran during build time, coverage is measurable and many code bases will set a coverage target. The goal of unit testing is to ensure the logic is operating correctly. This means that tests are making sure that a function or class is behaving appropriately. The two common forms of measurement are line and branch coverage. Line coverage measures the percentage of lines that are covered by tests to those not. Similarly, branch coverage measures the percentage of branches (if/else conditions) that are covered by tests to those not.
Modular testing
Modular testing is ran during build time. It is functionally similar to both unit testing and integration testing. During modular testing, only integration services are mocked out and the rest of the functionality is tested. Lets say there is a micro-service that consists of 2 interconnected lambdas. Rather than write individual tests for each function of the lambda, the lambda input/output is tested and verified. With multiple lambdas (perhaps in a chain of sqs/kinesis), the test triggers one lambda and then the next. Similarly, if you are testing an http or rpc server, the best form of testing is to spin up the server and execute tests against it. Run a CRUD operation on your server in a test environment locally to ensure that the CREATE, interacts well with the READ, which interacts well with the UPDATE and DELETE operations. Testing each individually, they may work, but putting them all together is there a nuance?
Integration testing
Integration testing is to test the integrations with dependent services. In both unit testing and modular testing, all dependencies are mocked. Integration testing is the time to make sure that integrations with dependent services is working as expected. By definition, since we are testing integrations, this has to be run on actual hardware in an actual environment, not on a macbook (unless that is your target). In cloud development, this means deploying your code to lambda and then invoking the lambda. Or, if the lambda is invoked by a step function, invoking the step function to make sure the lambda is executed correctly.
Integration tests MUST be ran in a deployment pipeline. I recommend, where possible, run as much of the integration testing as you can in development environments as well. Best practice is to run integration tests in all stages, so if your pipeline consists of beta/gamma/prod, the integration test runs in beta/gamma/prod. The integration tests act as a first level of detection to make sure that integrations are working correctly, if you do not run the integration test in prod - how do you know the integration is working correctly?
While all tests must be deterministic, it is imperative for integration testing to be deterministic! If a test is flaky, then people just hit “retry”. If the retry works, then a retry becomes common practice. The retry is just masking an actual issue. If you build a culture of debugging all test failures and you have confidence in your testing, then all tests must always pass.
Where possible, you want to integrate with your dependency prod services. Think about your integration with Dynamo or S3, are you using their gamma endpoints in your gamma environment? No. So, why would you integrate with another services gamma environment? The only time that you should integrate with another services gamma environment is for a tightly coupled feature development, where both teams are incrementally building software features. But, this type of software is at opposition of a CI/CD process, because changes are being backed up in gamma waiting for both systems to work. Instead, the services should integrate with each others prod services, and have appropriate feature flagging or authorization in place to enable/disable feature flags. This pattern will also help your system deploy rollback safe changes, which lends itself well to other processes in CI/CD deployments.
Craig, but seriously, they will not let us integrate with their prod and their gamma is always broken. In this case, you really only have a few options. I recommend you escalate this problem and say “our dependency will not let us integration test with them, blocking us from having integration tests”. But, if you must push forward these are your options. 1/ Do nothing. Don’t test your integrations and fly blind. Or, have tests with no confidence. Both are equally bad. 2/ Integrate with their gamma. If their gamma is broken then your pipeline is broken and you cannot push changes. This is frustrating but safe, as long as you have the willpower to not override (which puts. you in option 1). 3/ Mock their service. Mocking the service is commonly done through either mocking in the code by inspecting the request/environment or creating a proxy layer in front of the service. So, instead of calling your dependency directly, you are calling a proxy and the proxy can have mocking logic. This lets you narrow integration bugs into the proxy service, which is a simple proxy service and easy to debug. Additionally, this lets you test integrations with the other systems that do let you perform integration testing.
Canary testing
Canaries are tests that run on a recurring basis and mimic customer traffic. They derive their name from canaries in coal mines, where the canary is a first line of defense against a problem (odorless gas suffocating you). Canaries are always a strict subset of integration testing. There are a few common canaries that most services use: sunny day scenario, rainy day scenario, security (AuthN/AuthZ), performance. The canary tests should be tested at deployment time with an integration test, because why would you wait 5 mins or 60 minutes (whatever the canary duration is) to detect an issue?
Best practices are to run canaries in a different account than your production AWS account. This simulates traffic asif it was a customer, and prevents some special cases of handling “internal” traffic. Your canary path should never have a special path different than customer workflows, except in some scenarios when publishing metrics.