In a Test-Driven Development, everything starts with writing the tests first. In this post, I discuss handling flakiness in automated tests. If you start your development by writing the tests first, it can help with setting up some assumptions. The challenge with this is that you are not always aware of edge cases. As the saying goes – you don’t know what you don’t know.
What is a flaky test?
To start with, we need to understand the basic definition of a flaky test. Nobody intentionally writes a flaky test. A flaky test is an automated test that gives different results at different executions even when there are no changes in the underlying code. The non-deterministic nature of such tests makes it harder to debug them.
Let’s look at the possible reasons why tests can become flaky.
Reasons for a flaky test
With CI/CD, automated tests have become part of the development process. It goes without saying that automated tests do increase confidence in the process and the product you are trying to build. As we discussed in the previous section, sometimes a test can become flaky even if there is no code change. What are the possible reasons for a flaky test?
- Bad Tests – If a developer does not invest enough time to understand the functionality they are trying to test, they can end up writing bad tests. Sometimes with bad data or with wrong logic. Bad tests do represent not enough understanding of the underlying system.
- Wrong assumptions – One of the major reasons for a flaky test is the wrong assumptions. When writing a test, the developer does not have all the information and they write the assumptions for a test with the requirements they have in hand, this can result in either wrong or incomplete assumptions. One of the easiest ways to figure out the flaky test is to challenge the assumptions for that test. I have also realized that time can be a dynamic concept and the human mind can not completely conceive the time and it does make it harder to write tests that can involve time elements. Surprisingly, the flaky tests either have time-bound or data-constraint issues.
- Concurrency – Whether the concurrency is part of CI/CD execution or user-added for executing tests, it can cause flakiness for automated tests.
How to debug the flakiness of automated tests?
Before you can debug a flaky test, you should check all the following notes to see if anything might be causing the flakiness of the test
- memory leak – Memory leaks can cause performance degradation, but can also cause tests to flake more often. You should check if your tests and test environment are functioning optimally.
- wait time – Check wait times between tests and for each test. Possibly even increase the wait time allowing each test to complete all the possible requests.
- challenge assumptions – The easiest way to debug a flaky test is to look at all the assumptions, and data and challenge them. If you challenge the assumptions and change them, you might find different results.
Remember flakiness of the test is most of the time because of assumptions made about data or the functionality.
Conclusion
In this post, we discussed the flakiness of tests in automated tests and how to debug such tests. Check the framework you are using or the assumptions you are making about the tests.