Tired of a Selenium flaky test stopping your deployment? A flaky test is a test that produces inconsistent results in different runs. It might pass initially and fail on subsequent executions for no apparent reason. Clearly, there are some underlying reasons for that unpredictable behavior.
Flaky tests pose a significant challenge to CI systems as they contribute to seemingly arbitrary pipeline failures. Here is why it is so crucial to avoid them!
In this guide, you will understand what flaky tests are and delve into their primary causes. Next, you will explore some best practices to avoid writing flaky tests in Selenium.
Let’s dive in!
Flaky Test: Definition and Main Causes
A flaky test is a test that produces different results for the same commit SHA. Another guideline for identifying such a test is: “If it succeeds on the branch but fails after merging, it might be a flaky test.โ
Thus, flakiness is something related to a test rather than a testing technology. You can say that a Selenium test is flaky, but you should not say that Selenium is flaky.
The impact of flaky tests is particularly significant within a CI pipeline. Due to their inconsistent nature, they lead to unpredictable failures for the same commit across multiple deploy attempts. Because of them, you need to configure the pipeline to run multiple times upon failure. This causes delays and confusion, as each deployment seems susceptible to seemingly random failures.
Some of the main reasons for a test to show flakiness behavior include:
- Slowdowns: If the application under test experiences slowdowns, timeouts used in the test may intermittently cause failures.
- Race conditions: Simultaneous operations on a dynamic page can result in unexpected behavior.
- Bugs: Specific choices in test logic implementation can contribute to test flakiness.
These factors can individually or collectively contribute to flakiness. Letโs now see how to protect against them with some Selenium best practices!
Techniques to Avoid Writing Flaky Tests in Selenium
Explore the best methods backed by the official documentation to avoid flaky tests in Selenium.
Note that the code snippets below will be in Java, but you can easily adapt them to any other programming language supported by Selenium.
Make Sure You Are Using the Latest Version of Selenium
Selenium is a cross-browser and cross-platform technology that is available in several programming languages. At the same time, not all of the Selenium binding libraries out there work with the latest version of Selenium.
For instance, consider the unofficial Selenium WebDriver Go client tebeka/selenium
. Despite its popularity with thousands of GitHub stars, the library has not been updated for years and still relies on Selenium 3.
That could be the cause of your Selenium flaky tests. The reason is that Selenium 3 uses the JSON wire protocol for communicating with the web browser from the local end. JSON wire protocol is not standardized and might produce different results on different browsers. For this reason, Selenium 4 deprecated it in favor of the standardized and more reliable W3C WebDriver protocol. In short, Selenium 4 tests are inherently less flaky than Selenium 3 tests!
To avoid issues in your test suite, always ensure that your Selenium client library is using the latest version of Selenium. In particular, you should always adopt the official Selenium bindings, which leads to the next recommendation.
Prefer Official Selenium Bindings
As of this writing, Selenium bindings are officially available in C#, Ruby, Java, Python, and JavaScript. However, Selenium is so popular that there are a myriad of unofficial bindings in other programming languages. As mentioned earlier, some of them are just as popular as the official ones. A good reason may be that if you have written an application in a particular programming language, you probably want to write tests in that language as well.
Although that choice makes sense from a logical point of view, it may not be the best from a technical standpoint. Relying on an unofficial port means depending on updates from the community. If the contributors behind the project do not have time to keep up with the pace of official releases, you will always use an older version of the Selenium testing technology. Although a test is usually flaky for reasons that go beyond the technology in use, that is not always the case. Older versions of Selenium are known to be buggy, slow, and to offer now-deprecated APIs that should no longer be used.
Keep also in mind that not all official Selenium bindings are the same. For example, at the Selenium TLC meeting on January 5, 2023, it was pointed out that the Ruby binding tended to produce flaky results with Firefox on Windows. Thus, the recommended approach is to write Selenium tests with one of the officially supported languages, keeping an eye on the official site to see which binding is the most reliable and complete.
Write Generic Locators
One of the key aspects of writing robust E2E tests is the use of effective selection strategies for HTML nodes. Selenium supports several methods to select HTML nodes:
- By class name: Locates elements whose
class
attribute contains the specified value. - By CSS selector: Locates elements matching a given CSS selector.
- By id: Locates elements whose HTML
id
attribute matches the specified value. - By name: Locates elements whose HTML
name
attribute matches the search value. - By link text: Locates anchor elements whose visible text matches the search value.
- By partial link text: Locates anchor elements whose visible text contains the search value. If multiple elements are matching, only the first one will be selected.
- By tag name: Locates elements whose HTML tag name matches the search value
- By XPath expression: Locates elements matching the given XPath expression.
These include XPath and CSS selectors, the two most popular ways to select HTML nodes on a page. Note that choosing one selector strategy or the other can make all the difference. This is because the dynamic nature of the DOM in modern JavaScript-based pages can lead to flaky testing with improper selectors.
Consider this CSS selector:
div.container > header#menu > li:nth-child(3) > button.subscribe-button
This achieves its goal, but it is too long and tightly coupled with the HTML structure. A simple change in the DOM structure will lead to test failure.
In general, strive to write CSS or XPath selectors that are as generic as possible. Selectors tied too closely to the implementation lead to flaky behavior, especially when dealing with dynamic DOMs that change with user interaction.
Instead, prefer simpler and more robust CSS selectors like:
.subscribe-button
As a rule of thumb, remember that the class
attribute of an HTML element in the DOM can change dynamically, while its ARIA role on the page is less likely to change that easily. Also, target HTML attributes that are unlikely to change, like the id
attribute.
Use Implicit and Explicit Waits, Not Hard Waits
In E2E testing, you typically need to wait for dynamic operations to complete or for specific events to occur. A simple solution you may think of is to use a method like Thread.sleep()
`, which pauses test execution for a specified duration. This approach is called โhard waiting.โ While this correctly implements the waiting behavior, it also leads to Selenium flaky tests.
The problem is that you cannot know beforehand what the right time to wait for is. The wait time specified in a hard wait may seem reasonable for your configuration, but turn out to be too short or long for other environments. A common CPU or network slowdown will cause your test to fail. Plus, hard waits introduce delays in the tests and slow down your entire suite. There are all good reasons to never use them.
As a more reliable alternative, Selenium supports two built-in waiting strategies: implicit waits and explicit waits. Letโs analyze them both!
Implicit waits are set via a timeout as a global setting that applies to every element location call in a testing session. The default value is 0
, which means that if an HTML node is not found, the test will raise an error immediately. When an implicit wait is set, the driver will wait until the specified time value while locating an element before returning the error.
Note that as soon as the element is located, the driver returns the reference to the element. So, a large implicit wait value does not necessarily increase the duration of the testing session.
This is how you can define the implicit wait timeout in Java:
// set an implicit wait of up to 10 seconds on element location calls
driver.manage().timeouts().implicitlyWait(Duration.ofSeconds(10));
Check out the docs to see how you can set it in other programming languages.
Suppose you now want to select the #subscribe
element on the page:
WebElement subscribeButton = driver.findElement(By.id("subscribe"));
Selenium will automatically wait up to 10 seconds for the #subscribe
node to be in the DOM before raising the NoSuchElementException
below:
Exception in thread "main" org.openqa.selenium.NoSuchElementException: no such element: Unable to locate element: {"method":"css selector","selector":"#subscribe"}
At the same time, implicit waits may not be enough to avoid flaky tests in Selenium. After selecting an element on the DOM, you generally want to interact with it. Well, keep in mind that an HTML node might be in a non-interactive state at a given moment. Therefore, it is crucial to wait for elements to be in the correct state before interacting with them. This is where explicit waits come in!
In Selenium, explicit waits are loops that poll the test for a specific condition to evaluate as true before exiting the loop and continuing to the next instruction. If the condition is not met before the specified timeout, the test will fail with a TimeoutException
. Explicit waits are implemented through the WebDriverWait
API interface. By deafult, WebDriverWait
automatically waits for the designated element to exist in the page.
For example, use an explicit wait to check that an HTML node is clickable before calling the click()
method on it:
// wait up to 10 seconds
WebDriverWait wait = new WebDriverWait(driver, 10);
// find the element
WebElement subscribeButton = driver.findElement(By.id("subscribe"));
// wait for the element to be clickable
wait.until(ExpectedConditions.elementToBeClickable(subscribeButton));
// click the element
subscribeButton.click();
The above example relies on an expected condition method. Expected conditions are special methods supported by the Java, Python, and JavaScript Selenium binding. These allow you to check for conditions like:
- Element exists
- Element is stale
- Element is clickable
- Element is visible
- Text inside the element is visible
- Page title contains the specified value
Take a look at the ExpectedConditions
class to see all expected conditions supported by Java.
โ ๏ธWarning: Do not mix implicit and explicit waits in a single test. This can lead to unpredictable wait times and potential flaky behavior!
Set the Right Timeouts
Seleniumโs default timeout values are designed to cover most scenarios. Yet, they may be too short in some specific scenarios and lead to flakiness in your tests. The timeouts you should keep in mind are:
- Script timeout: Maximum time a JavaScript script executed with
executeScript()
can take before Selenium interrupts it. The default value is30000
milliseconds (30 seconds). - Page load timeout: Maximum time a page can take for the
readyState
property to signalcomplete
while the driver is loading it in the current browsing context. The default timeout is300000
milliseconds (5 minutes). If a page takes longer than that vaue to load, the test will raise aTimeoutException
. - Implicit wait timeout: Specifies the time to wait for the implicit element location strategy when locating elements. The default timeout
0
, which means no timeout.
Considering the dynamic nature of modern web pages, bad timeout values are a primary cause of Selenium flaky tests. A temporary slowdown on the local machine or service the application relies on and your tests will fail.
To configure the script timeout globally in Java, use the scriptTimeout()
method:
// set the Selenium script timeout to 100 seconds
driver.manage().timeouts().scriptTimeout(Duration.ofSeconds(100));
Similarly, you can set the page load timeout with pageLoadTimeout()
:
// set the Selenium page load timeout to 10 minutes
driver.manage().timeouts().pageLoadTimeout(Duration.ofMinutes(10));
Again, set the implicit wait timeout with implicitWait()
:
// set the Selenium implicit wait timeout to 10 seconds
driver.manage().timeouts().implicitWait(Duration.ofSeconds(30));
Other General Tips
Here are some other considerations you should keep in mind to avoid flaky tests in Selenium:
- Prefer simple unit tests over long E2E tests: Due to their complexity, long end-to-end tests are inherently more prone to flakiness compared to simple unit tests. When testing an entire stream of users in a web application, there may be a lot of moving parts. That means more chances for things to unexpectedly go wrong.
- Run your tests on the same configuration as your CI: If tests work locally but fail in the CI/CD pipeline, investigate differences among the two testing environments. Different operating systems or configurations can be the cause of flaky behavior.
- Handle spinners properly: When dealing with spinners, ensure you wait for their visibility before checking for invisibility. Checking directly for their invisibility before taking a particular action can lead to flaky results. This is because spinners may take time to get displayed on a page.
To better understand the last example, take a look at the snippet below:
// click on the "Load More" button
WebElement loadMoreButton = driver.findElement(By.cssSelector(".load-more"));
loadMoreButton.click();
// wait up to 10 seconds for a specific action to occur
WebDriverWait wait = new WebDriverWait(driver, 10);
// visibility check required to avoid flaky results
WebElement spinner = wait.until(ExpectedConditions.visibilityOfElementLocated(By.cssSelector(".data-spinner")));
// wait for the spinner to disappear
wait.until(ExpectedConditions.invisibilityOf(spinner));
// deal with the newly loaded data...
As you can see, you should first check for spinner visibility before you check for invisibility. The reason is that the “Load More” button will disappear and be replaced by a spinner element dynamically. This will only be present on the page for as long as new elements are loaded and rendered. Without the visibility check, the above logic would be flaky.
How to Deal With a Flaky Test in Selenium
The techniques outlined above help minimize flaky tests, but you cannot really eliminate them altogether. So, what should you do when discovering a flaky test in Selenium? A good strategy involves following these three steps:
- Find the root cause: Run the flaky test several times and inspect it with a debugger to understand why it produces inconsistent results.
- Implement a solution: Fix the test logic to address the issue. Next, execute the test locally several times and under the same conditions that lead to the flaky results to ensure that it now works all the time.
- Deploy the updated test: Verify that the test now generates the expected outcomes in the CI/CD pipeline.
For more information, refer to our guide on how to fix flaky tests.
Conclusion
In this article, you saw the definition of a flaky test and what implications it has in a CI/CD process. In detail, you explore some Selenium best practices to tackle the causes behind flaky tests. Thanks to them, you can now write robust tests that produce consistent results. Even if you cannot eliminate flaky tests forever, reducing them to the bare minimum is possible. Keep your CI pipeline safe from unpredictable failures!
Learn more about flaky tests:
- What is a Flaky Test? How to Fix Flaky Tests?
- How to Avoid Flaky Tests in Playwright
- Handling Flaky Tests in LLM-powered Applications