Use AI to optimize Selenium & UI tests

The key challenge facing development teams: CI/CD is not enough to reduce delivery times

As teams bring in CI/CD to speed up delivery, a realization emerges that CICD doesn’t magically improve delivery speed because the bottleneck for velocity is tests.

Selenium (and UI) test suites take too long to run

Selenium test suites are typically shifted later in the pipeline because they take too long to run, resulting in delayed feedback to developers (from hours to days and weeks). By the time the tests fail, developers have forgotten the context of the changes, and fixing the issue takes time. Additionally, teams have blown through their testing budget to run these tests.

UI test feedback is coming in late

Solution: generate dynamic UI subsets

Facebook pioneered the predictive test selection approach, and Launchable's Predictive Test Selection product has made the approach turn-key and accessible to every team. Launchable’s technology uses ML to predict which tests are likely to fail in your test suite based on the commits coming in. This pragmatic risk-based approach to testing gives another dimension (test execution times) to reduce the cost of testing without impacting either quality or speed of delivery.

Benefit: UI tests shifted left for faster feedback time

You run the tests that are likely to fail and box them as dynamically generated subsets.

Optimize for faster feedback

These subsets can then be moved earlier in the pipeline. These subsets can be run on a fixed cadence (multiple times daily) or per-commit basis. If there is a likely issue, it will be caught in minutes while the developer has the context to fix the problem. The net result is faster feedback and fast iterative development cycles.

Post-merge UI tests shifted left, bringing faster feedback to devs

Intelligent UI tests shifted left for faster feedback

Optimize for testing machine costs

If looking to optimize costs, subsets (less aggressive than the case in the previous sub-section) can replace the nightly runs, and the entire nightly run is now done weekly. In this case, the team has optimized to reduce machine costs while trusting that the nightly subset runs can catch the right amount of issues.

Launchable, thus provides a flexible approach where teams can trade off speed and costs during the application lifecycle to suit their organizational needs.

See how one of the customers shifted Selenium tests left to improve their feedback times drastically.

FAQs

How does this work with code coverage?

This is a complementary approach with a caveat. The caveat is that you no longer run code coverage report on every test suite run. Code coverage reports are run at a cadence that depends on what test suites are being run and what makes sense for your organization.

For example, if you optimized nightly integration test runs, you may run the code coverage report in the end of the week run.

You are asking me to think differently about testing. Is anybody else doing this?

Indeed, we are asking you to rethink testing and upgrade it in the new world of AI. Facebook and other leading companies are doing this in-house but it requires an in-house team of ML experts. A number of companies are using Launchable’s turn-key approach.

What about slippage or tests that you don’t catch?

Great question! What we are not saying is to ship bugs to production!

The answer is “it depends on your degree of risk”. For example, Sai (in the earlier use-case) was able to bring 90% reduction with 90% confidence. IOW, he was okay in letting 1/10 test failures pass because he is relatively early in the pipeline. He can do so because he has tests later in the pipeline that catch slippages.

Typically, customers do one or more of the following:

Have a defensive test run sometime later in the pipeline. This can show up as the same test suite has a “full run” later in the pipeline.
Depend on later test suites to catch with the issue. Thus, their approach is to bring this in earlier in the pipeline rather than later.
Couple this with practices like feature flags (applies to SaaS companies) such that they can rollback potential issues.
Be fairly conservative on the time/confidence ratio in the adoption phase of Launchable. For example, it is not uncommon for companies to start with something like 20-30% reduction in test times and slowly ramp that up over time.
Take the release phase into consideration in making the tradeoff. For example, for packaged software, it might be easy to be very aggressive early on and dial the tradeoff down as release approaches.

What do you mean by a pragmatic risk-based approach to testing?

Not every test is likely to fail on a commit but we still run them because we don’t know which test would likely fail. With the advancements in AI/ML, we can now predict which tests are likely to fail with a reasonable degree of confidence. If teams are okay with the reasonable degree of confidence, you can radically reduce execution times. How significant? 40-80% is typical and sometimes even more. Sai was able to reduce 90% of execution time at 90% confidence for 75 developers. We give a dial to the teams to make a data driven decision on the trade-off between confidence and execution time.

In our opinion, testing needs to evolve to catch up to the incredible advancements in AI. The technology is ready and here now, are you?

What do you mean by “In-place” reduction of tests?

Simply, we ask you to reduce the execution of a particular test suite wherever it is currently running (“in-place”) in the pipeline. This, as opposed to shifting a test suite left in the pipeline.

The Launchable Test Intelligence Platform also offers Test Insights to find inefficient test suites and Test Notifications to speed up feedback to developers.

Works with your existing tools, languages, and processes

Results in weeks—no months-long DevOps transformations

Launchable's ML-based approach means it can work with existing languages and tools. Developers start seeing their dev cycles go faster without changing their processes.