The Past, Present and Future of Browser Testing with David Burns, Core Selenium Contributor

In this podcast episode, I welcome David Burns, Head of OSPO at BrowserStack, Chair person for the W3C Browser Testing and Tools Working Group, core contributor on the Selenium Open Source Project. We talk about the past, present and future of browser testing, how to eradicate flaky tests and why it’s important to invest more effort into testing pyramid from the very beginning.

Key points:

David Burns on browser testing
WebDriver BiDi specification
How flaky tests are born
How to start new or migrate existing projects
Testing pyramid and how tools support it (or do they?)
Going BiDirectional with testing

Listen to the full conversation or read the edited transcript.

You can also get Semaphore Uncut on Apple Podcasts, Spotify, Google Podcasts, Stitcher, and more.

Like this episode? Be sure to leave a ⭐️⭐️⭐️⭐️⭐️ review on the podcast player of your choice and share it with your friends.

Edited Transcript

Darko Fabijan (00:02):
Hello, and welcome to Semaphore Uncut a podcast for developers about building great products. Today, I’m excited to welcome David Burns. David, thank you so much for joining us.

David Burns (00:11):
Thank you very much for having me.

Darko Fabijan (00:14):
Great. Can you please just go ahead and introduce yourself?

David Burns (00:17):
So I’m David Burns. I am a Head of open source at BrowserStack and what we look after core open source projects that are important to BrowserStack, and make sure that they are sufficiently funded by other people or with money depending on kind what the project is. We’re a small engineering group at the moment, looking to grow bigger. Our core projects are Selenium, Appium and Nightwatch.

In the past, spent nine and a half years working at Mozilla as engineer and then engineering manager. So I’ve spent a long time in the open source space. I’m also involved in standards bodies. So I’m the chair of the W3C’s working group for browser testing and tools. So anything that we want to standardize to work across the industry will probably come through our group at some points and we’ll see where and how we can make it standard, which makes it easier for all users who want to do testing.

Darko Fabijan (01:23):
Thanks for this introduction. And I’m sure that a lot of our listeners would love to hear about the space that you have spent a lot of time in as browser testing and web is definitely a dominant platform. A lot of people are on a daily basis writing tests that involve browser, also debugging them. So I was hoping you could give us a tour of this space that a lot of us are touching in one way or another.

Darko Fabijan (01:53):
But I think that most of us dive in just as much as we need to get components in place. Is it a Chrome driver? Selenium driver what’s this, keep going. But we do end up spending a lot of time on bugging. I think in most cases it’s our fault, but we also tend to blame it on technologies that we are using. So please, if you can elaborate on this topic.

On browser testing

David Burns (02:21):
So browser testing has been going for, oh, I don’t know, must be… I think Selenium turned 17 or 18 the other week, and it’s been important through the browser walls. So it got started there by Jason Huggins while he was a ThoughtWorks working on internal tooling and then another ThoughtWorks engineer created WebDriver at Simon Stewart and he took a different approach to how you should control a browser.

David Burns (02:50):
So Selenium would be controlling it from the outside in, which injects JavaScript into the page to manipulate events, as you need them. Where WebDriver was doing it slightly harder, but in a better way. So it would inject events behind the scenes.

So not actually in the page, which meant you had to have deeper integration to the browser. For certain browsers, that was really simple. Firefox, you just built an extension, you speak to the extension, which already has the internals, and with the Internet Explorer, you had to do weird and wonderful Calm interfaces, which if you’ve ever done Calm you know it’s painful, but it works and you always know roughly what you’re going to get.

David Burns (03:31):
And so we built up from there and then over the years, we just carried on improving the interfaces and things like that, until one day, one of the directors of engineering at Opera approached the project and said, you should make this a standard, we can see where you’re going. We think it’s important and you should make it a standard.

David Burns (03:55):
And so Simon came back to the project and I was working as Mozilla let the time and we had a chat and we said, yeah totally we should make this a standard. It makes perfect sense. And that was in around 2011. And so we built up the standard to what it is now, what browser vendors have all implemented. There’s a new standard coming out slowly but surely because what we did then and where the market is going is changing.

WebDriver BiDi specification

David Burns (04:25):
And so we need to update that, which leads to the WebDriver BiDi specification. So Bidi means bidirectional, in the past, if you wanted to use Selenium, you would have to pull the browser. Is this thing ready? Which can lead to flakiness.

David Burns (04:43):
And it’s partly because people don’t fully understand how that side of things can work. And as you said, people only deep dive as far as they need to go. And sometimes you need that little bit deeper to understand where the problems are going to become later. And so that’s allowing us to have new APIs that are similar to newer products that are hitting the market and improve how people do their testing. And I think that’s the key part here is Selenium willing to move with the times and improve it.

Sometimes you need to dive a little bit deeper to understand where the problems are going to become later.
-David Burns, Head of OSPO at BrowserStack

How flaky tests are born

David Burns (05:17):
One of the key things that I think touching on the subject of moving with times, originally Selenium was built to be this underlying API. So I think the best way to describe it is, it was supposed to be the database connections to your database and then someone would build your ORM on top of it.

David Burns (05:38):
That’s how it was originally designed, how we wanted it to do it. Fortunately, unfortunately, it depends on who you speak to. Some people like that, that you could go, I’m just going to do this. And then they could build their own world on top of it. And we saw the Googlers doing weird and wonderful things when I was at Mozilla. We did amazing things with Selenium, from doing basic performance testing, because we could control the whole environment where possible.

David Burns (06:10):
I’ve seen people doing games testing because they could control it and to the finite detail. And that was really, really cool. But as you mentioned earlier, I like that phrase, where people only dive down as far as they need to go rather than all the way, is that tech people now have to deliver features as quickly as possible and as quickly as possible tends to mean yesterday, not tomorrow.

David Burns (06:37):
And I think that mentality, especially being pushed from product managers and project managers, doesn’t allow people to understand where things are going. And so they’ll reach the zenith of their flakiness. Browser are asynchronous by nature to improve the performance and battery management and things like that. Because they can pause and do what they need to in different threads.

The mentality of shipping as quickly as possible doesn’t allow people to understand where things are going. And so they’ll reach the zenith of their flakiness.
-David Burns, Head of OSPO at BrowserStack

David Burns (07:05):
But people’s thinking is not asynchronous, especially when it comes to testing, they will go, I will want to do step one, step two, step three. And these two worlds really struggle to meld because they hit these points and then people are like, oh, but I’m not expecting the browsers to do that. Why is it doing that? And then obviously it’s Selenium’s fault or Puppeteer’s fault or Cypress‘. It’s every single one of those tools has the exact same problems because they’re all using browsers and it’s just not gelling with the mental models that people have. So it’s been a nice history and where the market is going, I think is quite cool.

How to start new or migrate existing projects

Darko Fabijan (08:25):
Maybe from the very practical standpoint, long time ago, it seemed, and there was a default choice and people just went with Selenium and then you don’t have that paradox of choice. And some products are just starting out and people are deciding, okay, what’s the tag that I should pick and run. Or some people are actually migrating from one to the other and so, what would you recommend?

David Burns (08:55):
I’ll start with new projects. I think the problem that I see a lot in the space at the moment is people have forgotten the basics when it comes to how you set up new projects and how you go about testing it.

And so people will knock up their MVP super quick, get it ready. Go, yeah it was in production, then they’ll throw some tests at it and then cry and cry some more because their new project’s not working when they try testing, it’s not been designed for testing was not being designed for modularity, which testing enforces. And so we’ve got to the stage where I think the testing pyramid seems to have been thrown out the window.

Browser testing: the testing pyramid and how tools support it (or do they?)

David Burns (09:48):
So the testing pyramid, this concept for listeners who might not know, it is this concept where you design your test in a large set for unit tests, a medium set for your integration tests and then a very small set for UI or tools like that, where you’re going from end-to-end.

David Burns (10:10):
And you want to minimize those tests because as you go up to the testing pyramid, you potentially increase the size of the flakiness. And I think that, because people have seem to have thrown that out. Some of the other tooling out there compared to Selenium is gaining a lot of traction because people struggle with that initial of core concept that they then go, well I’ve got this black box and this black box is solving this problem for me really well.

David Burns (10:43):
And then they try scale it out and anyone who’s done any scaling work will know that there is blood, sweat and tears behind every single bit of scaling from anyone who’s touched Kubernetes recently, they will tell you all the problems with it and they don’t necessarily know how to solve it, but everyone knows it.

David Burns (11:08):
And it’s because people that go, oh this is this cool thing. It’s nice for my CV. I can press this one button and it works brilliantly on my proof of concept, but it doesn’t necessarily work in the real world. And so when we look at the markets out there, there are a lot of tools that say that they’ve solved everything for you and they make big promises. I’m going to name some names and we’re not saying they’re better than Selenium or worse than Selenium.

David Burns (11:38):
I think they have their place definitely in the market. So Cypress is a nice black box is the way I look at Cypress when I compare them to Selenium and they go here is your gateway drug, I think is the best way… this will get you started. It has… you want to do API testing, we can do it. If you want to do browser testing, we can do it.

David Burns (12:02):
And then it’s, oh you want to now run these in all in parallel. Yeah, you’re going to have to pay for that part, but you know, you can totally do that and that’s fine. And it gives you this belief that they’ve solved the Selenium flakiness, and then you leave it for a while. And this Cypress then releases new APIs around waiting. Like I’ve got to wait for this new object to appear on the page.

David Burns (12:30):
They’re learning from the Selenium lessons in the past, just had a wake library, I don’t know, for at least nine years, somewhere in that region. And the same story came from Puppeteer. Puppeteer is very good. They’re initially Chrome-based, Firefox supported being added. And when I was a Mozilla I was leading the team that was adding it before moving away.

David Burns (12:52):
And again, there’s this belief that they’re solving the problem of waiting and flakiness and then suddenly, oh, this real problem still exists. And they add in APIs to add it. Those are really cool tools depending on your use case. And I think use case is where I don’t think a lot of people are putting a lot of effort into the way they think about the tools. Or I come back to Kubernetes, everyone’s talking about Kubernetes, can I release it quickly into some cloud provider and just press a button?

I think use case is where I don’t think a lot of people are putting a lot of effort into the way they think about the tools.
-David Burns, Head of OSPO at BrowserStack

David Burns (13:29):
And it’s, yeah you can, but then is it going to scale? Is it going to have this? Is it going to still be secure. Core problems that I think people easily forget about because they try to put certain amounts of developer experience in the tooling above all else. And I think that’s not a good thing.

Why it’s important to invest more effort into testing pyramid from the very beginning

Darko Fabijan (13:49):
Something that I can contribute here. How things are looking at the ground, I’m speak about, let’s say most successful customers that we have. Goes like this, you put the absolute minimum effort in the beginning, of course, you are not too involved with your testing pyramid.

And the distribution of tests and so on because you’re just getting started and amount of money that you are dealing with and number of developers and all that. And he just said, can this be done for yesterday? And then it’s a pattern, comes a series, investment series, where they get something worth of 50 million and then development teams jumps from 15 to 20 engineers to 100 and 150 in less than two years… year, year and a half.

Darko Fabijan (14:46):
Features need to continue flowing out. Yes, test suit goes crazy because just a lot of people building web apps initially is very cheap. Okay, so I’m going to build my unit test like this into this layer. And then I’m going to do this through the integration test. And then as you said, just this top part I’m going to cover if something goes from hard drive all the way to clicking the browser.

Darko Fabijan (15:17):
That clicking the browser gets, as you presented, essentially abused in a way, used so much. On the other hand, I can say is good for our business. People have to parallelize their test suite used across 50 or 100 jobs in parallel. In order to get the feedback loop. You’re not quick enough so they can survive.

Darko Fabijan (15:37):
And usually a part of conversation is, okay a step that needs to be put in place because if you guys keep growing your test use in this way, with this very expensive, also very flaky tests, hard to scale, is not going to run. So essentially I guess I repeated everything that you said, but I mean, we really have a lot of samples like that. The more successful the companies are, the bigger problems they have in terms of testing.

The more successful the companies are, the bigger problems they have in terms of testing.
-Darko Fabijan, Co-founder and CTO at Semaphore

David Burns (16:08):
Oh, definitely. I’ve seen one case I’ve seen recently, and I’ve seen this with a competitor. So one of the nice things about working in an open source program office is that if a competitor has one, I get to speak to my competitors regularly about things.

David Burns (16:25):
And they’re very open because open source people are very open. Obviously, no names are ever mentioned, but I’ve heard this from internally and I’ve heard this from competitors as we’re dropping Selenium, we’re migrating to something else, six months later, we’re dropping that thing and we’re moving half the test suite back to Selenium. And this is the thing of where I say the testing pyramid seems to have been thrown out. And yes, I’ve worked… when I was at Mozilla I joined before B2G the Boot to Gecko project, which became the Firefox OS project.

David Burns (17:02):
And Mozilla grew so fast. It was my first time ever seeing a company grow so fast ever. When I joined BrowserStack, we were at the beginning of that, shooting up. Yeah it’s so easy to have problems of your code has scaled out and you are having these problems. It’s like, oh I’ll just move to this other tool. And it’ll become my nirvana and it rarely does.

I’ve moved around in the developer experience space, I spent all my time in developer experience, it was around tooling. So we made geckodriver and we made that really awesome and simple to use.

David Burns (17:44):
And you didn’t have to worry about upgrading. So earlier you were like, oh what version do I have to use for this or that? Or the next thing, one of the key things that I wanted for Geckodriver is upgrade when you need to not because you have to, and then you see the Chromium based browsers. It is like, Chrome driver needs to be upgraded every version. And I’m sure if you are putting this into your CI system, that’s every four to six weeks nowadays. It’s a non-trivial amount of work.

Darko Fabijan (18:11):
And it breaks things.

David Burns (18:13):
Yeah. People just think you just upgrade one line and that’s it. Not always, because now you’ve got to deal with all the fallout of any potential bugs that it’s introduced, right?

David Burns (18:23):
And it’s a lot of work. It’s hard. And I think caring about the speed as you said, right? Yes, it’s good for you when tools are abused because as your company, you can sell more. Your solution is scalable so you can sell that better. But that also then potentially costs seeds. And so people go, oh is this really worth it? And then it’s like, oh it must be the tool space. And that’s what you were saying over there. It’s not necessarily. Past me has written really bad code. And past me is really bad at coding. And I go back and fix all them and make it better and faster.

David Burns (19:01):
And you know, and that’s where I think a lot of the tooling is. And that’s where me on the Selenium project is trying to take this Selenium project is try help people fix those problems by being aware of it. But then as we more move into similar APIs as like Playwright or Cypress or Puppeteer, there are other problems that we’re running into, because we are hitting these scalability problems again, because instead of being this HTTP socket that we’re constantly speaking to, it’s web socket, but the web socket can speak back to you. And if you’re not careful, suddenly you’re doing a denial of service onto your test servers by accident and that’s going to be the next foot down. It’s like, oh my test is not flaky anymore. But my server keeps dying. Like why? You know what I mean?

Going BiDirectional with testing

Darko Fabijan (20:33):
Going from here, let’s speak maybe more about the future. You mentioned about generally standard implementation of a driver or let’s say specification in the API. And you mentioned at that point, something about being bidirectional and so on. Can you give us maybe a tour of that that might be useful for people?

David Burns (20:53):
So I alluded to it so far in my discussion, but WebDriver BiDi is the term we are calling it and it’s about having Selenium be able to speak to a browser and the browser speak back to it. So one of the key things that a lot of people wanted, and we seen this through Puppeteer and we’re seeing it with Playwright, Cypress kind of, but Cypress architecturally can’t do all of it, is that they want to know when certain things happen in the browser and then carry on.

David Burns (21:23):
And so we are now creating originally WebDriver was HTTP because it was simple and standard and it scales massively, right? That’s how, BrowserStack for example, is able to scale. We just need to be able to process HTTP requests when we get them and process them properly and fastly and correct. And which is awesome.

David Burns (21:45):
Moving to the bidirectional way is that, so it can speak back to you, which is great for if you want to know while your test is running, is there a JavaScript exception suddenly thrown? How can I handle that? Or you want to know when a console message has come out. So you add a certain console message to help with flakiness, for example, you could handle that.

David Burns (22:07):
So we’re adding these new APIs around console logging, JavaScript exceptions coming back from the browser rather than, actually was there a message for me? No, there hasn’t. I’ll wait a little bit and then try again. You don’t have to wait because you know it’s going to come towards you. So there are features like that. We are going to hopefully be taking the whole WebDriver project to sit on top of these new APIs.

David Burns (22:32):
So when people use WebDriver, it’s still going to look the same. It’s still going to act the same. However, how it does it underneath is potentially going to change. And it opens up a whole new world of new features that we want and we can make things better, but sometimes moving it in that direction has its own problems.

David Burns (22:52):
It’s all about balancing the books and if you ask for console messages and you’ve put your debug version into production by accident, suddenly you’re going to get a thousand console messages a second potentially, because you are just admitting different, into this function, into this function, you know what I mean? And so you can get really hairy very quickly.

Darko Fabijan (23:17):
Yeah. Clear. You spent a lot of time at Mozilla and if you can potentially give us your view on the current state of browsers and alignment and all that. What’s your insider view on browsers?

David Burns (23:51):
I’m only still in the browser world through standards nowadays. Everything I say I’m caveating with that. So it’s not knife edge, but I think Safari is getting some unfair press at the moment. And I really think this, one of my last groups that I managed at Mozilla was the interoperability team. And one of our key tasks was managing the W3 test suite internally.

David Burns (24:21):
So we would make a copy of the W3C’s test suite. Well, it’s not W3Cs anymore, it’s called web platform tests, and we would run those against Firefox and then try process them. And we would upstream whenever there was an internal change and downstream them when in external changes. So we always wanted to keep that test suite up-to-date and we were doing pretty well. I haven’t looked at the numbers, but we were doing pretty well back then.

David Burns (24:47):
I think Firefox and Chrome were only 1% difference. And that 1% is not APIs that a lot of people would care about. It was fine. Safari was lagging behind, but why I think there’s a bit of unfair pressure is that every time Google adds a new API, especially for like Project Fugu, which is a brand making mobile features in browsers. So web USB, things like that, for example.

David Burns (25:19):
Adding those in Safari might be lagging behind in them. And I think it’s okay because there are standards out there, but browsers can pick and choose what they want and they can make sure that it aligns with their markets, how they see their markets. I’ve always felt that the Safari team is under resourced, especially for the amounts of money and size. I think they could definitely hire a lot more engineers, but driving that I think is important.

David Burns (25:48):
And so Safari is focused on battery life and done fantastic. So if you’ve got a MacBook open up Safari, use that as your day-to-day browser and your battery life is amazing. You open up Chrome on the exact same machine and you might only get half to two thirds of the battery life. And it’s about the focus that’s different.

David Burns (26:13):
I don’t necessarily agree with all the Fugu APIs that are being added. Some of them are really good. Some of them are just like, why would you really need that? And the pretense behind it, I don’t always agree with, but that’s just opinions, but I think it’s unfair on the Apple engineers because they are doing really good things and they’re focusing on what’s good for Apple.

David Burns (26:33):
Firefox is focuses focusing on what it can do to stay relevant in the market. And so its engineering team got cut down quite considerably last year with layoffs. And so looking at that, they’re going to start falling behind and unlike Apple who are pre-installed on laptops, it’s going to be harder to keep up. But I still believe browser diversity is good. Everyone moving to Chrome I think is a bad thing. I use Firefox is my default browser, because I still believe in the mission of Firefox improving the web and the internet for everyone and going from there.

Darko Fabijan (27:11):
Now that you’ve shined a bit more lights on it from a different perspective, all the browsers coming from very different places and trying to solve different things. What are important for them. And yeah, seems that from the specifications’ perspective, for the majority of the use cases that we generally have, they are aligned well enough to serve everything.

Darko Fabijan (27:32):
Yeah. Maybe a last question for me, you mentioned that you are focusing on open source strategy of BrowserStack and there are a number of projects involved here. Developers are always searching for something new to know about or potentially join. Can you talk a bit more about that? What are the projects you’re focusing on? Where do you see the help is needed?

David Burns (27:56):
So all the projects I mentioned going forward always need help and we are eternally grateful to anyone who can contribute some time to open source. And I appreciate this is not a space that a lot of people can do or they don’t feel comfortable doing. Sometimes people get scared going into open source. All of these projects I mentioned are really awesome about helping you contributors, especially people who might want to learn coding while contributing.

David Burns (28:24):
So anyone can help out here. So the main one that I help out on is the Selenium project. Personally, I look after the Python and JavaScript, those are my core language areas and help out there as well as I’m part of the technical leadership. So Selenium always needs help and there’s always bugs. I think we just passed on GitHub 10,000 issues and pool requests at the end of last week.

David Burns (28:50):
So it’s still a vibrant community. And so people are still using, it still raising bugs, still submitting pool requests, which is awesome. The next one is Appium, which is kind of like Selenium, but for mobile applications and the community there is awesome. One of the nice things about the Selenium project and Appium is the people there genuinely want to help.

David Burns (29:12):
And I’ve seen in both communities people come in and go, I have no idea how to program, but I need to do this for my job. And people teach them basic programming and then help them through. And so I really love these projects because of how they focus on community wonderfully and go there. And so the Appium project could always do with people they’re working towards their version two, Selenium just released version four, they’re moving forward at a really good pace. And so if you have time, help out there.

David Burns (29:44):
The other one that we contribute to is a project called Nightwatch.js. We’ve seen a lot of browsers that customers using it and it’s built on top of Selenium. And so it’s a test runner and it has all the things that you need. And so we’ve got some people helping out there and that project is a really cool project. There’s some really nice people there again.

David Burns (30:06):
If you have some time help out there. github.com/nightwatchjs, come help there. I hear JavaScript is an important language for the web these days. So I’m sure one or two people might know it and if you can help and if you are a bit worried about getting started in any of these projects, I’m going to do a slight plug, but it’s more to help open source projects is that, twice a week, I do a Twitch stream on getting people started in open source and getting them feeling comfortable.

David Burns (30:40):
And so I don’t plug any projects there. I don’t plug BrowserStack, it’s just about getting people started and come hang out code with me or ask questions and I’ll help you get started because I truly believe in open source. It’s part of my DNA nowadays, cause I’ve been in it for so long.

Darko Fabijan (30:59):
Great. Great. And I love hearing this Twitch and generally hold that new format is amazing. I now regretted that 15 years ago, something like that wasn’t around. I remember when I wanted to get into open source there were like waiting lists where you cannot connect really a face to a name and to a project and all that. And I’m also super excited and happy to here that you have those Twitch streams and people can join.

David Burns (31:26):
I get paid by BrowserStack, but I’m not promoting them. And so I’m happy to look at anything that fits into my area of expertise. I’m not going to help you with some CSS stuff because CSS is magic to me and anyone that can do it is amazing, but I can’t. But I will help you with how to build a browser and how to do testing, that side and developer experience and scaling systems. So anything and everything is welcome in that stream as long as it’s about helping people.

Darko Fabijan (31:58):
Amazing. Amazing. Great. Thank you so much for your time and good luck.

David Burns (32:04):
Awesome. Thank you.