At Wonolo, we use data to help make decisions. But sometimes very specific data is required to make these decisions. One of these specific types is the data that comes from A/B testing. A/B testing is an experimentation process where two versions of a treatment are shown to different segments of end users over the same time period. The result of this test is an answer to which treatment impacts the business more. And as a Product Manager it is an invaluable tool.
The challenge for a Product Manager (PM) that I want to discuss is not how to run an A/B test, but if you as a PM should run the test. As testing for the sake of testing is not an optimal approach to be effective at the job of product management.
By my count there are at least 10 factors that should be assessed before determining if an A/B test is warranted. Below I’ve outlined a tactical guide for Product Managers to use to help make this decision.
1. Are the fundamentals sound?
An A/B test is a way to control various elements so that you can evaluate the impact to business metrics. But if the business metrics aren’t established - or if the analytical instrumentation is not sound - then running an A/B test is moot. Like trying to build a house on quick sand, it's a bad idea. Instead, focus on base instrumentation and ensuring that the data for the feature/product/company is reliable. For a deeper analysis of analytical instrumentation, check out my previous article Basic Tips and Advanced Tricks for Implementing Client-side Analytics.
2. What stage company are you at?
This actually matters quite a lot. If you’re a Product Manager at a large company like Facebook you probably need to A/B test everything. There are likely many tests concurrently running so isolating the impact of a change matters greatly to the organization. But if you’re the first Product Manager at an early startup with one engineering team, you don’t really have time to A/B test everything. Instead, focus on speed of shipping to validate the main idea in-market.
A derivation of this split is if you are on a small team at a large company, working on Product-Market-Fit (PMF) expansion efforts. Spending time A/B testing a change should be heavily scrutinized rather than simply shipping the feature. Really for PMF expansion you should optimize for speed to market to test the core thesis.
3. How expensive is the A/B test?
By “expensive” I am referring to engineering time, although certain tests have monetary impact that can also be measured. From this standpoint, you as the Product Manager need to weigh the benefits of the A/B test vs the additional time it takes to build and run the test. Counterintuitively, the biggest changes are problematic to run as A/B tests. due to the increased timing. As an example, we at Wonolo are changing the navigation of the Wonoloer app from a left-hand “hamburger” menu to a “bottom nav” layout. We discussed internally that the change could impact engagement of some pages. But to A/B test this change it would have doubled the timeline to ship the feature. Since post change exposed navigation items will now be one tap away instead of two, we decided to ship without an A/B test given the timeline impact.
4. What is the core value proposition to your end-user?
A/B testing an in-line “read more” button to expand content can make sense for one company but not another. The way to understand if this is something worth A/B testing is to take a step back and understand what the core value proposition you offer to your users is. If you work at Buzzfeed your core value proposition is around "reading and engagement". Users come to the platform to read and hopefully share articles. Since a "read more" button interacts with this value proposition, understanding the impact is critical to the business. However, if you work at Salesforce this is probably not the case. Salesforce's core value proposition is around "relationship management". Thus a "read-more" button likely doesn’t impact this value proposition. One alternative way to measure the impact is by measuring data pre-launch against post-launch, also known as “pre/post” analysis.
5. Do you need to know “how much”?
If you need to know the specific impact of a change, A/B testing is a great tool to use. When I was a Product Manager at Eventbrite, I wanted to know what the impact of social indicators was on event consideration. The team hypothesized that showing the end-user the friends who were also attending an event would increase event consideration. Since we wanted to know the specific impact on consideration and purchases, we determined an A/B test was the most effective tool. This approach let us understand “how much” the impact on the conversion funnel was.
6. Do you need to account for Seasonality?
As a Product Manager you are constantly evaluating timelines and cross-team coordination. Since A/B tests add time to rolling out updates this tool can sometimes not fit into the jammed roadmap. Your next best option will likely be to do a pre/post measurement to determine impact. But if your business is seasonal then pre/post analysis can lead to incorrect takeaways, as the fundamental business metrics could increase or decrease due to the seasonal swing. The best way to account for the seasonality is to A/B test.
7. Is it an incremental change?
If you’re making a small change to an existing feature, A/B testing is a great way to measure the impact. At Wonolo we had a thesis that the ordering of steps in our signup funnel was non-optimized. Since it was the signup funnel, understanding how even a small change could impact it was critical. By moving the steps around we found an increase of 4% to the funnel. The reality is that this change feels small to the people on the team building the feature - but to a new user the change is the actual element they use to make their decision with. So small, incremental changes are actually good candidates for A/B testing.
When NOT to A/B test
8. Is the outcome certain?
By definition an A/B test will measure the differences between the treatment and control groups. But if the decision to roll out a feature has already been made, it might not make sense to A/B test it. At Wonolo, we received feedback that our customer portal was not intuitive or easy to use. Reviewing the portal revealed several foundational user experience (UX) issues. Since we were certain about the foundational UX changes, we opted to not A/B test the updates and instead look at engagement pre/post to measure the effectiveness. If we encountered any issues the plan was to “roll forward” and ship updates to address issues. We shipped the change, got feedback, and continued to iterate. Thus not A/B testing minimized the time it took to roll out the update and move onto the next feature to build.
9. Are you measuring performance?
If you are trying to measure load times or render times you might think to yourself “hey this is small and I need to know the difference so I should A/B test.” But that is not always the case. Depending on your A/B test implementation, the mere fact that you have set up an A/B test will cause performance degradation. In these cases it is much better to use different ways to measure the rollout of the change such as a Canary test where random traffic is sent to the new performance-boosted route.
10. If the business is fighting for its life
If things are not going well for the business, speed is of the essence. So although you “really need to know” if something works, due to the additional time needed to run an A/B test you should only use pre/post to cut down on time-to-impact. My old boss Casey Winters, Chief Product Officer @ Eventbrite has a great article about “firing every bullet” from a product perspective when things aren’t going well. This illustrates the stress that business failure puts on a company. Since you have to move as fast as possible, A/B testing is not the way to go as you need to optimize for speed.
Summary
A/B testing is one of the best tools a Product Manager can use to deal with uncertainty about an initiative. However sometimes it doesn’t make sense to A/B test. These are the 10 questions that I ask myself and have my team ask themselves to determine if an A/B test is warranted. I am sure that I am missing some - so if you have a favorite question/answer rubric please add in the comments below.