The Unofficial Shopify Podcast

No Best Practices: 7 Surprising Split Tests

Episode Summary

Questioning our conversion assumptions.

Episode Notes

Most business owners would agree that data-driven decisions are preferable, so why then are we all overlooking split testing when making design decisions? In this episode, we talk through seven interesting split tests and the implications of their results. With one big caveat: there are no best practices...

Show Links

Sponsors

Never miss an episode

Help the show

What's Kurt up to?

Episode Transcription

The Unofficial Shopify Podcast
6/14/2022

Paul Reda: I was gonna talk about something.

Kurt Elster: Was it data driven decisions?

Paul Reda: That’s the only thing I talk about.

Kurt Elster: So, you prefer data driven decisions.

Paul Reda: I mean, I don’t do anything without checking the data first.

Kurt Elster: You check the data?

Paul Reda: Yeah. I check the data.

Kurt Elster: So, you never just go with your gut?

Paul Reda: No.

Kurt Elster: No?

Paul Reda: No. That’s why I built my info board, so I can look at it.

Kurt Elster: You stand in your hallway observing your info board?

Paul Reda: Yeah. I know the exact time it takes to drive to Old Orchard every day to get to work.

Kurt Elster: Whoa. Look at you, George Jetson. Living in the future. Estimating your commute. So, I’m asking about data driven decisions because this is a thing everybody says. At least in our space I hear this a lot, is, “Oh, I prefer… I want data driven decision making. What does the data say?” And then, of course-

Paul Reda: And then, of course, they just do what they want in their gut. They look at the data and then they’re like, “Oh, actually if you look at it this way, my thing was right, so I was right.”

Kurt Elster: Go with my gut.

Paul Reda: Every time.

Kurt Elster: Yeah, and that’s what we see consistently except in one instance, and when you split test stuff. For whatever reason, split testing is the particular kind of data that people just seem to really respect.

Paul Reda: Well, I think it’s for the same reason why we went into this business, was because our old business of working with marketing agencies was entirely feelings based, and then when we started working on eCommerce, we could at least go, “Look at this. Now there’s more money. We did good, right?” And it’s hard for them to argue against more money, although we have had people do that.

Kurt Elster: Yeah. The issue is web design, design in general, is really quite a subjective thing until you start tying it to KPIs, key performance indicators like revenue, and conversion rate, and average order value, and then split testing. Now we can compare the design’s impact on those KPIs.

Paul Reda: Yeah, and Google Optimize, which we will be talking about quite a bit here, just literally gives you a thing and says, “This one is better,” and tells you the percentage amount it is better. You really can’t argue with it.

Kurt Elster: Yeah. It’s great. It really is. It’s the coolest thing. And the reality is oftentimes what you’re doing really is just a best guess.

Paul Reda: Yeah. You think what might be better. You’re just kind of… You’re like, “Well, is this better, maybe? I don’t know.”

Kurt Elster: And so, you could tell me all day that you love data driven decisions, but the reality is in respect to web design and Shopify themes, you’re going with your gut unless you’re split testing it. And I get that it’s not practical to split test everything and sometimes you just don’t feel the need to do it. That’s fine. But I’m making a point here.

Paul Reda: But my gut’s really smart, though. I’ve been doing this for a while.

Kurt Elster: I have a good sense of what’s truthy with my gut. So, today on The Unofficial Shopify Podcast, we are discussing split tests. We’re gonna go through several split tests and what those results were. Real split tests from client stores. And talk through the hypothesis and what the meaning is there, but ultimately the thing I want you to take away from this is that split testing is fun, it is easy, you should try it. Any dumb dumb could do it, because I figured it out, and I got a C in quantitative analysis in college. I’m not talking a high C, either. That was a barely get my degree C.

But you know what? Cs get degrees, so I made it.

Paul Reda: It was a gentleman’s C.

Kurt Elster: A gentleman’s C. I like that. I’m your host, Kurt Elster.

Ezra Firestone Sound Board Clip: Tech Nasty!

Paul Reda: Tech Nasty.

Kurt Elster: And I am joined by Mr. Paul Reda.

Paul Reda: I’ll just make the sounds now. Just tell me. Yell out the sound when you hit the button and I’ll make it.

Kurt Elster: Well, the joy of split tests is making more money. Cha-ching.

Paul Reda: Well, you gotta say cha-ching.

Kurt Elster: Oh, sorry. Cha-ching.wav.

Paul Reda: That’s it now, we just say whatever. We just say file name dot wav.

Kurt Elster: So, we’ve mentioned this before, but the annoying part about sharing split tests and sharing the results is you’re always gonna get the person who’s like, “Well, actually, that’s not significantly significant.” Or, “I’m just asking the question. Is it statistically significant?”

Paul Reda: Now, that is a legit question because most people are dumbasses.

Kurt Elster: You probably like wildly skewed the results for funsies. And so, I think it is simultaneously a valid but pointless question to ask because the truth is it really doesn’t matter if it’s statistically significant or not. It could have 99% statistical significancy but that would be in that store with that audience at that time. It absolutely will not have any bearing in your own store. So, when you look at other people’s split test results, stop treating those as best practices. They’re not. Those are fabulous split test ideas to try in your own store.

Paul Reda: Okay, but when you’re doing the split test, you should care about statistical significance.

Kurt Elster: I meant-

Paul Reda: So you’re not like, “I ran it for 4 hours. It’s up 200%!”

Kurt Elster: Okay. Yes, absolutely.

Paul Reda: Okay, good.

Kurt Elster: Yes. For your own test, you should care if it’s statistically significant or not. Because yeah, when you start a test you’ll see the initial result will just bounce around wildly. And then as it gains data it’ll level out and have a saner result. And sometimes the result isn’t what you wanted. And sometimes the result just is it doesn’t matter, in which case okay, now go with your gut and whichever you prefer.

Paul Reda: Yeah, sometimes a split test is a real Tuffy Rhodes.

Kurt Elster: What is a Tuffy Rhodes?

Paul Reda: Tuffy Rhodes was a guy on the Cubs who some year in the ‘90s, I don’t remember, hit three home runs on opening day. So, obviously that means Tuffy Rhodes is going to hit over 400 home runs for the year.

Kurt Elster: You want him on your fantasy team.

Paul Reda: He did not. He hit maybe like 10 the whole year.

Kurt Elster: So, it was a fluke.

Paul Reda: He got sent back down in August. But he hit three home runs on opening day. Clearly, winning the MVP.

Kurt Elster: And so, this-

Paul Reda: End that split test now.

Kurt Elster: This is a mistake we’ve seen people make with split testing. They start the test, the result immediately skews one way or the other hard, and probably the way they didn’t want it to, and so like 48 hours in, they just quit it and go, “Well, that one didn’t work.”

Paul Reda: Well, and they go, “Oh, well half the people are getting this horrible outcome and that cuts my conversion rate in half. Now half my audience is having the half the conversion rate. End that shit now. I’m losing money.”

Kurt Elster: Yeah. They panic.

Paul Reda: Yeah.

Kurt Elster: And so, there can be opportunity cost that happens with split testing. But what’s the opportunity cost of missing out on a great test that could have potentially increased your overall revenue for the rest of the year if you just waited it out?

So, with that caveat, these tests, we have waited until they’ve achieved statistical significance in Google Optimize, and we have always run them two weeks. Two weeks I really think is the minimum that you could get and still have something that is reasonably statistically significant.

Paul Reda: And I know your reasoning behind that, because I think you think of it as pay cycles being on two weeks.

Kurt Elster: Yes. Yeah, you get paid every two weeks.

Paul Reda: I haven’t seen that in the numbers, but definitely the amount of weekend in your sample size is very important.

Kurt Elster: Also true.

Paul Reda: You want as little weekend in your sample as possible, or at least you don’t want to two. You don’t want to start it on Friday and end it the next 10 days later on Monday, so you have two sets of weekend days in there and only one week in the middle of it. Because that’s just too much weekend.

Kurt Elster: Yeah. Everyone knows you do your shopping by procrastinating at work.

Paul Reda: It’s actually true.

Kurt Elster: Yes.

Paul Reda: On the weekends, people actually-

Kurt Elster: That’s not a joke.

Paul Reda: That’s not a joke. People live their lives on the weekend. They don’t buy as much. All stores, sales go down on the weekends.

Kurt Elster: I’m living for the weekend.

Paul Reda: Yeah.

Kurt Elster: And the other thing to consider, of course, is how you segment your tests. Like when I get a test that really throws me, we want to run that test again and we also want to run it segmented. What is this? How does this test look at new versus returning customers and mobile versus desktop? And then that can often lead to opportunities for personalization, where like maybe this one widget works better on mobile versus desktop, or it has a negative effect on new visitors but a positive effect on returning visitors. Well, we can use ongoing personalization in Google Optimize to just make that a permanent change for only that segment.

Paul Reda: All right. I feel like we’ve jumped straight to step four here, so let’s reel it back a little bit. Kurt, what is split testing?

Kurt Elster: Excellent question. So, split testing, when you’re making design changes to a site, you’re going with your gut. You have no idea what the impact of things are. So, if I’ve got… I have a bunch of press logos on my homepage. Does that have any impact at all, good or bad? I have no idea. And so, the way to try, the way to figure it out, is for half of the visitors to your site, leave it as is. That’s our control. And for the other half, get rid of the thing and see what impact does that have on our key performance indicators. And using… There’s a lot of split testing tools. The one we’re using is Google Optimize because it is free to use and it is easy to use because it plugs into Google Analytics, which you are very definitely probably already using in Shopify.

And so, Google Optimize, and then it’ll also try and save you from yourself on the statistical significance. It’ll tell you the probability.

Paul Reda: It will tell you the percentage chance that this choice is better.

Kurt Elster: And it will also watch out for you, where it’ll try and stop you from doing boneheaded things. It’ll be like, “Hey, maybe don’t edit this test a week into it and skew your results.”

Paul Reda: Yeah. Every time you touch the test, it’s now actually a new test.

Kurt Elster: Yeah. It’s like you try and end it early, it’s like, “Please don’t do this.” It will try and save you from yourself. So, I like Google Optimize a lot, but there are plenty of other tools that will work, as well.

Paul Reda: Yeah, so what we’re doing is we’re making a change on our store. We’re then splitting the audience in half. Half the audience sees the change, half the audience doesn’t see the change, and then we see which split of the audience performs better, makes more money, and then if the change ends up making you more money, you now apply that change to everyone and lather, rinse, repeat. Do it again. Do it again. Do it again. Add a weird line of text that’s like, “Great at parties,” above the add to cart button. Oh-

Kurt Elster: Great at parties.

Paul Reda: Yeah. Saying it’s great at parties actually gives me 5% more money than not having it there. Great.

Kurt Elster: And there’s other ways to run split tests. There’s nuance to it. And people who are really into it I’m sure are like in their car screaming at this moment. But what we’ve described, that is the most common… That’s like the prototypical split test, is should my add to cart button be red or green?

Paul Reda: All you’re doing is changing button colors, man. That’s all split testing is.

Kurt Elster: Oh, God. Gonna get some angry emails now. And so, half of people will see red, and half of people will see green. All right. Which one performed better? And that’s all it is.

Paul Reda: Red means stop, so obviously that’s worse. Duh.

Kurt Elster: And we don’t know this. That’s why we gotta split test it. I really had not… I had not performed split testing myself particularly on a regularly basis until a few years ago when… I am such a dork, in the year 2020, not knowing the craziness that was about to happen, one of my new year’s resolutions was I’m going to figure out split testing. And I’m gonna figure it out by just trying it and making mistakes and seeing what happens. And getting my own experience with it. And so, recently, looking back on it, now I’m very comfortable with split testing. And so, I went back through, and I pulled several of our more interesting split tests. The ones that surprised us, the ones that I thought were neat, and I wanted to share those, their results, and the hypothesis behind it, in the hope not that you go implement these just blindly. Oh my gosh, please don’t. But that you’ll take these ideas and split test it in your own store, and maybe develop the data where you know conclusively these are positive changes.

Paul Reda: I’m just thinking about how for our listeners that are self-admitted fiddly little monkeys-

Kurt Elster: Aren’t we all?

Paul Reda: This is a great thing for you to fiddle with. You know how you want to make your store better, but you don’t want to do any actual work to make it better, like worrying about your email campaigns or anything like that?

Kurt Elster: Writing 500 word product descriptions.

Paul Reda: Yeah. Making a bunch of content. Yeah. No. Too much work. You could set up a split test in like 15 minutes.

Kurt Elster: Oh, yeah.

Paul Reda: And maybe you end up making more money on that split test.

Kurt Elster: And then it just runs on its own for two weeks.

Paul Reda: Yeah. It doesn’t do anything. And at the end it’s free money and you didn’t really do jack shit, so you did work on your website, and you didn’t really have to do anything. It’s perfect to get that fiddly little monkey brain out of your head.

Kurt Elster: You know, I fall into the category of fiddle monkey, and that is really… That’s the appeal of it for me. I’m like, “Oh, I get to fiddle in Optimize.”

Paul Reda: And you know, it’s really worthwhile. I mean, I’m sure you’ve had split tests that ended up even generating 5% more revenue.

Kurt Elster: It adds up.

Paul Reda: Is 5% completely reasonable as being better on something?

Kurt Elster: Absolutely.

Paul Reda: All right. If you have a store that makes a million dollars a year, 5% is 50 grand. You did a stupid split test that you dicked around with for 15 minutes and now you have an extra $50,000 this year.

Kurt Elster: Maybe you spent an entire hour thinking about it.

Paul Reda: Wow.

Kurt Elster: $50,000 an hour? I will take that bet. So, rather than make them wait, you ready for some split test action?

Paul Reda: Change all the font to Baskerville.

Kurt Elster: All right. Tell me about your thoughts on that one.

Paul Reda: No, I read a thing that Baskerville is the most trustworthy font. People were like… It was like reading a piece of text that was trying to convince you of something and Baskerville worked best in trying to convince people of something. That being said, if I’m selling beach gear and not reading the Federalist Papers, maybe Baskerville is not the best choice.

Kurt Elster: Yeah. Yeah, on my Kindle I want Baskerville.

Paul Reda: Well, on your Kindle you want Bookerly, the special font made for Kindles.

Kurt Elster: Oh, my mistake. You know, maybe I just… I’ll keep it simple. Do Georgia.

Paul Reda: Get out of here. Anything that was a website font 20 years ago, loser font.

Kurt Elster: Oh, I see. Really going deep on the typography jokes.

Paul Reda: That’s right. That’s right.

Kurt Elster: Make some kerning references next. All right. We’ve discussed this one on the show before, but should price appear on the collection grid?

Paul Reda: We’ve determined no in all of our split tests.

Kurt Elster: The answer is maybe.

Paul Reda: May not apply. May not be applicable to you. But applicable to us.

Kurt Elster: When we ran this in an apparel store, and we’ve now run this in several stores, and I got… One was inconclusive and the others were various levels of positive, but in an example, the first time we did this in an apparel store, it increased revenue per session. So, Google Optimize, you could choose different goals. Doesn’t necessarily have to be conversion rate, which they call transactions. You could do revenue and it’ll give you revenue per session.

It increased revenue per session 23 and a half percent with 97% confidence.

Paul Reda: What’s revenue procession?

Kurt Elster: It’s visitor sessions.

Paul Reda: Oh per session. You were saying per session too fast.

Kurt Elster: Per session.

Paul Reda: Like we’re in a wedding party and we’re going down the aisle procession.

Kurt Elster: I see. I need to enunciate.

Paul Reda: All right, so yeah, 23% more revenue. Holy shit.

Kurt Elster: Yeah.

Paul Reda: For that store, that’s hundreds of thousands of dollars.

Kurt Elster: Yeah, it’s a pretty good deal.

Paul Reda: Yeah.

Kurt Elster: And so, this is one that I don’t think is applicable to every store. This is one you want to test. And my hypothesis here is I see the pricing in the collection grid, or I just see the items in the collection grid without the pricing, and whether I think about it or not, in my head I have to make a value judgment as to what this may be. Value is subjective. And if it’s worth it to me. And then when I click through to the product and I see the actual price, if that price is less than I suspected, suddenly it seems like a good value and I’m more likely to buy. And so, if you are in that position, this one is gonna kill for you. But other way around, if people… It could work against you and people could perceive the product as overpriced. So, it’s one of the ones you don’t just want to implement it blindly. You would want to test it.

Paul Reda: And this psychological reasoning is entirely something you’ve just backfilled, right?

Kurt Elster: 100%.

Paul Reda: Yeah. You don’t know why.

Kurt Elster: No, it’s a thing I made up. If I just go, “My hypothesis is,” then I could just say whatever I want.

Paul Reda: Yeah. Exactly.

Kurt Elster: Yeah. And that’s why you shouldn’t just blindly accept these supposed best practices on the internet. Because it’s some yahoo in an office in Skokie making it up.

Paul Reda: Aren’t we all?

Kurt Elster: Aren’t we all? Yeah, we’re pretty much all just making it up as we go. I mean, as evidenced by Elon Musk’s Twitter shenanigans, it is very clear, even the richest man in the world is winging it.

Paul Reda: Oh my God.

Kurt Elster: I’m not getting into that.

Sound Board Robot: Oh. My. God.

Kurt Elster: You triggered ohmygod.wav. Sorry.

Paul Reda: Oh. My. God.

Kurt Elster: So, we know font… We like typography. We know fonts are important. How important is font size?

Paul Reda: Bigger is better.

Kurt Elster: Bigger is better. You know-

Paul Reda: One letter per screen.

Kurt Elster: You have to just tab through it. Piece it together. It’s like a cypher. This one is a trick question. Readability is important. I did attempt multiple times to split test just font size and I could not get anything conclusive or statistically significant. I think the answer is overall readability is important and you can’t just simplify it to font size.

Paul Reda: And who do we ask in order to figure out overall readability on the web, Kurt?

Kurt Elster: Well, Baymard Institute, who does usability studies, I love Baymard Institute, did a large scale study and came up with readability guidelines. And they said the most important thing is line length. So, if you’ve ever read a newspaper… You know those things? They’re antiques. They’re printed on dead trees. And it’s like it’s what we did before email, I think.

Paul Reda: Oh. Okay.

Kurt Elster: Yeah. Magazines and newspapers have these really narrow columns. Or even if you read like a paperback, you’ll notice it is fairly narrow text. Narrow text line length and narrow text more so than font size is what determines readability. Like my eyeballs don’t need to be going back and forth.

Paul Reda: I wouldn’t say narrow text. I would say line length.

Kurt Elster: Line length.

Paul Reda: Because obviously you could be too narrow.

Kurt Elster: Yes. And so, they got it down to like 60 to 80 characters. And then there’s other factors like the spacing between the lines, line height.

Paul Reda: But these are all CSS value. So, font size, a CSS value, easily changed on your store. Line length is the size of the container, or like the padding on the container that the text is in. Easily changed on your store. Letter spacing, word spacing, these are all CSS values you could tweak.

Kurt Elster: Yeah. So, these could be… A theme developer or front end developer can relatively easily implement this stuff as just part of your theme.

Paul Reda: Yeah. And that is how Google Optimize generally works, is it works with CSS values in that if we’re hiding and showing an element, Google Optimize loads your page in sort of a frame, and then you find a CSS class or ID on that element, and then you tell Google Optimize, “This thing right here, our test is show/hide this thing.” And that’s how you set up the test. At least when you ask me for help in setting up the tests how you do it.

Kurt Elster: Yes. Yes. Also, it’s got a visual one, but I have access to a theme developer, and so we do it the fancy way. Do you have an Amazon Kindle, the eInk one?

Paul Reda: I do.

Kurt Elster: Paperwhite, that’s called, right?

Paul Reda: I do.

Kurt Elster: Yeah. I got one of those too. I think a lot of people have those or have one in a drawer gathering dust as a Christmas gift. Pull it out, turn it on, and mess with the settings, because in the settings for your book, when you’re reading on the Kindle, at least on the eInk one, you can mess with the font size, the line length, the width of the column and the line height, and so it’s a really… If you want to demonstrate this to yourself, you could actually do it with your Kindle.

Paul Reda: Well, I was gonna tell you just do it on your phone. All phone… If you got an iPhone right now, load up a text heavy webpage in Safari, load up your webpage in Safari, and turn on reader mode in Safari.

Kurt Elster: Oh, I love reader mode.

Paul Reda: And reader mode has those settings that you could tweak.

Kurt Elster: It’s like a uppercase, lowercase A in the upper left corner, right?

Paul Reda: Yeah. And it’s got line height, it’s got font size, and you could screw with that and really see how different it is for you.

Kurt Elster: The fact that reader mode exists as a prominent default feature in a web browser-

Paul Reda: Shows how far we’ve fallen.

Kurt Elster: Yeah. That’s like everything-

Paul Reda: Everything used to be like that.

Kurt Elster: Yeah. This should really justify to you how important this is.

Paul Reda: And you know, and when Baymard says the best line length is 60 to 80 characters, there are character counters online. If you have a question about an area on your website, copy one line of text out of it when you’re looking at it, Google character counter, and just paste that text into a character counter and it’ll tell you how many characters it is.

Kurt Elster: 100%.

Paul Reda: Because I am a fiddly little monkey sometimes, I did all of this on my eInk info board. Have you heard about that?

Kurt Elster: Yeah. What is this eInk info board you keep yammering about?

Paul Reda: I think we talked about it a couple weeks ago.

Kurt Elster: Oh, we did?

Paul Reda: Yeah. I spent way too much money on a giant, 32-inch eInk screen that’s like a Kindle, and it loads a webpage I built that lives inside my house, and it just has a bunch of data on it that I can look at like headlines, and stocks, and the weather, and a picture of my baby. My baby, she’s old news now. All I care about is this info board. Whatever. She just sleeps all the time.

Kurt Elster: You know, maybe she could… When she’s a little older, she could start helping you maintain the info board.

Paul Reda: But yeah, so I went completely overboard. I figured out how to install San Francisco, which is the official Mac font on it, but you’re only supposed to be able to do that inside Mac apps.

Kurt Elster: I don’t think this eInk board is a Mac app.

Paul Reda: It’s not. Don’t tell Tim Cook, though.

Kurt Elster: Wow. You’re a super hacker.

Paul Reda: Yeah. And then I set all the text sizing and stuff according to Baymard. I am a super hacker. I wrote a static HTML page. Great job, me.

Kurt Elster: There was more to it than that.

Paul Reda: There is a lot more than that. It’s pulling .json. It’s really cool.

Kurt Elster: Hero images. You know, the top of a collection or category page you get this banner or big picture at the top, it’s like a lifestyle image usually, it looks really cool.

Paul Reda: Why are you using the singular? Shouldn’t we have like five of those rotating in a slideshow?

Kurt Elster: You know, I’ve never seen a carousel as a hero image on a collection page. But yeah, if you really want to torture me, go ahead and implement that.

Paul Reda: God.

Kurt Elster: So, I love these because they got a lot of style. I want the #aesthetic, right? And so, I split tested it. I want to challenge my assumptions here. I split tested the presence of a hero image. And this was on a site that had really good hero images and a site that was really lifestyle leaned into aesthetic. And I thought for sure this is a waste of time because I know what the answer is gonna be. And I was wrong. What?

So, getting rid of the hero image-

Paul Reda: And just dumping a bunch of products at people?

Kurt Elster: Yes. Increased revenue per session 16% with 92% confidence. And this one annoyed me so much, I ran the test again on mobile versus desktop and new versus returning visitors, and those numbers changed a little bit, but in all cases it was net better to just not have the hero image.

Paul Reda: So, what you’re telling me is my aesthetic of just put less garbage on your site and it’s just products, products, all day long, is right?

Kurt Elster: The Paul Reda, “Keep it simple, stupid,” philosophy absolutely makes you money.

Paul Reda: Yeah!

Kurt Elster: And this one is data driven.

Paul Reda: And no one will listen to me.

Kurt Elster: They’re like, “More widgets!”

Paul Reda: Yeah, like, “Make it all invisible, then make it appear right before it scrolls off the screen so they gotta scroll back. Yeah!”

Kurt Elster: I got an email from somebody today who has a very successful website, that’s very plain, is going through a redesign, and said, “This redesign is too plain.” And that broke my heart because it’s like you want to go like, “Hey, maybe that’s why this is successful.”

Paul Reda: I mean, no one is going to your site and going, “Oh, this site looks so good. Look how cool it looks.” No. They’re like, “Does this product look cool? Do I want to buy it? What’s the price? Are they gonna actually ship it to me?” That’s all they care about. They don’t care about your fucking widgets.

Kurt Elster: Overall… Yeah, and you can’t have all these widgets but then also be worrying about page speed, right? I’m not going down that road. Less is more. Simpler is better. Ultimately, I think a lot of design is just things that are getting in the way of shopping.

Paul Reda: You want to know the page speed score of my info board?

Kurt Elster: It has one?

Paul Reda: Yeah. I ran it through Lighthouse locally.

Kurt Elster: It’s gotta be 100 of 100.

Paul Reda: Nope, it’s 68.

Kurt Elster: What?

Paul Reda: It’s text and like three images.

Kurt Elster: Such a useless metric. But the reason that hero image thing is outperforming… Not having the hero image is outperforming the hero image, is because the hero image is just getting in the way. It’s just pushing the product grid down the page.

Paul Reda: Yeah. They’re like, “I want to buy.” It’s like, “I clicked on shirts. Here’s a big photo of a guy wearing a shirt. Yeah, no shit. I clicked on shirts. I know.”

Kurt Elster: Free shipping. We have to have free shipping, right?

Paul Reda: Obviously.

Kurt Elster: Of course, you have to have free shipping. However, if you split test free shipping, which you could do not with Google Optimize, but there’s apps that’ll do it. ShipScout is the one I use. There’s another one, Intelligems. I haven’t used it, but I think that’ll do it, as well. You can test different free shipping thresholds and what’s cool about the ShipScout one is you can also put in, “Here’s my typical fulfillment cost. Here’s my cost of goods sold.” And it will help you optimize for revenue which free shipping threshold, when you’re including those costs and the checkout conversion rate for each different shipping threshold. Which is the one that optimizes for revenue? Which is the one that gets you the most money? Cha-ching.wav.

And just saying the name of the file, I don’t know why that’s so… Just immediately derailed myself.

Paul Reda: The fact that it’s .wav, so it’s like 1994.

Kurt Elster: It’s what my sound board supports.

Paul Reda: It’s 1992. I’m on the CompuServe forums on my PS1 downloading Seinfeld .wavs.

Kurt Elster: I’m gonna download them for AOL. It’s like, “Oh, I came home from school and it only half finished.” And then of those, only three were actually what they said they were. So, could test this, and when you start including the cost of it versus the conversion rate, it almost always is just… It’s just a cost center. The free shipping is just costing you money more than it is offsetting with additional conversion. Because it turns out it doesn’t… Depending on the brand and the shipping thresholds, it really doesn’t make that big an impact. It’s a very minor change.

Paul Reda: I want to dive into this a little bit, though, because I feel like we’ve been very much no gods but shipping, you need to offer free shipping and just raise your prices. You’ll make it up more on the back end. It will increase your conversion rate because people worry about shipping. I feel like we’ve hammered that a lot over the years.

Kurt Elster: Yes.

Paul Reda: So, this is a shift.

Kurt Elster: It is. Oh, this really blew my mind, and I thought maybe it’s just this store. This has been the case in every store that we’ve tested this, is that free shipping and higher free shipping thresholds just don’t make a big enough impact to be worth it.

Paul Reda: So, what you’re saying is charging for shipping has always been… in every test you’ve done, charging for shipping is better than free shipping in terms of the total end revenue you end up getting.

Kurt Elster: In terms of revenue per visitor, yes. Isn’t that crazy?

Paul Reda: I could see it. I could see it. And the argument was let’s say we magically know that the optimal free shipping threshold is 30 bucks. The choices were always like between free shipping or $10 shipping, so I wonder if the cases were always… It’s not that, “Oh, our free shipping threshold should be zero.” The problem was actually it was not high enough. It should have been higher to offset the cost more. And actually, become a profit center instead of a cost center.

Kurt Elster: And I think there’s other ways to go about this. I think… I still believe in no gods but shipping, but now it’s about the delivery promise. Especially after the supply chain crunch.

Paul Reda: Yeah. And I mean, I’m sure the costs of shipping things have come up. Having free shipping is rough to eat now, much rougher to eat than it was two years ago.

Kurt Elster: Yes. And so, I think for a lot of people the new… the more important thing is am I gonna get my stuff and am I going to get it quickly, more so than, “Oh man, I have to pay for shipping.” So, in this case I think free shipping, not necessarily a must have, but definitely a must test.

Paul Reda: Yeah. But I’m looking at a chart here. Yeah, this one, on one of your tests, you did… what is no free shipping? As in free shipping at zero dollars? Or as in, “We’re always charging you for shipping.”

Kurt Elster: I set the free shipping threshold at $1,000.

Paul Reda: Okay, so you were always paying for shipping, free shipping at $25, and free shipping at $75, and free shipping at $25 won.

Kurt Elster: Yes.

Paul Reda: So…

Kurt Elster: You’re always gonna have… Well, the checkout conversion-

Paul Reda: Every store has a number that’s the best number, and you gotta figure out what the number is, but we’re fairly certain the number is not zero.

Kurt Elster: Absolutely. And so, it’s going to be like using your average order value as a starting place is a good spot.

Paul Reda: Yeah. Don’t you want to always go like five bucks over your average order value?

Kurt Elster: If I can’t test for it, and these apps require… I think you have to be on Shopify Plus for these to work.

Paul Reda: I would assume, because you gotta change the checkout whether the person’s getting free shipping or not.

Kurt Elster: Yeah. And so, you need… If you can’t test for it and you want to offer free shipping, I would do average order value plus 15% as my starting point. The thing to consider is, all right, what is the cost on each order for fulfillment and shipping? And then trying to factor that in and figuring out is this worthwhile. Yeah, I’m gonna turn some people away, but then I could run an abandoned cart campaign flow and have that offer a free shipping coupon to try and make up for it, and now I’m getting the best of both worlds.

I don’t know. I think it’s… The free shipping is a must have I think is a-

Paul Reda: There’s no longer a rule of thumb now.

Kurt Elster: Yeah. It’s an assumption that needs to be challenged. This next one, not something I ever messed with. A client suggested it, we tried it, and had a really positive and interesting result. So, on Amazon, you ever notice they have like your browsing history?

Paul Reda: Yeah.

Kurt Elster: Some Shopify themes have this feature built in. It’s called recently viewed products. And I like this idea for when you’re browsing around, and you look at a product, and you go, “Maybe this is what I want to get. Maybe not. I gotta look at something else.” You look at the other thing, you go, “Well, that’s definitely not it.” But then you’re like, “What the heck was I looking at two minutes ago?” And oh, then there’s a little widget, recently viewed, and so you go, “Oh, that’s my… I gotta go back and click. That’s the one I want.” And then you get to the cart and the recently viewed is still there, and you go, “Oh, I should add this too.”

So, I think in theory this recently viewed products thing helps keep people on the site. It helps keep them in that shopping loop, captive to it, and also helps increase average order value. But you don’t know till you test it and so I tested it, it had a positive result, but I thought maybe this is different for new versus returning customers, right? Because it cookies it. So, if I’m visiting the site for the first time, recently viewed products might be less important to me.

Paul Reda: It’s got nothing in it. Yeah.

Kurt Elster: Whereas if I’m returning, I might be pleasantly surprised by it. So, we ran it both ways. For new visitors, it decreased conversion rate 9%. For returning visitors, increased it 33%. So, this is one where it turned out doing the split test revealed a personalization opportunity. And Google Optimize could do that, so if it’s a new visitor to the site, we just hide that object, that widget, and if it’s a returning visitor, then we don’t do anything.

Paul Reda: That sounds great. I’m just gonna ask. All right, that all worked out. How do we make that go all the time now?

Kurt Elster: So, that’s a Google Optimize feature.

Paul Reda: Oh, then you could just have that running all the time.

Kurt Elster: Yeah. There’s a personalization option in Google Optimize that is essentially just, “Hey, we’re gonna run an ongoing… We’re gonna use that same split testing engine for an ongoing change and then not track. We’re not actually testing two different audiences here.” It’s pretty nifty.

Cruising around on a site, you ever see where you can add to cart from the collection page as opposed to go visit the thing?

Paul Reda: I hate that.

Kurt Elster: You hate it?

Paul Reda: Yeah.

Kurt Elster: You built such a cool implementation of this on SHITI Coolers.

Paul Reda: I did.

Kurt Elster: But you’re not a fan.

Paul Reda: No. I did what I was told.

Kurt Elster: You know, normally I’m not a fan of it either. Just for whatever reason, I just go to the product page and add from there. I think if it’s a site where I’m buying a bunch of small things, like nuts and bolts, the websites where I can do that, like McMaster-Carr, then I think it makes a lot of sense. So, again, this is one of those ones where you want to test, but we did test it on an apparel site. Depending on the implementation, I’m sure, and it increased revenue per visitor 15%. Only had 85% confidence, so I don’t know that this one is statistically significant.

Paul Reda: Oh, I mean it is. It’s not like… The science standard is 95, but you know, 85, it’s looking that way.

Kurt Elster: Yeah. And this one I didn’t run… This would be an interesting one to see mobile versus desktop, new versus returning.

Paul Reda: And because I don’t like it, I’m gonna find a reason to not implement it to show that I’m right.

Kurt Elster: Well, I think this is dependent on the quality of the implementation of the add to cart, of course, but then also on the item that you’re adding to cart. Does it have variants? Does it have multiple variants? Do I need to check a size guide? I think all those things are gonna make a huge difference as to whether or not this feature makes sense.

Paul Reda: Yeah, and I think on apparel, it’s not a fancy $1,000 doodad. It’s not like an eInk info board that you need a lot of data about. A shirt’s a shirt. The shirt product page is really not gonna give you any more data than the shirt just in the collection grid, whereas a more complex piece of technology, you might want to read a big page about it, or I’m dropping a couple grand on this thing, I need fancy images to really convince me how fancy it is.

Kurt Elster: Absolutely. I want the 360 spinner. I need the AR view. I gotta really fully experience. It needs to be an experience.

Paul Reda: Experience, obviously. Experiential.

Kurt Elster: It’s an experiential website. Oh, you don’t browse that website.

Paul Reda: As opposed to the other websites, which you just know. It’s just it’s in your brain suddenly, you didn’t actually experience it. It’s injected like Picard.

Kurt Elster: Into the base of your skull?

Paul Reda: Yeah. It’s like Picard when he was with the Nausicaans.

Kurt Elster: Oh, I just assumed this was gonna be a Borg reference.

Paul Reda: No, with the Ressikans. The ones where he lived the entire life on the Ressikan planet and then he could play the flute afterwards.

Kurt Elster: Is this on TNG?

Paul Reda: Yeah.

Kurt Elster: Oh, wow. I don’t remember that one.

Paul Reda: It’s literally one of the most famous TNG. The Inner Light. Literally, one of the most famous TNG episodes.

Kurt Elster: I am so sorry to both our Star Trek and non-Star Trek listeners.

Paul Reda: How dare you?

Kurt Elster: I’ve offended everyone. All right, this is our last test here that we’re gonna discuss. Breadcrumbs. Do we need them?

Paul Reda: They attract ants.

Kurt Elster: Well, how am I gonna find my way back through the woods without breadcrumbs?

Paul Reda: Exactly.

Kurt Elster: No, so breadcrumbs, those little navigation links that live in the upper left. It’s usually like really tiny, 12-point text.

Paul Reda: Hate those, too.

Kurt Elster: Yeah. And it’ll be like home, collection title, product title, and that’s it. That’s all it’s got going for it. And so, it doesn’t seem like it has a ton of utility, but if you heatmap it you’ll see people click on it, and I hate it because I’m like, “This thing can’t be doing anything and it’s so tiny.” And then you see the heatmap, okay, people use this thing. And then, so I split tested it, and darn it, they most definitely use it. So, at least on product detail pages where I’m bouncing back and forth between a collection and a product, they’re using it like in place of a back button, I suppose. Having it increases revenue per visitor by 53% with 92% confidence.

It's because they’re making multiple purchases. It makes it easy to browse.

Paul Reda: That’s a crazy number.

Kurt Elster: Yeah. You get rid of that breadcrumb, it really… You’re shooting yourself in the foot.

Paul Reda: That’s awful.

Kurt Elster: Yeah. Like the breadcrumb is in my guess would be all themes and turned on by default.

Paul Reda: If you’re listening to this, there’s a 90% chance you just have them on on your store, but I guess we’re just saying don’t turn them off.

Kurt Elster: Yeah. Don’t be like, “Oh, I want to really clean this up because Paul said keep it simple and turn off my breadcrumbs.”

Paul Reda: You know-

Kurt Elster: I guess that’s gotta stay.

Paul Reda: You could listen to me.

Kurt Elster: So, that’s the extent of the interesting split tests. I’ve run many others. Mostly they were uninteresting or inconclusive.

Paul Reda: Yeah. I mean, a lot of times I bet it’s gonna be like, “Eh, whatever. Didn’t matter.”

Kurt Elster: Yeah. Oftentimes the test is this makes no difference one way or the other so just pick what you like.

Paul Reda: Yeah. We just mentioned the cool ones that actually did something.

Kurt Elster: That’s the fun of it.

Paul Reda: We didn’t mention all the ones that didn’t do shit.

Kurt Elster: Yeah. And maybe I had to let them run longer or form them differently, but I at least knew as is, it’s not telling you anything useful. And so, there’s freedom in that in that you’re like, “Well, here’s one thing I know I don’t have to worry about anymore.” And I think that’s… Looking back on all these tests that we’ve been running and the learnings from them, I think the important part… The important mindset to have going into split testing is just be willing to question everything. Every element. If you want to, you could go through systematically. Test every element on your homepage. And what you’ll quickly discover is some are significantly more important than others and the ones that you might suspect are important may not be.

There’s one site where I did this. I tested each individual section on the homepage. And to my surprise, the press bar… They call this the brag bar, or the trust bar, or whatever you want to call it, that had like a bunch of logos.

Paul Reda: Yeah. All the press logos.

Kurt Elster: That, when it was present, had a negative impact.

Paul Reda: You can even think of a hypothesis for that.

Kurt Elster: My guess is like they saw it and they just went… and because every scam website also has one of these.

Paul Reda: True.

Kurt Elster: Maybe there’s that association. And like in this particular instance it was like, “As seen in,” and just the logos, but there was no-

Paul Reda: There was no indicator that it actually meant anything.

Kurt Elster: It didn’t go anywhere.

Paul Reda: We saved the logo of this media company and then put it on our website. That was all it was.

Kurt Elster: Yeah. And it’s like they were all true.

Paul Reda: I could say I’ve been featured in the New York Times all the time. Haven’t.

Kurt Elster: No?

Paul Reda: No.

Kurt Elster: Yeah. I met Snoop Dogg. I used to be his blunt roller.

Paul Reda: Really?

Kurt Elster: No. That’s not true at all.

Paul Reda: That’s cool. Wow.

Kurt Elster: Yeah. Put that on my homepage.

Paul Reda: I’ve been to space. As seen in NASA.

Kurt Elster: I can’t top that one. I’ve been to space. And then I think the other interesting thing about split testing is you really don’t want to test exclusively conversion rate. I often… I find revenue is the one that’s the most valuable and interesting because that’s… You’re really looking at conversion rate and average order value, like blended into one.

Paul Reda: Revenue is the important… Revenue is the only metric. I mean, we could let one person into the store a year and if they buy something our conversion rate’s 100%. Great. It’s like, “Didn’t make any money, though.”

Kurt Elster: Well, and if I want to reduce noise and really lean on significance with my split tests, you would only want to be testing the immediate next step. Like add to cart. And that’s not necessarily something I have explored. Or page views. But what’s cool about Google Optimize, you don’t have to pick just one. You can have it run against multiple and it’ll give you different results for each in the same test at the same time. And so, you really don’t have to second guess yourself or worry about it. Just run it both ways. See what it says.

And then I think the other thing that surprised me was the wild, and I shouldn’t have been surprised, but the big difference that segmentation makes. Mobile versus desktop, new versus returning. I have not gotten into testing referral source or path, like people who came from a Google search.

Paul Reda: Well, and I mean yeah, that would be just off the top of my head, that would be a huge force multiplier, because one of the things that we struggle with when we’re looking at people’s data is, “Oh, our mobile conversion rate stinks compared to our desktop conversion rate.” And part of that is just built into the device itself, but the other piece of it is so much of mobile traffic to your store is absolutely dominated by Instagram and Facebook, and that’s generally top of funnel traffic, and that’s generally less likely to buy, and then that kills your mobile conversion rate.

And so, it’s like if there’s a way that you could be like, “Show this only on phones from Facebook,” and then you get some crazy score on it where it’s like, “Oh, that caused 53% increase in conversion rate,” well, you’ve just made a huge step in solving that problem.

Kurt Elster: I’ve got more testing to do is what it sounds like.

Paul Reda: Exactly.

Kurt Elster: More toys to play with. More levers to pull. That’s the excitement of it, especially… Now, we’re seeing Facebook ads are less effective. Channels are changing as far as like what acquisition channels are worthwhile. And so, one way around a decreasing ROAS is can we bolster the site? Can we make the site more effective for the traffic we do have? And split testing is the easy way into it, and I would really encourage people, check out Google Optimize. Try and set it up. Play with it. See what happens. You’re not gonna break anything.

Paul Reda: Well, if you will, Google Optimize will tell you, “You broke it. Don’t do that.”

Kurt Elster: You know, your worst case scenario is like your test is insignificant, I think.

Paul Reda: No, I think the worst case scenario is you ate a marginal decrease in revenue from half of your audience while the test was going but you learned something.

Kurt Elster: Yeah. I was gonna say that’s the cost of learning.

Paul Reda: That’s the cost of learning.

Kurt Elster: Yeah. You know, it was the friends we made along the way. That’s the real-

Paul Reda: That’s right. It’s the dollars we made with Optimize.

Kurt Elster: All right, we’re gonna go out… We’ll end it there and we’re gonna go out on-

Paul Reda: Applause.wav.

Kurt Elster: No. We’re going out on StarTrekRedAlert.wav. Oh, that one’s annoying. All right, see you guys.