CONVERT
which specific change made the difference.
You need to take a logical, emotionless approach to split testing. Let the results
be your guide. Let's say your sales page has been up for a while. Let's call that
original version A. Make a copy of it, change one element like the headline, and
call that version B.
Next you need to be patient. You need to have a certain number of conversions
before you can truly tell which version performed best. Your split testing
software documentation should explain how long to run your test and show you
how valid the results are. There is a concept in statistics called confidence. This
is expressed as a percentage. The longer your test runs, the closer the confidence
will get to 100%.
For example, your software might show that version A converted at 1.2%,
version B converted at 1.6%, with a confidence of 86%. That means there is an
86% chance that version B will outperform version A in the long run. You'll have
to decide how high a confidence level to shoot for before you stop the test and
accept the results. I like to shoot for at least 90%.
What matters with split testing is not how much time elapses, but how many
actions (visits, sales, etc.) take place. This means if your site gets a low amount
of traffic, your test will have to run for a long time to be statistically valid. It
could take weeks or longer. A high-traffic site like Amazon can run a valid test
in minutes if not seconds.
This is why patience is so important. The biggest mistake people make with split
testing (other than not doing it at all) is not letting their tests run long enough
and making decisions based on incomplete data. That can lead you to choosing
the wrong version of your page as the winner.
Let's say you have a low-traffic site and get 20 visitors a day, 10 each to version
A and B. The first day, your site converts very well, and version A makes 2
sales, while B makes 1 sale.
Those are conversion rates of 20% and 10%, which are awesome for most
products but unlikely to last long term. Two out of 10 visitors buying is more of
a fluke than anything.
An impatient person might look at those numbers and think wow, version A
performs twice as well as version B. That was true for that first day, but it's way
too soon to conclude anything. If a baseball player hits a home run the first time
he bats, no one would assume he's going to hit one every time. But if you
compare two players over a full season, their batting averages are a pretty
realistic statistic of how well they hit that year.
Once your test has run long enough and the confidence number is high enough
for you, it's time to start another test. Let's say you're testing headlines, and
version A performed better. I would come up with a third headline to test and