What is Statistical Significance in Google Ads? [Tutorial]

What is Statistical Significance in Google Ads [Tutorial]

As a marketer, you have to make a lot of tests between your ads.

When doing so, you have probably come across the term statistical significance.

And you may have one question in mind.

What the heck is this?

In this article, we want to answer that question and we are going to see all the related sides of that term to advertising.

If you are thinking, oh no, I am not good at science.

Don’t worry, we try to make it as easy as possible for you to grasp the most important points.

What is the statistical significance?

Statistical significance is a statistical term used in a lot of industries to make A/B test comparisons.

For example, the pharmaceutics industry uses it in vaccine/drug creation or it can help investors in making business decisions about certain companies.

Likewise, it is also used in the advertising field to compare the ads’ performances to find the best ad variations.

Definition

Statistical significance in the shortest possible format means, the reliability of the difference between the two compared data.

There is a null hypothesis, which basically means the original data should stay the same without changing.

And there is also an alternative hypothesis, which means the null hypothesis is wrong, and the original data should not be the same and whatever change happens, it is not by chance.

In order to measure their probability of happening, there is a P-value which is mostly compared with 0.05.

If the P is more than 0.05 or P>0.05 the alternative hypothesis is wrong or the new idea is wrong and there is not enough evidence.

If the P is less than 0.05 or P<0.05, the alternative hypothesis is right and it rejects the null hypothesis, which means the new idea is correct and changes aren’t random, or at least from the statistical point of view.

In simpler forms. Imagine, you are comparing two, new and existing ideas.

if P is smaller than 0.05, it means probably a new idea is going to happen and there is a very high chance of that.

If P is higher than 0.05, it means the existing idea should stay the same and there is not enough evidence for new changes to happen.        

How statistical significance works in google ads?

Now, we have come to an important question.

Why do we need this as an advertiser?

In our field, to create a better version of the existing ad, we need to make some tests.

And statistical significance can help us here to determine whether the improvements we see in our experiment campaign can last for the coming future.

If so, how reliable are those improvements?

Let’s see which metrics can be measured in terms of their performance.

Changes in metrics

There are a lot of metrics to compare in your tests.

But, mostly used ones are:

  • Impressions
  • Click-through-rate
  • Average CPC
  • Cost
  • Cost-per-conversion
Performance-metrics-of-a-new-ad-experiment

Aside from showing their own values, you can see the percentage increase or decrease from the original campaign and confidence intervals.

The confidence interval is a percentage and it is given in the brackets, which means that your changes are expected to be within that range. 

If you notice, all the percentage changes next to it are within that range as well. 

Confidence-interval-of-the-experiment-campaign

Keep in mind though, as we talked about earlier, there should be enough data to make your tests significant enough to show you the differences.

Otherwise, it won’t show anything, till it accumulates enough data.

Not-enough-data-for-a-metric-of-an-ad-to-be-significant
Statistical significance is not available because of no data

What P-value means for your A/B tests?

If you are wondering what the P-value means, it is the likelihood of getting the same result from your experiment campaign, as the original campaign.

The lower it is, the higher the probability of getting different results from the original campaign.  Also, this means, those different results are not random and they are expected to repeat the similar patterns in the future.

Level-of-significances-in-the-AB-tests-of-ads

You can just multiply it to 100 percent to know its probability in percentage.

P-value is less than 0.05

It means there is enough evidence to support your new idea and only 5 out of 100 experiments may give you different results.

The lesser it gets, the higher the probability of getting similar results to the new idea.

We can also call it an error rate.

P-value more than 0.05

It means there is not enough data and the error rate is too big.

So, whatever change happens may be due to randomness.

What method do google ads use when calculating statistical significance?

According to Google support, it uses the Jackknife resampling method for calculating confidence intervals and statistical significance.

In order to know how effective their calculations, you can make research on this method for more details.

How can you affect the statistical significance of your experiments?

Changing either original ad or test ad

At the time of the experiment, it is not recommended to change the variables of any ads.

Otherwise, your changes may affect the calculations and your existing data may not be significant anymore.

As a result, you may extend the waiting period until it gets significant again.

Volume

The more traffic your website generates, the easier it gets to lower the significance level.

What if you don’t have a big website that attracts lots of traffic and you are not reaching the significance level yet?

Well, there are two options for you.

  • Combine multiple campaigns and run their tests at the same time.
  • If you don’t want to increase the volume, you can wait till it reaches the probability of around 80 percent, instead of 95 percent. The downside is though, you have to calculate it yourself.

Time

Once you put a time limit and start your testing, you may see that sometimes the time interval that you set may not be enough to generate data.

However, the rule with statistical significance is you need to wait a little longer until it becomes significant.

Any result may become significant if it is given enough time.

Are all statistically significant results relevant?

If you see very big improvements in your experiments with very low P-value, then you can go ahead and apply that strategy.

But, there are certain times that you may not want to change anything because of these reasons.

Marginal effect

Sometimes even if the P-value is less than 0.05, its effect might be marginal like a 3 percent increase or decrease in performance.

For me in that case, I would keep my old strategy, which stood the test of time rather than risking for nothing.

Keep in mind that, the statistical significant ad doesn’t mean it is going to act the same way as it is expected 100 percent of the time. It might also easily go wrong. 

So, always take into account the effect size of the experiment campaign.

When the confidence interval is too wide or unreasonable

The confidence interval should be showing improvements in the first place.

However, we can see the intervals with both improvements and negative results.

For example, Conversion rate +16% [-26%, +18%]

What does it mean?

Again, for me nothing. Even if it is showing a positive conversion rate, there is a 95 percent chance that it may show any number within that range.  

You need to also pay attention to the wideness of the interval. It might show very good improvement from one side and the other side might be a slight or negative improvement.

Are they reliable?

If you see the value of P less than 0.05, would you consider it reliable?

We can guess that, there might be some outside factors that can affect the results like seasonality or other unpredictable risks.

Other than that, it is reliable, right?

Not really.

There is a person who thinks that statistical significance doesn’t work and he gives his own reasons for that. You can check out his article if you want.

Which calculator to use?

For some people who want to calculate their statical significance, they can calculate it themselves with online tools.

There are lots of them, but not all of them can be accurate.

What I found useful is VWO split test calculator which you can also try. But, it is just within free tools. Maybe there are other paid versions available.

Conclusion

I hope you got something from this article about using statistical significance.

If you learn how to apply this method in your A/B tests, you can make smart decisions by finding new opportunities all the time in the ever-changing market.

Now it is your turn, what do you think about this term?

Let me know in the comments section below…

For any help regarding your PPC management, you can contact me anytime…

Share it