# What is Variance and why is it important in sports betting?

While a mysterious concept to many, variance is a highly real and important part of the game. From upswings to downswings, it is a source of both excitement and utter frustration. In this article, we want to answer some of the most common questions surrounding the concept of variance and how it affects us. Chances are you’ll want to read on.

## What is it?

In statistics, variance is defined as the expectation of the squared deviation of a random variable from its mean. For those interested, the mathematical formula for the variance of a random variable X with outcomes x_1, x_2, …, x_n, each with probabilities p_1, p_2, …, p_n, looks like this:

Variance Formula

In plain english, however, variance is a measure of how far a set of numbers are spread out from their average.

The numbers can be anything, for example heights in a given group of people. Let’s say you have two groups of people, both with an average height of 180 cm. In one group, everyone have approximately equal height, so if you were to choose one at random, you’d be fairly sure that the person you selected will be roughly 180 cm. The other group, however, consist of only very short and very tall people. The average is the same, but if you were to choose one from this group at random, your results would fluctuate more than the first time. In this case, the variance for the second group is greater than for the first.

If you prefer to watch a video that explains variance, you can do so below:

## Why is it important?

The above example can be translated into investing and sports trading. Substitute people’s heights with different profits, and you’ll see where we’re going. As an example, consider the following investment options and the probability of different outcomes:

Investment A

Investment probability

Investment B

Investment probability

As you can easily calculate, both options yield an expected value of 100. Variance, however, is much larger in the first case, making it a far riskier option.

High return and low risk is what you want from any investment instrument, whether it is in financial or sports markets. The lesson is simple: Variance = risk. You want to reduce it.

## What does the number represent?

Variance is basically a measurement of how much you can expect any outcome to vary from what you expect it to be. In order to make any conclusions from the number itself, take the root of it and obtain the standard deviation:

Standard Deviation formula

Now, this new number can be used to determine the probability of profits staying within different ranges. Assuming the outcomes are normally distributed, there’s a 68 % chance that the value will stay within one standard deviation, 95 % chance it will stay within two, and 99.7 % within three:

Normal distribution

In sports trading, the curve would represent different outcomes after a certain number of trades, and their probability. It would be symmetrical around its expected value(rather than 0 on the figure), illustrating that your profits are equally likely to end up above and below what is expected.

The point is once again that reducing variance means reducing risk. That’s valuable when sports trading.

## Why is the number of trades important?

If you haven’t read our article on the law of large numbers, we recommend that you do so. Anyways, one of its major points is that the outcome of an individual bet is negligible in the long run. What matters is that the average of all results converge towards the expectation.

Consider a coin flip with 50 % probability of each outcome. You do a series of trials and register heads as a success(1) and tails as a failure(0). As you should, you expect the mean of your trials to equal 0.5 after a while, since the numbers of ones and zeros should be roughly the same. However, the question remains: How close to 0.5 can you expect it to be?

As it turns out, it depends on how many trials you conduct. After only a few coin flips, everything can happen. For example, there’s still a 6.25 % chance you’ll lose all of the first four, making the mean 0. If you increase the number of trials, however, the mean can be approximated as a normally distributed random variable. In other words, its probability distribution would look like the bell-shaped curve illustrated above (symmetrical around 0.5)

In fact, we can calculate how much the mean is expected to vary by utilizing the following formula:

How much is the mean expected to vary?

You don’t need to know exactly what the above formula means, but note that the mean variance is inversely proportional to sample size. In other words, when sample size n increases, variance of the mean decreases.

Allow us to illustrate. The following three graphs each show the probability distribution of the mean after a series of coin flips, but with a different number of trials. The purple one corresponds with 10 trials, the red with 100 trials, and the blue with 1000:

Simulation of coin flips

The point here is that as you increase the number of coin flips, the mean is increasingly likely to be close to 0.5. If the number of trials is too small, variance will dominate results, making them unstable.

Therefore, it’s important for any value bettor to place a large amount of trades. If you combine valuable bets with volume, you’ll be profitable in the long run. If you combine valuable bets (+ Expected Value) with a large volume of trades you will significantly increase your chances of being profitable in the long run. As an example, this is the probability distribution of different profits after betting on 1000 coin flips with a 5 % edge (odds 2.10):

Simulation of 1000 coin flips

## How to reduce it?

Hopefully, you’ve gained some insight on what variance is and why you want to minimize it. As to how you might do that, we’ve made an article on the subject already. Check it out here.

# The Law of Large Numbers for Sports Betting

If you've read our article on value betting, you've learned how edges occur in sports betting, and that good bets are characterized by a positive expected value. The question remains how to transform your edge into what is our ultimate goal: Long term profits.

Let’s take a look at a coin flip with 2.10 in odds of heads. The probability is 50 %, so the edge is 5 % (2.10/2.00 = 1.05). With a stake of \$10 and potential return of \$11, one trial give an expected value of \$0.5 (\$11*0.5 - \$10*0.5 = \$0.5). However, you won’t have \$0.5 in profits after the first toss. You will either be \$11 up or \$10 down. To see how this looks, we can plot the different outcomes and their corresponding probability:

Results of a coin toss

The chart shows the two only possible outcomes after the first trial, and that both outcomes are equally likely. In other words, after only one coin toss, there’s a 50 % chance that you’ll win money and a 50 % chance that you’ll lose. Strictly speaking, that is a risky investment.

When we flip the coin a second time, there are now four outcomes: \$22 up (25 % probability), \$1 up (50 %) and \$20 down (25 %). The expected value can be calculated to be \$1  (2 tosses * \$0.5), and with more possible outcomes, the probability distribution would now look like this:

Expected value of a coin toss

As illustrated, only one of the outcomes results in negative profits. In fact, the probability of losing money has gone down to 25 %, only half of what it was after the first trial. Then again, having a 25 % chance of losing money is still way out of our comfort zone.

The law of large numbers states that the mean of the results obtained from a large number of trials will get close to its expected value. This means that if we toss the coin many times, we should expect it to show heads and tails approximately the same amount of times.

Therefore, let’s see what happens if we keep increasing the sample size. Naturally, as the number of possible outcomes increases, winning every bet will become highly unlikely (prob = (1/2)^sample size)) and similarly it will become highly unlikely to lose all of the bets. If we map out the different outcomes and their probability after 1000 trades, we get the following bell-shaped curve:

Now, there are several observations to be made. First, we know that the expected value should be \$500 (\$0.5 * 1000 tosses). This is also reflected in the probability distribution, which is symmetric around 500. That means that you are equally likely to end up above and below \$500.

Furthermore, the chances of negative returns is a lot less than before. In fact, we can calculate the probability of losing money after 1000 trades to be 7 % (You’ll have to trust us on this one). We can also plot the same probability for a number of sample sizes, which will turn out like this:

Probability of being profitable after X number of sports trades

For example, if you want to be 98 % certain to make a profit, you would need a sample size of almost 1900. Increase the number to 5000, and you are theoretically going to make a profit with a certainty of 99.96 %. Moreover, you will earn more than the expected value of \$2500 half of the time.

The lesson is simple. As long as your sample size is small, placing valuable bets is only part of the story and risk is still highly present. By increasing your sample size of +EV bets you will reduce the risk of negative returns and your results will converge towards its expected value.

That is the power of the law of large numbers.

# Variance in Sports Betting and Trading Explained

## SPORTS BETTING 101

This video explains variance in sports betting and trading. Understanding variance is key for any bettor with a goal of becoming a professional sports bettor and do sports betting for a living. It is hosted by Marius from Trademate Sports.

# Big Data Analysis: Is Trademate Profitable?

Short answer: Yes. The Trademate community has made a total turnover of 18 556 824 GBP with a profit of 294 175 GBP. So for every 1 GBP bet, 1.0158 GBP has been won. In other words an average ROI of 1.58% per pound bet. The purpose of this article is to answer investigate to which degree Trademate is profitable, and how the profitability differs between the soft and sharp bookmakers.

# Basic Assumptions

For Trademate to be profitable we expect the data to show that we have an edge on both the soft bookmakers and the sharp bookmakers as measured by whether or not we are able to beat the vig free closing line over time. No matter which strategy you are following to become a profitable sports bettor, we always advocate measuring results vs the closing line. You can read more about the closing line here

Thus what we expect to see when we investigate the Trademate Community data is that there exists a strong correlation between the average flat ROI and the vig free closing line over a large sample size.

## Sample size selection and limitations

The data that you will see next are the trades registered by the community since we launched Trademate in the beginning of October. With a couple of exceptions. Similar to the first big data stream we have removed all of the half wins and half losses. Second, we have removed all trades where the edge placed = closing edge. This was done to adjust for the issue we have had over the last couple of weeks where the closing lines did not update. This issue has been fixed. However a consequence of this was that we excluded some really good trades that were placed close to kick-off or earlier trades where the edge placed ended up being equal to the closing line. So if anything the closing lines that you will see in the photos are a bit conservative. Third, Bovada has been removed as there were multiple cases of the closing line not registering at all. Fourth, our latest bookmaker additions have not been added, but the sample size on those is rather small anyway <4000.

## Flat stake sizing for analysis. Proportional stake sizing for actual trading

We are using flat stakes of 100. Both the ROI and the profits that you can see are thus from flat stake sizing. Similar to the last big data stream this is done to remove the impact that bet sizing would have on the results. Note that we do not consider using flat stakes as an optimal strategy when trading and advocate the use of a proportional staking strategy based on the Kelly Criterion. The topic of stake sizing and bankroll management has been covered in this article and this video

The pros of looking at the overall data is that we get a view of the overall profitability and performance of Trademate. The weakness of the overall data is that it includes multiple bets per game, all odds ranges, all edges and all times before kick-off. Thus it is not game theory optimal. For instance multiple users placing the same trade can skew the profits both up and down depending on the outcome. This is why we will also look at what happens when we select 1 random trade per game for different subsets.

YOU CAN CLICK ON AN IMAGE TO MAKE IT LARGER

Overall Trademate is beating the closing line with an average of 2,7% over a sample size of 120k trades, which is great. So why is there a discrepancy between the ROI and the closing edge?

A known issue with Trademate is that Tennis results are not always updated correctly. This occurs, because Tennis games are often moved back and forth in starting time. So with regards to Tennis our data is only as good as our users are at double checking that the results are correct manually.

YOU CAN CLICK ON AN IMAGE TO MAKE IT LARGER

When we look at the Tennis data alone we can see that either the results being updated wrongly are skewing our profits data (likely explanation), or Tennis is the only individual sports where Trademate fails to make a profit (unlikely explanation) despite having a strong positive closing edge.  In the rest of this article, we will exclude Tennis from the data set.

Even in the 113k sample size, there is still a discrepancy between the flat ROI and the avg. closing edge. One explanation for this is that the actual profits will vary based on the actual game outcomes. Since we are looking at all of the data the ROI could be skewed if there are some games where a lot of our users place the same bet.

Therefore we will next look at 1 randomly selected trade per game. To clarify if person A bet on the 1x2 home team, person B bet on the -0.5 handicap and person C bet on over 2.5 then only one of these bets will be included. If the game ended 1-0 then two of the trades would win and one would loose. So if we run it multiple times, then different trades will be selected, which is why the ROI can vary.

#1 - 1 random bet per game. Sample size: 25k trades

#2 - 1 random bet per game Sample size: 25k trades

Now why is there still a discrepancy between the flat ROI and the closing edge? It could be that even with a total sample size of 113k total and 25k unique trades, it is still not enough for the variance in the higher odds ranges to even out and converge towards the closing line. As stated previously taking all trades is not game theory optimal either. So what happens if we start to tighten up our presets?

All trades (No tennis). All edges, Odds 0-5. Hours before kickoff 0-5. Sample size: 89k trades

Once we narrowed our presets and reduced our odds range the correlation between the ROI and closing line appears to be stronger.

So next let’s have a look at the odds above 5.

All trades (No tennis). Odds: 5-100. Time: 0-100. Edge: 0-100. Sample size: ≈ 7k trades.

Variance

The closing looks great at 6.8%, but the ROI is at negative -4.1%. The sample size is only 6830 trades, so it indicates that the variance has not evened out. Because it will take a larger sample size for the variance to even out at 10 in odds, than at 2 in odds. Because the win probability is only 10% vs 50%. So if you place 100 trades in both ranges, the +- deviations in actual ROI vs theoretical will be larger in the 10 in odds range than at the 2 in odds range. You can read more about the topic of variance and how to reduce it here.

Favorite-Longshot bias: Overvaluing underdogs and undervaluing favorites.

Another possible explanation for why there is a discrepancy is larger in the higher odds ranges besides variance is that there really is something in the favorite longshot bias. Meaning that bookmakers have a higher margin on the underdogs, causing the odds you get offered to be lower than the true odds and that they have lower margins on the favorites. From a business point of view it makes sense that you would need to have higher margins on good that are sold more rarely, underdogs, and can have lower margins on goods you sell a higher volume of, favorites. If the favorite longshot bias is in fact true, which industry research also suggests, then it would imply that edges on really low odds are currently undervalued by our algorithm, while higher odds are overvalued. For example it could be that a 0.1% edge at 1.2 in odds really is a 1% edge or that a 4% edge on a 7 in odds really is only a 3% edge. Please note that the edge % mentioned are arbitrarily selected for illustrative purposes. We are currently working on adjusting our algorithm to take the fav-longshot bias into consideration. But as a temporary solution it would be wise to employ a margin of safety of 1-2% when betting on higher odds ranges, 5 and above.

#1 Random Odds 0-5. Hours before kickoff 0-5. Sample size: 22k trades

#2 Random Odds 0-5. Hours before kickoff: 0-5. 22k trades

When we select one random trade per game it still appears to be a discrepancy.

If we tighten up the odds range and time before kick-off even further

All edges. Odds: 0-2,5. Hours: 0-2,5. Sample size: 53k trades

#1 Random. All edges. Odds: 0-2,5. Hours: 0-2,5. Sample size: ≈17k trades

#2 Random. All edges. Odds: 0-2,5. Hours: 0-2,5. Sample size: ≈17k trades

The correlation between the ROI and closing edge starts to get stronger. In the overall 2,5 odds and 2,5 hours subset the ROI is even higher than the closing, which indicates a run good.

# The Asian bookmakers

On all trades the community is beating the vig-free closing line by 0,3%. With an ROI of 0.6%. So actually there appears to be a slight run good. It is likely that the closing lines for the sharps have been more affected than the softs by removing the trades where edge placed = closing edge, because the majority of trades in Asia are placed closer to kick-off. This would imply that the actual average closing edge would be slightly higher. Now from the raw data that includes all trades, even where the closings have not updated the flat ROI on the sharps is -0.06% and the average closing is 0.99%. So that the actual avg. closing lies somewhere between 0.99% and 0.3%. That the flat ROI deviates from the closing is not surprising when we are looking at a sample set that includes all trades placed, which is not an optimal strategy on the Asians. When the average closing edge is at <1% there is also a fine margin for whether the ROI runs +- 1%, the difference between profits and loosing money. Also, when having a breakeven flat ROI on the sharps using a flat stake sizing strategy, one would expect that using a proportional stake sizing strategy should bring profits. Because this would result in placing larger bets on lower odds (larger win probability) and higher edges (more value).

Let's have another look at Tennis.

TENNIS ASIAN ONLY. Sample size: 700 trades

I do not know exactly what conclusion to draw from this data. I would expect the sharp traders to be more thorough in double checking their Tennis results, as the overall edges are smaller there, it is a lot more important to keep track of performance. The closing edge is solid, but the results are just insane. So either Tennis is extremely profitable on the sharps, there is a crazy run good or something is again wrong with the reporting of results.

Tennis has once again been excluded in the remaining data presented.

What does it look like if we use 1 random bet per game?  Here are 3 tries. Sample size: 12k trades

I think we can all agree that one should not take any trade that exists on the sharps and use a rather narrow preset.

Odds: 0 - 2,5 Hours: 0 - 2,5. Sample size: ≈ 17k trades

#1 Random trade per game. Odds: 0-2,5 Hours: 0 - 2,5. Sample size: 9k trades

#2 Random trade per game. Odds: 0-2,5 Hours: 0 - 2,5. Sample size: 9k trades

#3 Random trade per game. Odds: 0-2,5 Hours: 0 - 2,5. Sample size: 9k trades

Overall the correlation between the ROI and closing edge appears to be strong. That the ROI is higher in one of the simulations could be explained by run good or that the actual closing edge and thus closing EV is slightly higher. Another important element to keep in mind is that a) The sharps do not limit. And b) Using a flat staking strategy as we have done in this analysis is not a game theory optimal strategy. By using a proportional staking strategy one would expect the actual profits to be better than the flat profits.

Also remember that the goal of a sports trader and the purpose of Trademate is to beat the no vig closing line. The reason we say that we want to place our bets as close to kick-off as possible is that it is more likely to remain an edge when the game kicks off and lead to a higher average closing edge. But this is only one way of beating the closing. If you are able to beat the closings by taking earlier positions then that is great, as long as you know the higher risk you are taking with placing trades when the odds is more volatile. Another point to emphasize is that Trademate is a tool traders can use to identify +EV bets prior to kick-off, but it is your job as a trader, to make sure that the trades you are taking end up with a +Closing EV.

# Soft bookmakers

Finally let’s have a quick look at the softs. I think the numbers speak for themselves here and that how to interpret them have already been covered previously in this post.

All trades (No tennis). Sample size: 89k

Odds: 0 - 5 Hours 0-5. Sample size: 69k

1 Random trade per game. Odds: 0 - 5 Hours 0- 5. Sample size: 18k

# CONCLUSION

Again I would like to emphasize that we benchmark results versus the no vig closing line. If you do not believe that beating the closing line will lead to profits in the long term, then we are not the service for you. Also we deliver the tool, but it is your job as a trader to pick trades that end up with a positive closing edge. We do our best to help you on the way by sharing our information and knowledge. We host regular big data streams where we dive into the community stats. In the latest video we also looked at the data for individual sports. However this video is available for customers only.

Calculating Expected Profits

If you are unfamiliar with the concept of Expected Value, read this before continuing

You can calculate your expected monthly profits by using the following formula:

Expected Profits = ( Starting Bankroll * Average ROI^Number of times you turnover your bankroll in a month )  - Starting Bankroll

With a starting bankroll of €5 000 on the softs bookies with a 2,2% avg flat ROI and given that you manage to turnover your bankroll twice per week you could expect to earn:

€5000 * 1,022^8 = €5 950 - €5000 = €950. If you bring all your money into the next month you will have €5 950*1,022^8 = €7 081 -€5950 = €1131. So the compound growth gives a potentially high profit growth. With an average stake size of €25 per bet or 0.5% of the total bankroll one would need to make 200 bets per week to reach that number. If you start with a lower bankroll, you will need a higher turnover, meaning more trades to make the same expected profits. Also, note that one would expect the actual profits to be higher in practice, because using a proportional staking strategy such as the Kelly Criterion is superior to a flat staking strategy and results in an actual ROI that is larger than the flat ROI used in the analysis of the community data.

Steps to reduce variance

Also note that valuebetting inherently contains a lot of variance. This will cause the actual profits in a given month to fluctuate from the expected profits. However, the variance does even out over a large sample size, but an early run bad can have a detrimental effect on your bankroll. Thus it makes sense to take steps to reduce the variance such as to:

1. Bet on lower odds ranges (E.g. < 3 )

2. Use 30% Kelly stake sizing

3. Bet closer to kick-off when the odds is more stable.

Following these steps will also reduce your potential turnover and thus the pace of your profit growth. So as a sports trader you need to evaluate to which degree you can tolerate risk and how it fits with your overall strategy. At a minimum one should use a 30 - 50% Kelly. However one can go for higher odds ranges, just know that you might need to reach a couple of thousand trades for the variance to even out.