The Idea
I think most people have played lotteries such as 6aus49 or Eurojackpot (if you’re in Germany or anywhere in Europe) at least once in their lives. Some even play on a regular basis even though gambling will lose them money in the long run, at least, statistically speaking. Despite the fact, lotteries still tend to entice people into accepting small losses for the opportunity of large gains.
In this article, I want to present some results of often-cited strategic recommendations for playing lotteries that are, at the same time, rarely quantified. There are at least two types. First, there are rules that promise to help select numbers which are more likely to win. For example, check out this article by tz.de where three such rules are explained. However, as far as I know, this doesn’t actually help and is therefore not my focus.
The second type, which is also mentioned in the above article, would be to avoid popular numbers. Basically, the goal is not to alter the probability of winning but instead to improve the expected profit, if a win occurs. Typically, it is suggested to stay away from numbers associated with dates or special patterns. For example, dates consist mainly of smaller numbers from 1 to 12 for the months and 1 to 31 for the days. In the following, I check to what extent this rule can impact the expected profit.
The Implementation
The data used here is from the regional Lotto provider Sachsenlotto. The site offers a data set covering lottery numbers from 1955 to 2017, which includes overall stakes and profit per prize category. On 4th May 2013, the official rules changed (there are no longer additional numbers, there were new prize categories: 2 correct numbers + correct super number, etc.). For that reason, I focus on the recent period from 2014 to 2017 where a total of 418 lotteries were played. The following table presents some summary statistics of the prizes per category.
Category | Mean | SD | Missing Obs. |
---|---|---|---|
6 correct with super number | 9,020,644 | 7,166,456 | 2 |
6 correct | 1,075,450 | 2,470,764 | 9 |
5 correct with super number | 11,627 | 5,039 | 0 |
5 correct | 3,719 | 1,320 | 0 |
4 correct with super number | 200.3 | 56.78 | 0 |
4 correct | 44.09 | 10.77 | 0 |
3 correct with super number | 21.33 | 4.48 | 0 |
3 correct | 10.56 | 1.7 | 0 |
To be clear, the main point here is to analyze the impact of the drawn lottery numbers on the expected profit within the 8 prize categories above. The lottery 6aus49 actually consists of 9 prize categories. However, I have omitted the category “2 correct with super number” because the prize is constantly 5 €. First, the following variable is defined:
represents the drawn lottery numbers and hence
simply counts the number of numbers between
and
. Note that
denotes the indicator function. Two special cases of this, namely
and
will be used to explain the price
per category
using OLS as follows:
is the overall stake in the corresponding draw and
is a dummy indicating whether it is Saturdays.
The Results
To simplify things, I am only presenting the results of “5 correct numbers and super number”, as shown in the following two tables.
Estimate | Std. Error | t value | Pr(>|t|) | ||
---|---|---|---|---|---|
![]() |
16233 | 1501 | 10.81 | 3.613e-24 | * * * |
![]() |
-1551 | 229.4 | -6.763 | 4.623e-11 | * * * |
![]() |
-9.449e-05 | 5.871e-05 | -1.609 | 0.1083 | |
![]() |
2328 | 1589 | 1.465 | 0.1436 |
Estimate | Std. Error | t value | Pr(>|t|) | ||
---|---|---|---|---|---|
![]() |
19033 | 1803 | 10.55 | 3.198e-23 | * * * |
![]() |
-1043 | 213.6 | -4.883 | 1.498e-06 | * * * |
![]() |
-9.105e-05 | 6.018e-05 | -1.513 | 0.131 | |
![]() |
2046 | 1628 | 1.257 | 0.2095 |
Firstly, the results indicate clearly that there is a highly significant effect of on expected profit; the fact that people frequently use popular numbers is supported by the data. Next, and further supporting the intuition behind the idea, the effect is stronger for
than for
. This is intuitive because smaller numbers are more likely to represent a date and hence are more often chosen. Thirdly, the effect seems relatively large as well. Finally, the average prize in this category is 11626.62 €. According to the regression model, this average is decreased by 1551.44 €, if an additional number below 12 is contained in the winning set, which translate to a 13.34 % difference. Note that the variable
(not surprisingly) has no impact. And if this wasn’t the case, then deciding which day to play on (Wednesday or Saturday) would make a difference because people play more on Saturdays. For a clear picture, see the following plot where a clear Wed/Sat pattern is observable.