sample size dice simulation

sample size dice simulation

Sample size dices

It is a question which I get often. Which sample size should I take. Unfortunately the answer I always have to give is: it depends on different factors.  In order to illustrate this I created the following "game" which can be used during training sessions.

I create 4 sets of 5 dices each.  The number of the dices are adapted (I do it with writing the new numbers on paper and sticking them to the dices with transparent tape):

  • set 1: 5 dices with numbers: 1.2 - 2.2 - 3.2 - 4.2 - 5.2 - 6.2 (average 3.7)
  • set 2: 5 dices with numbers: 1.0 - 2.0 - 3.0 - 4.0 - 5.0 - 6.0 (average 3.5)
  • set 3: 5 dices with numbers: 2.2 - 3.2 - 4.2 - 5.2 - 6.2 - 7.2 (average 4.7)
  • set 4: 5 dices with numbers: 0.0 - 1.0 - 4.0 - 5.0 - 8.0 - 9.0 (average 4.5 - more spread)


The participants are divided in 4 groups and each group get one set of dices.  No information is shared with the group about which sets exist (in other words they do not know the numbers which are on the dices, expect for their own set). The groups are also not allowed to share information between each other and they should not be able to see each other throw.

Each group has to throw the 5 dices and report the average of all throws done so far after each throw.  For example a group throws the 5 dices for the first time and have following numbers  1.0, 3.0, 4.0, 3.0, 5.0. They report 3.2 as the average. The next throw they have numbers 2.0, 6.0, 3.0, 2.0, 5.0.  They report now an average of the 10 dices thrown so far. In this case they report 3.4. 

After each throw all the reported averages are written down on a white board or flip chart.

The goal of the game is to order the different sets based on their theoretical average.  In other words set 1 is smaller than set 2, set 2 is smaller than set 3, and set 3 is smaller than set 4. After each throw each team can make  statements  if a set is smaller or greater compared to another set.  They can only make the statement once and are not allowed to change their statement anymore. .So if they are not certain it is better to wait for the next throw. The different statements are kept on the score card. The statement has to be written down by the team on the board to be valid.

The scoreboard is shown in the table below.



Set 1

Set 2

Set 3

Set 2

> > <



Set 3

 > > > >

> > >


Set 4

 > > > >

> > > >


The answers are filled in from left to right. The team with set 1 (red set) was the first team that made an assumption about the relationship between Set 1 and Set 2.  The team with set 2 (green set) was second and the team with set 4 (black set) was third.  The team with set 3 (blue set) has not stated an assumption. The red and green team think that set 2 is bigger than set 1. The black team thinks that set 2 is smaller than set 1.


The scoring goes as follow. If the statement is correct and the team is the first to make statement, the team gets 2 points.  If the statement is correct but the team was not first to make the statement, they get 1 point. 1 point is deducted when the answer is wrong . If a statement is wrong or right is only revealed at the end of the game.  It is allowed to give no answer.

The game stops after all teams have filled in all relationships or after 30 throws. The team with the most points win.

Afterwards a discussion can be done on how minimum meaningful difference to be detected, significance level, power value and standard deviation have an influence on sample size.

Attribute sample size dices


There are many variation you can do which changing the numbers on the dices. One of the variation I do is instead to work with numbers to work with pass / no-pass values (green and red sides on the dices).

The dices are as follow:

  • set 1: 10 dices (60 sides): 59 green / 1 red
  • set 2: 10 dices (60 sides): 58 green / 2 red
  • set 3: 10 dices (60 sides): 48 green / 12 red
  • set 4: 10 dices (60 sides): 45 green / 15 red

Now the percentage of bad parts have to be reported after each throw. 

This variant illustrates that when working with attribute date you require a higher sample size in order to make good statements.



This work is distributed free of charge under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License and it is the intellectual property of In short, you can use it for free,  share it and adapt it if you give the appropriate credits to You can not resell it as your own. Read more here.