# Keno or LOTTO style

#### panz

##### New Member
So, here is how the game goes.
suppose that the game uses the numbers 1 through 50 and suppose that the operator selects eight of these. If the bettor selects five numbers, find the probability that there are exactly five matches.

I'm not so sure of what to do here. So to start and perhaps save the calculations for ppl who will help, the probability of 50 choose 8 is 536,878,650, and the probability of 50 choose 5 is 2,118,760.

Anyway, what I did is (45 choose 3)/ (50 choose 8) = .00002643 is this right?

#### JohnM

##### TS Contributor
You were pretty close.

n = total numbers = 50
y = number chosen = 8
x = number to match = 5

= [ yCx * (n-y)C(y-x) ] / nCy

= [ 8C5 * 42C3 ] / 50C8

= 56 * 11480 / 536878650

= 0.0011974

Here's a link that talks about a similar example (where 20 are drawn from 80)
http://wizardofodds.com/keno

#### jerryb

##### New Member
John,

I'm not crazy about that formula, let me tell you why:

the numerator multiplies the number of ways that 5 matches could be made from the 8 chosen numbers (that seems OK) by the number of ways that 3 numbers NOT SELECTED into the eight could be chosen from the remaining 42??

seems to me that should be 8C5*45C3 which would give a slightly higher probability.

my reasoning is that if i select my 5 first (as is done in most lottery games) then there are 8C5 ways that my numbers could match the draw of eight and there are 45C3 possible ways that the non-matching numbers could be drawn.

Am I missing something here? or is the "wizard of odds" in error?

cheers
jerry

#### JohnM

##### TS Contributor
Jerry,

Let me think through this some more - you raise some good points.

John

#### JohnM

##### TS Contributor
I'm finding a lot of hits on google that explain "keno" as an example of a hypergeometric probability, which is the formula I originally posted.....

n = total numbers = 50
y = number chosen = 8
x = number to match = 5

= [ yCx * (n-y)C(y-x) ] / nCy

#### quark

jerryb said:
seems to me that should be 8C5*45C3 which would give a slightly higher probability.

my reasoning is that if i select my 5 first (as is done in most lottery games) then there are 8C5 ways that my numbers could match the draw of eight and there are 45C3 possible ways that the non-matching numbers could be drawn.
Hi Jerry,

If you use 45C3, then 3 of the 45 numbers will be chosen by the operator, and and number of matches can be more than 5. I think we need to use 42C3 to guarantee exactly 5 matches.

#### jerryb

##### New Member
seems like maybe you have never played keno?

you can't possibly match more than five numbers if you only choose five numbers on your ticket. in the keno games that i have played the operator draws some fixed number of numbers on each game, but the player can draw from as few as 1 to as many as the operator. if you read the original problem the player is only selecting five numbers, so there are a total of 45 numbers that will not match which can be chosen for the remaining 3 that the operator will select.

i believe that the formula posted is likely to be for the following scenario:

game has 50 numbers
operator selects 8
player selects 8

find the probability of matching exactly five

in this case then the formula posted works (to my common sense anyway)

BUT, that was not the opriginal problem posted.

i think this is where the descrepancy lies, if i have some free time this afternoon i'll read up on the hypergeometric reference to keno games.

cheers
jerry

#### JohnM

##### TS Contributor
For what it's worth, I ran a simulation on this - 600,000 runs, and the probability converged to 0.001163, which is pretty darn close to:

= [ 8C5 * 42C3 ] / 50C8

= 56 * 11480 / 536878650

= 0.0011974

#### jerryb

##### New Member
I went and re-read that wizard of odds page that you linked to john and i found this

"The overall general formula for the probability of x matches and y marks is combin(y,x)*combin(80-y,20-x)/combin(80,20).

As an example let's find the probability of getting 4 matches given that 7 were chosen. This would be the product of combin(7,4) and combin(73,16) divided by combin(80,20). combin(7,4) = 7!/(4!*3!)= 35. combin(73,16) = 73!/(16!*57!)=5271759063474610. combin(80,20) = 3535316142212170000. The probability is thus (35*5271759063474610)/3535316142212170000 =~ 0.052190967 ."

in the game of his example there are a total of 80 numbers and the operator chooses 20, thus the denominator of 80C20. but note that the numerator is built upon the number of "marks" made by the bettor and the number of matches out of the 20. the marks are the numbers selected by the player. so for his example:

total numbers 80
operator selects 20
player selects 7 = y
player matches 4 = x

so for the situation of this problem 80 becomes 50, 20 becomes 8, the 7 numbers selected becomes 5 and the 4 matches becomes 5.

thus his formula is

combin(5,5)*combin(45,3)/combin(50,8)

which my calculator show to be .0000264, the original poster's value.

in your original response you wrote

"y = number chosen = 8"

y in that formula should be the number chosen by the player. you let that be the number chosen by the operator, and so did i on my previous post. in this porblem the number chosen by the player is 5 and the number chosen by the house is 8. i still think something is not right here, i'll remain unconvinced for now that any of these potential answers is correct.

how did you set up your simulation? it does seemto matchup with your calculation.

cheers
jerry

#### JohnM

##### TS Contributor
It was done in Excel. I basically did a random sort on 50 numbers, then I arbitrarily selected 8, and looked for matches against the "top 8" in the random sort. If there was a match of 5, I set up another range where a "1" would be written, "0" otherwise. After 30,000 runs, I just sum the range with 1's and 0's to get the number of "successes."

But you're right now that I took a closer look - I was basing calculations on the player choosing 8 and having to match exactly 5 of the 8. But now that I read it again, the operator chooses 8, the player chooses 5, and needs to match all 5.

This would result in a much lower probability of winning....so I adjusted the simulation in Excel, and in the first 30,000 runs, there were no instances of matching 5.

#### jerryb

##### New Member
John,

here is another link that connects to this problem, though with a slightly different angle of attack.

http://www.uri.edu/artsci/math/clark/w1xx/hyper.htm

this site requires a plug-in if you want to view his interactive math content. when i ran "our" problem through his calculator came up with .0000264

i read a few other sites that discuss this issue and i believe that the stuff from wizard of odds is correct if you adjust it the way i did in my last post.

i'm still reading up on this.

cheers
jerry

#### panz

##### New Member
Thanks

Thanks a lot, for the discussion on this.

Anyway, yes, I have found out the answer to be (45 chooses 3)/ (50 chooses 8) (my original answer). (5 chooses 5 is 1 so ignored it).

Thanks again,
Pan

#### JohnM

##### TS Contributor
Phew. We can put this one to rest, then.

#### JohnM

##### TS Contributor

Pan's answer is correct, but the formulation isn't.

In a hypergeometric situation, which this is:

N = size of population = 50
M = # of items in population with property "E" = 8
N-M = # of items in population without property "E" = 42
n = number of items sampled = 5
k = number of items in sample with property "E" = 5

P(k) = (M,k)*(N-M,n-k) / (N,n)

So,

P(k=5) = (8,5) * (50-8,5-5) / (50,5)

= (56)*(1) / 2118760

= 0.0000264

#### jerryb

##### New Member
I disagree, again (big surprise, eh? )

Panz's formulation is as correct as yours, here is why:

in your formulation you are looking at a situation where there are 8 numbers with property E, which i will call eligible to be matched by the player. and the player sample five numbers when he places his bet, hence n=5 and k=5.

in panz's he is considering a situation where 5 numbers contain property E, and are eligible to be matched by virtue of being on the player's ticket. then the operator samples 8 numbers, hence M=5, n=8 and k=5

you are both right, and so was i in an earlier post, but you are attacking the problem from different directions. like finding the probability of drawing an ace from a standard deck is equal to the probability of not drawing anything else.

i have enjoyed learning about the hypergeometric probability.
cheers
jerry

#### JohnM

##### TS Contributor
Fair enough - I've preached that there's always more than one way to solve a problem....

#### jerryb

##### New Member
dare i say that we CAN put this one to rest??

cheers
jerry

#### cet

##### New Member
Actually, I have a similar question on probability that deals with a population, sample size, and matching.

While I don't claim to completely undertand the combin formulae, I DO understand that when you plug in the appropriate figures, the odds do match my state lottery.

Where my curiosity lays is in the odds of the \$1,000,000 Pyramid game bonus round.

The pyramid contains a population of 59 selectable boxes which contain either a credit value or a pyramid symbol.

You are instructed to pick 7 of the boxes, hence the sample size is 7.

Now if there were only 7 pyramid symbols in the game, the combin formulae would work to give you the odds of hitting 7 out of 7, however; there are 11 hidden pyramids among the population of 59 boxes.

You get a bonus for each pyramid revealed, so what I'd like to know is what are the odds of finding 1 through 7 pyramids in a population of 59 squares with 11 hidden pyramids and 7 selected boxes.

Could we also invert the combin formulae so we show a 1 in XXXXX odds summary rather than a decimal value?

Also, I'd like to get this into an Excel spreadsheet, so if you could help with the combin formulae, I'd be most thankful!

C.

#### cet

##### New Member
I believe I figured it out with a bit of searching ... Amazing what one can do when they put their minds to it!

Guessing we're dealing with HyperGeometric Distributions here, and as luck would have it, excel does have a HYPGEOMDIST function (YEAH!)

Which of course now leads me to my next question, and that is, are HGDs additive? For example, if a population is made up of 3 or more elements instead of just successful and unsuccessful.

Can you calculate the HGD for each element and add them together for a combination?

For example, a population has 100 elements, 95 of which we're not concerned with, however, there are 2 successful elements X, 2 successful elements Y and 1 successful element Z in the population.

What I'd like to know is if we must have at least one of each (X,Y, and Z), what are the odds it we are able to make S # of selections?

This gets really weird for me, and I can't fathom how to deal with more than just successful vs. unsuccessful.

Any advice, formulae or bottles of strong spirits to help comprehend this?

Many Thanks!

C.

#### jerryb

##### New Member
cet,

seems like you have your first problem under control. so to answer your second: in a hypergeometric probability you are dealing with success or failure only.

so to solve your example problem with 2 x's, 2 y's and 1 z in the set of 100 and a requirement that you draw one of each to win.

so try something like this:

use s=3 as a base to get a sense of how it would work, thus you would have to draw one x, one y and one z in three draws. try to simply count the number of way you could draw x,y,z out of the set and divide by the total number of sets of three draws possible (100C3). this might get you going.

i have to leave right now for an appointment, but i have an idea of a strategy and will post tomorrow if you are still at it.

cheers
jerry