Seeking help with probability distribution

#1
Hello everyone. Please forgive me as I’ve never taken a formal course in statistics, and I’m not sure I’m using the correct terminology.

I am trying to learn how to calculate probability through a distribution based off a series of numbers.

Let me explain.

My favorite baseball team, the Seattle Mariners, has seven upcoming games against subpar opponents (shown below with the win likelihood of each game). I am trying to figure out the statistic probability of each scenario in this seven game stretch based if the win probability assigned to individual games.

Seattle Mariners next seven games chance of winning (borrowed from the 538 blog):

Game one: 58%
Game two: 61%
Game three: 59%
Game four: 56%
Game five: 68%
Game six: 68%
Game seven: 70%

Possible out comes:

(Wins/losses)

0-7
1-6
2-5
3-4
4-3
5-2
6-1
7-0

So how do I calculate the probability of each scenario based on the data above?

If this is an easy one to figure out, please let me know how to crunch these numbers. Or provide a link to a generator or calculator if you’re using one.

Just eyeballing them, it’s clear that extremes on each end are unlikely, but I’d love to see the distribution of probability.

This is just for fun. Thanks!
 
#2
I think I was able to figure out the odds of 7-0, by multiplying all the probabilities together, which is .037.

I also figured out the inverse, 0-7, by calculating all of the probabilities together which is .0009.

I can’t figure out how to calculate the scenarios in between.
 
Last edited:

Dason

Ambassador to the humans
#3
When the probabilities are all different the only real way is to brute force the possibilities. Luckily you can do this iteratively fairly easily by recognizing that the only way to get to 'k' wins at game 'n' is to either be at k-1 wins at game 'n-1' and win that game or to be at k wins at game 'n-1' and lose that game.

I made an excel sheet that does just that. The first row lets us modify our probabilities for each game. The first column tells us how many wins we are at. The "Round" is how many games have been played. The value in the cells are the conditional probabilities of having that many wins given you're at the round specified in row 4.

If you look you'll see cell G9 is currently selected and the formula is F8*G$1 + F9*(1-G$1). The value in cell G1 is the probability of winning game 5. The value in F8 is the probability of having 2 wins in round 4, the value in F9 is hte probability of having 3 wins in round 4. So the formula in G9 is calculating P(2 wins at game 4)*P(win game 5) + P(3 wins at game 4)*P(lose game 5).

I added a row for -1 wins which always has a probability of 0. This is so I didn't have to modify the form of that formula for row 6.

Hopefully that makes sense but really if you get used to excel formulas it's easy enough to do this brute force kind of thing.

ExcelBruteForce.PNG
 
#5
When the probabilities are all different the only real way is to brute force the possibilities. Luckily you can do this iteratively fairly easily by recognizing that the only way to get to 'k' wins at game 'n' is to either be at k-1 wins at game 'n-1' and win that game or to be at k wins at game 'n-1' and lose that game.

I made an excel sheet that does just that. The first row lets us modify our probabilities for each game. The first column tells us how many wins we are at. The "Round" is how many games have been played. The value in the cells are the conditional probabilities of having that many wins given you're at the round specified in row 4.

If you look you'll see cell G9 is currently selected and the formula is F8*G$1 + F9*(1-G$1). The value in cell G1 is the probability of winning game 5. The value in F8 is the probability of having 2 wins in round 4, the value in F9 is hte probability of having 3 wins in round 4. So the formula in G9 is calculating P(2 wins at game 4)*P(win game 5) + P(3 wins at game 4)*P(lose game 5).

I added a row for -1 wins which always has a probability of 0. This is so I didn't have to modify the form of that formula for row 6.

Hopefully that makes sense but really if you get used to excel formulas it's easy enough to do this brute force kind of thing.

View attachment 316
Let me take a stab at factoring the formula for H10. Let's also check my reasoning.

H10 is the probability of having four wins through six games.

H10 = G9*H$1+G10*(1-H$1)

Part one: G9*H$1. The probability of having three wins at the start of game six multiplied by the chance of winning (a fourth game) game six.

Combined with:

Part two: G10*(1-H$1). The probability of having four wins ("already") at the start of game six multiplied by the chance of losing game six (required or else the team would have five wins).

Is my formula correct and does my reasoning make sense?