# Probability of distribution

#### alibabawithak

##### New Member
Hi All,
I am trying to brush off on my college statistics that I haven't used in a while, and looking for some help to make sure I am doing this correctly. I have the data points below, for which I am trying to figure out the probability of getting a data point below 2.
I have calculated the average as 3.773566 and standard deviation as 2.064209548. The distribution is skewed right, and when graphed the left tail appears cut off. I also normalized all the data (calculated z-score for each data point).
From my calculation, data point of 1(there are no data points below 1) has a distribution of 0.078365044 and a Z-Score of -1.343645357
From my calculation data point of 2 has a distribution of 0.133614451 and a Z-Score of -0.859198416

Here is what the distribution looks like:

and when normalized:

Using the online calculator https://www.mathportal.org/calculators/statistics-calculator/z-score-calculator.php I am inserting: P(-1.343645357 < Z < -0.859198416), which = .1048 or about 10%. Does this in fact accurately represent the statement that the probability of a data point falling between 1 and 2 is 10%?
Why if i instead change it to P(Z<-0.859198416) (the probability of a data point less than 2) the value becomes 0.1949 or 19%?

Also, why when i used a different calculator, it asks me to insert mean standard deviation and then gives me different probability (for example this calculator: http://onlinestatbook.com/2/calculators/normal_dist.html )

Am I doing something wrong? Would greatly appreciate any help.

Data points are:
1
1.1
1.1
1.1
1.2
1.2
1.2
1.4
1.4
1.5
1.5
1.5
1.6
1.6
1.6
1.6
1.6
1.6
1.6
1.6
1.7
1.7
1.7
1.7
1.7
1.7
1.7
1.7
1.8
1.8
1.8
1.8
1.8
1.9
1.9
1.9
1.9
1.9
1.9
1.9
1.9
2
2
2
2
2
2
2
2
2
2
2
2.1
2.1
2.1
2.1
2.1
2.1
2.1
2.1
2.1
2.1
2.1
2.1
2.1
2.1
2.1
2.2
2.2
2.2
2.2
2.2
2.2
2.2
2.2
2.2
2.2
2.2
2.2
2.2
2.2
2.2
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.3
2.4
2.4
2.4
2.4
2.4
2.4
2.4
2.4
2.4
2.4
2.4
2.4
2.4
2.4
2.4
2.4
2.4
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.6
2.6
2.6
2.6
2.6
2.6
2.6
2.6
2.6
2.6
2.6
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.7
2.8
2.8
2.8
2.8
2.8
2.8
2.8
2.8
2.8
2.8
2.8
2.8
2.8
2.9
2.9
2.9
2.9
2.9
2.9
2.9
2.9
2.9
2.9
2.9
2.9
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.1
3.2
3.2
3.2
3.2
3.2
3.2
3.2
3.2
3.2
3.2
3.2
3.2
3.2
3.2
3.2
3.2
3.2
3.2
3.2
3.3
3.3
3.3
3.3
3.3
3.3
3.3
3.3
3.3
3.3
3.3
3.3
3.3
3.3
3.3
3.3
3.4
3.4
3.4
3.4
3.4
3.4
3.4
3.4
3.4
3.4
3.4
3.4
3.4
3.4
3.4
3.4
3.4
3.4
3.4
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.6
3.6
3.6
3.6
3.6
3.6
3.6
3.7
3.7
3.7
3.7
3.7
3.7
3.7
3.7
3.7
3.7
3.7
3.8
3.8
3.8
3.8
3.8
3.8
3.8
3.8
3.8
3.8
3.8
3.8
3.9
3.9
3.9
3.9
3.9
3.9
3.9
3.9
3.9
4
4
4
4
4.1
4.1
4.1
4.1
4.1
4.1
4.1
4.1
4.1
4.2
4.2
4.2
4.2
4.2
4.2
4.3
4.3
4.3
4.3
4.3
4.3
4.3
4.3
4.3
4.3
4.4
4.4
4.4
4.4
4.4
4.4
4.4
4.5
4.5
4.5
4.5
4.5
4.6
4.6
4.6
4.6
4.7
4.7
4.7
4.8
4.8
4.8
4.8
4.8
4.8
4.9
4.9
4.9
5
5
5
5
5
5
5.1
5.1
5.1
5.1
5.1
5.1
5.2
5.2
5.2
5.2
5.3
5.3
5.4
5.4
5.5
5.5
5.5
5.6
5.7
5.7
5.8
5.9
5.9
6
6.2
6.2
6.3
6.4
6.5
6.5
6.5
6.7
6.7
6.8
6.9
7
7
7.1
7.1
7.2
7.3
7.3
7.3
7.3
7.5
7.5
7.5
7.6
7.6
7.6
7.7
7.7
7.7
7.8
7.8
7.9
7.9
8.2
8.2
8.3
8.3
8.4
8.4
8.4
8.6
8.7
8.8
8.9
8.9
9.1
9.6
10.6
10.8
11
11.4
11.4
11.5
11.6
11.6
11.8
12.1
12.6
12.9

#### Archidamus

##### Member
When it comes to modeling distributions I personally use Minitab and its 'Individual Distribution Identifier'. When I ran the data, the best fit was a log-normal distribution. There for I took the natural log of the data, then found the mean and sample standard deviation. The final transform became Z = [Ln(x)-mu]/sigma. I got mu = 1.212 and sigma = .4627. So now to find P(X<2), find P(Z<[ln(2)-mu]/sigma). I got it as .131 or 13.1%.

But don't stop there, we made an assumption that it is log-normal. So empirically I counted the data points that were below two and devided by the population count. I got 41/488 points, or 0.084 , or 8.4%

13.1% is the model's value, and 8.4% is the empirical value... now you have to determine if that is good enough for what ever you are trying to do with it.