Probability question to be calculated in R

tcratius

New Member
I am trying to solve these two questions and would like to discuss my logic of how I found the answer and some help with comprehending the question.

"The lifetime of a particular type of TV follows a normal distribution with =4800 hours, and = =400 hours.
(a) Find the probability that a single randomly-chosen TV will last less than 4,500 hours. Use R to assist with your computations.
(b) Find the probability that the mean lifetime of a random sample of 16 TVs is less than 4,500 hours. Use R to assist with your computations."

Ok, firstly, does the first line say that the lifetime of a particular type of TV follows a normal distribution, then does that mean it is the population they are talking about with a mean =4800 hours, and standard deviation =400?

Assuming the first part is the population mean and standard deviation, then I can work out the probability of a single randomly-chosen TV lasting less than 4500 hours, yet I am not confident it is the population.

Question a)
SE = standard_deviation/sqrt(1)
Z = (4500-4800)/SE
pnorm(Z) = 0.23

Question b) using central limit theorem
SE = standard_deviation/sqrt(16)
Z = (4500 – 4800)/SE
Pnorm(Z) = 0.001349898

Now I can not see how 16 TV's has such a small probability of being less than 4500hours when 1 TV has a probability of 23%.

Buckeye

Member
As the sample size increases, the standard error of the sampling distribution gets smaller. So, there is less variability around the mean. If you want to see visually, plot the density curves with different sample sizes.

tcratius

New Member
My view of the result of pnorm(Z) = 0.001349898 is that the probability of the vector < 4500 would 0.1% chance of successfully failing. That's it, nothing more? I haven't read the question wrong?

Buckeye

Member
The question is asking about the average lifetime of a random sample of 16 TVs. So, not every television will have a lifetime below 4500 hours. The sampling distribution plots averages of random samples of size 16 (over and over). If you plot a normal distribution with mean 4800 and standard deviation sigma/sqrt(n) you will see what I'm talking about.

Last edited:

tcratius

New Member
ok, in Rstudio this is how I put wrote it. I created 9600 random numbers using rnorm based on twice the mean or mu. Added it to a data.frame and then plotted it using ggplot2. Am I going about this the wrong way to get the normal distribution? and this is what happens when you try to teach statistic to a programmer . I am trying.
library(ggplot2)

mu <- 4800 # mean in Hour
sigma <- 400 # sigma
rnorm.axis <- mu * 2 # Number of values to be randomly generated
n <- 16 # number of TV's
std <- sigma/n

TV <- data.frame(x = rnorm(rnorm.axis, mu, sigma))
TV2 <- data.frame(X = rnorm(rnorm.axis, mu, std))

plotTV <- ggplot() +
geom_density(aes(x=x), data=TV, alpha= 0.2, fill='red', size=1) +
geom_density(aes(x=X), data=TV2, alpha=0.2, fill='blue', size=1) Buckeye

Member
Seems right except that the standard deviation should be sigma/sqrt(n) But you see how the blue curve has a smaller spread. Which suggests that the chance of seeing a sample of 16 tvs with an average less than 4500 is quite low.

Just remember that for a sampling distribution:
1.) randomly sample a specified size from the population
2.) take the mean lifetime of the sample
3.) plot it in the graph.
4.) repeat

This is done over and over until we see that blue shaded curve. In fact, if we sampled from a skewed population many times the sampling distribution would approach a normal distribution. You can visualize this at this website: http://www.lock5stat.com/StatKey/

Last edited:

tcratius

New Member
Yeah not sure how I ended up writing sigma/n when my code has sigma/sqrt(n). Cheers for your help, stats really blows my mind, yet it is really interesting. I'll checked out the website too. Take care 