Can you help me with this question

#1
In an email, 5 features are extracted. Let n=20 data are observed from this email.
(a) What is your proposed model of data? (Hint: you are allowed to choose freely parameters of the model so that the conditions of the proposed model met.)
(b) What is the probability that we observe 2, 1, 0, 0, 17 data respectively from feature one to five?
(c) What is the probability that we observe at most 4 data from the last feature?
(d) Compute the correlation between the first and second features?

Please use the R package
(e) Generate 1000 samples from your proposed model in part (a).
(f) Find the sample correlation between the first and the second features. (g) Compare the sample and model correlation between these two features.
 

Dason

Ambassador to the humans
#4
I'm not so sure about that. But that's partially because the original post is hard to follow.

From the description it seems like multinomial might be more appropriate?
 

ondansetron

TS Contributor
#6
I'm going to agree with @hlsmith and @Dason; it is unclear what the question is actually asking. Is there some variable for which one of 5 categories may occur and you're asking what is the probability of seeing 2 of the first ,1 of the second,0 of the third, 0 of the 4th, and 17 of the 5th category?

Or are you asking something else?
 
#7
Actually, I'm doing Masters in Data Analytics. This question is taken from my assignment. I think the way ondansetron think is right. Even I'm not clear.

Maybe we can do it with hypergeometric distribution.
 

ondansetron

TS Contributor
#8
My interpretation was an expanding on @Dason's idea, but if you are unsure of the question on your own, I suggest that you ask the professor the correct interpretation first (i.e. is this a single multinomial variable with 5-levels). This will help us avoid guidance that may not be correct given the true intention of the problem.