really urgent probability problem - is it bayesian?


I have an urgent problem but I'm a little lost.
Is this probability question referring to the use of Bayes' theorem?

Consider a language with only two symbols I and O. The proportion pI of
the symbol I in the language is unknown. Suppose one particular sample text consists of a sequence of N symbols drawn randomly and independently from the language, out of which the symbol I occurs NI times and the symbol O occurs N – NI times in this
particular sample text.
(a) Write an expression for the probability of the sample text in terms of pI, NI, and N.
(b) Derive mathematically the value of pI that maximizes the probability of the sample
text, in terms of NI and N. This value serves as an estimate of pI.
Firstly, I don't exactly understand what is meant by what is "the probability of the sample text". Does this mean we need to find P(I . O)? This is as far as I can get:
P(I) = Ni/N, P(O) = (N - Ni)/N, do we now multiply the two together? Then where does the original pI figure in?

And what is part (b) dealing with? Is it in trying to estimate the prior? Does this mean something like arg max P(I | O)P(I)? My mind keeps going round in circles in this!

Please help!


TS Contributor
It seems to me that this question is a standard elementary question for maximum likelihood estimator.

a) Write down the likelihood function. What is the distribution of the numbers I you observed? The term "probability" here refer to the probability mass function, i.e. the likelihood function because the distribution is discrete.

Imagine before the experiment, you only know you are going to draw \( N \) letters and each draw have a probability of \( p_I \) to observe the letter I.

b) Once you write down the likelihood function, find the \( p_I \) that maximize the above likelihood function.
Thank you! In that case the likelihood function would be binomial:

pi exp(Ni) * (1-pi) exp(N-Ni )

And the maximum value then would be 1, so I'll set the above expression equal to 1.

Am I correct in thinking I now just have to solve this problem using logarithms?


TS Contributor
Of course the upper bound for any probability is 1, but it is not attainable in most cases.
(e.g in this binomial case, it only attain to 1 when it is actually a degenerate random variable - the values of p take 0 or 1 so that it is actually a constant 0 or N )

Anyway, you just need to solve for the maximum, by doing some differentiation with respect to the parameter \( p_I \)