I am trying to solve a problem and the results I get seem counter-intuitive.

We randomly throw \(n\) balls into an area partitioned into 3 bins \(b_1,b_2,b_3\). The size of each bins is proportional to the probability the ball will fall in it. Let's call these probabilities \(p_1,p_2,p_3\). This can be described by a multinomial distribution.

Now let's say I throw 12 balls, and I know how many landed in each bin (\(x_1=3,x_2=6,x_3=3\)).

I would like to estimate the size of the bin from the observations. For this I use Maximum Likelihood. It can be shown that the MLE will be \(p_1=3/12,p_2=6/12,p_3=3/12\). This is pretty intuitive.

It turns out that the actual likelihood at this point is:

\(L(p_1=0.25,p_2=0.5,p_3=0.25|x_1=3,x_2=6,x_3=3)=\)

\(=\frac{12!}{3!6!3!}0.25^30.6^60.25^6=0.07050\)

Now, let's assume I knew in advance that \(p_1=p_3\). How would that change my result? It would not - I would still get the same parameter values \(p_1=0.25,p_2=0.5,p_3=0.25\).

The twist comes now: let's assume I cannot observe balls that landed in \(b_3\). If I know that 12 balls were thrown I am fine, since I can calculate \(b_3=n-b_1-b_2=12-3-6=3\). but what happens if I don't know \(n\)?

I figure that in this case, I would need to estimate \(x_3\) (or equivalently \(n\)) as well. However, if I use MLE, the results start looking weird. Intuitively, I would expect that if I observe \(x_1=3,x_2=6\) and I know that \(p_1=p_3\), then the MLE will probably be \(p_1=0.25,p_2=0.5,p_3=0.25,x_3=3\). However, it is clearly not the maximum, since for example:

\(L(p_1=0.24,p_2=0.52,p_3=0.24|x_1=3,x_2=6,x_3=2)=\)

\(=\frac{11!}{3!6!2!}0.24^30.52^60.24^2=0.07273\)

So from this it seems that \(x_1=3,x_2=6,x_3=2\) is more likely than \(x_1=3,x_2=6,x_3=3\) even if I know that \(p_1=p_3\), which seems very counter-intuitive.

**My questions are whether my logic is sound, whether my intuition is misleading me and whether this is the correct way to estimate the parameters and missing data.**

Thanks!