I haven’t looked at a statistics book for a good 5 years, since leaving uni and moving into the job world. However, I have come across a real world problem where I feel there is a statistical method to help me, but after searching high and low cannot find it.
I’ve looked at two tail test to determine confidence interval but cannot seem to apply it to my particular real world example, as there are 4 possible outcomes.

Real world example:

We have a set of 9,163 applications which will be transition to new environments; there are 4 possible places, or servers, where they might be moved to or ‘land’. Let’s call them servers 0, 1, 2 and 3. Each of the 4 servers has a different support cost associated to it – if too many applications land on a server which is too expensive to maintain than the transition will be seen as economically unviable. It’s quite costly to asses each application to determine which server it will land on, therefore we are thinking of taking a sample of applications to determine whether there is a case for continuing with the rest.
We were thinking of assessing 5% of applications (458 applications) and then taking that and extrapolating the results out to apply to the remaining, unassessed applications, to decide whether there is a business case to move them.

Question: is the sample size of 458 of 9,168 applications enough to provide statistically relevant results, with a confidence level of 95%; in terms of the results the margin of error or confidence interval should be 2.5%.That is to say, we expect the results of the application assessment to be wrong very rarely.

Maths orientated example.

Number of applications = 9,163
Total number of outcomes, or servers they might land once assessed = 4
Required confidence level = 95%
Margin of error i.e. percentage of applications which will be incorrectly assessed = 2.5%
Sample size = ?????

Question
Is there any help, guidance or insight you can provide in order to help me determine an appropriate sample size which will ensure that I can extrapolate out the results with a high degree of statistical confidence?

could you explain a bit more the purpose of the sampling? As I understand, you take one application and you assess whether it should be moved to a different server? That would make an attribute measurement ( like 1 if the app should be moved, 0 if not). There is a simple formula for that to determine the required sample size -

n=(2/D)^2 *(1-p)*p where p is your best estimate before the sampling of the proportion of 1s ( or zeroes) and D is the precision you want to measure the proportions with. The 2 corresponds to 95% confidence.

In practice you would take a pre-sample first, to get a rough idea of the p if it were completely unknown.

Let's assume all of the applications are on the same server today. The assessment would assess, and determine whether it would move to a different server. The app always has to move, although there are four options as to where it would go.

I'd like to know whether a sample size of 5% of the total application base of 9,163 is enough to be statistically significant. I.e

PRIOR TO ASSESSMENTS
9,163 APPLICATIONS IN PLACE A

AFTER ASSESSMENTS
458 applications assessed
200 in PLACE 1
150 IN PLACE 2
100 IN PLACE 3
8 IN PLACE 4

Therefore, would I be able to take the results of the 458 applications and extrapolate them up to the total number e.g.

EXTRAPOLATED RESULTS
9,163 TOTAL

4001 IN PLACE 1
3001 IN PLACE 2
2001 IN PLACE 3
160 IN PLACE 4

How reliable will this extrapolation be? I was hoping that the sample size would make it statistically significant to proceed.