I am working on a project in which I am comparing a specific physiological value for a large set of species (75 Species in Total).

One of the things that I wanted to do was to compute the mean and confidence interval of the data of all the species together, to show all the data are actually centred around a specific value ( we can definitely see it when plotting the distribution).

However, my dataset is highly unbalanced, and I have species with only 5 available datapoints, and others with up to 150 datapoints. Therefore I wonder how I could calculate this mean with all the species having the same "weight" in the calculation (because obviously, the species with 150 datapoints will influence the mean way more than these having only five).

I thought about doing a mean of the mean of each species, but I kind of remember that it is not really a good thing to do mean of means.

Thank you already for your help

Théo.