How to know if relevent to consider splitting my data by category or blended?

#1
I have a dataset of open cases supporting in 9 different languages. the sample for all languages blended is 384 benchmarks per week, but splitting by language it would be 1903. This is a significant difference. I would like to know how to see if it is relevant to consider splitting by language when it comes to the sample?