I'm still trying to figure out what you mean when you say "valid." Do you maybe mean that you are looking for results that are "statiscially significant"?
Or are you trying to make your graph look like a normal distribution? You can't force your data to be normal. You can sometimes manipulate it a bit to make it look more normal, but you don't change the data to do that. Yeah, worldwide the relationship between a child's height and weight is probably normally distributed. But that doesn't mean your population of 5,000 is normally distributed or that your sample of that 5,000 is normally distributed. Your results aren't wrong just because they aren't normally distributed.
The results you have are the results you have. The shape of your histogram or scatterplot is not going to change if you copy your existing results five times over. When you say you "duplicated" the data, what you really did was (essentially) falsify your data. You didn't really sample 500 children, so 4/5 of your data is made up. You don't want to report that.
There is no magic number of how many cases you are supposed to have. The sample size you need to collect is determined by the size of the entire population and how accurate you want to be. The less accurate you need to be, the fewer samples you need to take. But those samples need to be properly selected following the correct sampling procedure for your experiment/investigation. Otherwise, you can't say the sample represents the bigger group.
You have a set of results. Those are your results. More data may change those results, but those data don't exist yet. You report on what you have, not what you think you're going to get in the future.