How to compare standard deviation in observed data with simulated results?

Hi guys.
I am a Master's student currently running a crop modelling research project. I have field trial data for wheat for two different seasons, for ten different different wheat genotypes. In the field trials, each genotype was run with three different replications, and I have averaged these replications to get typical values for each season. For example, in one trial, a particular wheat genotype was had a yield of 4.825 tons per hectare in one replication, and 5.075 t/h and 4.725 t/h in the other two, giving an average yield of 4.875 t/h for the season, with a standard deviation of 0.147 between the replications.

I have calibrated my crop model using field trial, soil and weather data for the period. When I run it I get a result of 4.760 t/h for yield, a close match with the average yield for the season. The difference between my observed and simulated results is 0.115 t/h, less than the standard deviation among the replications. Standard deviation describes the range of values across data, so if my model predicts values within this range, it seems to be reasonably accurate. However, I'm not sure how to compare the difference between my simulated and observed data, and standard deviation of observed data, or if the two are directly comparable at all. I am currently using Root Mean Square Error, Willmott's index and co-efficient of determination (Rsq) to compare my simulated and observed values. Is there an appropriate statistical method to compare simulated and observed results, when the the observed results are averages which fall within a range of values, rather than one set value? I don't have much background in statistics, and any help or advice any of you can offer is much appreciated :)