My name is Jan and I am a master student in the field of Tropical ecology. This is my first post and request for help on this forum. It’s because I am struggling right now with a research proposal I am writing. The study is different than what I am used to until now, so I’m not sure about the statistical implementations. I really hope that one of you can share your knowledge about this.

In a large area in Africa, there is a large-scale project established by a non-for-profit organisation which tries to motivate farmers to increase the amount of natural growing trees on their farms. The farmers involved in the project report the amount of “new” trees every year to specific farmers who collect the data of multiple farmers and send the numbers to the organisation responsible for the programme.

The following is applicable:

- 115.000 farmers are involved
- 338 villages are included in the project
- 1200 farmers are collecting the data of all the involved farmers
- A total amount of 8.5 million new trees are reported
- The size of the total programme area (in hectares) in unknown
- The size of each farm (in hectares) is unknown

I will “count” the trees on randomly selected farms (so I use the 115.000 farms as the

*population*). This I will compare with the reported amount of trees each year for those particular farms in order to come to an accuracy and correctness of the reported numbers by the farmers themselves. With that information, I can eventually calculate the number of actual trees involved in the whole programme. More details:

- I calculated a sample size of 69 farms, based on the following parameters (these are not just randomly chosen, but based on certain methodology):
- Confidence level: 90%
- Margin of error: 10%
- Population proportion: 50%
- Population size: 115.000

- Due to time and resource constrains, I want to use a
*clustered random sampling*method:- 1: Random selection of 12 villages
- 2: Random selection of 6 farms per selected village

- Each sample will deliver a percentage, namely the number of actually found trees divided by the number of reported trees by the farmer.
- With the sample, I think I will calculate the mean percentage and use that to “correct” the total number of reported trees.

Thanks in advance for your help!