p-value for observing additional effects in chemical clusters by incorporating new chemicals in future


New Member
I clustered chemicals based on their structural features.

Later I realized that:
  1. in some clusters, there are some chemicals (not all chemicals in the cluster) that will emit red light when shined by the laser,
  2. in some clusters, some chemicals will emit green light, and
  3. in some clusters, chemicals do not emit light at all.
For example, in cluster 1, there are 5 chemicals, and only 1 emits red light when excited by the laser. In cluster two, there are 34 chemicals, but only 7 emit green light once excited by the laser, and in cluster 3, with 12 chemicals, non emits light.

However, in the future, I will add additional new chemicals to the clusters (by their structural features), but I am worried that some of them will emit light of different colors. For example, I am afraid that, in the future, I may add a chemical in cluster 3 that will emit green light.

How can I statistically test (obtain a p-value) that in the future, all additional chemicals will either have no emission or they will emit the same light per cluster as I observe today, i.e., all future chemicals in cluster 1 will have either no emission or some will emit red light, in cluster 2 future additional chemicals will either emit green light or no light and in cluster 3 future additional chemicals will have no emission?

I appreciate any suggestions.