# Statistics for bridges with wooden poles underneath

#### RossyAmsterdam

##### New Member
Dear all,

I'm struggling with the kind of statistics I should use for the following problem.

Some bridges in Amsterdam are built on wooden poles (in order for them to not sink due to the high level of the groundwater). I need to evaluate the strength of the wooden foundation on the basis of the strength of a small sample of wooden poles.

The following information is available:
- The bridge has 509 poles underneath
- We have a sample of 35 poles for which we know the stength
- The average strength of these 35 poles is 439 kN
- The standard deviation is 48 kN
- The minimal strength a pole should have is 185 kN

Now I would like to answer the following two questions:
1) What is the probability that all 509 poles have a minimal strength of 185 kN (and can I make such a statement with a sample consisting of 35 poles)?
2) How big should my sample actually be if I want to be able to state with a probabability of 95% that all 509 poles have a minimal strength of 185 kN?

My background is in high energy physics in which I'm used to work with PDFs (probability density functions) for very large samples. I'm new to problems like the above. Therefore I'm not certain which statistics to use.
Can I just use a normal distribution (constructed with the average and standard deviation of the sample of 35 poles) and integrate it from minus infinity to 185kN in order to get the probability asked in question 1? In that case I would not use the fact that the total population conists of 509 poles. Would this be the right way to go?

Or should I work with a student's t test and calculate the probability P and the accuracy Z? If so, could someone explain me how to proceed with the students t test for this specific problem?

I hope my questions are somewhat clear.

Any help is very much appreciated!

#### j58

##### Active Member
The problem with assuming that the strengths have a normal distribution with mean and SD equal to their observed sample values is that observed sample values are estimates subject to random error, which you must take into account. One way to do this is to perform a Bayesian analysis. You would still assume that the population distribution is normal, but put prior distributions on the mean and SD of the normal distribution. These priors can be chosen to be broad, so that the data more-or-less speaks for itself. In fact, you can go one step further, and put broad prior distributions on the parameters of the prior distributions, allowing the data to effectively choose the most appropriate priors. The result of the analysis is a posterior probability distribution of the strengths, from which you can compute the probability that the breaking strength of any one pole is less than 185; then, assuming independence of pole strengths, you can calculate the probability that any of 509 poles are less than 185 kN.

An attractive alternative that doesn't assume any distributional form for the strengths is bootstrapping, which uses the empirical frequency distribution of the sample as an estimate of the distribution of the population. To perform bootstrapping, you would take a large number (say 1,000,000) of samples with replacement of size 509 from your sample of 35, compute the proportion of samples of 509 that include one (or more) value less than 185 kN, and use the average of those proportions as your estimate of the probability that there is at least one pole in the population of 509 has a strength less than 185.

Last edited:

#### Dason

@RossyAmsterdam can you go into a little bit more detail on what you're trying to answer with your questions. To me it kind of reads like what you want to know is what is the probability that all 509 polls have a strength greater than 185kN. Is that an accurate reading of at least one of your questions?

#### RossyAmsterdam

##### New Member
I think I understand the theory of your first suggestion, but I would not know how to start applying it in practice. Would you use Python or do you have a better suggestion?
The boostrapping is new to me but it sounds interesting as well, I’ll dive into this method.
I’ll let you know how it worked out. Thanks for your help!

@Dason thanks for your reply! Good question.. I’ll try to explain in more detail what I’m trying to answer.

What we want to evaluate is the strength of the wooden foundation of 250 bridges. We have taken one bridge as an example to answer several questions. This is the bridge with 509 poles underneath.
We want to evaluate the strength of the foundation of the bridge on the basis of a sample of wooden poles, since it is imposible to sample all poles (it is very expensive and quite many poles cannot be reached physically).
For this specific bridge we have already sampled 35 poles and calculated their strengths.
Now I have two questions:

- How many poles should we actually sample in order to make a statement about the strenght of all 509 poles.
This statement could be something along the lines of: “If we sample XX poles we can say with 95% certainty that all 509 poles have a strength of minimal 185 kN” (this is what I intended with question 2 in my initial post)

- Since we already have this sample of 35 poles, we indeed also want to answer the question: “What is the probability that all poles have a strength greater than 185 kN”
(This is what I intended with question 1 in my initial post)

Does this make it a bit clearer? Thanks for your help!!

#### j58

##### Active Member
@RossyAmsterdam - In practice, in a Bayesian analysis, the posterior distribution is usually computed by using Markov Chain Monte Carlo sampling, performed by specialized software, such as Jags or Stan. Usually one does not work directly with these programs, but rather interfaces with them using a purpose-built R or Python package. See the Jags or Stan website for details.

Since you want to make inferences about a population of bridges you will have to sample from multiple bridges. This implies a somewhat more complex model, with two sources of variation that have to be accounted for: variation between bridges and variation within bridges. Both Bayes and bootstrapping can handle this situation, but the model has to be specified appropriately.

Just out of curiosity, how is the testing performed? Presumably, you don't remove 35 poles from a bridge and hope that the bridge still stands. Do you remove entire poles, replacing each with a new pole, and then break the old pole to determine its strength?

#### RossyAmsterdam

##### New Member

Concerning how the testing is performed: as long as the bridge is not demolished we cannot remove any poles. What is done instead is that divers dive into the canal to the tops of the poles that stick out above the denser ground/water and take with a handboor a small bar out of the pole. This bar is analysed and, with the use of other information, an estimate about the strength is made.
The more precise method is to remove entire poles as soon as a bridge or another construction that is built on poles is demolished and then perform analyses on these.

We actually want to use these removed poles (this proces is ongoing) to get a precise indication about the poles underneath the bridges that still stand. So that these bridges can be evaluated by only taking the small bar samples.

Hence, along with this project come a lot of statistical questions.

BTW, of course the bridges are still safe enough This is to evaluate their state and how to handle renovation in future.

#### j58

##### Active Member

Concerning how the testing is performed: as long as the bridge is not demolished we cannot remove any poles. What is done instead is that divers dive into the canal to the tops of the poles that stick out above the denser ground/water and take with a handboor a small bar out of the pole. This bar is analysed and, with the use of other information, an estimate about the strength is made.
The more precise method is to remove entire poles as soon as a bridge or another construction that is built on poles is demolished and then perform analyses on these.

We actually want to use these removed poles (this proces is ongoing) to get a precise indication about the poles underneath the bridges that still stand. So that these bridges can be evaluated by only taking the small bar samples.

Hence, along with this project come a lot of statistical questions.
Sounds like an interesting project.

BTW, of course the bridges are still safe enough This is to evaluate their state and how to handle renovation in future.
Since I'll be visiting the Netherlands this summer, I'll just have to trust you.

#### RossyAmsterdam

##### New Member
Hi @j58,
If you don't mind I have one last question about the first method you've described:
If I would perform the Bayesian analysis and build the model that you’ve suggested, will I also be able to determine what my sample size should be if I want to state with 95% certainty that none of the poles have a strength lower that 185 kN? And in a later stadium, when I take into account in the model that there are multiple bridges, will I also be able to extract from the model how I should build up my sample (x poles from y bridges)?

Thanks a lot for your help!

And enjoy your visit to the Netherlands

#### martinl

##### Member
Dear all,

I'm struggling with the kind of statistics I should use for the following problem.

Some bridges in Amsterdam are built on wooden poles (in order for them to not sink due to the high level of the groundwater). I need to evaluate the strength of the wooden foundation on the basis of the strength of a small sample of wooden poles.

The following information is available:
- The bridge has 509 poles underneath
- We have a sample of 35 poles for which we know the stength
- The average strength of these 35 poles is 439 kN
- The standard deviation is 48 kN
- The minimal strength a pole should have is 185 kN

Now I would like to answer the following two questions:
1) What is the probability that all 509 poles have a minimal strength of 185 kN (and can I make such a statement with a sample consisting of 35 poles)?
2) How big should my sample actually be if I want to be able to state with a probabability of 95% that all 509 poles have a minimal strength of 185 kN?

My background is in high energy physics in which I'm used to work with PDFs (probability density functions) for very large samples. I'm new to problems like the above. Therefore I'm not certain which statistics to use.
Can I just use a normal distribution (constructed with the average and standard deviation of the sample of 35 poles) and integrate it from minus infinity to 185kN in order to get the probability asked in question 1? In that case I would not use the fact that the total population conists of 509 poles. Would this be the right way to go?

Or should I work with a student's t test and calculate the probability P and the accuracy Z? If so, could someone explain me how to proceed with the students t test for this specific problem?
Interesting problem. It will require a lot more than 35 sample poles to give you confidence (more than 300), but if it is average you are after, then it would require less.

Last edited:

#### j58

##### Active Member
It seems that the minimum sample size required to give you confidence is around 55, which gives a probability of 0.048 of
having all sample poles >= 185kN when there is 1 actual pole < 185kN, a probability of 0.0478 of having all sample poles >=
185kN when there are 2 actual poles < 185kN, etc.
ORLY?