Need help analysing data


I am struggling for weeks with my data. I collected a lot of data during my internship. Now I have to analyze it but I am stuck.

My data consists of always set-ups of three tanks. For those tanks did I measured every day many measurements over a lot of months.
I don´t even dare to think about comparing the different set-ups with each other.

I tried to find a way with SPSS and my professor told me to do it with GraphPad. However, every time I think I understand what I have to do I fail.
By now I feel as if I am hitting my head repeatedly against a wall.

Can anybody tell me how I can handle this data in a good way? Thus, how to sort it, put it into a good table for the program and then how to chose the correct test.

I thank everybody in advance for every tip!




TS Contributor
Can you tell us more information about your internship? No need to identify the company particulars, but more about the tanks. Is this some type of chemical process, lab experiment, etc. What type of class are you taking? This would help us guide you. There are certain approaches that are accepted in some fields that are not accepted on other fields. For example, there are a lot of tools used in industrial statistics (my field) that are unknown in other fields such as social research.

One insight is that you collected data in time sequence from three different, yet similar processes. The type of testing will depend on your answers to my questions as well as your research question(s). Do you have a research question, or is this a case where you have data and you want to know what to do with it? Possible research questions: 1) Is there a difference between the 3 tanks?; 2) Is the process stable over time?; 3) Did experimental treatments have an impact? Or are you simply exploring the data to see if you find something interesting?

Your professor's recommendation of GraphPad, may indicate that they are interested in graphical data exploration. You could use time series plots to look at process stability over time, or box-plots to compare the median and spread of the three tanks. If your data included process inputs and outputs, you could use scatter-plots to explore potential relationships between variables.
Last edited:
Those were tanks of bacteria that took up nitrogen compounds in their biomass to clean the water.
Although we made the different set-ups quite comparable are there differences. Therefore we not only want to compare the tanks of each set-up with each other but if possible also the different set-ups with each other. Thus 1) is our main research question but 3) also.

What would it mean to "simply explore the data"? If it is not extremely complicated would that be also very interesting.

I came one step further in my exploration. I now decided that it´s impossible to look at all the data at the same time. Hence, I decided to look first at the most important measurements apart. Later, I will look if I have enough time to analyze everything.


TS Contributor
Ok. I am going to speculate a little here since I know little about the domain. Since the point of the experiment is to investigate how the bacteria clean the water, i presume that the response variable is changing over time. Therefore, you cannot simply lump the data together and average it. I recommend that you treat this as a repeated measures experiment and use an Repeated Measures - ANOVA to analyze the experiment.

Exploring the data means using graphical techniques. Plot the response for each tank over time in a time series plot. Plot a box plot for each tank. Plot scatter plots using tank settings as independent variables and the response as a dependent variable. In the words of Ellis Ott, father of industrial statistics, "Plot the data!".
Last edited: