How to approach my data?

Hello guys, please can you help me?

I would like to know the effect of forced online environment during pandemic on work performance.
I would like to compare teams who have been on-line before pandemic with those who have been online due to situation factor.

I have those variables:
Virutality - The level of virtuality before pandemic (in percent - percentage of team members sitting together in one office) The numbers of people in teams are heterogeneous, ratio of virtuality before pandemic is 42%-100%. (so there is no team only virtual never and we can only differentiate between semi-virtual and physically collocated - Any team where 90% or more percent of the team is physically present in the same office is considered as a „collocated“ team. While teams where at least 75% of members work from the same office marked in a group „Semi virtual“. Accordingly, any team with less than 75% of members in one office is considered „virtual“.
Creation date of order - (pre/or post Covid)
Reopen count - How many times the client returned the order - only values 0 and 1
Duration - How long it took to resolve order
Updates - How many interactions were needed to complete the order

This dataset has almost 50 000 of orders (of 50 teams in total)

I thought I will use ANCOVA for comparasion of groups, but I am new at statistics so I will be gratefull for any suggestions how to approach my data.
Thank you!


TS Contributor
How do you want to treat Virtuality, as noise to be controlled for, or as a variable of interest? Using ANCOVA on a covariate is conceptually the same as blocking. It is used to pull the variation due to noise out of the model to make a more powerful test.

ANCOVA will treat Virtuality as noise to be controlled. You could use multiple regression with Virtuality as a variable and Pre/Post COVID as a categorical predictor. This would make Virtuality part of the explanatory model rather than treating it as noise.


Fortran must die
The difference between treating a variable as a variable, as noise, or as a moderator always confused me. They the same thing statistically to me.

What is the dependent variable and how is it coded (interval....)? That determines a lot of how you approach things.
Thank you very much for your responses.

dependent variable is Duration (in seconds) and Updates (discrete variable)

Also there is one problem. (How) should I distinguish between the level of virtuality and teams?
For you to understand - There is always the number of order and then to what team it was assigned to, and other variables...

One more problem is one team work on several orders at the time...So that it is complicated as well. Maybe some multi-level approach?