Looking for guidance on best method to make sense of data

#1
Hi folks,

I am not a student, but am trying to make better sense of data. I am very familiar with R for a number of purposes, but am not sure of the technical name for what I am trying to calculate here.

The background:
I am measuring metrics on how a group of people are completing a task. Multiple of these tasks are put into a request system with a due date. The request system measures time and manages state of the tasks. We are ultimately focused on "days past due" bucketed by each group of people for a variety of reasons not worth getting into.

The data points:
The numbers vary, but generally could be summarised like the numbers below (though we have the raw data and can measure other data points if necessary):

Code:
+------------+-----------------------+---------+

| Group Name | Average Days Past Due | Total # |

+============+=======================+=========+

| A          | 98                    | 20      |

+------------+-----------------------+---------+

| B          | 64                    | 39      |

+------------+-----------------------+---------+

| C          | 36                    | 3885    |

+------------+-----------------------+---------+

| D          | 24                    | 548     |

+------------+-----------------------+---------+

| E          | 14                    | 64      |

+------------+-----------------------+---------+

| F          | 145                   | 1102    |

+------------+-----------------------+---------+
The raw data gets thrown together to tell us that our average days past due is something like 56.23082361 days past due

The problem/question:
I want to measure the relative impact of all of these numbers, but am having trouble finding the actual name for what I want to figure out. Sure, a basic weight could be calculated by saying that Group A is 20/5658, but weight is not what I am going for, rather "impact" (which is probably not the real name for what I am thinking of).

In other words, what really moves the needle? What is the term I am looking for to determine the relative impact of each group on the numbers (focused on high ones)? In the example above, C might have a large number of total issues, but the days past due is relatively low, so they likely provide downward pressure on the mean. Group A might have a mean days past due that is higher, but their total is lower.

Am I overcomplicating these things and should just stick to looking at weight or issues individually? Is there a term that I can research more and add to the summary to make better sense of these numbers?
 

Miner

TS Contributor
#2
I would start with a standard box plot of the individual values by group. Next I would use a modified box plot (where the box encloses the 5th through 95th percentiles). This is called SPAN-90 and trims the outliers. Customers are usually more upset by unpredictable times than with longer times. Which would you prefer: the cable company can come tomorrow sometime between 8 am to 6 pm, or they can come next week between 1 and 2 pm.

Usually, differences between groups are so blatantly obvious that no statistical test is necessary.
 
#3
I guess what I am looking for is more "relative downward pressure" and "relative upward pressure" on the final mean days past due number. I want to show how much influence each team's numbers has on the final product. Something like Group A is 2 (some number that shows there is little impact from both their own average and their total) and Group F is 42 (some number that shows there is a lot of impact from both their own average and their total). Alternatively, negative numbers showing relative downward pressure/influence and a positive number showing relative upward pressure/influence.

I see this as being different than a standard weight, as weight would just take into account the number of issues and not the mean days past due.

Again, I am not sure if I am making sense here, so feel free to tell me that I am not making any. :)