Arthritis trial data: Advanced multivariate stats problem.

I am a doctor planning a post-hoc exploratory analysis of some clinical trial data and would like some suggestions as to the best approach.
Patients with arthritis have many inflamed joints; in clinical trials, doctors test 66 of these for tenderness and swelling. The result is trinary: 0; no involvement, 1; unilateral involvement and 2; bilateral involvement. Effective treatments improve swelling and tenderness and lead to a reduction in the summed score of all 66 joints. My drug has an overall similar effect as another drug in improving this overall score in a randomised trial, but I want to know the typical pattern of response for my drug vs the second drug, which has a different mode of action and may work better or worse in specific joints. We don't have a firm hypothesis as to which joints these may be, so we are looking at a purely exploratory analysis.

The endpoint would be the change in score in a given joint over time, say after 3 months. Other time points (1,3,6,12 months) are available if a repeated-measures approach incorporating the change over multiple timepoints is possible/desirable, but the response at 3 months is most relevant.

What options are open to us for defining the pattern of changes (deterioration and improvement) in the trinary score in 66 different joints? As the patients globally improve, these changes are likely to be highly intercorrelated. There are, of course, many additional demographic and clinical factors that may influence the pattern of joint responses, and it would be good to account for these.

I am not looking for a finalised statistical analysis plan here but would really like some pointers on what is possible / excluded so I can do some directed pre-reading before talking to the stats guys. I have been reading about cluster analysis and PCA or LDA but am not sure whether these methods are applicable with trinary data and it's not clear to me how to include covariates like age, weight, gender etc. which may well interact with joint-specificdrug responses. Also, my groups (drug 1 and drug 2) are known and I want to establish the characteristics that define them, not generate PCs and explain them. I suppose in this sense it's more of a classification problem.

I'm very grateful for all input and happy to add more info if anything isn't clear. Thanks in advance. TJ
trinary: 0; no involvement, 1; unilateral involvement and 2; bilateral involvement.
I am sorry but you lost me here. What does it mean "unilateral involvement"? Which one is the most healthy: "no involvement," or "bilateral involvement"?

Does this mean that you will have 66 response variabels, one for each joint, and that they each can get a value of "0", "1"or "2"?
And then you would also have explanatory factors like drugs used, gender, age heritability etc.
Then it sounds like multinomial regression, but with all these 66 joints....hmm.

I am not looking for a finalised statistical analysis plan here but would really like some pointers on what is possible / excluded so I can do some directed pre-reading before talking to the stats guys.
I would like to applaud you for this, to prepare and to go to the stat guys. And to be prepared to pay for the advice.
Many thanks for your quick reply. As joints always come in pairs, we’re actually combining left and right for reasons related to the hypothesis so we have 33 “joint pairs” which may be 0, 1 or 2; neither affected, 1 affected, both affected. Healthy is 0, lass healthy is 1 and unhealthy would be 2.

My reading so far seems to indicate MANOVA and descriptive discriminant analysis...

We’re very much steeling ourselves for the stats bill! But it’s an important topic so they’re earning it...
patients are assessed at multiple timepoints and its the change between baseline and t=x (probably 3 months) that we're interested in, unless there are major methodological advantages in including the other timepoints.
So if I have a pain in both of my two thumbs in the lowest joint, that would be "2"?

Or if you code it as 66 variables it would be "1" for left thumb and "1" for right thumb? Am I correct?

There must be a strong and well known "correlation" here, or rather frequencies. So that left hand is symmetric to the right hand. But also that two close fingers on the same hand have correlated values. So there must be a well known Bayesian prior that can be used. (That prior would be based on real data.)
So if I have a pain in both of my two thumbs in the lowest joint, that would be "2"? YES

It's slightly more complicated - the raw score is binary yes/no for two aspects of joint involvement - tenderness and swelling. We are only interested in presence of disease in the joint yes/no so have coded this as no disease / disease (disease being either or both). These aspects are definitely highly correlated. In a similar vein, I want to ignore the left/right aspect for this post-hoc analysis as we actually assume that the genetic environment of each joint left and right is identical. The adjacent joints will be correlated to some degree, although it is more complex than a simple radial gradient as the disease has characteristic anatomical distribution (interesting paper on this topic:

Our aim is to see if a given joint pair involved at t=0 and improved at t=x is specific to treatment 1 vs treatment 2 and vice versa, so it is essentially the pattern of these deltas in the joints that we want to map to treatment 1 or 2.
Last edited:
What is the treatment? is it that you will so to say treat the whole body, with some drugs maybe?Or could it be that you can treat just one joint?

I believe that the great statistician Box started his career by treating one side of the body (randomized) and then used the other side as a control. That is clever since the individual will be his own control with the same environment and genetics. But I guess it is not like that now.


Omega Contributor
Whatever you are trying to do @Markov_Spirello - it seems convoluted. However I will point out that if patients are randomized you don't need to control for baseline covariates, since they shouldn't be imbalanced. Unless you have loss to follow-up or some systematic bias. If you think otherwise please elaborate.

I would look to the literature in your field, rheumatology,..., maybe even ophthalmology (since they deal with pair observations from a single unit) to see how others have approach comparable topics. Examining lots of joints post hoc, should be considered suspect for false discovery, since it includes many tests.

There is no overall score difference but you are trying to tease out regional difference, but don't know what they are. If I was on this project, I wouldn't even run any statistics beyond descriptive stats. Just write a paper contrasting the joint scores and drug group and call it a hypothesis generating study. There appears to be many potential subgroups and you will likely face a curse of dimensionality issue, if you have 33 potential joints affected yes/no, you could have 2*33 groups (8589934592) with 3 score option each (I think), whew.

I can usually see a troublesome or at least difficult project coming down the pipeline and this definitely seems like one. In addition, this doesn't appear to be the primary study outcome for the dataset, thus it may be underpowered and not well structure for this hypothesis, thus contributing to the repeatability crisis in medicine.
Last edited:
Hlsmith is a good contributor, but right now I don't understand what he is saying.

Of course it is good to do before and after measurements. If the standard deviation is 10 and you just randomize then the std will be 10. But if the difference within patients is 1 and you do before/after, the standard deviation will be just 1. So, a much better design.

It is not strange with 33 or 66 response variables. Look at any sample survey they send you.

Hlsmith and I don't know anything so far, if the study is under powered or not.
@Markov_Spirello - This is certainly an interesting statistical problem. I have some ideas, but I have a couple preliminary questions: First, I'm concerned about the lack of sensitivity of your outcome(s). A reduction in the trinary score for any joint-pair would imply complete disappearance of tenderness or swelling in at least the right or left joint. Are the drugs you plan to compare so effective that you would expect to see such changes?

My second question is how many patients do you have data on for each drug?
Last edited:


Omega Contributor
Yeah, but that description is a little unclear. If I had to stretch my brain - I would guess that the OP means that everyone has at least 1 joint with pain. The joint with the least number of individuals with pain only has 150 and that is for the treatment group with the fewest individuals. While some joints may have up to 450 individuals with pain in the larger of the two treatment groups.

I don't think my prior rant was incoherent. I am just conflicted by the number of subsets and how you can probably get individuals in multiple analyses. When the OP runs an analysis to see if pain in the hip joint reduced then they go on with their second analyses to see if pain is reduced in the knee joint, then ankle, wrist, elbow, shoulder, etc. There are subjects in some of these repeated groupings and not others, I would think you would want to control for the number of joints with pain and not that they aren't independent. If my knee joint has pain, well hey I am going to get secondary hip and back pain from body mechanics, Then given those ailments I am going to get back pain from poor sleep. Now drug one helps physiologically with the knee and thee other joints are indirectly alleviated. But not due to the direct efficacy of the drug on those joints. So etiology of joint pain seems important as well, control for that. There ends up being a large web of things that should be controlled for.

Question, were individuals and clinicians blinded?