I am studying the association between product quality measured by expert intermediary (

*overallScore*) and that measured by online user

*rating*. In particular, I hypothesize that the aforementioned association is dependent on the longevity of product use. For instance, I would expect a positive correlation if the user rating were given right after the product purchase; and negative correlation if the user rating were given after some considerable use of the product. The longevity of use (

*moder*) is captured at four levels (1 = < 1 month, 2 = 1-3 months, 3 = 3-6 months, and 4 = > 6 months). The descriptive statistics of the three variables is given below:

Further, below I include a screenshot of the binned scatterplot of the relationship between

*overallScore*and

*rating*by levels of

*moder*, that provides some support to me initial assumption:

Now, my dataset has the following structure:

– User

*rating*is captured at the individual level for a given product identified with a

*name_id*(the total number of products is 109). The minimum number of ratings per

product is 3, and the maximum is 583.

– Expert intermediary

*overallScore*is captured at the

*name_id*level.

– And finally, each

*name_id*is associated with a product

*category_id*.

As far as I understand, following such data structure each

*rating*is nested within

*name_id*and each

*name_id*is nested within

*category_id*. Therefore, to formally test my hypothesis on the moderating role of longevity of use, I use a linear hierarchical model using -mixed- command in Stata:

Note (1) In case of my data technically the expert rating is published first and then users have an opportunity to provide their evaluations; therefore, I am using user rating as the outcome here.

Note(2) I did run several robustness tests with estimators more suitable for bounded interval outcome (i.e., 1-5).

Does this seem like a plausible approach? I would appreciate your advises.