Search results

1. Linear Probability Model

This is when you use linear assumptions for a DV that has two levels only. SAS's senior statistician told me firmly this only works when you assume a binomial distribution. However, the organization I report to chose to assume normality. As I understand it, this is not then a Linear Probability...
2. Control variables

I run a time consuming report for a bunch of variables the federal government controls for. My argument is that it does not matter what their levels are, because the federal government will control for them in evaluating our success (that is their purpose to level the playing field). But then I...
3. MCAR, MAR, MNAR test

I read an article today that stressed the need when data is missing more than two percent of cases to test whether it is MCAR MAR, MNAR. Does anyone know how to do such a test. I understand MAR is tied to being associated with the predictors and the MNAR with Y. But not how to do a formal test...
4. Importance of regression assumptions

Long ago I was brought up on the view that analysis of the residuals was critical for regression. But I am confused about the advice on it now. This is for data sets that has thousands (or tens of thousands of points) . That is how important is residual analysis? Normality: The sense I get is...
5. Running simulations in R

I have wanted to do this for a long time. But the package that does this in SAS (PROC IML) is too expensive. I was wondering, and remembering that I am a beginner in both R and simulations, what a good R package would be to do this.
6. proc genmod binary DV linear probability model

Three questions. First for the predictors that have two levels, proc gen mod shows the results associated with the zero level of the predictor. I want to know the probability associated with being in the 1 level (for example being Hispanic when you are measuring if one is Hispanic or not). To...
7. Interpreting dummy variables.

After all these years reading regression this should be simple to do... Impact are the regression slopes for dummy variables. I should say that the excluded reference group here is not a good idea to me, they are less than 16 of which we have extremely few and most likely they earn very...
8. Hlsmith is this something you would be interested in?

The generation of large metabolomic data sets has created a high demand for software that can fit statistical models to one-metabolite-at-a-time on hundreds of metabolites. We provide the %polynova_2way macro in SAS to identify metabolites differentially expressed in study designs with a two-way...
9. Multilevel Models

I start using this 7 years after I studied it, when I have largely forgotten the method. I have lots of questions but to start with, there seems to be disagreements on whether you should center you predictors at the lower (individual) level or not. And if you do, should you use group or grand...
10. Trouble connecting to R

My office closed and we now work from home. I connect through a VPN (it is a state computer and there are serious concerns with security). My internet connection, which used to be the state one now turned off, is in practice 60 megs down (in theory 100 but it never is) and 5 up. Things are a bit...
11. Relative impact

Say for OLS regression. Say you get asked which of these variables have the most impact on the dependent variable. So you can rank the predictors from most to least influence. Say most of the predictors are dummy variables but some are not. Is there a good way to do this? In a decade plus I have...
12. These are the articles that I always find discouraging

It suggests key views of academics are incorrect (and of course others will disagree with this author). As someone who is not a researcher, and will never know who is right it raises the question about the validity of doing statistics when you are not a very high end analyst/researcher. Or makes...
13. Partially standardized coefficients

This is for multilevel models. I don't understand what they are saying here exactly. What are partially standardized coefficients? And how do you calculate them? I don't understand what dummy code the the binary covariate means. They are already dummy variables...
14. multilevel models

I circled back to them, we are finally using them in analysis .... I have some questions starting with this one. Unfortunately I no longer have the book this is based on. But I am hoping someone will know generally. It has to do with estimators REML and ML. It is a two level model. ML...
15. Explaining regression to a non-technical audience.

They are smart and well educated, but know nothing at all about regression. I would appreciate comments on this for that audience, including if I am getting something wrong. There are at least 16,000 cases for this analysis (it varies). Income is linear regression, employment is a linear...
16. Short run versus long run in cointegration

In dealing with cointegration there are discussion of short term (error correction) models and long term effects (the cointegration). But I have yet to find a description of what this means in practice even when slopes are generated. What is the practical ramifications of short and long term...
17. Accepting the null

I was always taught you can never accept the null, but I have seen this comment before for bounds testing. "For accepting or rejecting the null hypothesis there are 2 critical bounds, upper and lower bound. If the F-value of bound test is larger than upper bound value at 5% level of...
18. Time series models

I have a lot of analysis where X is measured at one point, sometimes years before Y is measured. I remain unclear when a time series model is required or not in this case assuming you are not using lags of Y to predict. Is it when X and/or Y is not stationary? That is my guess but I am not...
19. What is a population

I have long said I have a population. We are only interested in our customers. We have all the data for those customers. But you could argue we have a sample of the population of future customers (who might vary from our existing ones). I am not sure about this one way or the other and was...
20. Regression assumptions

This is a strange question I know. I work for a state agency that in practice works for a federal one. They have a model that is going to be run even if there are problems with the assumptions in the model. That is the federal government is going to run the model whether there are errors in the...