More than 2 categorical variable+1 continuous variable as independent variable

#1
I am a civil engineering student and have no relation with statistics. Now I am working on my master thesis and I have got to work on SPSS now, which I have no idea about. Tried to learn a lot from youtube and internet, but still I could not find a proper ans which will explain my problem

1. My dependent variable is continuous (Number of years)

2. I have 8 independent variables, 1 is continuous (number of vehicles passing through the bridge) and remaining are categorical (Type of the bridge, Type of edge beam, Material of the beam, Structural type of the beam, Climate zone, Road type, Road category)

3. Each categorical data have more than 6 subcategory (Type of bridge has 22 category, Type of beam-15, material of the beam-7, Structural type-10, Climate zone-6 etc)

4. How I can find which category data has no much effect on my dependent variable? Is it by Anova? If I am getting significance level less that 0.05 in anova test, does it mean that, that categorical data have an effect on my dependent variable?

5. How can I find the regression equation? My aim is to find a linear equation like N=Ax+By+Cz... etc, where A,B,C are coefficients and x,y,z are the inputs for above mentioned continuous and categorical variable. I have created dummy variables for each categorical data. ie, 21 dummy variables for bridge type, 14 for type of beam etc. Now how I will do the regression test by including both the continuous variable and 7 categorical data?

6. Once I get the regression coefficients, how can I create the regression equation from it, as we are using only k-1 dummy variables and we will get k-1 coefficients and a constant. So what is the coefficient for the 'k' th category?

If I asked anything stupid, pls excuse me. I started learning statistics only one week before and I have no basic idea of this subject :(

It is urgent :( :(

Thanks in advance!!


Regards
Sobhit
 

noetsi

No cake for spunky
#3
Commonly this means no one knows an answer (or is unsure if they do).

How I can find which category data has no much effect on my dependent variable? Is it by Anova? If I am getting significance level less that 0.05 in anova test, does it mean that, that categorical data have an effect on my dependent variable?
You can use ANOVA or linear regression (both are similar methods). If you are getting a p value for an IV less than whatever alpha you chose (.05 is commonly used for alpha but is not the only one) than your IV has signficant predictive power on the DV controlling for other variables. I don't know how ANOVA deals with categorical variables. In linear regression you have to split up a categorical variable in a series of dummy variables to test. So if you have 15 levels you would have 14 dummies - you can not test the categorical variable by itself, but instead each level of it.

How can I find the regression equation? My aim is to find a linear equation like N=Ax+By+Cz... etc, where A,B,C are coefficients and x,y,z are the inputs for above mentioned continuous and categorical variable. I have created dummy variables for each categorical data. ie, 21 dummy variables for bridge type, 14 for type of beam etc. Now how I will do the regression test by including both the continuous variable and 7 categorical data
The regression equation would include all the dummy variables for all the categorical variables as X. So a 15 level categorical variable will have 14 seperate x for it. You will have a lot of dummy variables - which beyond causing a lot of work in creating your equation may cause power issues if you don't have enough data points. You need more data points for every variable (including d ummy variables) in your model.

Once I get the regression coefficients, how can I create the regression equation from it, as we are using only k-1 dummy variables and we will get k-1 coefficients and a constant. So what is the coefficient for the 'k' th category?
I don't understand what you are asking for here. The slopes you generate in your regression will be the coefficients for all variables in your model.