I have a project i have to do for my regression analysis course. Basically I need to gather a bunch of data and plot it and find a regression line using software.

Now for my idea i chose average movie gross revenue can be determined with these factors:

x1 = amount spent on marketing/promotion.

x2 = amount spent on making the movie

x3 = if the movie is a sequel (binary variable)

x4 = who is directing the movie

x5 = if it got largely positive reviews (binary variable)

x6 = who is starring in the movie

x7 = how many theaters the movie was released in worldwide

Now i'm an avid movie goer and i have a general idea of where i can get the data for most of these.

However i have a question regarding x4 and x6.

I dont know if those variables should be binary because there are many different possibilities, and i haven't really learned this in my course yet but basically if they are, should i just leave it as:

whether the director is well known?

whether the stars are well known?

So what i was thinking of doing for my B4 and B6 constants was trying to get an average revenue for all the well known directors movies and well known actors movies and just putting them in there.