Rating and Votes for hotels - how to connect?

Artem

New Member
#1
Hello everyone,
I have collected data on hotels, and I want to run some regressions on it.

One problem appeared:
I have average rating for each hotel and also a number of votes by which it was calculated. So it is clear that 2 hotels with different rating and different number of votes can not be compared, due to different accuracy of the rating.
(Example: hotel1 has rating = 8, votes = 100
hotel2 has rating =7,8 , votes = 1000);
Rating is between 0 and 10 for each hotel.
It is clear that these ratings are of different accuracy, and thus I have question:
How to connect these votes and rating into one variable, that can be compared between different hotels, and there should be a relation - the bigger the variable, the better the hotel is?
Thanks in advance.
 
#2
Hello everyone,
I have collected data on hotels, and I want to run some regressions on it.

One problem appeared:
I have average rating for each hotel and also a number of votes by which it was calculated. So it is clear that 2 hotels with different rating and different number of votes can not be compared, due to different accuracy of the rating.
(Example: hotel1 has rating = 8, votes = 100
hotel2 has rating =7,8 , votes = 1000);
Rating is between 0 and 10 for each hotel.
It is clear that these ratings are of different accuracy, and thus I have question:
How to connect these votes and rating into one variable, that can be compared between different hotels, and there should be a relation - the bigger the variable, the better the hotel is?
Thanks in advance.
Dear Artem,

After reading your problem, one solution came to my mind was to multiply the 2 values you had there to get one single value that will be affected by both rating and number of votes.

For example, in this way, you will have the value for
hotel1 = 8 x 100 = 800
hotel2 = 7.8 x 1000 = 7800

Since the rating value is the average of all the rates, this way you will obtain the sum of all the votes.
I think this value will have a better use since clearly, if you have a hotel x with rating = 10 but votes = 2 then you only get the total sum of 20 which is still smaller than other hotels which have higher number of votes.
The only problem I have here is that if there is enough people votes, some hotel y with rating = 2 but votes = 5000 can have a value of 10000 and may get to the top of your list even though it is the worst hotel ever.

One way to fix this may be: If you think 5 is your average rating (as in it is the line divides the good and bad hotels) then you can take the rating value - 5.
What it does is it will give any hotel with rating higher than 5 a positive value and below 5 a negative value.

I hope this will help

Cheers,

Alex