I want to perform a regression between two variables. One is the total revenue of an organisation and the other is the total revenue collected indirectly - this assumes that there is also direct collection. The total revenue is the sum of the indirect collection and the direct collection.
My linear regression aims to establish a relationship between indirect collection and revenue. I did the calculation with excel data analysis and the result was 0.822, which I think is pretty good.
However, when thinking about this I thought, "If indirect collection is a portion of a total collected, wasn't I supposed to run the regression using percentage values instead of the raw nominal values?"
Then I converted the indirect collection values (in the table below, first column) into percentage values (third column), that is, the indirect collection corresponds to X% of the total revenue:
Indirect collection Total revenue Indirect collection Total revenue
384 520,76 ___________ 488 035,07 ________________78,79 _______________488 035,07
318 212,53 ___________ 575 745,64 ________________55,27 _______________ 575 745,64
423 518,04 ___________ 708 524,95 ________________ 59,77 _______________708 524,95
487 925,30___________ 764 408,51 ________________ 63,83 _______________ 764 408,51
580 426,63 ___________745 557,47 ________________ 77,85 _______________ 745 557,47
668 926,14___________ 864 968,77 ________________ 77,34 _______________ 864 968,77
...
The recalculation of the regression is 0.34... In other words, the argument is no longer in effect. So I got really confused. And this would be for use in a paper, so I need to know which of the two methods is the more important to calculate the regression. The aim is to find out whether developments in indirect collection values affect total revenue.
Can you help this noob out? Thanks in advance.
My linear regression aims to establish a relationship between indirect collection and revenue. I did the calculation with excel data analysis and the result was 0.822, which I think is pretty good.
However, when thinking about this I thought, "If indirect collection is a portion of a total collected, wasn't I supposed to run the regression using percentage values instead of the raw nominal values?"
Then I converted the indirect collection values (in the table below, first column) into percentage values (third column), that is, the indirect collection corresponds to X% of the total revenue:
Indirect collection Total revenue Indirect collection Total revenue
384 520,76 ___________ 488 035,07 ________________78,79 _______________488 035,07
318 212,53 ___________ 575 745,64 ________________55,27 _______________ 575 745,64
423 518,04 ___________ 708 524,95 ________________ 59,77 _______________708 524,95
487 925,30___________ 764 408,51 ________________ 63,83 _______________ 764 408,51
580 426,63 ___________745 557,47 ________________ 77,85 _______________ 745 557,47
668 926,14___________ 864 968,77 ________________ 77,34 _______________ 864 968,77
...
The recalculation of the regression is 0.34... In other words, the argument is no longer in effect. So I got really confused. And this would be for use in a paper, so I need to know which of the two methods is the more important to calculate the regression. The aim is to find out whether developments in indirect collection values affect total revenue.
Can you help this noob out? Thanks in advance.