Line graph using means calculated at intervals

#1
I'm having trouble constructing a line graph. Before you leave this thread immediately, let me explain:

1) I'd like to graph the AVERAGES of a binary variable over time.
2) I'd like to calculate these averages for a given number of time bins. I have 260 weeks' worth of data and I would like to graph, for example, the average of this binary variable in weeks 1-10, in weeks 11-20, and so on.

I've tried using egen to calculate the mean of the binary variable by week group but Stata doesn't like me trying to use egen at the same time that I use another time variable generated using the xtile command. I end up with a list of 39,000 values that are all the same, which is unhelpful.

Is there a way I can get around this without manually calculating the averages for each time bin, entering them in a word processor, and graphing the whole thing by hand?

Thanks for your help and patience with this relative Stata newbie.
 

Link

Ninja say what!?!
#2
I gotta say, you caught me with the phrase "Before you leave this thread immediately, let me explain". hehehe. I wish I could help you. I'm not that fluent at Stata though.

Some things I offer to make things more clear for someone more fluent at it though are:
1) In graphing the average over time for a binary variable, the time points have to be discrete and binned. Notice that if you were to graph the average over, say each minute or second, then you'd likely have one or very few observations for each minute. Its likely that you'd have zero cells too. You address this well in your second point.
2) To do this in Sas, I would first create an indicator for the bin that I'm in. So, I would create a variable called bin and assign it 1 in week 1-10, 2 in week 11-20, 3 in week 21-30, etc. Then, I'd use proc means to get the average by each bin. I'd use the results to graph. Its very likely that you can do the same in STATA.
 
#3
Hi Link--

Thanks for the advice! Yes, you're totally right. The problem I seem to be having is that I can't figure out how to get Stata to run these calculations automatically. Because I have, say, 26 time bins, and about 10 variables for which the summary measures need to be calculated (per bin), it's going to be really tedious and time-consuming if I type out

mean [var] if [week_bin]==1
mean [var] if [week_bin]==2
mean [var] if [week_bin]==3...

and so on, record the results by hand, and then graph them. I was hoping there would be an 'if' or 'by' syntax I could use to avoid doing this, but so far I can't seem to find any 'if' or 'by' option that works without giving me the same average for all of my values across all time bins. Do you know of any such option?

Thank you again for your reply!

--M_Sundaram
 
#4
Hi there,

My suggestion is the following : create a new dataset in which you have the average of your 10 variables over each time bin. First create your time bin variable. For instance :

gen tilme_bin = autocode(week, 26, 1, 260)

(take a look at h autocode). Then use "collapse" :

collapse "your list of 10 variables of interest", by(time_bin)

Hope this helps!

Etienne