Data management snapspan life table survival analysis


New Member
Hi :wave:
I'm having trouble with the data management necessary to set the database so that I can run a Lifetable to do some descriptive analysis.
I tried a simple snapspan synthax:

snapspan id year x1-x15, gen (year0)

but i'm getting an error message (which said "not sorted", even if I sort the data for id or id and year and put the variables x1-x15 in the order that they have in the database) and a theorethical issue. Stata says that it will ignore a big quantity of observations, because I've just one record for those id, but it happens because some people drop out from University during their freshman year. What can I do not to lose that piece of information? I found on statalist a solution to this problem and I tried it, but it didn't work. Firstly, because it used expand with by, and secondly (after i fixed that thing) stata said to me that it didn't make any change :shakehead
this is the sinthax (the second line is the one I modified, and I've tested it until that)

sort id
expandcl 2 if _N==1, gen (expanded) cluster (id)
replace year = . in <originalobs>/l
sort id year
by id: replace year = year+1 if year==1 & _n==2

The variable year contains integers from 1 to 5 (first, second year of university and so on). It is repeated if you need an extrayear to finish (Italian University System, speaking of which, sorry for my English! :eek:), then there are two 3 for the same id if the person needed one extrayear for his bachelor degree.

Thanks for your help!
best regards ^_^
Hi timbp,
thanks for helping me :)
I tried your syntax, but it says again that by and expand are two commands which cannot be used in the same line. I also tried without bys id, but it generated just one observation (._.)
(I used _N to indicate the last observation of each cluster id, because I want to duplicate only the observation of id with just one record, so that they won't be ignored by snapspan)
thanks anyway!