More observations than individuals - convert rows to variables - reshape????

#1
Dear everyone,

I have a HH dataset consiting of about 50 or so subsets which I am trying to merge together. The following problem occures. In some of these datasets there are more observations then individuals in the survey because data on jobs has been coded as job1, job2, job3 etc... each taking a seperate row.

So in my dataset there are three ID variables and the job variable and together they form a unique ID.

Now what I need is to have the additional (job) rows transformed to variables so that I end up with exactly the number of individuals as in the other data sets. Meaning all information that is now contained in a single job row (lets say for job3) such as days worked in this job per month or hours worked in this job per day should become variables (columns).

I've drawn a rudementary graph below. Hope it will help...

Thats what I have:

# ID1 ID2 ID3 JOBID day/month hours/day
---------------------------------------------------------------------|
1 201 1 1 JOB1 # #
2 201 1 2 JOB1 # #
3 201 1 2 JOB2 # #
4 201 1 3 JOB1 # #
5 201 1 4 JOB1 # #
6 201 1 5 JOB1 # #
7 201 2 5 JOB2 # #
8 202 2 5 JOB3 # #
9 202 3 6 JOB1 # #
10 202 3 7 JOB1 # #

Should become:

# ID1 ID2 ID3 JOB1 day/month hours/day JOB2 day/month hours/day
---------------------------------------------------------------------------------
1 201 1 1 JOB1 # # JOB2 # #
2 201 1 2 JOB1 # # JOB2 # #
4 201 1 3 JOB1 # # JOB2 # #
5 201 1 4 JOB1 # # JOB2 # #
6 201 1 5 JOB1 # # JOB2 # #
9 202 3 6 JOB1 # # JOB2 # #
10 202 3 7 JOB1 # # JOB2 # #

Hope that makes sense at all.

Btw I tried using reshape but I aways get a reshape error saying that some variables are not constant within the id variables I have described.

Well thank you ever so much to whoever can help me sort that out (and of course whoever gives it a shot) you really made my day.

Best, Florian
 
#2
Oh wow my table got messed up big time. So here we go again hope it's better now.

#---ID1---ID2---ID3---JOBID---var1---var2
----------------------------------------
1---201----1-----1-----JOB1-----#-----#
2---201----1-----2-----JOB1-----#-----#
3---201----1-----2-----JOB2-----#-----#
4---201----1-----3-----JOB1-----#-----#
5---201----1-----4-----JOB1-----#-----#
6---201----1-----5-----JOB1-----#-----#
7---201----2-----5-----JOB2-----#-----#
8---202----2-----5-----JOB3-----#-----#
9---202----3-----6-----JOB1-----#-----#
10--202----3-----7-----JOB1-----#-----#

Should become:

#--ID1--ID2--ID3--JOB1--var1--var2--JOB2--var1--var2
---------------------------------------------------------
1--201--1-----1---JOB1----#----#---JOB2----.----.
2--201--1-----2---JOB1----#----#---JOB2----#----#
4--201--1-----3---JOB1----#----#---JOB2----.----.
5--201--1-----4---JOB1----#----#---JOB2----.----.
6--201--1-----5---JOB1----#----#---JOB2----#----#
9--202--3-----6---JOB1----#----#---JOB2----.----.
10-202--3-----7---JOB1----#----#---JOB2----.----.

Just to clarify again. The table shows how rows 3,7,8 become variables in the second table (JOB3 missing because of space). Those rows that had second or third job become variables and in those the individuals who have one job only are missing. Can't be that difficult. Thanks again everyone for helping...
 

bukharin

RoboStataRaptor
#3
You're right that -reshape- is the tool for this. However, for most purposes data in a "long format" (each job on a separate line) is easier to analyse in Stata - if I were you I would use -reshape long- to make all the datasets like your first example and then -append- them together. You can always -reshape wide- again later if needed.

If you are still keen to use a wide format your -reshape- command probably needs to be something along the lines of:
reshape wide days hours, i(id1 id2 id3) j(job)

(note that you can reshape multiple variables simultaneously - days and hours in the example above)

For more specific advice you would need to show us the exact command that you used, and its exact output, and preferably a real example of your data. It's usually easiest to copy and paste from the Stata results window, and to paste the results inside CODE tags so that they appear correctly in the web browser.