New to Stata, data set organization 'theory'


New Member
Hi all,

Finally migrated from excel to Stata, but my professor had to leave for an emergency, so I'm a little lost. Hoping you can help me with a few conceptual questions...

I have several different excel sheets with different information. First is the master table, with country along the vertical, year along the horizontal, and the data point as an integer corresponding to that country in that year (political stability ranking)

Additionally, I have another data sheet, laid out in exactly the same way, but with a different integer set (GDP). This goes on for several other points, but the layout is the same.

I'm looking to integrate these all together, year and country name stay the same, but I can't conceptually wrap my head around managing this in Stata. Any help would be greatly appreciated.


Your goal is to have your dataset in long format, something like this:
country    year    stability    gdp    ...
Australia  2001    10           200
Australia  2002    10           202
USA        2001    8            1000
Do you mean the countries are in the columns and the years are in the rows? If I were you I would transpose the sheets so that the countries are in the rows and the years in the columns.

Then for each sheet, import it into Stata and use -reshape- to reshape the dataset to long format. Then you can simply -merge- the different variables (political stability, GDP etc) by country & year.


New Member
Yes, I guess I was thinking about it more from a database perspective, but repeating the country for each entry makes sense.

Thank you