Becoming a Statistician - What to master, R, SAS, or SQL?

#1
Hi, I'm new to the forum!



I hold a M.S. in Pure Math and I have been working as a community college instructor for a few years. I recently completed a graduate certificate in statistics (15 graduate credit hours - Mathematical Statistics I and II, Linear Statistical Model, Machine Learning I and II) and I'm interested in becoming a statistician. Now, I have been using "R" for my entire statistic program, but as I am doing job search on USAJOBS and Indeed I found that a number of the positions require both SAS and SQL (especially on government jobs, I am a US Army Veteran and I'm hoping to work for the Feds as I can carry my service years into the job), so my question is: Should I continue to learn more to master "R"? Or should I go ahead and focus on learning SAS and SQL?



Any comments are appreicated, thank you in advance!
 

maartenbuis

TS Contributor
#2
If you know multiple programs, then you are signaling to your future employer that you are flexible and can learn new programs, or new features of programs you already know, when needed. That is often more valuable than knowing one program extremly well.

SAS has a large and very loyal user base, mainly because many organisations have build up a large collection of SAS code, which would be expensive to port to another program. For that reason it would be a fair choice when you want to become a professional statistician. On the other hand, its capabilities largly overlap with R; they are both general purpose statistical programs. SQL aims at doing different things; database management. So that would be more of an opportunity to learn something new.
 
Last edited:

Lazar

Phineas Packard
#3
This recent survey would suggest that learning SQL would be a plus and R is the next most important BUT I take the point maartenbuis about collections of SAS code.

For SQL I found this [warning! PDF] to be useful to start with
 

hlsmith

Not a robit
#6
The VA system typically always uses SAS. I have not seen inclings that it is changing any time soon, its very structured and typically not quickly changed. Plus the interactions between all of the centers call for a unified language between analysts.
 

noetsi

Fortran must die
#7
Two issues are involved here. First, most organizations are inherently conservative and are unlikely to change something as basic as the software simply because universities move to R (well statistics departments). Second, SPSS and SAS are large organizations and do various things that insure that their product is well known which R does not. Simply the fact that they are corporations means that the product appears safer to managers who likely know little about statistical software. In my large public organization i had to appeal to get R allowed because they were sure that it would contain viruses.

My guess is that the federal government (which is what USA Jobs is) primarily uses SPSS or SAS. So it would not hurt to learn one of those. If you learned to code stats in R you should have no trouble picking up the GUI based SAS and SPSS systems which are much easier. My guess is that you will find SAS more utilized than SPSS, but it is only that. SQL is extremely useful because unlike universities you will rarely if ever be given the data you need to analyze. You will have to go out and get it. Probably 50-80 percent of my job as an analyst in a state agency doing stats is data queries. Which SQL is used I don't know, but my guess is that it will be Acess simply because govenments change slowly.

Depending on the agency knowing how to do excel and how to write up reports will be critical as well. Don't assume that statistics in real world organizations will be similar to universities. From my observation working in a variety of large organizations doing analysis they are totally different (although other posters here strongly disagree with that view):p
 

Mean Joe

TS Contributor
#8
Since you've used R throughout your stats program, I assume you know enough about R. I'm wondering what mastery you have in mind...

I suspect you don't need to be a master of R to do the stats work in the job you are seeking.

So go learn another language like SAS, if you need it for your job. It's good to have a variety of languages you can be reliable in. I suspect you don't need to be a Master of any R/SAS/SQL/etc to do the stats work in that job.

Getting sidetracked here
 

noetsi

Fortran must die
#9
lol MeanJoe on the spoiler.

In my experiene (in all but a few elite units like the census bureau) the stats and software you learn in school will be far beyond what you do on the job. The trick on the job is to know how to get things done by a dead line and that that perfect (what one tries for in school) is the enemy of the good enough. Most businesses prefer good enough on time than perfect.

Being good in excel so you can present your findings in an understandable way, is critical in private organizations. In first job involving empericial analysis my boss had me spend much of my time learning formating.....so it looked good. Because how it looked was more important than what it contained.
 

noetsi

Fortran must die
#10
Since it apparently caused confusion before I was not arguing that researchers/professors don't gather their own data. I meant it is common in classes to work from a data set provided by the instructor.
 
#11
Thank you for the massive amount of information here!!! I am starting to work on learning SQL and SAS now!

I guess here is my secondary question... I'm currently making mid-50k in a southern, lower than average cost of living city (my mortgage is mid 800 a month, gas is mid to low 3 something dollars a gallon). I don't work a lot... Like about 30 hours a week with loads of holidays... But there just isn't a lot of room for promotions and state jobs don't give too much raises to match inflation...

So changing career to become a statistician (for the Feds)... Is it a promising move given my condition (I'm willing to relocate)? I just feel that I have to make my decision quick so as to build my experience in a new profession. Just wondering how is the demand for statisticians, salary level, and opportunity to move up the ladder.

Thanks again!
 
Last edited:

noetsi

Fortran must die
#12
State jobs in the south these days are not going to give many raises. We went six years recently without any.

Federal jobs have generally good pay and benefits - but you have to stay around a reasonably long time to really get the benefit of the pensions. In the state agency that I am at now it is fairly common to have people here for 20 plus years. You lose a lot of this benefit by comming into the system as I did later in life. If you want to enter a public system (given the focus on pensions there) you want to do it as young as you can. Note that, in my experience anyhow which may not be common, you have greater access to software and more chances to do things in stats in the public than the private sector. In the private sector they do what contributes to the bottom line and overly complex or long analysis will be frowned on even if more accurate. Speed and deadlines are more critical there then in the public sector and so is stress. If you want to earn a lot of money in the private sector long hours and lots of stress go hand and hand with this. People get fired - something that is very rare in the public sector.

I don't know what the demand for statisticians is in the federal system. Unfortunately the fragmented nature of the federal government makes it very difficult to get a sense of this. In general I think you will find, based on the sector as a whole not statisticans, that pay raises and probably promotion is slower there than the private sector in part because there is more emphasis on pensions in part because it is easier to judge success in the private sector. Conmonly public jobs promote on the basis of supervisory factors rather than substantive skills and the way to get a pay raise is commonly to move to a new position as performance based pay is rare. One major downside is that a large proportion of federal jobs are in DC and you will have to travel a signficant distance to work given housing in the DC area.

My comments are based on reading and studying the federal system - I have been employed in state not federal jobs.
 
Last edited:
#13
It sounds like you are already doing this, but I would suggest having an understanding of all of them at a basic level. Organizations and industries vary a bit and it's impossible to tell where you will end up, so it's good to stay as diversified as possible. Here's what I mean with bare-minimum basics:

SAS/R
Import/export data
Recode data
EDA tools
How to approach more advanced things (learn lm() in R and proc reg in SAS and know that that basic framework will serve you well as far as implementing more advanced techniques)

SQL
Basics of querying, recoding, and reporting data (select/from/where/group by/having/order by)
How to turn narrow data into wide data (i.e., group by and summary calls)
What different joins entail and when to use them (e.g., left join vs. inner join)

Whenever you learn a technique regarding recoding and EDA, try to understand how to do it in all 3.

It won't take you long to get there. Once you do, you will be in relatively good shape. When I started my current job I knew R fairly well but had only one class in SAS and zero SQL experience. I was able to pick up everything I needed to know to do the job in SAS and SQL with one short SQL training class and on-the-job learning.
 
#14
Also, I second noetsi's suggestion about Excel. Obviously, Excel is not ideal for data manipulation or advanced analytics, but you will find that many organizations prefer using it for creating graphics. And it doesn't hurt to understand basic Excel formulas and functions.
 
#15
lots of good suggestions here. You'll need to know as much SAS as possible, if for no other reason than to look good at your work.

R will actually benefit you personally better b/c it's a skill you can always take with you ... it also looks good to know a lot of R.

You may never need to know SQL, esp. in some organizations, gov included, where all data management and data analysis is often done by separate people. BUT, again, it'll look very good + benefit you personally.

if you get in the gov, make tools, they love it ... with businesses it's hit or miss, sometimes they don't want you to spend time making tools, sometimes they love it