How to start on data analisys

#1
Hey, i am new here and i would like get some advices to start on data analisys field.

I study computing engineer and i work as programmer. For my work i have access to big amount of data that i would like analize.

Any tutorial for first steps?

Thank you in advance and sorry for my english
 
#2
You will have to be more specific about what inference you want to make from your data. Based on that you could look at basic descriptive statistics or techniques of regression, time series etc etc..
Also, which language you program on?
 
#3
Thank you chetan.apa for your answer.

I program in several languages, mainly .NET, PHP. But i make some projects in Java, Python and C. I have no problem to learn a new languaje.

About the techniques, I am sure that my data have any patron because for example if in april down the sales sure same month (April) of previous years too down the sales.

I think first i should make a descriptive analisys to understand the data. ¿Correct? later in function of the descriptive analisys result i take a decision for the next method

if the last paragraph is correct, ¿what need to start? i have seen something about R

Thank you
 
#4
I think first i should make a descriptive analisys to understand the data. ¿Correct? later in function of the descriptive analisys result i take a decision for the next method

if the last paragraph is correct, ¿what need to start? i have seen something about R

Thank you
Definitely, you need to perfrom descriptive statistics first. R is a good tool to learn and use for statistical analysis

About the techniques, I am sure that my data have any patron because for example if in april down the sales sure same month (April) of previous years too down the sales.
I think you are talking about seasonality in the data. Can you provide a plot of your data
 
#5
Here is a data plot

Data is about received calls in a call center, not about sales like in my previous example. Data is grouped by year and month in the chart just to get a better view but the original data is sections of 15 minutes.

Thank you for your time
 

noetsi

Fortran must die
#6
I think you need to start with, what do I want to know. That seems pretty basic, but until you have a very specific sense of what the question is you can not form hypothesis or test them. Lots of people rush into software or methods without clearly thinking through what they are trying to achieve.

Reading this thread I was not certain what your question was.
 

Mean Joe

TS Contributor
#7
Any tutorial for first steps?
It looks like you have data about # of calls in a month.

1) Start with mean and standard deviation of your data.

2) Sort your data (high to low), then see if there is a pattern. eg 3 of the top 5 months are April, or 7 of the top 10 months occurred in 2011 (after event ...)

Just get acquainted with your data. You may want to also try

1b) Mean and standard deviation of your data according to some quality, eg mean for month of July, mean for year of 2008.