I just wrote an introduction with a couple of reasons as to why I am here. Reason 1 is that I'm not sure where the divide is between big data and just a big MySQL database. We have easily over 1 million records right now in our database and I'm not sure if that is considered big data. Should I be using Hadoop, Pig, or Hive for my data now? If so, which one should I go with? I'm leaning towards Hadoop. When one table consisting of over 185,000 inmate records takes a couple minutes to print in MySQL Workbench, should I consider that big data and would it be more efficient to move to Hadoop?

Anyways, the second reason I am here is because I am a newb to R and want to start making predictions and assessing risk based on the data that we do have. What I want to do is pull out the tables and make graphs showing this type of thing.

The charge is the x axis (characters). And the number of Failures to Appear at court plotted throughout the graph with the y axis. So for every charge, the number of FTA's associated with that charge.

Then, I want a graph where the age is the x axis and the FTA's are the y axis.

Thank you so much for the help. I'm new to R so I'm sorry if this takes some explanation.