I have a .csv file consisting of 1000 rows that I exported from MySQL. I have 266 unique variables in one row. Two questions:
1. How do I make a graph where you can see the frequency of each variable (character string)? (The problem is that it only shows two or three of the variables and I want to see them all.)
2. What would be the best graphic for this?
Also, if you have the time to get into more detail, here are a few things that you can look at if you want to help me understand what other cool stuff I can do with my data.
> dim(jackson)
[1] 1000 9
> labels(jackson)[[2]]
[1] "First_Name" "Middle_Name" "Last_Name"
[4] "SO_Num" "Arrest_Agency" "Charge"
[7] "Bail_Amount" "Lodging_Date" "Release_Date"
> nrow(unique(jackson[6]))
[1] 266
> summary(jackson)
First_Name Middle_Name Last_Name
WILLIAM: 42 JAMES : 43 TAYLOR : 24
MICHAEL: 32 ALLEN : 39 GARWOOD : 21
JAMES : 27 EDWARD : 33 THORNTON: 21
JOSHUA : 27 LYNN : 32 BREWER : 18
ERNEST : 24 LUTHER : 24 JOHNSON : 17
JASON : 23 RAY : 24 GADBERRY: 16
(Other):825 (Other):805 (Other) :883
SO_Num Arrest_Agency
Registered Sex Offender: 70 MFP :407
00130781 : 24 MFS :260
00109511 : 21 MFC :125
00111128 : 17 MFO : 87
00103890 : 16 CPP : 37
00113458 : 16 TAP : 26
(Other) :836 (Other): 58
Charge
0475.894 PCS/METH / UNL POSSESS METHAMPHETAMINE - 1 : 46
0163.427 SEX AB 1 / SEX ABUSE 1ST DEG : 34
0162.205 PCS/METH / FAIL TO APPEAR 1ST DEG - 1~PCS/METH: 30
0164.055 THEFT 1 / THEFT 1ST DEG - 1 : 26
0162.205 THEFT 1 / FAIL TO APPEAR 1ST DEG - 1~THEFT I : 23
0163.160 ASSAULT 4 / ASSAULT 4TH DEG - 1 : 22
(Other) :819
Bail_Amount Lodging_Date Release_Date
$0.00 :747 01/06/2015: 55 :906
$10,000.00: 61 01/14/2015: 40 01/16/2015: 14
$5,000.00 : 53 01/07/2015: 34 01/23/2015: 7
$20,000.00: 19 01/13/2015: 33 02/06/2015: 5
$25,000.00: 19 11/11/2014: 28 01/30/2015: 4
$50,000.00: 18 01/09/2015: 27 01/20/2015: 3
(Other) : 83 (Other) :783 (Other) : 61
Thank you so much for your input!
1. How do I make a graph where you can see the frequency of each variable (character string)? (The problem is that it only shows two or three of the variables and I want to see them all.)
2. What would be the best graphic for this?
Also, if you have the time to get into more detail, here are a few things that you can look at if you want to help me understand what other cool stuff I can do with my data.
> dim(jackson)
[1] 1000 9
> labels(jackson)[[2]]
[1] "First_Name" "Middle_Name" "Last_Name"
[4] "SO_Num" "Arrest_Agency" "Charge"
[7] "Bail_Amount" "Lodging_Date" "Release_Date"
> nrow(unique(jackson[6]))
[1] 266
> summary(jackson)
First_Name Middle_Name Last_Name
WILLIAM: 42 JAMES : 43 TAYLOR : 24
MICHAEL: 32 ALLEN : 39 GARWOOD : 21
JAMES : 27 EDWARD : 33 THORNTON: 21
JOSHUA : 27 LYNN : 32 BREWER : 18
ERNEST : 24 LUTHER : 24 JOHNSON : 17
JASON : 23 RAY : 24 GADBERRY: 16
(Other):825 (Other):805 (Other) :883
SO_Num Arrest_Agency
Registered Sex Offender: 70 MFP :407
00130781 : 24 MFS :260
00109511 : 21 MFC :125
00111128 : 17 MFO : 87
00103890 : 16 CPP : 37
00113458 : 16 TAP : 26
(Other) :836 (Other): 58
Charge
0475.894 PCS/METH / UNL POSSESS METHAMPHETAMINE - 1 : 46
0163.427 SEX AB 1 / SEX ABUSE 1ST DEG : 34
0162.205 PCS/METH / FAIL TO APPEAR 1ST DEG - 1~PCS/METH: 30
0164.055 THEFT 1 / THEFT 1ST DEG - 1 : 26
0162.205 THEFT 1 / FAIL TO APPEAR 1ST DEG - 1~THEFT I : 23
0163.160 ASSAULT 4 / ASSAULT 4TH DEG - 1 : 22
(Other) :819
Bail_Amount Lodging_Date Release_Date
$0.00 :747 01/06/2015: 55 :906
$10,000.00: 61 01/14/2015: 40 01/16/2015: 14
$5,000.00 : 53 01/07/2015: 34 01/23/2015: 7
$20,000.00: 19 01/13/2015: 33 02/06/2015: 5
$25,000.00: 19 11/11/2014: 28 01/30/2015: 4
$50,000.00: 18 01/09/2015: 27 01/20/2015: 3
(Other) : 83 (Other) :783 (Other) : 61
Thank you so much for your input!