Book club

trinker

ggplot2orBust
#1
I'm starting this as a different approach to book clubs. Maybe it will stink and flop, maybe not.

Book clubs tend to flop because they feel like work. There's something that happens when we lose autonomy that makes us like the act less than if we set our own goals and deadlines.

A new approach to bookclubbing...

Share the (1) name of the book(s)/authors/paper(s) your reading and a (2) ditty about it.
  1. likes
  2. dislikes
  3. intended audience
  4. TS members you'd recommend it to
  5. etc.

If you want to throw in a pic/link of it that could be nice too.
 

trinker

ggplot2orBust
#2
I'll go first...

Stephen Few is a data visualization guy. I've been enjoying (and purchased 2 of his books):

  1. Now you see it
  2. Show me the numbers


I have enjoyed these a lot and would recommend to anyone interested in the mechanics of how the brain perceives visualizations. I would not recommend the Dashboard book he has as it's not really for stats people. The two I mention are cheap to buy.
 

Miner

TS Contributor
#3
Uncanny. That book is sitting on my desk at this moment. I just borrowed it from a coworker a couple days ago.

I recommend the following from Milliken and Johnson:
  1. Analysis of Messy Data, Volume 1: Designed Experiments
  2. Analysis of Messy Date, Volume 2: Nonreplicated Experiments
  3. Analysis of Messy Data, Volume 3: Analysis of Covariance
These books address common issues that arise when you take statistics from the theoretical and apply them in the real world where constraints force you to make suboptimal choices.
 

bugman

Super Moderator
#4
My copies arrived from amazon today!

Looking forward to some couch time! They both look great and a bargin at that. Thansk for the recommendation!
 

bugman

Super Moderator
#5
Finally finished reading "Now you see it". For the price and content, this is a bargin book. Though I found alot of the content was actually subconciously locked away in my small brain, there were alot of "oh, of course" moments that served to jog my memory. Nice examples also. i was particuallry interested in the use of tableau software in many of the chapters.

All in all an easy, informative read. And a useful reference.
 

trinker

ggplot2orBust
#6
If you notice in one of the books he thanks good ol Hadley Wickham. It seems the output of tableau is very similar to ggplot2. For some people tableau may be the way to go but it is pricey. I'd be curious to here what advantages/disadvantages it has with R (Bryan and Lazar have used both software).
 

bryangoodrich

Probably A Mammal
#7
Advantages? It's easy and intuitive to use. It's VERY interactive and gives you a very nice GUI way to set up your graphs. Seriously, watch the on-demand training videos to see just how much you can do.

http://www.tableausoftware.com/learn/training

On top of that, as long as your data is cleaned up, you can establish your own queries and calculations within Tableau without messing up the underlying data. It can connect to MANY systems very easily (such as SAP). It has a framework of setting up visualizations like you would a spreadsheet in that you have tabs for each of your visuals. Then you can create a dashboard that puts it all together. Again, setting up that dashboard is intuitive and drag-and-drop oriented. Even better, besides all the ways it is interactive, is you can very easily set up filters and triggers on these interactions, such as if you have a map and a table, you can make it so when you click on a spatial unit in the map the table shows you the details for that unit only. Even more, you can set up multiple dashboards that interact across each other, so if you click on a spatial unit, say, it'll move you to the second dashboard and show you the details for that one unit. Then all of this, if you have the server edition, can be integrated into Sharepoint (like my firm uses) and provide a complete web format for these dashboards. Otherwise, like with the desktop version, you'd have to have your clients install the free Tableau Reader. Then they can load up your workbook and use the dashboards.

So with all of that, and all the different ways you can set up calculations, aggregations, and queries, it makes really good visualizations! Given the data types that are in your visual, it'll give you options for what types of visuals you can do (e.g., you can't make a map if you don't have a spatial data type). You can very easily stack data in your visual (so you can group bar plots under major and minor categories), alter the way things are marked (such as color, size, or labels) by, as usual, just dragging and dropping your variable to the specified location. Seriously, I loaded the program up, dragged a csv from my explorer window into the program, it loaded it up and within minutes I was toying around with a visual I could change into 4 different views (plots and tables).

It's only "weakness" is that it isn't a data management tool. But it isn't supposed to be! So you will still have to use other tools (SQL, R, Python, Excel, etc.) to make sure your data is properly tabular. But if you organize your data well, then Tableau is ready to connect to it and you can start launching into making visualizations, very complex ones, within minutes. Once you have visuals, you can start making all sorts of dashboards and interact with your data. That's really what got me: I could highlight a bunch of points and see the underlying data for just them and immediately export that subset if I wanted.

That's how it differs from R. If I have a system in place to run an algorithm, great. Tableau is a tool you interact with. Not only can you not get anywhere near that sort of interactivity with R, but it allows you to go from complex ad hoc interactions back to the data. But as I said, it's a read-only sort of relationship, so eventually you'd want to export the data, to analyze it further, in something like R. But when it comes to that interactivity and visual exploration of data, R doesn't come close. The amount of work involved to go from data to visual back to the data is way too much. Even if I could visually identify a bunch of points of interest, I then have to go back to the terminal, run code to identify them (again!), to subset the data. The visual and data exploration are intimately linked in a GUI interface, and that makes it stupid easy (and fun!) to use.
 

bryangoodrich

Probably A Mammal
#8
and looking at some of the advanced on-demand training videos, I think there might be a way to integrate Python with Tableau O.O

That would be pretty cool, because then you can interact directly with the Tableau data format (it uses its own) to help move from messy data into tidy data for Tableau. Of course you can just work on the data first in R or something and then (re)import it, but this API will just offer that much more flexibility. I'm continued to be impressed by this software!

For $2,000 for the desktop version, it really isn't that expensive, certainly not much more than Stata (about $1,500 for the cheapest license).
 

Dason

Ambassador to the humans
#9
That's how it differs from R. If I have a system in place to run an algorithm, great. Tableau is a tool you interact with. Not only can you not get anywhere near that sort of interactivity with R, but it allows you to go from complex ad hoc interactions back to the data. But as I said, it's a read-only sort of relationship, so eventually you'd want to export the data, to analyze it further, in something like R. But when it comes to that interactivity and visual exploration of data, R doesn't come close. The amount of work involved to go from data to visual back to the data is way too much. Even if I could visually identify a bunch of points of interest, I then have to go back to the terminal, run code to identify them (again!), to subset the data. The visual and data exploration are intimately linked in a GUI interface, and that makes it stupid easy (and fun!) to use.
You should give cranvas a try.
 

bryangoodrich

Probably A Mammal
#13
Revive the dead!

Anyone up for extending book club to a general reading club? Say each week post 1-3 articles worth reading/learning. With Slack, we can easily share these (since not everybody is on TS Dropbox), and we can have discussions (here or on Slack threads).

What do ya'll think?

Not knocking books. They just don't fit within a week. We can have a book over the long haul (monthly check-in?), but with an article or two, we could easily have little weekly "brown bags" in a way. That's my thought about it. Need some "structured" learning in my life.
 

Dason

Ambassador to the humans
#14
Doing three articles a week might be a stretch at least at first. I'd be game for it. Any articles in particular you want to give a shot? Maybe a classic to start off with?
 

bryangoodrich

Probably A Mammal
#15
I'm not picky, but I am getting into Bayesian statistics and functional programming concepts now. I think for starting 1 article a week is fine, but unless they're relatively short and we're particularly eager, 3 would be kind of a cap. Typically 1 or 2 (related) papers might be sufficient. I don't even care if it's always statistics related. I just heard some stuff about nutrition science on a podcast that makes me interested to read the actual science (and their ultimate questionable hippie interpretation!)
 

bryangoodrich

Probably A Mammal
#17
Yeah, I used to (esp. when I had my own access to stuff). Between work, learning practical things (coding), and exercising like a fiend, I don't really dedicate time to literature reviewing anymore. Hell, without taking the bus anymore, I really don't make much time for reading books anymore. Now I just listen to podcasts and occasionally ebooks when I can get them through my library (long queues!!). It would be nice to not only keep up on literature, but I miss the brown bag approach to just having some material to read, meetup, and talk about it over lunch. I figure Slack would be a good surrogate to that with smart people here.
 

hlsmith

Not a robit
#19
Sounds good. I read a lot about causality, causal assumptions and study designing methods. I also enjoy reading machine learning study (typically supervised), since it is like a whole new area of stats for me. I also read a lot of those short 2 page blog posts that pop up on my twitter feed. Next time I come across one that seems applicable to all I will post it. This week I am reading quite a bit about instrumental variables, since I am working on a broken randomized trial (i.e., not everybody took the treatments they were suppose to). Though, I would also be interested in generic Bayesian article. I usually like articles with supplemental code provided.
 

bryangoodrich

Probably A Mammal
#20
That's actually a good point. I think for this effort, even some blog posts of relevant topics could be useful to share as well. I think causality could be an interesting topic. It's something I've not read much into, and I still question its efficacy. Then again, I'm a cynic!