pre-publication testing of one-click auto-analysis tool

#1
Hi all, I’m a psychometrician at Penn (https://scholar.google.com/citations?user=TlO7ZTsAAAAJ&hl=en) and wanted to get your feedback (as students, faculty, or private sector folks) on a new tool I’m developing for one-click automatic data analysis:

https://mooremetrics.com/deedive

All you have to do is upload a dataset in .csv format, and DeeDive will email you a .pdf of visuals and a .txt of results. It can handle any combination of variable types (numeric, strings, dates, etc.) as long as it’s in .csv format, and feel free to upload fake or simulated data sets to test it out (it’s free – no ads; use as often as you like). As indicated on the page, you can email questions to mooremetrics@gmail.com or email me directly at reductionist@gmail.com.

Thanks so much!

Tyler Moore
 

ondansetron

TS Contributor
#5
I think items like this are more likely to contribute in a negative manner given the way people already egregiously misuse point and click softwares that require little thought prior to use and facilitate p hacking. An automated process removes the little cognitive effort that people needed to use even a point and click like spss. The people who don’t need these products generally know better and those who need it know too little that they’re dangerous when given these tools.

I hate to be critical, but I can’t see more tools (that remove responsibility of knowledge from the end user) being a good thing. If someone brought output to me I’d still probably rerun it myself to make sure I can get the results and that they’re correct.
 
#6
I think items like this are more likely to contribute in a negative manner given the way people already egregiously misuse point and click softwares
True!

But the same can be said about all software and even about a pocket calculator. But you don't want to forbid computers and pocket calculators, do you?

Even kitchen knifes can be miss-used. It is for the user to be responsible. If someone does not know how to handle a fire, then maybe they should not play with the fire.
 

ondansetron

TS Contributor
#7
True!

But the same can be said about all software and even about a pocket calculator. But you don't want to forbid computers and pocket calculators, do you?

Even kitchen knifes can be miss-used. It is for the user to be responsible. If someone does not know how to handle a fire, then maybe they should not play with the fire.
I think it is more challenging to misuse R or SAS syntax since it tends to be more deliberate and require a bit more thought than spss, for example.

I agree that we shouldn’t get rid of the tools because of bad operators. I take the same stance on the silly “ban the pvalue” arguments; why should we ban the pvalue when the problem is that people misusing it aren’t equipped to function as a statistician in the first place?

Overall, I think the OPs tool can be useful but I just think we should be careful about these kinds of things
 
#8
but I just think we should be careful about these kinds of things
I certainly agree.

But there will be a lot more software that people claim to be "fantastic", from machine learning and AI.

But I don't know anything about OP:s software. It would be nice of him to show an example. And I would like to know what methods it is based on.
 
#9
Hi all, not sure how I missed all these responses until now, but thank you so much for the feedback! @GretaGarbo yes I should've included an example output - please find attached (both a .pdf and a .txt are emailed to you when you use it).

@Dason @ondansetron yes the risk for misuse was (is) on my mind a lot w/r/t this and other user-friendly stat tools. If the peer-review process worked like it should in theory, then we wouldn't have to worry about it because misuse of methods by people who don't know what they don't know would be sniffed out by the Reviewers. As you know, in reality, lots of garbage makes it past Reviewers on a regular basis, so yes I see the danger. When I was conceiving it (really just a hobby right now), I intentionally provided only visual results so there would be no test statistics (or anything numeric) that might end up published in a reckless journal. However, understandably, people almost always want to see under the hood, so I now include a .txt with actual results (e.g. attached). @GretaGarbo that output [just using sink() in R] gives some idea of what's going on in the algorithm. Briefly, it:

1. calculates the mixed correlation matrix (i.e. Pearson for continuous-with-continuous, tetrachoric for binary-with-binary, polyserial for ordinal with continuous, etc.) for the full data set,
2. extracts a 10-factor PCA (oblimin rotation) from that mixed correlation matrix,
3. takes the strongest indicators of components 1-3 as the dependent variables,
4. takes the strongest indicators of components 4-10 as the independent variables,
5. runs 3 linear models predicting each of the 3 DVs with all IVs (noting strongest effects),
6. runs the same models as in #5 above, but this time including all 2-way interactions,
7. makes visuals of the strongest effects from above, and
8. runs some "kitchen sink" factor analyses (including bifactor), but this is of questionable use.

that's about all there is to it. Unfortunately this means it would be easier for someone to just stick this output into a manuscript, bringing us back to that same issue.

Anyway, thanks again for your feedback, and if you haven't tried the tool, I'd love to hear what you think of it.
 

Attachments