1. P

    Correct calculation of BIC (Bayesian Information Criterion) to determine K for K-Means

    I am trying to calculate BIC in python. In python, there is no inbuilt library for computing BIC. I referenced the following link to compute variance and BIC further:-...
  2. T

    Two sample t-test between percents in Python?

    Hi guys, I have a bunch of data in the following format (sorry, it does not format well on here): Key Indicator-----------First Group------------Second Group ------------------Observations---Mean-----Observations----Mean 1 Grade change...
  3. T

    Which correlation test when researching the effect of previous success on change?

    Hi everyone, I'm doing research on business owners, who have started multiple businesses. I had a sample of 2500 businesses and looked at the business owners' success rate from their first to last business. That gave me two groups: 1) business owners who were SUCCESFUL in their first business...
  4. R

    Best way to treat integer/float column with null values in logistic regression

    Hi, I was wondering if anyone can assist me with this issue. I am building a logistic regression model to predict purchase or not purchase based on web site behaviour data. One of the factors that I would like to include in the model is the visits to purchase and the days to purchase...
  5. S

    Syntax to use Python for renaming variables

    Hello! I am working with a very large dataset and I plan to merge in more years of similar data; however, the data will be panel data so I need to rename the variables with a suffix. I have worked all day on writing script (even started renaming each individual variable but gave up at #409 out...
  6. trinker

    Python nested list compared to R's

    Are these two things equivalent in R and Python respectively? r <- list(list(c("a", "b"), c("d", "e")), list(c("f", "g"), c("h", "i"))) python = [[("a", "b"), ("d", "e")], [("f", "g"), ("h", "i")]]
  7. rogojel

    What does Python offer that R can't?

    Hi, I am looking at Coursera data anlysis specializations and most (basically all except John Hopkins) offer Python as the language of choice for their courses. I really like R but I started to wonder if I am missing something? I just see no advantages in Python for data analysis, that would...
  8. P

    OMS OXML broken output, slow work

    Hello! I have SPSS 22 x64, and SAV file with about 400 cases. Syntax is FREQUENCIES VARIABLES=var01 var02 /ORDER=ANALYSIS. OMS /SELECT ALL /DESTINATION FORMAT=OXML OUTFILE="xmlout.xml". XML file appears with 0 size, than, after some time, it becomes bigger, but unfinished. For...
  9. Lazar

    Calling Python from R

    So I am trying to call python from R so I can use beautiful soup. I have tried both >system("python") and >library(rPython) >python.exec("import bs4 some more python script here") The issue I am running into in both cases is I get back "No module named bs4" I think the...
  10. D

    Converting syntax between python and STATA

    Dear colleagues, Im converting a code written in PYTHON to STATA and I'm stuck at the point when have to apply the iterative procedure. Cannot find the way to write the loop so that it includes the initial value and generates an estimate for lat. Has any of you got any experience with that...
  11. A

    Gibbs sampler problems

    I am trying to implement the gibbs sampler found here on page 175. It is written in R but I am trying to write it in Python but running into problems. Here is my python code from numpy import * import...
  12. A

    Kulldorff's Spatial Scan Statistic

    Hello, I have a semester project to develop a tool using Python which implements Kulldorff's algorithm and finds clusters in point (geo)data. Currently it is a mess: I understand Kuldorff's algorithm upto the part before it dives into Bernoulli & Poisson. I imagine I have to call certain...
  13. P

    conjoint in python

    Hello, I create a web page for conducting surveys (php) and there is also a module on conjoint analysis (orthogonal plan, two attributes at a time approach, trade-off matrix approach, etc) and I have a question: do you have any tips? I heard about a SPSS plugin for Python, has anyone used this...
  14. A

    [Python] - dealing with large text files

    Hi, I have a large file (2 GB) with 4 columns that I want to read and extract some info from. The data looks like this: CHR START END A 1 10583 10583 0.14 1 10611 10611 0.02 1 13302 13302 0.11 I also have another file from where I have extracted a string to be compared with...
  15. X

    Libsvm: scale data using python

    Libsvm provides a binary for scaling data i.e. svm-scale. How can I scale data to the normalized values using the Python. And then how do I do the scaling for the test samples?