Need to know why cant we use t - test in all the phases , i mean why is it required that t- test requires that the data is normally distributed.....why t-test can't handle this because t-test deals with samples as well right.

I know that they are other non-parametreic tests which dont expect the data to be normally distributed but my question is specific to t test and z-test.]]>

I am a vascular surgeon planning a research project.

I need some advice regarding the sample size calculation.

I am looking at the same group of patients pre and post-procedure looking for a change in the diameter of their main blood vessel at 2 different points.

We have some data from a small pilot study published;

Pre post Range for post n

Point 1: 27.4mm 28.3mm 25.24 - 21.83 43

Point 2: 22.2mm 23.6mm 21.83 -...

Sample size calculation]]>

How short can a test retest reliability between trial duration be?

If one will do to an experiment and repeated it for 3 times in succession , is this okay?]]>

I'm wondering if anyone would be able to help me understand my mediation results. I ran two different mediation analyses on SPSS using the macro PROCESS.

The issue I am having is my a and b pathways are not significant, but my c' and c pathways are. How would I write these results up?

Thank you in advance!]]>

Could anyone explain to me why researchers use strong criteria in rct's and what then happens to the power?

Thanks in advance!]]>

So it is difficult to get past the opening paragraphs.

I am OK with standard Logistic Regression but what does conditional mean here?

What are strata (clusters)?

What is a stratum indicator variable?

What are matched sets?

How do you get results and how accurate are they?

A simple example of each would help greatly.]]>

I want to split each cohort based on values of an "education" variable.

1)High school dropout if var<=11

2)High school graduate if var==12

3) College participants if var between 13 and 15

4) College graduates if var>=16

Additionally, I want to estimate a multinomial Logit model using these choices using high school dropouts as the reference group.

Eventually I want to estimate an ordered Probit model as well.

How do I go about doing this in Stata?I'm pretty new...

I need your help with cohorts]]>

I hope you are doing well. I was wondering if anybody could help me with the information component formula. A Bayesian method for data mining analysis.

I am having trouble calculating the confidence interval. Formula attached in PDF. The lower limit CI makes sense but when I calculate the upper CI, it is still lower than the CI itself which I don't fully understand. Looking at the formula, it doesn't make sense either to me...

How would you calculate instead?

Thanks so much...

Information component formula - Bayesian method - Data mining]]>

The distribution of water quality is not normally distributed.

I have 100 samples from the urban areas and 200 samples from the rural areas.

From what I have found so far, I can't seem to find a test that allows me to test the significance of the difference between the two groups given that I want to compare medians and that the two groups are different lengths.

I am using R studio for my analysis]]>

Can someone be so kind to explain me a little bit what is used for?]]>

I am working on a data base collected by walk/path observations in animals. The data is recorded as coordinate X and Y.

Do you have any suggestion about which R package I can use and which analysis would best to test different between different conditions. For now I found a nice package called spatstat, but I am not sure that would be ideal for my data.

Thank you all]]>

For a website CRO (conversion rate optimization) trajectory project of a client, I’m trying to setup a way that they can sequentially check whether an A/B test ‘reached’ significant results (i.e. Does website B lead to more transactions, than website A?). Much like clinical trials, data is coming in over time and the sooner a conclusion can be drawn the bigger the impact. I’ve been reading some papers on this subject (e.g. INTERIM ANALYSIS: THE ALPHA SPENDING FUNCTION APPROACH by...

What alpha spending formula to use for chi square test of independence ?]]>

I need help with my research. I am investigating three variables, data is collected on less than 50 students. I dont know which statistical tests I should (am allowed to) use, so I would be very grateful if someone could help me in inbox where I could share more informations. Thank you very much ]]>

I am not sure whether I should use Dummy coding or Helmert coding.

I haven't find papers with moderated mediation and Helmert coding yet.

My study:

X= 3 levels

W=4 levels

M1,M2, M3, continuous variables

Y= continuous variable

My hypotheses propose that one of the levels of X, level 1, will have the highest impact on Y though the mediators, followed by the next level, which...

Multicategorical factorial design: Dummy or Helmert coding?]]>

I have data from 100 participants, for each one I have one column with one measure and second column with the second measure. My question is - if it’s possible to Pearson correlation at once for each participant.

example:

to do correlation between column A to B, another correlation between column C to D, another correlation between column E to F and so on.

I know I can do table correlations for all variables, but it's not what I need at the end.

thank you]]>

I have a problem with the interpretation of my results from a multiple linear regression. I want to evaluate the impact of a policy change (X) on an economic indicator (Y) and have included various covariates in the regression (6). The policy change occurred in 2015 and I have data from 2012 to 2018 available. I first run the regression with the full data set and it turned out highly significant. Afterwards, I played around a bit with the data range and got some weird results. If the...

Interpretation of my results (Significance varies a lot)]]>

I am currently evaluating a 2x2 experimental design.

Let's say the two independent variables are A and B with the manipulated levels (A1 vs. A2) and (B1 vs. B2). The dependent measure is performance (as a percentage score). I hypothesize that performance is higher in treatment A2 compared to A1. Moreover, i hypothesize that performance in treatment B2 is higher than in B1. However, most importantly, I predict that the increase from A2 to A1 is significantly higher than the increase from...

2x2 design: compare the increase within two treatments]]>

Solution: 3, 6, 12, 24, 48, 96,..., ~3M]]>

For the target situation, I know that there are always going to be 3 positive results. I’m looking at a soccer league and comparing some stats (goals, last year rank, etc) to predict likelihood of top 3 in a given year. Each previous subgroup/year will have 3 positive results. However, since the probabilities calculated are independent of each other...

Normalizing logistic regression probabilities to fixed sum]]>

I do not understand the difference between the question 1 and 2, both ask to compute the influence of one single variable on y without the impact of a second independent variable. So why we have to compute partial and then semipartial correlation for question 2. I mean, we can just compute the partial correlation of study exam, and then for the question 2 calculate again the partial correlation of study semester...

]]>

I'm trying to find a suitable statisical method to analyse my data. I have data on what caused failiures for HGVs in an annual, arranged test vs a roadside inspection, and the number and rate of each failure. Each of these categories I have one value per year for six years. I have attached a sample graph to demonstrate what I mean.

I thought as first a paried sample t test, as I had "paired" data (same trucks, in two different scenarios), but I'm not trying to compare the means of...

Which Method?]]>

The overuse of percentages by our wise ruling class and media to convey critical information may be hindering understanding, and therefore a rational response to the

WANTED: Rational Coronavirus Analysis]]>

I have a question concerning the following case:

As we can observe, the IV of gender has an important impact, since its result is <0,05, however we can see that it only explains 0,31 % of superstrenght.

How can be significant and at the same time only explain 0,31 % of the DV?

Thank you for your time]]>

- need to understand how to put , Xbar=8.13 n=30 SD=4.17 to written formula work

- The National Football League (NFL) has recently decided that they must improve their understanding of vicious, outside-the-rules, illegal hits/tackles that have been occurring in professional football games. This phenomenon has never been studied or monitored in the past. To have a comparative basis, the NFL decides that they must look back at some history. It is not feasible to look at...

easily done on minitab]]>

I am doing a bachelor project about empathy's class efficiency at school. To do so, we gave an "empathy course" to a class of 10 years old children.

We then gave to the children of this class a test to measure their level of empathy and compared those results to the ones of another class who did not have an empathy class.

I want to compare the means of those two classes but I am not sure about which test to use. I was thinking using the "Independent t-test for two samples" (the

two...

Independent t-test for two samples]]>

In Life

Some mock me for doing statistics

Some loathe me and statistics

Some don’t understand what statistics are

Why is it that statistics

Put a calm smile on my face?

Because of statistics I can solve the deepest mysteries

Because of statistics I will not be lonely again, playing in the data

Because of statistics I can rearrange the stars in the skies above

(by Chinese statistician Wang Jiaowei [translated],

The...

Statistics Poetry]]>

I have a question about statistical testing on hydrological statistics - I am trying to find a correlation on the link between a multitude of values and arguments (like precipitation and number of cases, etc). My question is what kind of test should be used to find a connection between different arguments. For example, I need to find a connection between number of flash flood cases and the size of the basin (file attached). What kind of test should be used, assuming there is no...

Analysing bunch of data]]>

For an assignment I have to answer the following question:

"What is the effect of

I don't quite understand what they mean by "

Plus, I couldn't find any helpful theory online...

I have the following data where I already calculated all the missing values (X1-X15) for the first part of the exercise:

Thank you in advance for your help!]]>

I want to conduct an A/B test where main metric is ‘logged in visitors/total visitors’ to my website

I assume, and want to verify with you that I’m correct:

- my null hypothesis is that there’s no difference between the 2 groups proportions

- Therefore, it is a chi squared test

If so, how do I calculate the sample size needed for the test? The online calculators talk about CTR, but that’s not a CTR

I would appreciate some theoretical explanation to shade light on this

Thank you!]]>

Demonstrate understanding of the Central Limit Theorem, using R, by showing how the distribution of the sample mean changes according to sample size.

Consider a Poisson distribution with λ = 1.5.

Generate samples of 10,000 means over different numbers of observations (eg give a matrix 1, 2,3...100) rows. For each of these samples of means, compute the mean of the means, the sample standard deviation of the means...

Help with R code to produce proportions of sample means above one sigma]]>

Since McNemar is preferred test for paired categorical data, I used it and got significant difference in comparison of 45% for all proportions but 45% to 37%. I interpret this as "other tests show significantly...

Is it wrong to use Chi squared test for paired data?]]>

I’m testing a hypothesis that requires Tukey-like comparisons of the variances (or standard deviations) of multiple groups. I’ve already used Tukey to conduct pairwise comparisons of the

Tukey-like pairwise comparisons of variances]]>

Tests of normality - pros and cons of each, and what to do if they give different results?]]>

I have information on a multiple YouTube channels. The information collected for each users channel is:

channel name | the date/day | the number of subscriber on the day | number of videos on the day | a yes/no variable for if they have a video trending on that day.

I want to see what kind of impact having a trending video on a certain day has on the number of subscribers the channel has. I want to see if there is a difference in growth in subscribers between the days where the...

YouTube Trending Analysis/Statistics]]>

I know to use the chi-square distribution for CI of SD, but somebody wants to use the central limit theorem to calculate this CI and wants to get the book that has this formula. thank you all.]]>

If you don't know them... well could you read please? It's a simple introduction on how it works.

In tennis, each game is won when one player achieves both of two goals: her score 1 must reach at least 4 points, and (2) must exceed that of her opponent by 2 points. Scores start with “Love” (0 points), then “15” (1 point), “30” (2 points) and “40” (3 points)...

Markov Chain Models in tennis]]>

The measurement - no. of redundancies per year. Year 1 = 1000, year 2 = 1300, year 3 = 2400...etc. It's not normally distributed, and it's time series, so I guess a poisson distribution.

How would you go about assessing whether or not the differences from year to year are random fluctuations or something more significant?

many thanks]]>

Cattle are the main natural reservoir of STEC (Shiga toxin-producing E. coli) and excrete these bacteria in their feces. Human transmission occurs through consumption of fecally contaminated food or water, resulting in serious diarrhea and kidney damage. Currently there is no specific treatment for human infection. Therefore, limiting bovine carriage (by vaccination) is a viable option.

I am doing research on an experimental bovine vaccine. I performed a RCT (randomised...

Advice on statistical analysis of in vivo bacterial data]]>

- What is the statistical tool that is used for the inference of
*estimation*?

Sample Mean

Hypothesis Test (6 steps) (this was my answer)

p-values

- Which of the following confidence levels will provide the widest interval if the same sample data is used to create each interval?

b. 95%...

not understanding this homework]]>

Firstly I know very little about statistics and secondly I know that I do risk being labelled as a 'fruitcake' by some. But so be it.

However, it is my contention that the current Covid-19 pandemic is nothing more than a variation of the other Coronavirus outbreaks that have occurred in the past. A theory that some well respected scientists also hold BTW.

Anyway it occurred to me that one way to test this would be to look at the weekly death statistics this year compared to the...

Help with Covid-19 statistcs (Newbie alert)]]>

I am writing my thesis and had to stop my experiments because of COVID-19.

Because of this I have to continue with a case study instead of an RCT.

Now my question is if I can use statistics in a case study and if this is useful? (it's one patient with a pre and post measurement)

I found a Reliable Change Index (RCI) and was wondering if I can use this or if there are other alternatives that you know of?

Thank you for your help,

Maite]]>

- How can I find out if there is cause-effect relationship between the 2 variables?
- question: The difference between an association and a correlation?
- A strong association automatically implies a correlation?

Let's suppose I only have 2 labels, `cat` and `dog` (I actually have 1500+).

And 2 different classifiers, `A` and `B`, that implement completely different models.

Given a set of samples pictures `S1..Sn` for input `S` I'd like to figure out the total probability assigned to each label.

Code:

` Classifier: A B...`

I'm attempting to compare differences between 2 unequal group sizes (one ~ 97 the other ~ 714). The reason for the large discrepancy is I am looking at a program done by one class to see if it is significantly different than what has occurred in previous classes. I've been reading about robust stats recently and decided to use a yuen bootstrap in R-Studio from the WRS2 package for a more valid comparison, especially with the difference in sample size.

My formula is...

Independent Robust T-Test]]>