I'm sorry for not using the correct terms, but I have a problem that involves multiple layers of probability that I could solve by myself eventually, but it would take days of manual labour to systematically work through, so there must be a faster way. I would very much appreciate any advice on how to create an equation to address this puzzle.

Imagine a game where a letter is posted to a random address. The recipient of that letter will then forward it to a new address, and so on and...

Multiple layers of probability]]>

1) Is it still true there is no (non bootstrap) way of generating SE for lasso? If so how do you do statistical test?

2) One article I read said that all variables had to be standardized to use bootstrap? Is that true?

3) I understand that lasso assigns a penalty to shrink various estimates. But I don't understand substantively which will shrink more than others (that is the basis they will shrink). I have to admit although I...

Lasso regression]]>

By using univariate analysis (McNamer/Wilcoxon), I found that there is no statistical difference between case and control groups (checked for one of my dichotomous independent variable).

When I used COXREG (in order to create consitional log regression in SPSS) and analysed the same independent variable (analysed alone in regression) - I realized there was a...

Help- conditional log regression]]>

I am not a statistician and I am only a beginner in this field.

I would really appreciate any help on this subject of regression.

I have a sample size of 37 with 9 predictors.

The predictors are (family size(categorical so its converted to dummy variable), total no of appliances (scale ), total no of rooms(scale ), total appliance usage hours(scale ), tarriff price of electricity(scale ), income group(categorical so its converted to dummy variable) etc)

The DV is energy consumption...

Multi-variate regression for a small sample]]>

I have recently conducted a validation study on a 7-item survey.

For the factor analysis, I have calculated

1. Determinant of the correlation matrix Det = 0.030

2. Bartlett test of sphericity Chi-square = 685.883 Degrees of freedom = 21 p-value = 0.000 H0: variables are not intercorrelated

3. Kaiser-Meyer-Olkin Measure of Sampling Adequacy KMO = 0.847

Afterwards I have conducted FACTOR ANALYSIS with principal component factor Factor analysis/correlation

Factor...

factor analysis: PCA vs EFA]]>

Y = X_1**2 + X_2**2 + ... + X_k**2

with X_i independent standard normally distributed (mean 0, std 1).

The expected value of a sum of variables is

E(X_1 + X_2) = E(X_1) + E(X_2)

The expected value of a product of independent variables is

E(X_1 * X_2) = E(X_1) * E(X_2)

Combining all of this

E(Y) = E(X_1**2 + X_2**2 + ... + X_k**2)

= E(X_1**2) + E(X_2**2) + ... + E(X_k**2)

= E(X_1)*E(X_1) + E(X_2)*E(X_2) + ... +...

Expected value of chi-square distribution]]>

Relative impact]]>

I am currently completing my dissertation and my supervisor wants me to include a statistical significance test in relation to the data which I received.

This is what he has asked:

- I was wondering if we can extrapolate these local findings nationwide- and provide the 95% CI of these two estimates?

- Is it possible to run some significance tests for these observations between compliant and non-compliant, which looks significant...

Statistical Significance]]>

I compared different ways to calculate confidence intervals. On the one side I used the direct formular on the other side I used a percentile bootstrap methoden. (Calculate the statistic (e. g. mean) on n subsamples and choose the a/2 and 1-a/2 percentile for the confidence interval.)

I noticed that applying the bootstrap method with replacement and a sample size that equals the original sample leads to (almost) the same confidence interval than the one I calculated by formular. The...

Why does bootstraping with 50% samplesize and no replacements always gives appropriate confidence intervals.]]>

I'm making an MLB Model and I'm at a point where I'm stuck.

I have the AVG Runs Scored for each team, and the Standard Deviation of Runs Scored for each team for the season as well. If I know that one team is facing a pitcher who is superb, let's say 15% better than the league average, I will multiply the AVG Runs Scored by .85 to try and get a better indication of how many runs I can expect the team to put up.

My question is, do I have to multiply the Standard Deviation by .85...

I have a Mean and Standard Deviation. If I multiply the Mean by a number (non-constant), do I have to multiply the Standard Deviation by that number?]]>

Is possible to calculate the possibilities of an unvaccinated man to die?]]>

Chance weights are integer values that represent outcomes to a random event. A flip of a coin has two outcomes, each with a chance weight of 1. A roll of a pair of dice has 36 chance weights partitioned across 11 possible outcomes like so: {1,2,3,4,5,6,5,4,3,2,1}.

I start with randomly chosen from this set of chance weights: {1, 3, 6, 10, 15, 21, 25, 27, 27, 25, 21, 15, 10, 6, 3, 1}.

There are 216 chance weights, allotted to 16 possible outcomes.

But this...

Scaling chance weights]]>

Code:

```
Row │ hryear4 prmjind1 prcnt_minority median_wage wage_10 median_age
│ Int64 Int64 Float64 Float64 Float64 Float64
─────┼─────────────────────────────────────────────────────────────────────
1 │ 2010 1 54.9525 10.0 8.0...
```

Include Month Variable in GLM?]]>

This is not about causality per se, I know that correlational studies can not show that...

How do you know if X is moving Y in non-experimental data.]]>

I'm thinking of using the Mann-Whitney test but I have also read it is not the best for differently shaped distributions. I also have various sample sizes, some are below 30 per group, others are above 400 per group. Also, the majority of my data have unequal...

Statistical test for non-normal, dissimilarly distributed data?]]>

Code:

`Row │ hryear4 prmjind1 prcnt_minority median_wage wage_10...`

This is the code

PROC LOGISTIC DATA=WORK.SORTTempTableSorted

PLOTS(ONLY)=ALL

;

CLASS pd2 (PARAM=REF) pd1 (PARAM=REF) pd3 .....;

MODEL DVD (Event = '1')=pd1 pd2 pd3 pd4 pd5 pd6 pd7 pd8 pd9 pd10 pd11 pd12 pd13 pd15 pd16 pd17 pd18 pd19 pd20 pd21 pd22 pd23 pd24 pd25 pd26 pd27 pd28 pd29 pd30 pd31 pd14 /

SELECTION=NONE

LINK=LOGIT

;

RUN...

Reference coding in SAS Proc Logistics]]>

I have a dataset that consists of Price and Quantity.

The industry has a price that changes every hour or every day, and is calculated off a Supply and Demand formula where both the supply and demand is constantly changing. The Demand part of this is the Industry demand for that hour of that day, but suppliers may be coming and going throughout the day with different quantities and price offers.

As a result there can be large fluctuations in Price between months, both from an overall...

Minimizing variance - can't seem to find the solution]]>

How can I create a table with 2 independent numeric variables in SPSS?]]>

I am finishing my master thesis and have included an ARIMAX model in the research. Now I would like to make some manual calculations based on the results from the ARIMAX Model in SPSS. I know that the results are not all significant but this is not a concern because a simulated example is used so the significance is not that important.

My question, for now, is how to derive a formula from this model (and which formula to use), and to use the results of the model in SPSS to make...

Deriving a formula from an ARIMAX Model]]>

Omitted variable bias]]>

Missing data]]>

Struggling to understand where to start for internships.]]>

Last time I ran logistic regression with predictors on the original scale (which the system assumes is...

Best way to run regression]]>

when we read descriptions of several post-hoc tests, we get actually lost, because everyone has its advantage/disadvantage. And sometimes you reach to a point where you can't decide which one is better to use. Even if you use all the candidate post-hoc tests that most likely fit your situation, you might get different significance results.

If we read researches done on them, we usually come across some terminologies that describe them, namely:

Liberal test, Conservative test, a...

Post-hoc tests' properties]]>

I have data collected from 40 businesses that publish R&D spending over 11 years on all the above.

I have been using R&D as the independent variable and the remainder as dependant variables.

Correlation is showing significant at the following levels:

RD EPS NP EBITDA OP GP

RD...

Best Model for comparing R&D Spending to business performance metrics]]>

I want to know whether the intervention would make a significant change to the participant or not by assessing:

(i) the difference in pre-test and post-test for each cycle,

(ii) the difference in the post-tests (cycle 1 & cycle 2)

(iii) the...

Do I have to conduct a normality test for only one paired sample?]]>

I need help choosing the correct test to accurately compare some data.

I want to compare attendance between two time periods.

ex. attendance = 14 during a 6.5 week period, while in a separate period of only 6.25 weeks, attendance was 7 times

Any suggestions?

thanks in advance!]]>

Individual Covariate in Fractional Factorial Design]]>

I am testing a new measurement and I am trying to calculate the normal value of the new measurement. I had the measurement done be 6 observers and repeated that after 6 weeks to test intra and interobserver reliability. The variable has a normal distribution.

1- To get the normal values for the new measurement which data set to use ( I have the variable values for each observer on 2 occasions)

2- The normal value of the new measurement, is it the minimum and maximum value in...

Normal values for a new test]]>

I am currently teaching myself A level statistics from a textbook, which can be difficult when I can't work out how to get from the question to the answer given in the book, so here's hoping you can help!

I keep thinking I've understood the process for type II errors, and then a question comes along that knocks this belief. I've been thinking on the below question for so long that I'm most likely now completely overthinking it, but I'm pretty stuck. It's only the second part of the...

Type II error textbook question]]>

I am currently doing Master in data science. I came across the function PDF probability density function which is used to find cumulative probability(range) of a continuous random variable.

The PDF probability density function is plotted against probability density in y axis and Random variable in x axis.

I am not able to understand how to convert an experiments observation of continuous random variable into probability density function

Kindly help me understand...

probability density]]>

3-to-1 propensity score matching and sensitivity analysis]]>

In a 2D world, a bag of magic marbles is poured out, where they should form a normally distributed pile. There are 256 marbles in the bag. As the marbles pour out one by one, they form...

Discrete normal distributions]]>

My question probably spans a couple of thread topics with regards to the title but I decided to post my question in the R forum as its the program I am using.

I am currently in the process of fitting a logistic regression model to horse racing outcomes (1 = Winner, 0 = Loser) given a range of fundamental variables associated with the horse.

The closing market odds is the best predicted of a horse race outcome. I am firstly going to run a one-step logistic regression model which...

Logistic regression problem in R - Horse racing]]>