# Logistic regression curve looks like linear regression

#### Pinky1111

##### New Member
Hi, I performed a linear as well as a logistic regression analysis (both with Matlab) and plotted those. Now the logistic regression curve is very linear and can be described by the same function as the linear regression. I explain the linearity by the fact that it is only a very small fraction of the whole logistic regression curve and far away from the asymptotic values. But can it actually be that they are described by the same function??

A confused beginner.

#### ondansetron

##### TS Contributor
Hi, I performed a linear as well as a logistic regression analysis (both with Matlab) and plotted those. Now the logistic regression curve is very linear and can be described by the same function as the linear regression. I explain the linearity by the fact that it is only a very small fraction of the whole logistic regression curve and far away from the asymptotic values. But can it actually be that they are described by the same function??

A confused beginner.
Can you show us the plots?

What is being plotted on the Y axis in each case? for the logistic regression, the log odds should be roughly linear as a function of the x-variables, but if you're plotting the predicted probabilities, this would be sigmoidal (or a smoothed function of the observed 0 and 1 observations would be sigmoidal-ish).

#### Dason

It's sigmoidal over the entire domain/range but if you just look at a subset it can be quite linear. If you just look at the plot for y between .25 and .75 a linear fit is pretty darn good. So if it's turns out that what you're modeling has predicted probabilities mainly in that region then a linear fit won't be too bad. Trying to use the linear fit if you plan on going beyond the input range you fit the model with could be problematic though.

#### hlsmith

##### Not a robit
Yes @Dason - I assumed as well this is likely what he is referencing. Though, you never know without them explicitly posting them. Given your scenario - it can be interesting to see what subset of observations land in the tails. In particular, if they are using multiple regression models are they the subgroups positive or negative for most of the covariates of interest. In the use of propensity weights, trimming of extreme weights gets used - but once you trim you have to acknowledge your are using a different data sample conditional on some process.

#### Dason

I'm dumping the code for the above graphic here since I don't care enough to save it to my system but I took the time to write it so might as well...
C-like:
# Create the data for a logistic curve
xs <- seq(-5, 5, by = .01)
ys <- plogis(xs)

# Let's do some plotting and save to a png
png("LogisticVsLinear.png")
# Create plot area with labels but no points
# basegraphics4life
plot(xs,
ys,
type = "n",
ylim = c(-.2,1.2),
main = "Logistic vs Linear - Midrange",
ylab = "y",
xlab = "x")
# Add in the logistic curve
lines(xs, ys, col = "blue")

# Plot the asymptotic boundaries
abline(h = 0)
abline(h = 1)

# you can define the area you want the line
# to 'best fit' for.  In this case it was
# for -1 <= x <= 1
id <- which(abs(xs) <= 1)
xs_red <- xs[id]
ys_red <- ys[id]
o <- lm(ys_red ~ xs_red)

# Plot the best fit line
abline(o, lty = 2)

legend("topleft",
c("Logistic", "Linear"),
col = c("blue", "black"),
lty = c(1,2))
dev.off()

#### Buckeye

##### Member
The Y-axis is a performance measure? Is that a continuous variable? Or are there two outcomes, success and failure?

#### Pinky1111

##### New Member
@Buckeye For the logistic regression it is a binary variable (only ones and zeros)
In the linear regression plot the actual (continuous) performance values are used

#### hlsmith

##### Not a robit
And the "plot" thickens to a plasma like state!