- Thread starter skan
- Start date

Jenny Kotlerman

www.statisticalconsultingnetwork.com

Hello.

I hope you understand my poor english.

Imagine I know that a straight line is the curve that better fits to these data (*) but they are not normally distributed, their error is a heavy talied fdp and I think least square method gives too much "weight" to nearer data.

(*) Or I can transform them to get it.

As an example you can read this:

http://arxiv.org/ftp/arxiv/papers/0704/0704.1867.pdf

Or

"In particular, standard methods such as least-squares fitting are known to produce systematically biased estimates of parameters for power-law distributions and should not be used in most circumstances"

"Fitting a line to your log-log plot by least squares is a bad idea. It generally doesn't even give you a probability distribution, and even if your data do follow a power-law distribution, it gives you a bad estimate of the parameters. You cannot use the error estimates your regression software gives you, because those formulas incorporate assumptions which directly contradict the idea that you are seeing samples from a power law"

from http://cscs.umich.edu/~crshalizi/weblog/491.html

I know how to maximize the maximum likelihood but not how to use it to fit data.

http://arxiv.org/ftp/arxiv/papers/0704/0704.1867.pdf

Or

"In particular, standard methods such as least-squares fitting are known to produce systematically biased estimates of parameters for power-law distributions and should not be used in most circumstances"

"Fitting a line to your log-log plot by least squares is a bad idea. It generally doesn't even give you a probability distribution, and even if your data do follow a power-law distribution, it gives you a bad estimate of the parameters. You cannot use the error estimates your regression software gives you, because those formulas incorporate assumptions which directly contradict the idea that you are seeing samples from a power law"

from http://cscs.umich.edu/~crshalizi/weblog/491.html

I know how to maximize the maximum likelihood but not how to use it to fit data.

Last edited: