Kernel Density Outlier Threshold

Hi All,

First of all, thanks for showing an interest in this thread, and secondly, if your so kind to give your thoughts, I'd be so grateful.

- We work with financial bank statements for fraud investigation.
- I have converted a bank statements debit transactions into a scatter plot, and a probability density function curve (1D), using kernel density estimation with bandwidth optimised and the kernel is epanechnikov.
- Works perfect, we now have a probability density plot, which without any doubt represents our underlying dataset. Hard work done!:)
- It produces a graph with very low probabilities. Max being in region of 0.002, and lowest outliers much much less (obviously close to 0, if not 0)
- Obviously plugging in the value into KDE will reveal the probability at that point x.
- I need a definative answer on whether this is an outlier or not. How do I know the threshold for this outlier quantification??? (e.g. If p(x) > 0.01 etc.):confused:
Last edited: