Kernel Density Outlier Threshold

#1
Hi All,

First of all, thanks for showing an interest in this thread, and secondly, if your so kind to give your thoughts, I'd be so grateful.

Background
- We work with financial bank statements for fraud investigation.
- I have converted a bank statements debit transactions into a scatter plot, and a probability density function curve (1D), using kernel density estimation with bandwidth optimised and the kernel is epanechnikov.
- Works perfect, we now have a probability density plot, which without any doubt represents our underlying dataset. Hard work done!:)
- It produces a graph with very low probabilities. Max being in region of 0.002, and lowest outliers much much less (obviously close to 0, if not 0)
- Obviously plugging in the value into KDE will reveal the probability at that point x.
- I need a definative answer on whether this is an outlier or not. How do I know the threshold for this outlier quantification??? (e.g. If p(x) > 0.01 etc.):confused:
 
Last edited: