Hi,
I'm wondering if we could apply standard deviation or IQR on our daily log on time?
I have a dataset with many machine ID that has many log in records for entire month.
I have removed the weekend data and left with log in data of over 1000 machines.
From the dataset, each machines has several logins in a day at different time. Taking an example of a machine, if I group by time and count the logins for the entire month, I will have login from at different times from 0000hrs to 2359hrs.
If the 0th percentile is 8am, the 100th percentile is 1730pm. Between these time, there are over 300 logins. I want to use these data to detect abnormal logs outside of 2 standard deviation. The mean could be maybe 1300pm. 2 SD may be 4hrs. The lower 2SD is 9am and upper 2SD is 1900pm.
Does this make any sense?
I'm wondering if we could apply standard deviation or IQR on our daily log on time?
I have a dataset with many machine ID that has many log in records for entire month.
I have removed the weekend data and left with log in data of over 1000 machines.
From the dataset, each machines has several logins in a day at different time. Taking an example of a machine, if I group by time and count the logins for the entire month, I will have login from at different times from 0000hrs to 2359hrs.
If the 0th percentile is 8am, the 100th percentile is 1730pm. Between these time, there are over 300 logins. I want to use these data to detect abnormal logs outside of 2 standard deviation. The mean could be maybe 1300pm. 2 SD may be 4hrs. The lower 2SD is 9am and upper 2SD is 1900pm.
Does this make any sense?