Truncated Normal Distribution

#1
Hello,

I am working on an assignment to use statistics to evaluate the view count of social media posts.

The data sets that I am working with appear to be fairly normally distributed; however, it is truncated because the view count will never be 0. The X axis represents the number of views posts receive.

A few questions:

1) What's the proper way to determine the size of bins?

In the first two attachments, I determined the number of bins and their width by looking at the data set and working backwards. In the "Bins Are Too Big" attachment, I rounded up the square root of the number of data points from the View Count 2 data set, then subtracted the smallest value from the set from the largest and divided that number by the rounded square root. These bins are likely too wide, but I don't know what other calculations to do.

2) How do standard deviations work in this scenario? For instance, -2σ would be to the left of the X axis, but views can never be negative.

3) What test statistics and formulas can I/should I use with these data? If I look at all of the view counts together, is that the sample or the population? (No other comparison points are given).

Any help is greatly appreciated!

Post View Count.png Post View Count 2.PNG Bins Are Too Big.JPG
 

hlsmith

Omega Contributor
#2
There aren't many hard rules in statistics. Bin size IMO should be based on being able to best depict data. Your first attached image seems fine. Also it seems like you are working with count data, but given the high mean and smooth ness of distribution approximate truncated normal is probably fine. I haven't really worked with the truncated normal, but I would imagine their are formulas for interpreting its distribution given its asymmetric tails.