Looking for the best way to use my available data in a predictive model

#1
I have several pieces of data available, but I'm not sure about the best use of it in a predictive model. I envision using a regression, then plugging the predicted values into a monte carlo simulation for optimization. I'm wondering how an experienced statistician would use the available data to make predictions. Here is a list of what I have available and what I'm working with:

  • I have a list of 100-200 items or so, maybe more, and I want to predict a value for each.
  • My data is considered a normal distribution.
  • A mean and standard deviation for each item which does not take into account situational data
  • I have several columns of situational data that could contribute to predicting the value I need
  • Data related to the event I'm predicting a value for and could be considered related to item at the macro level
  • Some items may be correlate to other items and the event data may help identify if these items will be more likely to be at the top or bottom of the historical data list below
  • I have historical data for prior actual results. If I had a regression that gave me a predicted value of 27, I'd have this type info (each item has its own list):
    • The item has had an actual value exceed x value by x% of the time:
    • Value of 12 exceeded 80%
    • Value of 18 exceeded 73%
    • Value of 24 exceeded 52%
    • Value of 30 exceeded 37%
    • Value of 36 exceeded 18%
  • If I could produce a variable that would provide the success rate of my item in the category of one of my columns and a failure rate of an opposing factor in that same category, would that be helpful and what would the best way to make them relate to each other? If this item is confusing, think of it as my item is successful in a category 33% of the time, while an opposing force fails 27% of the time. Would it make sense to create some columns using Bayes theorem for some of the categories where it could be done?
What would you do with this type of data available to create the most accurate predictive model? Randomness does exist where items should definitely excel, but do not and the opposite is true too. Some of this randomness may be explained though in the data and just not identified yet. Would regression with MC be correct? I'm thinking to use the regression, then apply the MC to account for the randomness. How would you envision a model to best take advantage of the available info?