About Sampling

My manager wants me to do a research about advance purchase day vs. fare.
however it involves too many categories, and we have billions of tickets in the data base like
1. non-stop, 1-stop, 2 -stop,
2. round trip, one-way
3. class
4. city pair

First question: how many city pair ticket info will be representative for this analysis?
Second question: if there is significant difference between one-way and round trip, or stop numbers. or class. do I need to category each and make separate analysis for each and summarize them?