Hello all,
I've not played around with biostatistics since grad school and am interested in doing a study looking at smoking and military fitness scores.
Ok here are the basics..... I have data on 2900 individuals age 18-59.
667 Smoke
2233 Non-Smokers
And i have stadardized fitness scores on each person.
My first question is, is a simple student t-test appropriate to determine significance or not?
Next question is regarding sampling. Is it approriate to perfrom the t-test using all smokers and all non-smokers or should the two populations be of similar size? If they should be of similar size what is the best way to sample both groups?
Ideally, breaking both populations into age categories (21-30, 31-40, 41-50) would give me better data since most smokers tend to be younger and younger people have higher fitness scores. Hope that makes sense.
any help would be much appreciated.