Index.htm Misd130.htm

Sampling - My Comments

There is one major problem with sample selection at the moment: the rule at the 3 major markets (as above) that the data collectors must try to ensure that the sample is representative of the total population in terms of grade and sex means that for many market-days, then there is insufficient data for low and high weights (therefore also for low and high grades), especially for cows and bulls; thus it is difficult or impossible to get accurate unit prices for minority sexes and grades. Therefore with immediate effect, I strongly recommend that sampling techniques should be radically changed: sexes should be sampled as near as possible 33% steers, 33% cows, 33% bulls; also the extreme weights should be actively sought out for sampling (these would also be extreme grades) - about 25% of the sample should be light animals and 25% heavy animals, with about 50% in the middle weight range. If pricing for light animals proves to be a problem (because they tend more to be sold as a 'mob'?), then the requirement to record light animals can possibly be dropped (this statement applies more to the log-log data processing system proposed in this report, of which more below).

The question of total net sample size is important - in theory a greater sample size (up to some point) should ensure greater accuracy; in practice of course every increase in sample size involves more work in weighing, grading, sexing, and pricing (but refer to the self-weighing concept discussed above).

The sample size required for accuracy depends on sampling philosophy and also on the data processing technique used. If my recommendation on GSC's to be sampled is adopted, it should result in smaller sample sizes being required. If the present primitive types of data processing continue to be used, relatively large samples will continue to be required; this is because the present techniques effectively treat the sample for a market-day as 15 separate GSC samples. But if it is possible to treat the sample for a market-day as only 3 separate sex samples, then it should be possible to greatly increase accuracy for the same sample or to significantly reduce total sample size for zero or small decrease in accuracy.

If we do continue to use the present primitive data processing techniques, but do change from population-representative sampling to GSC-representative sampling, then for a total net sample of 45 animals, there would still be only 3 data points for each of the 15 GSC's; this is really not enough to ensure a

good level of accuracy.

Given the physical constraints of labour, time and logistics for a market with a single weighing scale, then this 45:15:3 situation makes it quite important that if at all possible some good technique for data processing of future market price data is found. The log-log best-fit technique outlined below I believe is such a good technique.

The business of statistical information on trends of cattle numbers, weights, grades, sexes, ages, etc should be determined by some other means, if indeed that information is required; certainly in this respect it appears that the available trend information from the MIS program in Dar, Moshi and Arusha has been hardly used in any meaningful sense to date; so why collect it?

To get specific on sample size, I have a feeling that a net sample size of 24 animals (8 steers, 8 cows, 8 bulls) with 6 light animals, 6 heavy animals and 12 middle-weight animals, will produce the kind of accuracy we need. If as we amass data, we find that there is no longer a need to make separate curve fits for each of the 3 sexes, then the sample size can come down yet again.