Share this Blog

Tuesday, November 8, 2011

Sampling in Social Science Research



Sampling is the procedure a researcher uses to gather people, places, or things to studySampling is the process of selecting units (e.g., people, organizations) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosenSampling is the process of selecting units (e.g., people, organizations) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen. The process of creating a sample that correctly reflects the makeup of the whole population.
Researchers usually cannot make direct observations of every individual in the population they are studying. Instead, they collect data from a subset of individuals – a sample – and use those observations to make inferences about the entire population. Ideally, the sample corresponds to the larger population on the characteristic(s) of interest. In that case, the researcher's conclusions from the sample are probably applicable to the entire population. This type of correspondence between the sample and the larger population is most important when a researcher wants to know what proportion of the population has a certain characteristic – like a particular opinion or a demographic feature. Public opinion polls that try to describe the percentage of the population that plans to vote for a particular candidate, for example, require a sample that is highly representative of the population.
Two general approaches to sampling are used in social science research. With probability sampling, all elements (e.g., persons, households) in the population have some opportunity of being included in the sample, and the mathematical probability that any one of them will be selected can be calculated.
Probability sampling (Representative samples)
Probability samples are selected in such a way as to be representative of the population. They provide the most valid or credible results because they reflect the characteristics of the population from which they are selected (e.g., residents of a particular community, students at an elementary school, etc.). There are two types of probability samples: random and stratified.
Random Sampling
The first statistical sampling method is simple random sampling. In this method, each item in the population has the same probability of being selected as part of the sample as any other item.The term random has a very precise meaning. Each individual in the population of interest has an equal likelihood of selection. This is a very strict meaning -- you can't just collect responses on the street and have a random sample.
The assumption of an equal chance of selection means that sources such as a telephone book or voter registration lists are not adequate for providing a random sample of a community. In both these cases there will be a number of residents whose names are not listed. Telephone surveys get around this problem by random-digit dialing -- but that assumes that everyone in the population has a telephone. The key to random selection is that there is no bias involved in the selection of the sample. Any variation between the sample characteristics and the population characteristics is only a matter of chance.
Systematic Sampling
Systematic sampling is another statistical sampling method. In this method, every nth element from the list is selected as the sample, starting with a sample element n randomly selected from the first k elements. For example, if the population has 1000 elements and a sample size of 100 is needed, then k would be 1000/100 = 10. If number 7 is randomly selected from the first ten elements on the list, the sample would continue down the list selecting the 7th element from each group of ten elements. Care must be taken when using systematic sampling to ensure that the original population list has not been ordered in a way that introduces any non-random factors into the sampling. An example of systematic sampling would be if the auditor of the acceptance test process selected the 14th acceptance test case out of the first 20 test cases in a random list of all acceptance test cases to retest during the audit process. The auditor would then keep adding twenty and select the 34th test case, 54th test case, 74th test case and so on to retest until the end of the list is reached.
Stratified Sampling
The statistical sampling method called stratified sampling is used when representatives from each subgroup within the population need to be represented in the sample. The first step in stratified sampling is to divide the population into subgroups (strata) based on mutually exclusive criteria. Random or systematic samples are then taken from each subgroup. The sampling fraction for each subgroup may be taken in the same proportion as the subgroup has in the population. For example, if the person conducting a customer satisfaction survey selected random customers from each customer type in proportion to the number of customers of that type in the population.
A stratified sample is a mini-reproduction of the population. Before sampling, the population is divided into characteristics of importance for the research. For example, by gender, social class, education level, religion, etc. Then the population is randomly sampled within each category or stratum. If 38% of the population is college-educated, then 38% of the sample is randomly selected from the college-educated population.
Stratified samples are as good as or better than random samples, but they require a fairly detailed advance knowledge of the population characteristics, and therefore are more difficult to construct.
Cluster Sampling
The fourth statistical sampling method is called cluster sampling, also called block sampling. In cluster sampling, the population that is being sampled is divided into groups called clusters. Instead of these subgroups being homogeneous based on selected criteria as in stratified sampling, a cluster is as heterogeneous as possible to matching the population. A random sample is then taken from within one or more selected clusters. For example, if an organization has 30 small projects currently under development, an auditor looking for compliance to the coding standard might use cluster sampling to randomly select 4 of those projects as representatives for the audit and then randomly sample code modules for auditing from just those 4 projects. Cluster sampling can tell us a lot about that particular cluster, but unless the clusters are selected randomly and a lot of clusters are sampled, generalizations cannot always be made about the entire population. For example, random sampling from all the source code modules written during the previous week, or all the modules in a particular subsystem, or all modules written in a particular language may cause biases to enter the sample that would not allow statistically valid generalization.
Non-probability Samples (Non-representative samples)
As they are not truly representative, non-probability samples are less desirable than probability samples. However, a researcher may not be able to obtain a random or stratified sample, or it may be too expensive. A researcher may not care about generalizing to a larger population. Non-probability samples are limited with regard to generalization. Because they do not truly represent a population, we cannot make valid inferences about the larger group from which they are drawn. Validity can be increased by approximating random selection as much as possible, and making every attempt to avoid introducing bias into sample selection.
Judgmental Sampling
An important  non-statistical sampling method is judgmental sampling. In judgmental sampling, the person doing the sample uses his/her knowledge or experience to select the items to be sampled. For example, based on experience, an auditor may know which types of items are more apt to have non-conformances or which types of items have had problems in the past or which items are a higher risk to the organization.
Quota sample
The defining characteristic of a quota sample is that the researcher deliberately sets the proportions of  levels or strata within the sample. This is generally done to insure the inclusion of a particular segment of the population. The proportions may or may not differ dramatically from the actual proportion in the population. The researcher sets a quota, independent of population characteristics.
Example: A researcher is interested in the attitudes of members of different religions towards the death penalty. In Iowa a random sample might miss Muslims (because there are not many in that state). To be sure of their inclusion, a researcher could set a quota of 3% Muslim for the sample. However, the sample will no longer be representative of the actual proportions in the population. This may limit generalizing to the state population. But the quota will guarantee that the views of Muslims are represented in the survey.
Purposive sample
A purposive sample is a non-representative subset of some larger population, and is constructed to serve a very specific need or purpose. A researcher may have a specific group in mind, such as high level business executives. It may not be possible to specify the population -- they would not all be known, and access will be difficult. The researcher will attempt to zero in on the target group, interviewing whomever is available.
A subset of a purposive sample is a snowball sample -- so named because one picks up the sample along the way, analogous to a snowball accumulating snow. A snowball sample is achieved by asking a participant to suggest someone else who might be willing or appropriate for the study. Snowball samples are particularly useful in hard-to-track populations, such as truants, drug users, etc.
Convenience sample
A convenience sample is a matter of taking what you can get. It is an accidental sample. Although selection may be unguided, it probably is not random, using the correct definition of everyone in the population having an equal chance of being selected. Volunteers would constitute a convenience sample.

Haphazard Sampling
There are also other types of sampling that, while non-statistical (information about the entire population cannot be extrapolated from the sample), may still provide useful information. In haphazard sampling, samples are selected based on convenience but preferably should still be chosen as randomly as possible. For example, the auditor may ask to see a list of all of the source code modules, and then closes his eyes and points at the list to select a module to audit. The auditor could also grab one of the listing binders off the shelf, flip through it and “randomly” stop on a module to audit. The haphazard sampling is usually typically, quicker, and uses smaller sample sizes than other sampling techniques. The main disadvantage of haphazard sampling is that since it is not statistically based, generalizations about the total population should be made with extreme caution.
Importance of Sampling
1.Researchers usually cannot make direct observations of every individual in the population they are studying. So sampling can be helpful.
2.In Sampling– a sample – can be used for  observations to make inferences about the entire population.  So Sampling is very important in research because it helps researcher to relive burden of over data.
3.Sampling makes research more practical.
4.A good research conclusion is drawn only if there is a ood sampling procedure.
Terms associated with sampling
Population :Before gathering your sample, it's important to find out as much as possible about your population. Population refers to the larger group from which the sample is taken. You should at least know some of the overall demographics; age, sex, class, etc., about your population. This information will be needed later after you get to the data analysis part of your research, but it's also important in helping you decide sample size. The greater the diversity and differences that exist in your population, the larger your sample size should be. Capturing the variability in your population allows for more variation in your sample, and since many statistical tests operate on the principles of variation, you'll be making sure the statistics used later can do their powerful stuff. 
Sampling Frame: After you've learned all the theoretically important things about your population, you then have to obtain a list or contact information on those who are accessible or can be contacted. This procedure for listing all the accessible members of your population is called the sampling frame. If you were planning on doing a phone survey, for example, the phone book would be your sampling frame. Make sure your sampling frame is appropriate for the population you want to study. In this case, the Census Dept. says that 93% of us have a phone, so that's not too bad, but you have to decide if any of the unique characteristics of people you're interested in studying are lost by selecting a restrictive sampling frame. The term refers to the procedure rather than the list. It's important for researchers to discuss their sampling frame because that's what ensures that systematic error, or bias, hasn't entered into your study. 

1 comment: