Stratified random sampling is a sampling method that involves dividing a population into smaller subgroups known as strata. In stratified random sampling, or stratification, strata are formed based on attributes or characteristics shared by members, such as income or educational level.
Stratified random sampling is also called proportional random sampling or quota random sampling.
How Stratified Random Sampling Works
By completing the analysis or research on a group of entities with similar characteristics, a researcher may find that the population size is too large to complete the research. To save time and money, an analyst can take a more feasible approach by selecting a small group of the population. The small group is called the sample size, which is a subset of the population that is used to represent the entire population. A sample of a population can be selected in several ways, one of which is the stratified random sampling method.
A stratified random sampling involves the division of the entire population into homogeneous groups called strata (plural of stratum). Random samples from each stratum are then selected. For example, think of an academic researcher who wants to know the number of MBA students in 2007 who received a job offer in the three months following their graduation.
You will soon discover that there were nearly 200,000 MBA graduates in that year. You could decide to take a simple random sample of 50,000 graduates and conduct a survey. Better yet, you could divide the population into strata and take a random sample of the strata. To do this, it would create population groups based on gender, age range, race, country of nationality and career path. A random sample is taken from each stratum in a number proportional to the size of the stratum compared to the population. These subsets of the strata are then grouped together to form a random sample.
Example of Stratified Random Sampling
Suppose a research team wants to determine the grade point average of college students in the U.S. The research team struggles to collect data from the 21 million university students; decides to take a random sample of the population using 4,000 students.
Suppose now that the team examines the different attributes of the sample participants and wonders if there are differences in the average grades and careers of the students. Suppose 560 students have degrees in English, 1,135 in science, 800 in computer science, 1,090 in engineering, and 415 in mathematics. The team wants to use a proportional stratified random sample in which the sample stratum is proportional to the random sample of the population.
Suppose the team investigates the demographics of college students in the U.S. and find the percentage of what students specialize in: 12% specialize in English, 28% in science, 24% in computer science, 21% in engineering, and 15% in mathematics. Thus, five strata are created from the stratified random sampling process.
Next, the team has to confirm that the population stratum is in proportion to the sample stratum; however, they discover that the proportions are not equal. The team then needs to resample 4,000 students from the population and randomly select 480 English students, 1,120 science students, 960 computer science students, 840 engineering students and 600 math students.
With them, it has a proportional stratified random sample of college students, which provides a better representation of students’ college careers in the U.S. Researchers can then highlight specific strata, look at the different studies of U.S. college students. and observe the different grade averages.
Simple Random Samples vs. Stratified Random Samples
Simple random samples and stratified random samples are both statistical measurement tools. A simple random sample is used to represent the entire data population. A stratified random sample divides the population into smaller groups, or strata, based on shared characteristics.
The simple random sample is typically used when too little information about the data population is available, when the data population has too many differences to divide it into several subsets, or when there is only one distinct feature among the data population.
For example, a candy company may want to study the buying habits of its customers to determine the future of its product line. If there are 10,000 customers, you can choose 100 of them as a random sample. You can then apply what you discover from those 100 customers to the rest of your base. Unlike stratification, it will take the sample of 100 members purely at random, without taking into account their individual characteristics.
Proportional and Disproportionate Stratification
Stratified random sampling ensures that each subgroup of a given population is adequately represented within the total sample population of a research study. Stratification can be proportional or disproportionate. In a proportional stratified method, the sample size of each stratum is proportional to the population size of the stratum.
For example, if the researcher wants a sample of 50,000 graduates by age, the proportional stratified random sample will be obtained using this formula (sample size/population size) x stratum size. The following table assumes a population size of 180,000 MBA graduates per year.
Number of people in the stratum
Sample size of the strata
The strata sample size for MBA graduates in the age range of 24 to 28 years is calculated as (50,000/180,000) x 90,000 = 25,000. The same method is used for the other age groups. Now that the sample size of the strata is known, the researcher can perform a simple random sampling in each stratum to select the participants in the survey. That is, 25,000 graduates of the age group of 24 to 28 years will be randomly selected from the entire population, 16,667 graduates of the age range of 29 to 33 years will be randomly selected from the population, and so on.
In a disproportionate stratified sample, the size of each stratum is not proportional to its size in the population. The researcher may decide to take a sample of half of the graduates in the 34-37 age group and 1/3 of the graduates in the 29-33 age group.
It is important to note that a person cannot fit into several strata. Each entity should only fit into one stratum. Having overlapping subgroups means that some people will be more likely to be selected for the survey, which completely overrides the concept of stratified sampling as a type of probabilistic sampling.
Steps to Define a Stratified Random Sample
Step 1: Define the population and subgroups
Like other methods of probabilistic sampling, you should start by clearly defining the population from which the sample will be taken.
Choosing features for stratification
You should also choose the feature that you will use to divide your groups. This choice is very important: since each member of the population can only be placed in one subgroup, the classification of each subject in each subgroup should be clear and obvious.
Stratification by multiple characteristics
You can choose to stratify by multiple different features at once, as long as you can clearly assign each subject to exactly one subgroup. In this case, to obtain the total number of subgroups, the number of strata of each characteristic is multiplied.
For example, if you stratified by both race and sex, using four groups for the first and two for the second, you would have 2 x 4 = 8 groups in total.
Step 2: Separate the population into strata
Next, collect a list of all members of the population and assign each of them a stratum.
You need to make sure that each stratum is mutually exclusive (there is no overlap between them), but that together they contain the entire population.
Step 3: Decide on the sample size for each stratum
First, you need to decide whether you want the sample to be proportional or disproportionate.
Proportional versus disproportionate sampling
In proportional sampling, the sample size of each stratum is equal to the proportion of the subgroup in the population as a whole.
Subgroups that are less represented in the population as a whole (e.g. rural populations, which constitute a smaller part of the population in most countries) will also be less represented in the sample.
In disproportionate sampling, the sample sizes of each stratum are disproportionate to their representation in the population as a whole.
You can choose this method if you want to study a particularly underrepresented subgroup whose sample size would be too low to allow you to draw statistical conclusions.
You can then decide the total sample size. It must be large enough to ensure that statistical conclusions can be drawn about each subgroup.
If you know the desired margin of error and confidence level, as well as the estimated size and standard deviation of the population you work with, you can use a sample size calculator to estimate the necessary figures.
Step 4: Take a random sample from each stratum
Finally, you should use another probabilistic sampling method, such as simple random sampling or systematic sampling, to take a sample within each stratum.
If done correctly, the randomness inherent in these methods will allow you to obtain a representative sample of that particular subgroup.
Advantages of Stratified Random Sampling
The main advantage of stratified random sampling is that it captures the key characteristics of the population in the sample. Like a weighted average, this sampling method produces characteristics in the sample that are proportional to the total population. Stratified random sampling works well for populations with a variety of attributes, but is ineffective if subgroups cannot be formed.
Stratification provides lower estimation error and greater accuracy than the simple random sampling method. The greater the differences between the strata, the greater the accuracy gain.
Disadvantages of stratified random sampling
Unfortunately, this research method cannot be used in all studies. The disadvantage of the method is that several conditions must be met for it to be used correctly. Researchers must identify all members of the studied population and classify each of them into one, and only one, subpopulation. Consequently, stratified random sampling is disadvantageous when researchers cannot safely classify each member of the population into a subgroup. In addition, finding an exhaustive and definitive list of an entire population can be challenging.
Overlap can be a problem if there are subjects that fall into multiple subgroups. When simple random sampling is performed, those in multiple subgroups are more likely to be chosen. The result could be misrepresentation or an inaccurate reflection of the population.
The above examples make it easy: college students, graduates, men, and women are clearly defined groups. However, in other situations it can be much more difficult. Imagine that characteristics such as race, ethnicity, or religion are incorporated. The classification process becomes more difficult, making stratified random sampling an ineffective and less than ideal method.
When to use Stratified Random Sampling
To use stratified sampling, it is necessary to be able to divide the population into mutually exclusive and exhaustive subgroups. This means that each member of the population can be clearly classified into exactly one subgroup.
Stratified sampling is the best choice among probabilistic sampling methods when you believe that subgroups will have different mean values for the variable(s) you are studying. It has several potential advantages:
Ensures sample diversity
A stratified sample includes subjects from each subgroup, ensuring that it reflects the diversity of its population. It is theoretically possible (though unlikely) that this does not occur when using other sampling methods, such as simple random sampling.
Ensure similar variance
If you want the data collected from each subgroup to have a similar level of variance, you need a similar sample size for each subgroup.
With other sampling methods, you may end up having a low sample size for certain subgroups because they are less common in the general population.
Reduction of the overall variance of the population
Although its global population may be quite heterogeneous, it may be more homogeneous within certain subgroups.
For example, if you’re studying how a new schooling program affects children’s test scores, chances are that both their original scores and any changes in scores are highly correlated with family income. Scores are likely to be grouped by household income category.
In this case, stratified sampling allows to obtain more precise measurements of the variables to be studied, with a lower variance within each subgroup and, therefore, for the whole population.
Enable a variety of data collection methods
Sometimes it may be necessary to use different methods to collect data from different subgroups.
For example, to reduce the cost and difficulty of your study, you may want to sample urban subjects going door-to-door, but rural subjects using the mail.
Our specialists wait for you to contact them through the quote form or direct chat. We also have confidential communication channels such as WhatsApp and Messenger. And if you want to be aware of our innovative services and the different advantages of hiring us, follow us on Facebook, Instagram or Twitter.
If this article was to your liking, do not forget to share it on your social networks.
You may also be interested in: Transversal Studies
De Vaus, D. A. Research Design in Social Research. London: SAGE, 2001; Trochim, William M.K. Research Methods Knowledge Base. 2006.
Denyer, David and David Tranfield. “Producing a Systematic Review.” In The Sage Handbook of Organizational Research Methods. David A. Buchanan and Alan Bryman, editors. (Thousand Oaks, CA: Sage Publications, 2009), pp. 671-689
Torgerson, Carole J. “Publication Bias: The Achilles’ Heel of Systematic Reviews?” British Journal of Educational Studies 54 (March 2006): 89-102; Torgerson, Carole. Systematic Reviews. New York: Continuum, 2003.
You might also be interested in: How a Research is Conducted