Simple Random Sampling: 6 Basic Steps With Examples

What Is a Simple Random Sample?

A simple random sample is a subset of a statistical population in which each member of the subset has an equal probability of being chosen. A simple random sample is meant to be an unbiased representation of a group.

Key Takeaways

  • A simple random sample takes a small, random portion of the entire population to represent the entire data set, where each member has an equal probability of being chosen.
  • Researchers can create a simple random sample using methods such as lotteries or random draws.
  • A sampling error can occur with a simple random sample if the sample does not end up accurately reflecting the population it is supposed to represent.
  • Simple random samples are determined by assigning sequential values to each item within a population, then randomly selecting those values.
  • Simple random sampling provides a different sampling approach compared with systematic sampling, stratified sampling, or cluster sampling.
Simple Random Sample

Investopedia / Madelyn Goodnight

Understanding a Simple Random Sample

Researchers can create a simple random sample using a couple of methods. With a lottery method, each member of the population is assigned a number, after which numbers are selected at random.

An example of a simple random sample would be the names of 25 employees being chosen out of a hat from a company of 250 employees. In this case the population is all 250 employees, and the sample is random because each employee has an equal chance of being chosen. Random sampling is used in science to conduct randomized control tests or for blinded experiments.

The example in which the names of 25 employees out of 250 are chosen out of a hat is an example of the lottery method at work. Each of the 250 employees would be assigned a number between one and 250, after which 25 of those numbers would be chosen at random.

Because individuals who make up the subset of the larger group are chosen at random, each individual in the large population set has the same probability of being selected. In most cases this creates a balanced subset that carries the greatest potential for representing the larger group as a whole.

A manual lottery method can be quite onerous for larger populations. Selecting a random sample from a large population usually requires a computer-generated process. The same methodology as the lottery method is used, only the number assignments and subsequent selections are performed by computers, not humans.

Room for Error

With a simple random sample, there has to be room for error represented by a plus and minus variance (sampling error). For example, if in a high school of 1,000 students a survey is taken to determine how many students are left-handed, random sampling can determine that eight out of the 100 sampled are left-handed. The conclusion would then be that 8% of the student population of the high school are left-handed, when in fact the global average would be closer to 10%.

The same is true regardless of the subject matter. A survey on the percentage of the student population that has green eyes or a physical disability would result in a mathematical probability based on a simple random survey, but always with a plus or minus variance. The only way to have a 100% accuracy rate would be to survey all 1,000 students which, while possible, would be impractical.

Although simple random sampling is intended to be an unbiased approach to surveying, sample selection bias can occur. When a sample set of the larger population is not inclusive enough, representation of the full population is skewed and requires additional sampling techniques.

How to Conduct a Simple Random Sample

The simple random sampling process entails six steps, each performed in sequential order.

Step 1: Define the Population

The origin of statistical analysis is to determine the population base. This is the group for which you wish to learn more, confirm a hypothesis, or determine a statistical outcome. This step is to simply identify what that population base is and ensure that the group will adequately cover the outcome you are trying to ascertain.

Example: You want to learn how the stocks of the largest companies in the United States have performed over the past 20 years. Your population would be the largest companies in the United States as determined by the S&P 500.

Step 2: Choose Sample Size

Before picking the units within a population, we need to determine how many to select. This sample size may be constrained by the amount of time, capital rationing, or other resources available to analyze the sample. However, be mindful to pick a sample size large enough to be genuinely representative of the population. In the example above, there are constraints in analyzing the performance for every stock in the S&P 500, so we only want to analyze a subset of this population.

Example: Your sample size will be 20 companies from the S&P 500.

Step 3: Determine Population Units

In our example the items within the population are easy to determine, as they've already been identified for us (i.e. the companies listed within the S&P 500). However, imagine analyzing the students currently enrolled at a university or food products being sold at a grocery store. This step entails crafting the entire list of all items within your population.

Example: Using exchange information, you copy the companies comprising the S&P 500 into an Excel spreadsheet.

Step 4: Assign Numerical Values

The simple random sample process calls for every unit within the population to receive an unrelated numerical value. This is often assigned based on how the data may be filtered. For example, you could assign the numbers one to 500 to the companies based on market cap, alphabetical order, or company formation date. How the values are assigned isn’t relevant; all that matters is that each value is sequential and has an equal chance of being selected.

Example: You assign the numbers one through 500 to the companies in the S&P 500 based on alphabetical order of the current CEO, with the first company receiving the value one and the last company receiving the value 500.

Step 5: Select Random Values

In step 2 we chose 20 as the number of items we wanted to analyze within our population. We now randomly select 20 number values out of the 500. There are multiple ways to do this, as discussed later in this article.

Example: Using the random number table, you select the numbers 2, 7, 17, 67, 68, 75, 77, 87, 92, 101, 145, 201, 222, 232, 311, 333, 376, 401, 478, and 489.

Step 6: Identify Sample

Each of the random variables selected in the prior step corresponds to an item within our population. The group sample is selected by identifying which random values were chosen and which population items those values match.

Example: Your sample consists of the companies that correspond to the values chosen in step 5.

Random Sampling Techniques

There is no single method for determining the random values to be selected in step 5. The analyst can’t choose completely random numbers on their own, as there may be factors influencing their decision. For example, the analyst’s wedding anniversary may be the 24th, so they may consciously (or subconsciously) pick the random value 24. Instead, the analyst may choose one of the following methods:

  • Random lottery Each population number receives an equivalent item, say a ping pong ball or slip of paper, on which it is written, and those items are stored in a box. Random numbers are then selected by pulling items from the container without looking at them.
  • Physical methods – Simple, early methods of random selection may use dice, flipping coins, or spinning wheels. Each outcome is assigned a value or outcome relating to the population.
  • Random number table – Many statistics and research books contain sample tables with randomized numbers.
  • Online random number generator Many online tools exist where the analyst inputs first the population size and then the sample size to be selected.
  • Random numbers from Excel – Numbers can be selected in Excel using the =RANDBETWEEN formula. A cell containing =RANDBETWEEN(1,5) will selected a single random number between one and 5.

When pulling together a sample, consider getting assistance from a colleague or an independent person. They may be able to identify biases or discrepancies of which you may not be aware.

Simple Random vs. Other Sampling Methods

Simple Random vs. Stratified Random Sample

A simple random sample is used to represent the entire data population. A stratified random sample divides the population into smaller groups, known as “strata,” based on shared characteristics.

Unlike simple random samples, stratified random samples are used with populations that can be easily broken into different subgroups or subsets. These groups are based on certain criteria, then elements from each are randomly chosen in proportion to the group’s size versus the population. In our example above, S&P 500 companies could have subsets defined by type of industry or geographical region of the company’s headquarters.

This method of sampling means there will be selections from each different group—the size of which is based on its proportion to the entire population. Researchers must ensure that the strata do not overlap. Every point in the population must only belong to one stratum, because they should be mutually exclusive. Overlapping strata would increase the likelihood that some data are included, thus skewing the sample.

Simple Random vs. Systematic Sampling

Systematic sampling entails selecting a single random variable that determines the internal how the population items are selected. For example, if the number 37 was chosen, the 37th company on the list sorted by last name of the CEO would be selected by the sample. Then, the 74th (i.e. the next 37th) and the 111st (i.e. the next 37th after that) would be added as well.

Simple random sampling does not have a starting point; therefore, there is the risk that the population items selected at random may cluster. In our example there may be an abundance of CEOs with a last name that starts with the letter 'F.' Systematic sampling strives to even further reduce bias by ensuring that these clusters do not happen.

Simple Random vs. Cluster Sampling

Cluster sampling can occur as a one-stage or two-stage cluster. In the former, items within a population are put into comparable groupings (using our example, companies are grouped by year formed), then sampling occurs within these clusters.

Two-stage cluster sampling occurs when clusters are formed through random selection. The population is not clustered with other similar items. Sample items are then randomly selected within each cluster.

Simple random sampling does not cluster any population sets. Though it may be a simpler, clustering (especially two-stage clustering) can enhance the randomness of sample items. In addition, cluster sampling may provide a deeper analysis on a specific snapshot of a population, which may or may not enhance the analysis.

Advantages and Disadvantages of Simple Random Samples

While simple random samples are easy to use, they do come with key disadvantages that can render the data useless.

Advantages of a Simple Random Sample

Ease of use represents the biggest advantage of simple random sampling. Unlike more complicated sampling methods, such as stratified random sampling and probability sampling, no need exists to divide the population into subpopulations or take any other additional steps before selecting members of the population at random.

A simple random sample is meant to be an unbiased representation of a group. It is considered a fair way to select a sample from a larger population, as every member of the population has an equal chance of getting selected. Therefore, it has less chance of sampling bias.

Disadvantages of a Simple Random Sample

A sampling error can occur with a simple random sample if the sample does not end up accurately reflecting the population it is supposed to represent. For example, in a simple random sample of 25 employees, it would be possible to draw 25 men even if the population consisted of 125 women, 125 men, and 125 nonbinary people.

For this reason simple random sampling is more commonly used when the researcher knows little about the population. If the researcher knows more, it would be better to use a different sampling technique, such as stratified random sampling, which helps to account for the differences within the population, such as age, race, or gender.

Other disadvantages include the fact that for sampling from large populations, the process can be time-consuming and costly compared with other methods. Researchers may find that a project not worth the endeavor of its cost-benefit analysis does not generate positive results. As every unit has to be assigned an identifying or sequential number prior to the selection process, this task may be difficult based on the method of data collection or size of the data set.

Simple Random Sampling

Advantages
  • Each item within a population has an equal chance of being selected.

  • There is less of a chance of sampling bias, as every item is randomly selected.

  • It is easy and convenient for data sets already listed or digitally stored.

Disadvantages
  • Incomplete population demographics may exclude certain groups from being sampled.

  • Random selection means the sample may not be truly representative of the population.

  • Depending on the data set size and format, random sampling may be a time-intensive process.

Why Is a Simple Random Sample Simple?

No easier method exists to extract a research sample from a larger population than simple random sampling. Selecting enough subjects completely at random from the larger population also yields a sample that can be representative of the group being studied.

What Are Some Drawbacks of a Simple Random Sample?

Among the disadvantages of this technique are difficulty gaining access to respondents that can be drawn from the larger population, greater time, greater costs, and the fact that bias can still occur under certain circumstances.

What Is a Stratified Random Sample?

A stratified random sample first divides the population into smaller groups, or strata, based on shared characteristics. Therefore, a stratified sampling strategy will ensure that members from each subgroup are included in the data analysis. Stratified sampling is used to highlight differences between groups in a population, as opposed to simple random sampling, which treats all members of a population as equal, with an equal likelihood of being sampled.

How Are Random Samples Used?

Using simple random sampling allows researchers to make generalizations about a specific population and leave out any bias. Using statistical techniques, inferences and predictions can be made about the population without having to survey or collect data from every individual in that population.

The Bottom Line

Simple random sampling is the most basic form of analyzing a population, allowing every item within it to have the same probability of being selected. There are also more complicated sampling methods that attempt to correct for possible shortcomings in the simple method. However, they don’t match the ease of simple random sampling for smaller populations.

Open a New Bank Account
×
The offers that appear in this table are from partnerships from which Investopedia receives compensation. This compensation may impact how and where listings appear. Investopedia does not include all offers available in the marketplace.
Open a New Bank Account
×
The offers that appear in this table are from partnerships from which Investopedia receives compensation. This compensation may impact how and where listings appear. Investopedia does not include all offers available in the marketplace.