# Random Samples in Excel Using the RAND Function

By Angela Henderson, Director of Institutional Research and Effectiveness, Stetson University

This tip provides two methods for selecting a simple random sample from an Excel data file using the RAND() function. This tip is particularly helpful if you need to determine survey or research samples from student, staff, or faculty populations. Two simple ways to accomplish this goal based on an example scenario are provided below.

Scenario: Create a sample of 25% of undergraduate students enrolled in Fall 2016.

Data file: Fall 2016 undergraduate census data file (containing all undergraduate students).

Data file sample Method 1

This method uses the formula = RAND() to generate a numeric value between 0 and 1 for each record.

For this example, enter the formula: =RAND() in cell E2 to generate a random number for the first record. Click on cell E2 and then copy the formula down and create a random value for each record. To prevent the random values from changing when the worksheet recalculates, select and copy all values in column E. With column E cells still highlighted, right-click and select Values under Paste Options. This replaces the formulas in column E with the calculated values and prevents the data from changing. To prevent the random values from changing when the worksheet recalculates, select and copy all values in column E. With column E cells still highlighted, right-click and select Values under Paste Options. This replaces the formulas in column E with the calculated values and prevents the data from changing. Select all the data columns and click on “Sort & Filter” on the Home tab in the Ribbon. Select Filter to apply filters to all the selected columns.

Once filters are applied, click the filter arrow in the header of the random value column (column E) and select Sort Smallest to Largest. As shown below, records are now sorted according to the random value generated. Select the number of records corresponding to your sample needs. For example, if your data file contains 1,000 records, simply select the first 250 records for a random sample of 25% of the population. Method 2

A second, but somewhat less exact method, can be used to generate a TRUE/FALSE indicator for each associated record. Using the formula = RAND()<0.255 returns TRUE for about 25% of the records selected. As such, some variance in the percentage of results returned as TRUE can be expected. Using = RAND()<0.255 (as opposed to = RAND()<0.25) helps reduce the variance in the percentage of records returned.

If seeking a different sample size, for example, 50%, simply change the value in the formula to reflect the desired proportion: = RAND()<0.505.

Using this method creating a random sample list of 25% of the population, enter the formula
= RAND()<0.255 in cell E2. Click on cell E2 and then copy the formula down and create an indicator for each record. As in the prior method, to prevent TRUE/FALSE indicators from changing when the worksheet recalculates, select and copy all values in column E. With column E cells still highlighted, right-click and select Values under Paste Options. This replaces the formulas in column E with the calculated values and prevents the data from changing.

To add filters to the columns, select all the data columns and click on “Sort & Filter” on the Home tab in the Ribbon. Select Filter to apply filters to all the selected columns.

Once filters are applied, click the filter arrow in the header of the TRUE/FALSE indicator column (column E) and uncheck FALSE and (Blanks) so only TRUE remains selected. As shown below, results are now filtered to show only records with a TRUE indicator that should be included in the sample. 