Theory and practice in non-probabilistic surveys - 2

Current practices for managing bias in non-probabilistic online surveys


The most common form of recruitment is to invite people to join the panels.
The panels offer the opportunity to gather a large amount of information on the profile of their members. The main alternative to the panels is river sampling, where potential respondents are recruited through similar sources but are destined for a one-off survey rather than a long-term group. River sampling does not provide data on respondents' profiles in advance. Both samples face a threat to the positivity requirement because people who do not use the Internet can not participate.
Getting a wide range of potential respondents is critical to the success of any recruitment method, and it has been seen that respondents recruited through different websites can exhibit extremely diverse demographic distributions (and other features).
Recruitment from a diversified set of sources necessarily improves the likelihood of meeting the positivity requirement; however, the complexity of the recruitment process also increases, potentially creating a trade-off between positivity and interchangeability.
To date, the overwhelming majority of non-probabilistic survey research has been based on online panel data but at the moment there is not enough research to recommend a recruitment method with respect to the other.


Non-probabilistic investigations are generally based on selection aimed at obtaining the desired sample composition while data collection is in progress. This is usually achieved through quotas, in which the researcher constructs a particular distribution through one or more variables. Usually these are cells defined by a cross-classification of demographic characteristics such as gender by age, with each cell requiring a specific number of interviews completed within that category. The final result is a sample that corresponds to the pre-specified distribution among the chosen variables. The use of quotas is based on the assumption that the individuals included in each share cell are exchangeable with unsampled individuals who share such characteristics. If this hypothesis is satisfied, the sample will have the correct composition on the confounding variables, allowing the estimation of the averages and proportions that are generalized to the target population.
However, there is a growing consensus that basic demographic variables such as age, gender, race and education are insufficient to achieve interchangeability.
Sampling methods that allow researchers to control different dimensions can improve the ability to condition a more appropriate set of potential confounders.

Now let's look at three methods:

1. Matching using distances like the Euclidean one

This method was used by YouGov on surveys conducted using his panel in the United States and is divided into the following steps:

  • Design a random sample of cases made anonymous by a high quality origin, which is believed
  • reflect the true joint distribution of a large number of variables in the target population.
  • I use this below as a synthetic sampling list (SSF) which serves as a model for the possible sample of the survey
  • Each member of the panel completing the survey is combined with a case in the SSF with similar characteristics using distance measurements like the Euclidean one. When each SSF record has been matched to a suitably similar respondent, the survey is complete.

This approach is attractive for its ability to flexibly match the target population to more covariates than is possible with traditional quota methods. In order for this approach to succeed, the composition of the corresponding variables in the SSF must exactly match the target population and any model used to combine the datasets must be specified correctly. More importantly, the correspondence variables must be corrected to ensure interchangeability.

2. Propensity score matching (PSM)

This method uses the propensity score for the construction of dimension cells and is organized as follows:

  • A probabilistic survey that is assumed to accurately reflect the target population is fielded in parallel with a non-probabilistic survey.
  • Using a set of common covariates collected in each survey, a propensity model is estimated by combining the two samples and predicting the probability that each respondent belongs to the probabilistic survey.
  • When subsequent surveys are used, the propensity model is used to calculate a propensity score for each respondent as they are screened for the new survey. The odds are not set on particular characteristics of the respondent, but are based on quantiles of the propensity score distribution.

Here too, much depends on how much the parallel reference survey corresponds to the target population. If the reference survey suffers from non-response and non-coverage bias, these problems will be transferred to the non-probabilistic survey.

3. Routing

Another less studied method for many non-probabilistic surveys is the use of routers. Most non-probabilistic polling actuators have many active at the same time. When a router is used, rather than designing samples separately for each survey, respondents are invited to participate in an unspecified survey. The actual survey is dynamically determined based on the characteristics of the respondent and the needs of active surveys compared to the quotas or selection criteria.
This allows a more efficient use of the sample, but it means that for each survey it depends on which other investigations are active at the same time.


Since it may not be possible to achieve the desired sample composition by sampling alone, post-probing adjustment is still required. There are more types:

1. Weighting

The weighting has been studied in two areas:

1.1 Calibration

The calibration methods directly regulate the sample composition to match a known distribution of variables in the target population. The simplest form of calibration is post stratification, in which the sample is divided into mutually exclusive cells that are weighted such that the proportion of each cell in the sample corresponds to the target proportion in the target population.

1.2 Propensity score weighting

The weighting of the propensity score involves the combination of a non-probabilistic sample with a probabilistic data source or gold standard as a reference sample. A model that predicts the belonging to the sample is adapted to these combined data and the observations in the non-probabilistic sample are inversely weighted to their likelihood of appearing in that sample.

2. Matching

With matching, the idea is to create groups containing one or more observations from both a reference sample and a non-probabilistic sample that are similar on a set of auxiliary variables that are thought to be associated with the selection. The groups in the non-probabilistic sample are then weighed in such a way that the distribution corresponds to the distribution of the reference sample.

The combination is very similar to the post stratification and the propensity score, with one important exception. In many applications, observations for which an acceptable match does not exist are removed from the final data set. When this happens, information is lost and inference is only possible for those observations of the matching samples.

3. Multi-level regression and subsequent stratification (MRP)

When the number of cells becomes large, the number of observations in each of them becomes small and the estimates become unstable. MRP planning allows post-stratification using a large number of cells by adopting a multi-level model that groups information about cells that share similar characteristics and allows estimation of cell averages even when cells are sparse.

All these methods fail if the requirements of interchangeability and positivity are not met or if the specification of the model does not correctly replicate the target composition on the confusion variables. If interchangeability and positivity are met, the best method is one that can more accurately reflect the correct composition of the sample using the available data and information. If interchangeability and positivity are not met, there is no a priori reason to believe that one of these methods will work better than any other.


Researchers will be facilitated by identifying a series of theoretically based confounding factors before data collection and using them as a starting point for a research project.
In the absence of a strong theory regarding the subject of the survey, achieving interchangeability will prove to be extremely challenging. We recall the centrality of interchangeability and positivity in reaching unbiased estimates from non-probabilistic surveys.


It is one thing to know in principle that interchangeability, positivity and composition must be achieved in order to avoid selection bias in non-probabilistic survey estimates. Another thing is to reach them successfully in practice. Even when the topic is well known and many probable confounding factors are identified, it can be difficult to have the full certainty that there is not a factor yet unknown that introduces distortions in the survey estimates. The suggestion therefore is to identify the probable confounding factors and to plan the collection and analysis of the data so that they are measured and actively considered.

Leave a comment