PsyBlog

Social Psychology Experiments: 10 Of The Most Famous Studies

Ten of the most influential social psychology experiments explain why we sometimes do dumb or irrational things. 

social psychology experiments

Ten of the most influential social psychology experiments explain why we sometimes do dumb or irrational things.

“I have been primarily interested in how and why ordinary people do unusual things, things that seem alien to their natures. Why do good people sometimes act evil? Why do smart people sometimes do dumb or irrational things?” –Philip Zimbardo

Like famous social psychologist Professor Philip Zimbardo (author of The Lucifer Effect: Understanding How Good People Turn Evil ), I’m also obsessed with why we do dumb or irrational things.

The answer quite often is because of other people — something social psychologists have comprehensively shown.

Each of the 10 brilliant social psychology experiments below tells a unique, insightful story relevant to all our lives, every day.

Click the link in each social psychology experiment to get the full description and explanation of each phenomenon.

1. Social Psychology Experiments: The Halo Effect

The halo effect is a finding from a famous social psychology experiment.

It is the idea that global evaluations about a person (e.g. she is likeable) bleed over into judgements about their specific traits (e.g. she is intelligent).

It is sometimes called the “what is beautiful is good” principle, or the “physical attractiveness stereotype”.

It is called the halo effect because a halo was often used in religious art to show that a person is good.

2. Cognitive Dissonance

Cognitive dissonance is the mental discomfort people feel when trying to hold two conflicting beliefs in their mind.

People resolve this discomfort by changing their thoughts to align with one of conflicting beliefs and rejecting the other.

The study provides a central insight into the stories we tell ourselves about why we think and behave the way we do.

3. Robbers Cave Experiment: How Group Conflicts Develop

The Robbers Cave experiment was a famous social psychology experiment on how prejudice and conflict emerged between two group of boys.

It shows how groups naturally develop their own cultures, status structures and boundaries — and then come into conflict with each other.

For example, each country has its own culture, its government, legal system and it draws boundaries to differentiate itself from neighbouring countries.

One of the reasons the became so famous is that it appeared to show how groups could be reconciled, how peace could flourish.

The key was the focus on superordinate goals, those stretching beyond the boundaries of the group itself.

4. Social Psychology Experiments: The Stanford Prison Experiment

The Stanford prison experiment was run to find out how people would react to being made a prisoner or prison guard.

The psychologist Philip Zimbardo, who led the Stanford prison experiment, thought ordinary, healthy people would come to behave cruelly, like prison guards, if they were put in that situation, even if it was against their personality.

It has since become a classic social psychology experiment, studied by generations of students and recently coming under a lot of criticism.

5. The Milgram Social Psychology Experiment

The Milgram experiment , led by the well-known psychologist Stanley Milgram in the 1960s, aimed to test people’s obedience to authority.

The results of Milgram’s social psychology experiment, sometimes known as the Milgram obedience study, continue to be both thought-provoking and controversial.

The Milgram experiment discovered people are much more obedient than you might imagine.

Fully 63 percent of the participants continued administering what appeared like electric shocks to another person while they screamed in agony, begged to stop and eventually fell silent — just because they were told to.

6. The False Consensus Effect

The false consensus effect is a famous social psychological finding that people tend to assume that others agree with them.

It could apply to opinions, values, beliefs or behaviours, but people assume others think and act in the same way as they do.

It is hard for many people to believe the false consensus effect exists because they quite naturally believe they are good ‘intuitive psychologists’, thinking it is relatively easy to predict other people’s attitudes and behaviours.

In reality, people show a number of predictable biases, such as the false consensus effect, when estimating other people’s behaviour and its causes.

7. Social Psychology Experiments: Social Identity Theory

Social identity theory helps to explain why people’s behaviour in groups is fascinating and sometimes disturbing.

People gain part of their self from the groups they belong to and that is at the heart of social identity theory.

The famous theory explains why as soon as humans are bunched together in groups we start to do odd things: copy other members of our group, favour members of own group over others, look for a leader to worship and fight other groups.

8. Negotiation: 2 Psychological Strategies That Matter Most

Negotiation is one of those activities we often engage in without quite realising it.

Negotiation doesn’t just happen in the boardroom, or when we ask our boss for a raise or down at the market, it happens every time we want to reach an agreement with someone.

In a classic, award-winning series of social psychology experiments, Morgan Deutsch and Robert Krauss investigated two central factors in negotiation: how we communicate with each other and how we use threats.

9. Bystander Effect And The Diffusion Of Responsibility

The bystander effect in social psychology is the surprising finding that the mere presence of other people inhibits our own helping behaviours in an emergency.

The bystander effect social psychology experiments are mentioned in every psychology textbook and often dubbed ‘seminal’.

This famous social psychology experiment on the bystander effect was inspired by the highly publicised murder of Kitty Genovese in 1964.

It found that in some circumstances, the presence of others inhibits people’s helping behaviours — partly because of a phenomenon called diffusion of responsibility.

10. Asch Conformity Experiment: The Power Of Social Pressure

The Asch conformity experiments — some of the most famous every done — were a series of social psychology experiments carried out by noted psychologist Solomon Asch.

The Asch conformity experiment reveals how strongly a person’s opinions are affected by people around them.

In fact, the Asch conformity experiment shows that many of us will deny our own senses just to conform with others.

' data-src=

Author: Dr Jeremy Dean

Psychologist, Jeremy Dean, PhD is the founder and author of PsyBlog. He holds a doctorate in psychology from University College London and two other advanced degrees in psychology. He has been writing about scientific research on PsyBlog since 2004. View all posts by Dr Jeremy Dean

conducting a social experiment

Join the free PsyBlog mailing list. No spam, ever.

  • Invited Paper
  • Published: 20 October 2010

Setting up social experiments: the good, the bad, and the ugly

Die Gestaltung von Sozialexperimenten: The good, the bad and the ugly

  • Burt S. Barnow 1  

Zeitschrift für ArbeitsmarktForschung volume  43 ,  pages 91–105 ( 2010 ) Cite this article

20k Accesses

5 Citations

3 Altmetric

Metrics details

It is widely agreed that randomized controlled trials – social experiments – are the gold standard for evaluating social programs. There are, however, many important issues that cannot be tested using social experiments, and often things go wrong when conducting social experiments. This paper explores these issues and offers suggestions on ways to deal with commonly encountered problems. Social experiments are preferred because random assignment assures that any differences between the treatment and control groups are due to the intervention and not some other factor; also, the results of social experiments are more easily explained and accepted by policy officials. Experimental evaluations often lack external validity and cannot control for entry effects, scale and general equilibrium effects, and aspects of the intervention that were not randomly assigned. Experiments can also lead to biased impact estimates if the control group changes its behavior or if changing the number selected changes the impact. Other problems with conducting social experiments include increased time and cost, and legal and ethical issues related to excluding people from the treatment. Things that sometimes go wrong in social experiments include programs cheating on random assignment, and participants and/or staff not understanding the intervention rules. The random assignment evaluation of the Job Training Partnership Act in the United States is used as a case study to illustrate the issues.

Zusammenfassung

Es herrscht weitestgehend Konsens darüber, dass randomisierte kontrollierte Studien – Sozialexperimente – der „Goldstandard“ für die Bewertung sozialer Programme sind. Es gibt jedoch viele wichtige Aspekte, die sich nicht durch solche Studien bewerten lassen, und bei der Durchführung dieser Studien kann oft etwas schiefgehen. Die vorliegende Arbeit untersucht diese Themen und bietet Lösungsvorschläge für häufig auftretende Probleme. Sozialexperimente werden bevorzugt, weil die Randomisierung dafür sorgt, dass alle Unterschiede zwischen der Treatmentgruppe und der Kontrollgruppe der Intervention und nicht einem anderen Faktor zuzuschreiben sind. Es fällt Politikern und Beamten auch leichter, die Ergebnisse von Sozialexperimenten zu erklären und zu akzeptieren.

Bei experimentellen Bewertungen fehlt oft die externe Validität, und es fehlt die Möglichkeit, „entry effects“, Skaleneffekte, allgemeine Gleichgewichtseffekte und nichtrandomisierte Aspekte der Intervention zu kontrollieren. Experimente können auch zu verzerrten Aussagen über die Auswirkungen führen, wenn die Kontrollgruppe ihr Verhalten ändert oder wenn eine Änderung der Anzahl der ausgewählten Personen zu einer Veränderung der Auswirkungen führt. Weitere Probleme bei Sozialexperimenten sind erhöhter Zeitaufwand und Kosten sowie juristische und ethische Fragen nach dem Ausschluss gewisser Menschen von den Maßnahmen. Fehler, die manchmal in Sozialexperimenten vorkommen, sind beispielsweise Programme, die bei der Randomisierung nicht korrekt vorgehen und Teilnehmer bzw. Mitarbeiter, die die Interventionsregeln nicht verstehen. Die randomisierte Bewertung des Job Training Partnership Act in den USA wird als Fallstudie verwendet, um diese Themen besser aufzuzeigen.

1 Introduction

Since the 1960s, social experiments have been increasingly used in the United States to determine the effects of pilots and demonstrations as well as ongoing programs in areas as diverse as education, health insurance, housing, job training, welfare cash assistance, and time of day pricing of electricity. Although social experiments have not been widely used in Europe, there is growing interest in expanding their use in evaluating social programs. Social experiments remain popular in the United States, but there has been a spirited debate in recent years regarding whether recent methodological developments, particularly propensity score matching and regression discontinuity designs, overcome many of the key objections to nonexperimental methods. This paper provides an assessment of some of the issues that arise in conducting social experiments and explains some of the things that can go wrong in conducting and interpreting the results of social experiments.

The paper first defines what is generally meant by the term social experiments and briefly reviews their use in the United States. This is followed by a discussion of the advantages of social experiments over nonexperimental methods. The next section discusses the limitations of social experiments – what we cannot learn from social experiments. Next is a section discussing some of the things that can go wrong in social experiments and limits of what we learn from them. To illustrate the problems that can arise, the penultimate section provides a case study of lessons from the National JTPA Study, a social experiment that was used to assess a large training program for disadvantaged youth and adults in the United States. The last section provides conclusions.

2 Definitions and context

As Orr ( 1999 , p. 14) notes, “The defining element of a social experiment is random assignment of some pool of individuals to two or more groups that are subject to different policy regimes.” Greenberg and Shroder ( 2004 , p. 4) note that because social experiments are intended to provide unbiased estimates of the impacts of the policy of interest, they must have four specific features:

Random assignment : Creation of at least two groups of human subjects who differ from one another by chance alone.

Policy intervention : A set of actions ensuring that different incentives, opportunities, or constraints confront the members of each of the randomly assigned groups in their daily lives.

Follow-up data collection : Measurement of market and fiscal outcomes for members of each group.

Evaluation : Application of statistical inference and informed professional judgment about the degree to which the policy interventions have caused differences in outcomes between the groups.

These four features are not particularly restrictive, and social experiments can have a large number of variations. Although we often think of random assignment taking place at the individual level, the random assignment can take place at a more aggregated level, such as the classroom, the school, the school district, political or geographic jurisdictions, or any other unit where random assignment can be feasibly carried out. Footnote 1 Second, there is no necessity for a treatment to be compared against a null treatment. In an educational or medical context, for example, it might be harmful to the control group if they receive no intervention; in such instances, the experiment can measure differential impacts where the treatment and control groups both receive treatments, but they do not receive the same treatment. Footnote 2

Third, there does not have to be a single treatment. In many instances it is sensible to develop a number of alternative treatments to which participants are assigned. In health insurance experiments, for example, there are often a number of variations we would like to test for the key aspects of the treatment. Thus, we might want to randomly assign participants to various combinations of deductable amounts and co-payment rates to see which combination leads to the best results in terms of costs and health outcomes. Likewise, in U.S. welfare experiments, the experiments frequently vary the “guarantee,” the payment received if the person does no market work, and the “implicit tax rate,” the rate at which benefits are reduced if there are earnings. Footnote 3

Fourth, social experiments can be implemented in conjunction with an ongoing program or to test a new intervention; in some instances a social experiment will test a new intervention in the context of an ongoing program. Welfare programs in the United States have been subject to several types of social experiments. In the 1960s and 1970s, a series of “negative income tax” experiments were conducted where a randomly selected group of people were diverted from regular welfare programs to entirely new welfare programs with quite different rules and benefits. During the 1980s and 1990s, many states received waivers where they were permitted to try new variations on their welfare programs so long as the new interventions were evaluated using random assignment. U.S. vocational training programs have included freestanding demonstrations with experimental designs as well as experimental evaluations of ongoing programs. Inserting an experimental design in an ongoing program is sometimes difficult, particularly if the program is an entitlement or if the authorizing legislation prohibits denying services to those who apply.

Another important distinction among experiments is that the participants can volunteer for the intervention or they can be assigned to the program. For purely voluntary programs, such as many job training programs in the United States, there is no meaningful concept of mandatory participants. For welfare programs, however, a new intervention can be voluntary in nature or it could be mandatory; the numerous welfare to work demonstration programs tested in the United States have fallen into both categories. While both mandatory and voluntary programs can be evaluated using an experimental design, the findings must be interpreted carefully. The impacts estimated for a voluntary program can not necessarily be expected to apply for a program where all welfare recipients must participate, and the impacts for a mandatory program may not apply if the same intervention were implemented as a voluntary program.

Although this paper does not focus on the ethics of random assignment, it is important to consider whether it is ethical to deny people the opportunity to participate in a social program. Both Greenberg and Shroder ( 2004 ) and Orr ( 1999 ) discuss the ethics of random assignment, but they do not do so in depth. More recently, the topic was explored in more depth in an exchange between Blustein ( 2005a , b ), Barnow ( 2005 ), Rolston ( 2005 ), and Schochet ( 2005 ). Many observers would agree that random assignment is ethical (or at least not unethical) when there is excess demand for a program and the effectiveness of the program is unknown. Blustein ( 2005a ) uses the experimental evaluation of the Job Corps to raise issues such as recruiting additional applicants so that there will be sufficient applicants to deny services to some, the fact that applicants who do not consent to the random assignment procedure are denied access to the program, and whether those randomized out of participation should receive monetary compensation. She believes that a good case can be made that the Job Corps evaluation, which included random assignment, may have been unethical, although her critics generally take issue with her points and claim that the knowledge gained is sufficient to offset any losses to the participants. As Blustein makes clear, her primary motivation in the paper is not to dispute the ethics of the Job Corps evaluation but rather to urge that ethical considerations be taken into account more fully when random assignment is being considered.

An important distinction between social experiments and randomized controlled trials that are frequently used in the fields of medicine and public health is that social experiments rarely make use of double blind or even single blind approaches. In the field of medicine, it is well known that there can often be a “placebo effect” where subjects benefit from the perception of such a treatment. Although social experiments can also be subject to similar problems, it is often difficult or impossible to keep the subjects and researchers unaware of their treatment status. A related phenomenon, known as the “Hawthorne effect,” refers to the possibility that subjects respond differently to stimuli because they are being observed. Footnote 4 The important point is that the inability to conduct double blind experiments, and even the knowledge that a subject is in an experiment can potentially lead to biased estimates of intervention impacts.

It is important to distinguish between true social experiments and “natural experiments.” The term natural experiment is sometimes used to refer to situations where random selection is not used to determine assignment to treatment status but the mechanism used, it is argued, results in treatment and comparison groups that are virtually identical. Angrist and Krueger ( 2001 ) extol the use of natural experiments in evaluations when random assignment is not feasible as a way to eliminate omitted variable bias; however, the examples they cite make use of instrumental variables rather than assuming that simple analysis of variance or ordinary least squares regression analysis can be used to obtain impact estimates:

Instruments that are used to overcome omitted variable bias are sometimes said to derive from “natural experiments.” Recent years have seen a resurgence in the use of instrumental variables in this way – that is, to exploit situations where the forces of nature or government policy have conspired to produce an environment somewhat akin to a randomized experiment. This type of application has generated some of the most provocative empirical findings in economics, along with some controversy over substance and methods.

Perhaps one of the best known examples of use of a natural experiment is the analysis by Angrist and Krueger ( 1991 ) to evaluate the effects of compulsory school attendance laws in the United States on education and earnings. In that study, the authors argue that the number of years of compulsory education (within limits) is essentially random, as it is determined by the month of birth. As Angrist and Krueger clearly imply, a natural experiment is not a classical experiment with randomized control trials, and there is no guarantee that simple analyses or more complex approaches such as instrumental variables will yield unbiased treatment estimates.

3 Why conduct social experiments?

There are a number of reasons why social experiments are preferable to nonexperimental evaluations. In the simplest terms, the objective in an evaluation of a social program is to observe the outcome for an intervention for the participants with and without the intervention. Because it is impossible to observe the same person in two states of the world at the same time, we must rely on some alternative approach to estimate what would have happened to participants had they not been in the program. The simplest and most effective way to assure comparability of the treatment and control groups is to randomly assign the potential participants to either receive the treatment or be denied the treatment; with a sufficiently large sample size, the treatment and control groups are likely to be identical on all characteristics that might affect the outcome. Nonexperimental evaluation approaches generally seek to provide unbiased and consistent impact estimates either by using mechanisms to develop comparison groups that are as similar as possible to the treatment group (e.g., propensity score matching) or by using econometric approaches to control for observed and unobserved omitted variables (e.g., fixed effects models, instrumental variables, ordinary least squares regression analysis, and regression discontinuity designs). Unfortunately, all the nonexperimental approaches require strong assumptions to assure that unbiased estimates are obtained, and these assumptions are not always testable.

Burtless ( 1995 ) describes four reasons why experimental designs are preferable to nonexperimental designs. First, random assignment assures the direction of causality. If earnings rise for the treatment group in a training program more than they do for the control group, there is no logical source of the increase other than the program. If a comparison group of individuals who chose not to enroll is used, the causality is not clear – those who enroll may be more interested in working and it is the motivation that leads to the earnings gain rather than the treatment. Burtless's second argument is related to the first – random assignment assures that there is no selection bias in the evaluation, where selection bias is defined as a likelihood that individuals with particular unobserved characteristics may be more or less likely to participate in the program. Footnote 5 The most common example of potential selection bias is that years of educational attainment are likely to be determined in part on ability, but ability is usually either not available to the evaluator or available only with measurement error.

The third argument raised by Burtless in favor of social experiments is that social experiments permit tests of interventions that do not naturally occur. Although social experiments do permit evaluations of such interventions, pilot projects and demonstrations can also be implemented without a randomly selected control group. Finally, Burtless notes that evaluations using random assignment provide findings that are more persuasive to policy makers than evaluations using nonexperimental methods. One of the best features of using random assignment is that program impacts can be observed by simply subtracting the post-program control group values from the values for the treatment group – there is no need to have faith that a fancy instrumental variables approach or a propensity score matching scheme has adequately controlled for all unobserved variables. Footnote 6 For researchers, experiments also assure that the estimates are unbiased and more precise than alternative approaches.

4 Can nonexperimental methods replicate experimental findings?

The jury is still out on this issue, and in recent years there has been a great deal of research and spirited debate about how well nonexperimental methods do at replicating experimental findings, given the data that are available. There is no question that there have been important developments in nonexperimental methods in recent years, but the question remains as to how well the methods do in replicating experimental findings and how the replication depends on the particular methods used and data available. Major contributions in recent years include the work of Heckman et  al. ( 1997 ) on propensity score matching and Hahn et  al. ( 2001 ) on regression discontinuity designs. Footnote 7 In this section several recent studies that have found a good match between nonexperimental methods and experimental findings are first reviewed, followed by a review of studies that were unable to replicate experimental findings. The section concludes with suggestions from the literature on conditions where nonexperimental approaches are most likely to replicate experimental findings.

Propensity score matching has been widely used in recent years when random assignment is not feasible. Heckman et  al. ( 1997 ) tested a variety of propensity score matching approaches to see what approaches best mirror the experimental findings from the evaluation of the Job Training Partnership Act (JTPA) in the United States. The authors conclude that: “We determine that a regression-adjusted semiparametric conditional difference in differences matching estimator often performs the best among a class of estimators we examine, especially when omitted time-invariant characteristics are a source of bias.” The authors caution, however: “As is true of any empirical study, our findings may not generalize beyond our data.” They go on to state: “Thus, it is likely that the insights gained from our study of the JTPA programme on the effectiveness of different estimators also apply in evaluating other training programmes targeted toward disadvantaged workers.”

Another effort to see how well propensity score matching replicates experimental findings is in Dehejia and Wahba ( 2002 ). These authors are also optimistic about the capability of propensity score matching to replicate experimental impact estimates: “This paper has presented a propensity score-matching method that is able to yield accurate estimates of the treatment effect in nonexperimental settings in which the treated group differs substantially from the pool of potential comparison units.” Dehejia and Wahba ( 2002 ) use propensity score matching in trying to replicate the findings from the National Supported Work demonstration. Although the authors find that propensity score matching works well in the instance they examined, they caution that the approach critically depends on selection being based on observable variables and note that the approach may not work well when important explanatory variables are missing.

Cook et  al. ( 2008 ) provide a third example of finding that nonexperimental approaches do a satisfactory job of replicating experimental findings under some circumstances. The authors looked at the studies by the type of nonexperimental approach that was used. The three studies that used a regression discontinuity design were all found to replicate the findings from the experiment. Footnote 8 They note that although regression discontinuity designs are much less efficient than experiments, as shown by Goldberger ( 1972 ), the studies they reviewed had large samples so impacts remained statistically significant. The authors find that propensity score matching works well in replicating experimental findings when key covariates are included in the propensity score modeling and where the comparison pool members come from the same geographic area as the treatment group, and they also find that propensity score matching works well when clear rules for selection into the treatment group are used and the variables that are used in selection are available for the analysis. Finally, in studies where propensity score matching was used but the covariates available did not correspond well to the selection rules and/or there was a poor geographic match, the nonexperimental results did not consistently match the experimental findings.

In another recent study, Shadish et  al. ( 2008 ) conducted an intriguing experiment by randomly assigning one group of individuals to be randomly assigned to treatment status and the other to self-select one of the two treatment options (mathematics or vocabulary training). The authors found that propensity score matching greatly reduced the bias of impact estimates when the full set of available covariates was used, including pretests, but did poorly when only predictors of convenience (sex, age, marital status, and ethnicity) were used. Thus, their findings correspond with the findings of Cook et  al. ( 2008 ).

Smith and Todd ( 2005a ) reanalyzed the National Supported Work data used by Dehejia and Wahba ( 2002 ). They find that the estimated impacts are highly sensitive to the particular subset of the data analyzed and the variables used in the analysis. Of the various analytical strategies employed, Smith and Todd ( 2005a ) find that difference in difference matching estimators perform the best. Like many other researchers, Smith and Todd ( 2005a ) find that variations in the matching procedure (e.g., number of individuals matched, use of calipers, local linear regressions) generally do not have a large effect on the estimated impacts. Although they conclude that propensity score matching can be a useful approach for nonexperimental evaluations, they believe that it is not a panacea and that there is no single best approach to propensity score matching that should be used. Footnote 9

Wilde and Hollister ( 2007 ) used data from an experimental evaluation of a class size reduction effort in Tennessee (Project STAR) to assess how well propensity score matching replicates the experimental impact estimates. They accomplished this by treating each school as a separate experiment and pooling the control groups from other schools in the study and then using propensity score matching to identify the best match for the treatment group in each school. The authors state that: “Our conclusion is that propensity score estimators do not perform very well, when judged by standards of how close they are to the ‘true’ impacts estimated from experimental estimators based on a random assignment design.” Footnote 10

Bloom et  al. ( 2002 ) make use of an experiment designed to assess the effects of mandatory welfare to work programs in six states to compare a series of comparison groups and estimation strategies to see if popular nonexperimental methods do a reasonable job of approximating the impact estimates obtained from the experimental design. Nonexperimental estimation strategies tested include several propensity score matching strategies, ordinary least squares regression analysis, fixed effect models, and random growth models. The authors conclude that none of the approaches tried do a good job of reproducing the experimental findings and that more sophisticated approaches are sometimes worse than simple approaches such as ordinary least squares.

Overall, the weight of the evidence appears to indicate that nonexperimental approaches generally do not do a good job of replicating experimental estimates and that the most common problem is the lack of suitable data to control for key differences between the treatment group and comparison group. The most promising nonexperimental approach appears to be the regression discontinuity design, but this approach requires a much larger sample size to obtain the same amount of precision as an experiment. Footnote 11 The studies identify a number of factors that generally improve the performance of propensity score matching:

It is important to only include observations in the region of common support, where the probabilities of participating are nonzero for both treatment group members and comparison group members.

Data for the treatment and comparison groups should be drawn from the same data source, or the same questions should be asked of both groups.

Comparison group members should be drawn from the same geographic area as the treatment group.

It is important to understand and statistically control for the variables used to select people into the treatment group and to control for variables correlated with the outcomes of interest.

Difference in difference estimators appear to produce less bias than cross section matching in several of the studies, but it is not clear that this is always the case.

5 What we cannot learn from social experiments

Although experiments provide the best means of obtaining unbiased estimates of program impacts, there are some important limitations that must be kept in mind in designing experiments and interpreting the findings. This section describes some of the limitations that are typically inherent to experiments as well as problems that sometimes arise in experiments.

Although a well designed experiment can eliminate internal validity problems, there are often issues regarding external validity, the applicability of the findings in other situations. External validity for the eligible population is threatened if either the participating sites or individuals volunteer for the program rather than are randomly assigned. If the sites included in the experiment volunteered rather than were randomly selected, the impact findings may not be applicable to other sites. It is possible that the sites that volunteer are more effective sites, as less capable sites may want to avoid having their poor performance known to the world. In some of the welfare to work experiments conducted in the United States, random assignment was conducted among welfare recipients who volunteered to participate in the new program. The fact that the experiment was limited to welfare recipients who volunteered would not harm the internal validity of the evaluation, but the results might not apply to individuals who did not volunteer. If consideration is being given to making the intervention mandatory, then learning the effects of the program for volunteers does not identify the parameter of interest unless the program has the same impact on all participants. Although there is no way to assure external validity, exploratory analyses examining whether impacts are consistent across sites and subgroups can suggest (but not prove) if there is a problem.

Experiments typically randomly assign people to the treatment or control group after they have applied for or enrolled in the program. Thus, experiments typically do not pick up any effects the intervention might have that encourage or discourage participation. For example, if a very generous training option is added to a welfare program, more people might sign up for the program. These types of effects, referred to as entry effects, can be an important aspect of a program's effects. Because experiments are likely not to measure these effects, nonexperimental methods must be used to estimate the entry effects. Footnote 12

Another issue that is difficult to deal with in the context of experiments is the finite time horizon that typically accompanies an experiment. If the experiment is offered on a temporary basis and potential participants are aware of the finite period of the experiment, their behavior may be quite different from what would occur if the program were permanent. Consider a health insurance experiment, for example. If members of the treatment group have more generous coverage during the experiment than they will have after the experiment, they are more likely to increase their spending on health care for services that might otherwise be postponed. The experiment will provide estimates of the impact of a temporary policy, but what is needed for policy purposes is the impact of a permanent program. This issue can be dealt with in several ways. One approach would be to run the experiment for a long time so that the treatment group's response would be similar to what would occur for a permanent program; this would usually not be feasible due to cost issues. Another approach would be to enroll members of the treatment group for a  varying number of years and then try to estimate how the response varies with time in the experiment. Finally, one could enroll the participants in a “permanent” program and then buy them out after the data for the evaluation has been gathered.

Another area where experiments may provide only limited information is on general equilibrium effects. For example, a labor market intervention can have effects not captured in a typical evaluation. Examples include potential displacement of other workers by those who receive training, wage increases for the control group due to movement of those trained into a different labor market, and negative wage effects for occupations if the number of people trained is large. Another example is “herd immunity” observed in immunization programs; the benefits of an immunization program affect those not immunized at some point as their probability of contracting the disease diminishes as the number of people in the community immunized increases. Not only do small scale experiments fail to measure these effects, even the evaluation of a large scale program might miss them. Footnote 13

With human subjects, it is not always a simple matter to assure that individuals in the treatment group obtain the treatment and those in the control group do not receive the treatment. In addition, being in the control group in the experiment may provide benefits that would not have been received had there been no experiment. These three cases are described below.

One factor that differentiates social experiments from agricultural experiments is that often some of those assigned to the treatment group do not receive the treatment. So-called no-shows are frequently found in program evaluations, including experiments. It is essential that no-shows be included in the treatment group to preserve the equality of the treatment and control groups. Unfortunately, the experimental impact estimates produced when there are no-shows provide the impact of an offer of the treatment, not the impact of the treatment itself. A policy maker who is trying to decide whether to continue a training program is not interested in the impact of an offer for training – the program only incurs costs for those who enroll, so the policy maker wants to know the impact for those who participate.

Bloom ( 1984 ) has shown that if one is willing to assume that the treatment has no impact on no-shows, the experimental impact estimator can be adjusted to provide an estimate of the impact on the treated. The overall impact of the program is a weighted average of the impact on those who receive the treatment, \( { I_{\text{P}} } \) , and those who do not receive the treatment, \( { I_{\text{NP}} } \) :

where p is the fraction of the treatment group that receives the treatment. If the impact on those who do not receive the treatment is zero, then \( { I_{\text{NP}} = 0 } \) , and \( { I_{\text{P}} = I/p } \) ; in other words, the impact of the program on those who receive the treatment is estimated by dividing the impact on the overall treatment group (including no-shows) by the proportion who actually receive the treatment.

Individuals assigned to the control group who somehow receive the treatment are referred to as “crossovers.” Orr ( 1999 ) observes that some analysts assign the crossovers to the treatment group or leave them out of the analysis, but either of these strategies is likely to destroy the similarity of the treatment and control groups. He further observes that if we are willing to assume that the program is equally effective for the crossovers and the “crossover-like” individuals in the treatment group, then the impact on the crossover-like individuals is zero and the overall impact of the program can be expressed as a weighted average of the impact on the crossover-like individuals and other individuals:

where \( { I_{\text{c}} } \) is the impact on crossover-like participants, \( { I_{\text{o}} } \) is the impact on others, and c is the proportion of the control group that crossed over; assuming that \( { I_{\text{c}} = 0 } \) , we can then compute the impact on those who do not cross over as \( { I_{\text{o}} = I/(1 - c) } \) . If the crossovers receive a similar but not identical treatment, then the impact on the crossover-like individuals may well not be zero, and Orr ( 1999 ) indicates that the best that can be done is to vary the value of \( { I_{\text{c}} } \) and obtain a range of estimates. Footnote 14

Heckman and Smith ( 1995 ) raise a related issue. In some experiments, the control group may receive valuable services in the process of being randomized out that they would not receive if there were no experiment. This may occur because when people are being recruited for the experiment, they receive some services with the goal of increasing their interest. Alternatively, to reduce ethical concerns, those randomized out may receive information about alternative treatments, which they then receive. In either case, the presence of the experiment has altered the services received by the control group and this creates what Heckman and Smith ( 1995 ) refer to as “substitution bias.”

Heckman and Smith ( 1995 ) also discuss the concept of “randomization bias” that can arise because the experiment changes the scale of the intervention. This problem can arise when the program has heterogeneous impacts and as the scale of the program is increased, those with smaller expected impacts are more likely to enroll. Suppose, for example, that at its usual scale a training program has an earnings impact of $1,000 per year. When the experiment is introduced, the number of people accepted into the program increases, so the impact is likely to decline. It is possible, at least in theory, to assess this problem and correct for it by asking programs to indicate which individuals would have been accepted at the original scale and at the experiment scale. Another possible way to avoid this problem is to reduce the operating scale of the program during the experiment so that the size of the treatment and control groups combined is equal to the normal operating size of the program. More practically, randomization bias can be minimized if the proportion randomized out is very small, say 10% or less; this was the strategy employed in the experimental evaluation of the Job Corps in the United States where Schochet ( 2001 ) indicates that only about 7% of those admitted to the program were assigned to the control group. Footnote 15

6 What can go wrong in social experiments?

In addition to the issues described above that frequently arise in social experiments, there are a number of problems that can also arise. Several common problems are described in this section, and the following section provides a case study of one experiment.

For demonstration projects and for new programs, the intervention may change after the program is initiated. In some cases it may take several months for the program to be working at full capacity; those who enroll when the program first opens may not receive the same services as later participants receive. The program might also change because program officials learn that some program components do not work as well in practice as they do in theory, economic conditions change, or the participants differ from what was anticipated. Some types of interventions, such as comprehensive community initiatives are expected to change over their implementation as new information is gathered. Footnote 16 Although program modifications often improve the intervention, they can complicate the evaluation in several ways. Instead of determining the impact of one known intervention, the impact evaluation may provide estimates that represent an average of two or more different strategies. At worst, policy makers might believe that the impact findings apply to a different intervention than what was evaluated.

Several strategies can be used to avoid or minimize these types of problems. First, it is important to monitor the implementation of the intervention. Even ongoing programs should be subject to implementation studies so that policy makers know what is being evaluated and if it has changed over time. Second, for a new intervention, it is often wise to postpone the impact evaluation until the intervention has achieved a steady state. Finally, if major changes in the intervention occur over the period analyzed, the evaluation can be conducted for two or more separate periods, although this strategy reduces the precision of the impact estimates.

Experiments can vary in their complexity, and this can lead to problems in implementation and the interpretation of findings. In some instances, experiments are complex because we wish to determine an entire “response surface” rather than evaluate a single intervention. Examples in the United States include the RAND health insurance experiment and the negative income tax (welfare reform) experiments (Greenberg and Schroder 2004 ), where various groups in the experiment were subject to variations in key parameters. For example, in the negative income tax experiments, participants were subject to variation in the maximum benefit and the rate at which benefits were reduced if they earned additional income. If the participants did not understand the concepts involved, particularly the implicit tax rate on earnings, then it would be inappropriate to develop a response surface based on variation in behavior by participants subject to different rules.

Problems in understanding the rules of the intervention can also arise in simpler experiments. For example, the State of Maryland wished to promote good parenting among its welfare recipients and instituted an experiment called the Primary Prevention Initiative (PPI). The treatment group in this experiment was required to assure that the children in the household maintained satisfactory school attendance (80% attendance), and preschool children were required to receive immunizations and physical examinations (Wilson et  al. 1999 ). Parents who failed to meet these criteria were subject to a fine of $25.00 per month. The experiment included an implementation study, and as part of the implementation study, clients were surveyed on their knowledge of the PPI. Wilson et  al. ( 1999 ) report that “only a small minority of clients (under 20%) could correctly identify even the general areas in which PPI had behavioral requirements.” The lack of knowledge was almost as high among those sanctioned as for clients not sanctioned. Not surprisingly, the impact evaluation indicated that the PPI had no effect on the number of children that were immunized, that received a physical exam, or that had satisfactory school attendance. If there had been no data on program knowledge, readers of the impact evaluation might logically have inferred that the incentives were not strong enough rather than that participants did not understand the intervention.

The potential for participants in experiments to not fully understand the rules of the intervention is not trivial. If we obtain zero impacts because participants do not understand the rules and it is possible to educate them, it is important to identify the reasons why we estimate no impact. Thus, whenever there is a reasonable possibility of participants misunderstanding the rules, it is advisable to consider including a survey of intervention knowledge as part of the evaluation.

Finally, in instances where state or local programs are asked to volunteer to participate in the program, there may be a high refusal rate, thus jeopardizing external validity. Sites with low impacts may be reluctant to participate as may sites that are having trouble recruiting adequate participants. Sites may also be reluctant to participate if they believe random assignment is unethical, as was discussed above, or adds a delay in processing applicants.

7 Lessons from the National JTPA Study

This section describes some of the problems that occurred in implementing the National JTPA Study in the United States. The Job Training Partnership Act (JTPA) was the primary workforce program for disadvantaged youth and adults in the United States from 1982 through 1998 when the Workforce Investment Act (WIA) was enacted. The U.S. Department of Labor decided to evaluate JTPA with a classical experiment after a series of impact evaluations of JTPA's predecessor produced such a wide range of estimated impacts that it was impossible to know the impact of the program. Footnote 17 The National JTPA Study used a classical experimental design to estimate the impact of the JTPA program on disadvantaged adults and out-of-school disadvantaged youth. The study began in 1986 and made use of JTPA applicants in 16 sites across the country. The impact evaluation found that the program increased earnings of adult men and women by over $1,300 in 1998 dollars during the second year after training. The study found that the out-of-school youth programs were ineffective, and these findings are not discussed below.

I  focus on the interim report of the National JTPA Study for several reasons. Footnote 18 First, the study was generally well done, and it was cited by Hollister ( 2008 ) as one of the best social experiments that was conducted. The problems that I  review below are not technical flaws in the study design or implementation, but program features that precluded analyzing the hypotheses of most interest and, in my view, approaches to presenting the findings that may have led policy makers to misinterpret the findings. I  focus on the interim report rather than the final report because many of the presentation issues that I  discuss were not repeated in the final report. Footnote 19

7.1 Nonrandom site selection

The study design originally called for 16 to 20 local sites to be selected at random. Sites were offered modest payments to compensate for extra costs incurred and to pay for inconvenience experienced. The experiment took place when the economy was relatively strong, and many local programs (called service delivery areas or SDAs) were having difficulty spending all their funding. Because participating sites were required to recruit 50% more potential participants to construct a control group one-half the size of the treatment group, many sites were reluctant to participate in the experiment. In the end, the project enrolled all 16 sites identified that were willing and able to participate. All evaluations, including experiments, run the risk of failing to have external validity, but the fact that most local sites refused to participate raised suspicion that the sites selected did not constitute a representative sample of sites. The National JTPA Study report does note that no large cities are included in the participating sample of 16 SDAs (by design), but the report's overall conclusion is more optimistic: “The most basic conclusion … is that the study sites and the 17,026 members of the 18-month study sample resemble SDAs and their participants nationally and also include much of their diversity” (Bloom et  al. 1993 , p. 73).

Although the external validity of the National JTPA Study has been subject to a great deal of debate among analysts, there is no way to resolve the issue. Obviously it is best to avoid sites refusing to participate, but that may be easier said than done. Potential strategies to improve participation include larger incentive payments, exemption from performance standards sanctions for the period of participation, Footnote 20 making participation in evaluations mandatory in authorizing legislation, and decreasing the proportion of the applicants assigned to the control group.

7.2 Random assignment by service strategy recommended

Experimental methods can only be used to evaluate hypotheses where random assignment was used to assign the specific treatment received. In JTPA, the evaluators determined that prior to the experiment adults in the 16 sites were assigned to one of three broad categories – (1)  occupational classroom training, (2)  job search assistance (JSA) or on-the-job training (OJT), and (3)  other services. Although OJT is generally the most expensive service strategy, because the program pays up to one-half of the participant's wages for up to six months, and JSA is the least expensive because it is generally of short duration and is often provided in a group setting, it was observed that the individuals deemed appropriate for OJT were virtually job ready as were those recommended for JSA; in addition, because OJT slots are difficult to obtain, candidates for OJT are often given JSA while waiting for an OJT slot to become available. The “other” category included candidates recommended for services such as basic skills (education), work experience, and other miscellaneous services but not occupational classroom training or OJT.

The strategy used in the National JTPA Study was to perform random assignment after a prospective participant was given a preliminary assessment and a service strategy recommended for the person; individuals that the program elected not to serve were excluded from the experiment. Two-thirds of the participants recommended for services were in the treatment group, and one-third was excluded from the JTPA program for a period of 18 months. During the embargo period, control group members were permitted to enroll in any workforce activities other than JTPA that they wished.

There are several concerns with the random assignment procedures used in the National JTPA Study. None of these concerns threatens the internal validity of the impacts estimated, but they show how difficult it is to test the most interesting hypotheses when trying to graft a random assignment experimental design to an existing program.

By presenting findings primarily per assignee rather than per participant, the findings may be misinterpreted . This issue relates more to presentation than analysis. A reader of the full report can find detailed information about what the findings mean, but the executive summary stresses impact estimates per assignee, so casual readers may not learn the impact per person who enrolls in the program. Footnote 21 There are often large differences between the impact per assignee and impact per enrollee because for some analyses the percentage of assignees that actually enrolled in the program is much less than 100%. For adult women for example, less than half (48.6%) of the women assigned to classroom training actually received classroom training; for men, the figure was even lower (40.1%). Assignees who did not receive the recommended treatment strategy sometimes received other strategies, and the report notes that impacts per enrollee “were about 60 percent to 70 percent larger than impacts per assignee, depending on the target group” (Bloom et  al. 1993 , p. xxxv). Policy makers generally think about what returns they are getting on people who enroll in the program, as little, if any, money is spent on no-shows. Thus, policy makers want to know the impact per enrollee, and they might assume that impact estimates are impact per enrollee rather than impact per assignee. Footnote 22 \( { ^{,} } \) Footnote 23

Failure to differentiate between the in-program period and the post-program period can be misleading, particularly for short-term findings. The impact findings are generally presented on a quarterly basis, measured in calendar quarters after random assignment, or for the entire six-quarter follow-up period. For strategies that typically last for more than one quarter, the reader can easily misinterpret the impact findings when the in-program and post-program impacts are not presented separately. Footnote 24 \( { ^{,} } \) Footnote 25

The strategy does not allow head-to-head testing of alternative strategies. Because random assignment is performed after a treatment strategy is recommended, the only experimental estimates that can be obtained are for a particular treatment versus control status. Thus, if, say, OJT has a higher experimental impact than classroom training, the experiment tells us nothing about what the impact of OJT would be for those assigned to classroom training. The only way to experimentally test this would be to randomly assign participants to treatment strategies. In the case of the JTPA, this would mean sometimes assigning people to a strategy that the SDA staff believed was inappropriate.

The strategy does not provide the impact of receiving a particular type of treatment – it only provides the impact of being assigned to a particular treatment stream . If all JTPA participants received the activities they were initially assigned to, this point would not be important, but this was not the case. Among the adult women and men who received services, slightly over one-half of those assigned to occupational classroom training received this service, 58 and 56%, respectively. Footnote 26 Of those who did not receive occupational classroom training, about one-half did not enroll, and the remainder received other services. The figures are similar for the OJT-JSA group except that over 40% never enrolled. The “other services” group received a variety of services with no single type of service dominating. There is, of course, no way to analyze actual services received using experimental methods, but the fact that a relatively large proportion of individuals received services other than those recommended makes interpretation of the findings difficult.

The OJT-JSA strategy assignee group includes those receiving the most expensive services and those receiving the least expensive services, so the impact estimates are not particularly useful. The proportions receiving JSA and OJT are roughly equal, but by estimating the impact for these two service strategies combined, policy and program officials cannot determine whether one of the two strategies or both are providing the benefits. It is impossible to disentangle the effects of these two very different strategies using experimental methods. In a future experiment this problem could be avoided by establishing narrower service strategies, e.g., making OJT and JSA separate strategies.

Control group members were barred from receiving JTPA services, but many received comparable services from other sources, making the results difficult to interpret. The National JTPA Study states that impact estimates of the JTPA program are relative to whatever non-JTPA services the control group received. Because both the treatment group and the control group were motivated to receive workforce services, it is perhaps not surprising that for many of the analyses the control group received substantial services. For example, for the men recommended to receive occupational classroom training, 40.1% of the treatment group received such training, but so did 24.2% of the control group. For women, 48.6% of the treatment group received occupational classroom training and 28.7% of the control group received such services. Thus, to some extent, the estimated impacts do not provide the impact of training versus no training, but of one type of training relative to another.

The point is not that the National JTPA Study was seriously flawed; on the contrary, Hollister ( 2008 ) is correct to identify this study as one of the better social experiments conducted in recent years. Rather, the two key lessons to be drawn from the study are as follows:

It is important to present impact estimates so that they answer the questions of primary interest to policy makers. This means clearly separating in-program and post-program impact findings and giving impacts per enrollee more prominence than impacts per assignee. Footnote 27

Some of the most important evaluation questions may be answered only through nonexperimental methods rather than experimental methods. Although experimental estimates are preferred when they are feasible, nonexperimental methods should be used when they are not. The U.S. Department of Labor has sometimes shied away from having researchers use nonexperimental methods in conjunction with experiments. When experimental methods cannot answer all the questions of interest, nonexperimental methods should be tried, with care taken to describe all assumptions made and for sensitivity analyses to be conducted.

8 Conclusions

This paper has addressed the strengths and weaknesses of social experiments. There is no doubt that experiments offer some advantages over nonexperimental evaluation approaches. Major advantages include the fact that experiments avoid the need to make strong assumptions about potential explanatory variables that are unavailable for analysis and the fact that experimental findings are much easier to explain to skeptical policy makers. Although there is growing literature testing how well nonexperimental methods replicate experimental impact estimates, there is no consensus on the extent to which positive findings can be generalized.

But experiments are not without problems. The key point of this paper is that any impact evaluation, experimental or nonexperimental in nature, can have serious limitations. First, there are some questions that experiments generally cannot answer. For example, experiments frequently have “no-shows” who do not participate in the intervention after they were randomly assigned to the treatment group, and crossovers who are members of the control group who somehow take the treatment intervention or something other than what was intended for the control group. Experiments are often bad at capturing entry effects and general equilibrium effects.

In addition, in implementing experimental designs, things can go wrong. Examples include problems with participants understanding the intervention and difficulties in testing the hypotheses of most interest. These points were illustrated by showing how the National JTPA Study, which included random assignment to treatment status and is considered by many as an example of a well conducted experiment, failed to answer many of the questions of interest to policy makers.

Thus, social experiments have many advantages, and one should always give careful thought to using random assignment to evaluate interventions of interest. It should be recognized, however, that simply conducting an experiment is not sufficient to assure that important policy questions are answered correctly. In short, an experiment is not a substitute for thinking.

Executive summary

It is widely agreed that randomized controlled trials – social experiments – are the gold standard for evaluating social programs. There are, however, important issues that cannot be tested using experiments, and often things go wrong when conducting experiments. This paper explores these issues and offers suggestions on dealing with commonly encountered problems. There are several reasons why experiments are preferable to nonexperimental evaluations. Because it is impossible to observe the same person in two states of the world at the same time, we must rely on some alternative approach to estimate what would have happened to participants had they not been in the program.

Nonexperimental evaluation approaches seek to provide unbiased and consistent impact estimates, either by developing comparison groups that are as similar as possible to the treatment group (propensity score matching) or by using approaches to control for observed and unobserved variables (e.g., fixed effects models, instrumental variables, ordinary least squares regression analysis, and regression discontinuity designs). Unfortunately, all the nonexperimental approaches require strong assumptions to assure that unbiased estimates are obtained, and these assumptions are not always testable. Overall, the evidence indicates that nonexperimental approaches generally do not do a good job of replicating experimental estimates and that the most common problem is the lack of suitable data to control for key differences between the treatment group and comparison group. The most promising nonexperimental approach appears to be the regression discontinuity design, but this approach requires a much larger sample size to obtain the same amount of precision as an experiment.

Although a well designed experiment can eliminate internal validity problems, there are often issues regarding external validity. External validity for the eligible population is threatened if either the participating sites or individuals volunteer for the program rather than are randomly assigned. Experiments typically randomly assign people to the treatment or control group after they have applied for or enrolled in the program. Thus, experiments typically do not pick up any effects the intervention might have that encourage or discourage participation. Another issue is the finite time horizon that typically accompanies an experiment; if the experiment is offered on a temporary basis and potential participants are aware of the finite period of the experiment, their behavior may be different than if the program were permanent. Experiments frequently have no-shows and crossovers, and these phenomena can only be addressed by resorting to nonexperimental methods. Finally, experiments generally cannot capture scale or general equilibrium effects.

Several things can go wrong in implementing an experiment. First, the intervention might change while the experiment is implemented. A common occurrence is that the intervention itself changes, either because the original design was not working or circumstances change. The intervention should be carefully monitored to observe this and the evaluation modified if it occurs. Another potential problem is that participants may not understand the intervention; to guard against this, knowledge should be tested and instruction provided if it is a problem.

Many of the problems described here occurred in the random assignment evaluation of the Job Training Partnership Act evaluation in the United States. Although the intent was to include a random sample of local programs, most local programs refused to participate, resulting in questions of external validity. Random assignment in the study occurred after an appropriate service strategy was selected. This assured that each strategy could be compared to exclusion from the program, but the alternative strategies could not be compared with each other. Crossover and no-show rates were high in the study, and it is likely many policy officials did not interpret the impact findings correctly. For example, 40% of the men recommended for classroom training received that treatment, as did 24% of the men in the control group. Thus, the difference in outcomes for the treatment and control groups is very different from the impact of receiving training versus not receiving training. Another feature that makes interpretation difficult is that one service strategy included those who received the most expensive strategy, on-the-job training, and the least expensive strategy, job search assistance; this makes it impossible to differentiate the impacts of these disparate strategies. Finally, the interim report made it difficult for the reader to separate impacts from the post-program period from those from the in-program period and much more attention was paid to the impact for the entire treatment group than the nonexperimentally estimated impact on the treated. It is likely that policy makers failed to understand the subtle but important differences here.

There is no doubt that experiments offer many advantages over nonexperimental evaluations. However, many problems can and do arise, and an experiment is not a substitute for thinking.

Kurzfassung

Es herrscht weitestgehend Konsens darüber, dass randomisierte kontrollierte Studien – Sozialexperimente – der „Goldstandard“ für die Bewertung sozialer Programme sind. Es gibt jedoch viele wichtige Aspekte, die sich nicht durch solche Studien bewerten lassen, und bei der Durchführung dieser Studien kann oft etwas schiefgehen. Die vorliegende Arbeit untersucht diese Themen und bietet Lösungsvorschläge für häufig entstehende Probleme. Es gibt viele Gründe, warum Experimente gegenüber nichtexperimentellen Bewertungen bevorzugt werden. Da es nicht möglich ist, die gleiche Person in zwei verschiedenen Zuständen gleichzeitig zu beobachten, müssen wir auf eine alternative Vorgehensweise zurückgreifen, um einzuschätzen, was mit den Probanden geschehen wäre, hätten sie am Maßnahmenprogramm nicht teilgenommen.

Nichtexperimentelle Bewertungsansätze versuchen unvoreingenommene, konsistente Aussagen über Auswirkungen zu treffen, indem sie entweder Vergleichsgruppen entwickeln, die der Behandlungsgruppe so ähnlich wie möglich sind („propensity score matching“), oder indem sie Ansätze verwenden, die beobachtete und nichtbeobachtete Variablen kontrollieren (z. B. Fixed-effects-Modelle, Instrumentalvariablen, „Ordinary Least Squares Regression Analysis“ und „Regression Discontinuity Designs“). Leider benötigen sämtliche nichtexperimentellen Ansätze starke Annahmen, um zu gewährleisten, dass unvoreingenommene Einschätzungen erfolgen. Es ist nicht immer möglich, solche Annahmen zu prüfen. Im Allgemeinen deuten alle Anzeichen darauf hin, dass nichtexperimentelle Ansätze nur schlecht experimentelle Einschätzungen reproduzieren können. Das häufigste Problem ist dabei der Mangel an geeigneten Daten, um die Kernunterschiede zwischen der Treatmentgruppe und der Vergleichsgruppe zu kontrollieren. Der vielversprechendste nichtexperimentelle Ansatz scheint das „Regression Discontinuity Design“ zu sein, wobei diese Methode eine wesentlich größere Versuchsgruppe benötigt, um die gleiche Präzision wie ein Experiment zu erreichen.

Obwohl ein gut geplantes Experiment Probleme der internen Validität ausschließen kann, bleiben oft Fragen der externen Validität. Die externe Validität hinsichtlich der Gesamtbevölkerung wird gefährdet, wenn entweder die teilnehmenden Standorte oder die Personen sich für das Programm freiwillig melden, anstatt zufällig ausgewählt zu werden. Normalerweise werden in Experimenten Personen zufällig der Treatmentgruppe oder der Kontrollgruppe zugeordnet nachdem sie sich für das Programm angemeldet haben. Auf dieser Weise bilden Experimente in der Regel Faktoren nicht ab, die Personen zur Teilnahme ermutigen oder von der Teilnahme abschrecken können. Ein weiterer Aspekt ist der begrenzte Zeithorizont, den ein Experiment normalerweise mit sich bringt. Läuft das Experiment nur für eine begrenzte Zeit und sind sich die potenziellen Teilnehmer dessen bewusst, kann ihr Verhalten anders sein, als wenn das Experiment zeitlich unbegrenzt wäre. Bei Experimenten muss man oft mit No-Shows und Cross-Overs rechnen, und nur nichtexperimentelle Methoden sind dafür geeignet, solche Phänomene zu berücksichtigen. Zuletzt können Experimente in der Regel Skaleneffekte und allgemeine Gleichgewichtseffekte nicht erfassen.

Bei der Durchführung von Experimenten kann einiges schief gehen. Erstens kann sich während der Durchführung die Intervention ändern. Dies passiert häufig, entweder weil das ursprüngliche Design sich als ungeeignet erwiesen hat oder weil sich die Bedingungen geändert haben. Die Intervention ist aus diesem Grund sorgfältig zu beobachten und die Bewertung gegebenenfalls entsprechend anzupassen. Ein weiteres potenzielles Problem ist die Möglichkeit, dass die Teilnehmer die Intervention nicht verstehen. Um hier vorzubeugen, sollten das Verständnis der Teilnehmer hinsichtlich der Intervention geprüft und ggf. Schulungen bereitgestellt werden.

Viele der hier beschriebenen Probleme sind bei randomisierten Bewertung des Job Training Partnership Act in den USA aufgetreten. Obwohl eine Zufallsauswahl von lokalen Programmen teilnehmen sollte, weigerten sich die meisten dieser Programme. Diese Weigerung wirft Fragen der externen Validität der Studie auf. Die Randomisierung für die Studie erfolgte, nachdem eine passende Maßnahmenstrategie für die verschiedenen Teilnehmer ausgewählt worden war. Diese Vorgehensweise stellte sicher, dass jede Strategie mit der Situation bei Nichtteilnahme am Programm verglichen werden konnte, jedoch konnten die alternativen Strategien dadurch nicht miteinander verglichen werden. Die Cross-Overs und No-Show-Raten für die Studie waren hoch, und es ist wahrscheinlich, dass viele Beamte die Ergebnisse falsch interpretierten. Zum Beispiel bekamen nur 40% der Männer, für die eine Schulung empfohlen wurde, dieses Treatment, aber auch 24% der Männer in der Kontrollgruppe. Die unterschiedlichen Ergebnisse der Treatment- und Kontrollgruppen sind also nicht auf die Tatsachte zurückzuführen, dass eine Gruppe Schulungen bekommen hat und die andere nicht. Eine weitere Besonderheit, die die Interpretation schwierig macht, ist, dass eine Maßnahmenstrategie sowohl die teuersten Maßnahmen (die Ausbildung am Arbeitsplatz) als auch die billigsten Maßnahmen (die Hilfe bei der Jobsuche) enthielt. Dadurch ist es nicht möglich, zwischen den Auswirkungen dieser disparaten Maßnahmen zu unterscheiden. Schließlich machte es der Zwischenbericht dem Leser schwer, die Auswirkungen, die in der Zeit nach dem Programm beobachtet wurden, von denen während der Programmzeit zu trennen, und die Auswirkungen für die gesamte Treatmentgruppe bekamen viel mehr Aufmerksamkeit als die nichtexperimentell geschätzten Auswirkungen auf die Maßnahmenteilnehmer. Höchstwahrscheinlich sind den Entscheidungsträgern subtile, aber wichtige Unterschiede hier entgangen.

Es gibt keinen Zweifel, dass Experimente zahlreiche Vorteile gegenüber nichtexperimentellen Bewertungen haben. Es können dabei jedoch viele Probleme auftreten, und ein Experiment kann das Nachdenken nicht ersetzen.

There are a number of factors that help determine the units used for random assignment. Assignment at the individual level generates the most observations, and hence the most precision, but in many settings it is not practical to conduct random assignment at the individual level. For example, in an educational setting, it is generally not feasible to assign students in the same classroom to different treatments. The most important problem resulting from random assignment at a more aggregated level is that there are fewer observations, leading to a greater probability that the treatment and control groups are not well matched and the potential for imprecise estimates of the treatment effect.

It is important to distinguish between a known null treatment and a broader “whatever they would normally get” control treatment. As discussed below, the latter situation often makes it difficult to know what comparison is specifically being made and how estimated the impacts should be interpreted.

Orr ( 1999 ) notes that by including a variety of treatment doses, we can learn more than the effect of a single dose level on participants; instead, we can estimate a behavioral response function that provides information on how the impact varies with the dosage. Heckman ( 2008 ) provides a broader look at the concept of economic causality.

There are many views on how serious Hawthorne effects distort impact estimates, in the original illumination studies at the Hawthorne works in the 1930s and in other contexts.

See Barnow et  al. ( 1980 ) for a discussion of selection bias and a summary of approaches to deal with the problem.

As discussed more in the sections below, many circumstances can arise that make experimental findings difficult to interpret.

Propensity score matching is a two-step procedure where in the first stage the probability of participating in the program is estimated, and, in the simplest approach, in the second stage the comparison group is selected by matching each member of the treatment group with the nonparticipating person with the closest propensity score; there are numerous variations involving techniques such as multiple matches, weighting, and calipers. Regression discontinuity designs involve selection mechanisms where treatment/control status is determined by a screening variable.

It is important to keep in mind that regression discontinuity designs provide estimates of impact near the discontinuity, but experiments provide estimates over a broader range of the population.

See also the reply by Dehejia ( 2005 ) and the rejoinder by Smith and Todd ( 2005b ).

The paper by Wilde and Hollister ( 2007 ) is one of the papers reviewed by Cook et  al. ( 2008 ), and they claim that because Wilde and Hollister control on too few covariates and draw their comparison group from other areas than where the treatment group resides, the Wilde and Hollister paper does not offer a good test of propensity score matching.

Schochet ( 2009 ) shows that a regression discontinuity design typically requires a sample three to four times as large as an experimental design to achieve the same level of statistical precision.

See Moffitt ( 1992 ) for a review of the topic and Card and Robins ( 2005 ) for a recent evaluation of entry effects.

See Lise et  al. ( 2005 ) for further discussion of these issues.

See Heckman et  al. ( 2000 ) for discussion of this issue and estimates for JTPA. The authors find that JTPA provides only a small increase in the opportunity to receive training and that both JTPA and its substitutes increase earnings for participants; thus, focusing only on the experimental estimates of training impacts can lead to a large underestimate of the impact of training on earnings.

The Job Corps evaluation was able to deny services to a small proportion of applicants by including all eligible Job Corps applicants in the study, with only a relatively small proportion of the treatment group interviewed. The reason that this type of design has not been more actively used is that if there is a substantial fixed cost per site included in the experiment, including all sites generates large costs and for a fixed budget results in a smaller overall sample.

Comprehensive community initiatives are generally complex interventions that include interventions in a number of areas including employment, education, health, and community organization. See Connell and Kubisch ( 1998 ) for a discussion of comprehensive community initiatives and why they are difficult to evaluate.

See Barnow ( 1987 ) for a summary of the diverse findings from the evaluations of the Comprehensive Employment and Training Act (CETA) that were obtained when a number of analysts used diverse nonexperimental methods to evaluate the program.

I  was involved in the National JTPA study as a subcontractor on the component that investigated the possibility of using nonexperimental approaches to determine the impact of the program rather than experimental approaches.

The final report was published as Orr et  al. ( 1996 ).

Although exempting participating sites from performance standards sanctions may increase participation, it also reduces external validity because the participating sites no longer face the same performance incentives.

Some tables in the executive summary (e.g., Exhibit S.2 and Exhibit S.6) only provide the impact per assignee, and significance levels are only provided for estimates of impact per assignee.

A U.S. Department of Labor senior official complained to me that one contractor refused to provide her with impacts per enrollee because they were based on nonexperimental methods and could not, therefore, be believed. She opined that the evaluation had little value for policy decisions if the evaluation could not provide the most important information she needed.

Although I  argue that estimates on the eligible population, sometimes referred to as “intent to treat” (ITT) estimates are prone to misinterpretation, estimating participation rates and the determinants of participation can be valuable for policy officials to learn the extent to which eligible individuals are participating and what groups appear to be underserved. See Heckman and Smith ( 2004 ).

It is, of course, important to capture the impacts for the in-program period so that a cost-benefit analysis can be conducted.

For example, Stanley et  al. ( 1998 ) summarize the impact findings from the National JTPA Study by presenting the earnings impacts in the second year after random assignment, which is virtually all a post-program period.

See Exhibit 3.18 of Bloom et  al. ( 1993 ).

This is not a simple matter when program length varies significantly, as it did in the JTPA program. If the participants are followed long enough, however, part of the follow-up period should be virtually should all be after program exit.

Angrist, J.D., Krueger, A.B.: Does compulsory attendance affect schooling and earnings? Q.  J. Econ. 106 (4), 979–1014 (1991)

Article   Google Scholar  

Angrist, J.D., Krueger, A.B.: Instrumental variables and the search for identification: from supply and demand to natural experiments. J.  Econ. Perspect. 15 (4), 9–85 (2001)

Google Scholar  

Barnow, B.S.: The impacts of CETA programs on earnings: a review of the literature. J.  Hum. Resour. 22 (2), 157–193 (1987)

Barnow, B.S.: The ethics of federal social program evaluation: a response to Jan Blustein. J.  Policy Anal. Manag. 24 (4), 846–848 (2005)

Barnow, B.S., Cain, G.G., Goldberger, A.S.: Issues in the analysis of selection bias. In: Stromsdorfer, E.W., Farkas, G. (eds.) Evaluation Studies Review Annual, vol.  5. Sage Publications, Beverly Hills (1980)

Bloom, H.S.: Accounting for no-shows in experimental evaluation designs. Evaluation Rev. 8 (2), 225–246 (1984)

Bloom, H.S., Orr, L.L., Cave, G., Bell, S.H., Doolittle, F.: The National JTPA Study: Title II-A Impacts on Earnings and Employment at 18 Months. Abt Associates, Bethesda, MD (1993)

Bloom, H.S., Michalopoulos, C., Hill, C.J., Lei, Y.: Can Nonexperimental Comparison Group Methods Match the Findings from a Random Assignment Evaluation of Mandatory Welfare to Work Programs? MDRC, New York (2002)

Blustein, J.: Toward a more public discussion of the ethics of federal social program evaluation. J.  Policy Anal. Manag. 24 (4), 824–846 (2005a)

Blustein, J.: Response. J.  Policy Anal. Manag. 24 (4), 851–852 (2005b)

Burtless, G.: The case for randomized field trials in economic and policy research. J.  Econ. Perspect. 9 (2), 63–84 (1995)

Card, D., Robins, P.K.: How important are “entry effects” in financial incentive programs for welfare recipients? J.  Econometrics 125 (1), 113–139 (2005)

Connell, J.P., Kubisch, A.C.: Applying a theory of change approach to the evaluation of comprehensive community initiatives: progress, prospects, and problems. In: Fulbright-Anderson, K., Kubisch, A.C., Connell, J.P. (eds.) New Approaches to Evaluating Community Initiatives, vol.  2, Theory, Measurement, and Analysis. The Aspen Institute, Washington, DC (1998)

Cook, T.D., Shadish, W.R., Wong, V.C.: Three conditions under which experiments and observational studies produce comparable causal estimates: new findings from within-study comparisons. J.  Policy Anal. Manag. 27 (4), 724–750 (2008)

Dehejia, R.H.: Practical propensity score matching: a reply to Smith and Todd. J.  Econometrics 125 (1), 355–364 (2005)

Dehejia, R.H., Wahba, S.: Propensity score matching methods for nonexperimental causal studies. Rev. Econ. Statistics 84 (1), 151–161 (2002)

Goldberger, A.S.: Selection Bias in Evaluating Treatment Effects: Some Formal Illustrations. Institute for Research on Poverty, Discussion Paper 123–72, University of Wisconsin, Madison, WI (1972)

Greenberg, D.H., Shroder, M.: The Digest of Social Experiments, 3rd  edn. The Urban Institute Press, Washington DC (2004)

Hahn J., Todd, P.E., Van der Klaauw, W.: Identification and estimation of treatment effects with a regression discontinuity design. Econometrica 69 (1), 201–209 (2001)

Heckman, J.J.: Economic causality. Int. Stat. Rev. 76 (1), 1–27 (2008)

Heckman, J.J., Smith, J.A.: Assessing the case for social experiments. J.  Econ. Perspect. 9 (2), 85–110 (1995)

Heckman, J.J., Smith, J.A.: The determinants of participation in a social program: evidence from a prototypical job training program. J.  Labor Econ. 22 (2), 243–298 (2004)

Heckman, J.J., Ichimura, H., Todd, P.E.: Matching as an econometric evaluation estimator: evidence from evaluating a job training programme. Rev. Econ. Stud. 64 (4), 605–654 (1997)

Heckman, J.J., Hohmann, N., Smith, J., Khoo, M.: Substitution and dropout bias in social experiments: a study of an influential social experiment. Q.  J. Econ. 115 (2), 651–694 (2000)

Hollister, R.G. jr.: The role of random assignment in social policy research: opening statement. J.  Policy Anal. Manag. 27 (2), 402–409 (2008)

Lise, J., Seitz, S., Smith, J.: Equilibrium Policy Experiments and the Evaluation of Social Programs. Unpublished manuscript (2005)

Moffitt, R.: Evaluation methods for program entry effects. In: Manski, C., Garfinkel, I. (eds.) Evaluating Welfare and Training Programs. Harvard University Press, Cambridge, MA (1992)

Orr, L.L.: Social Experiments: Evaluating Public Programs with Experimental Methods. Sage Publications, Thousand Oaks, CA (1999)

Orr, L.L., Bloom, H.S., Bell, S.H., Doolittle, F., Lin, W.: Does Training for the Disadvantaged Work? Evidence from the National JTPA Study. The Urban Institute Press, Washington, DC (1996)

Rolston, H.: To learn or not to learn. J.  Policy Anal. Manag. 24 (4), 848–849 (2005)

Schochet, P.Z.: National Job Corps Study: Methodological Appendixes on the Impact Analysis. Mathematical Policy Research, Princeton, NJ (2001)

Schochet, P.Z.: Comments on Dr. Blustein's paper, toward a more public discussion of the ethics of federal social program evaluation. J.  Policy Anal. Manag. 24 (4), 849–850 (2005)

Schochet, P.Z.: Statistical power for regression discontinuity designs in education evaluations. J.  Educ. Behav. Stat. 34 (2), 238–266 (2009)

Shadish, W.R., Clark, M.H., Steiner, P.M.: Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random and nonrandom assignments. J.  Am. Stat. Assoc. 103 (484), 1334–1343 (2008)

Smith, J.A., Todd, P.E.: Does matching overcome LaLonde's critique of nonexperimental estimators? J.  Econometrics 125 (1), 305–353 (2005a)

Smith, J.A., Todd, P.E.: Rejoinder. J.  Econometrics 125 (1), 305–353 (2005b)

Stanley, M., Katz, L., Krueger, A.: Developing Skills: What We Know about the Impacts of American Employment and Training Programs on Employment, Earnings, and Educational Outcomes. Cambridge, MA, unpublished manuscript (1998)

Wilde, E.T., Hollister, R.: How close is close enough? Evaluating propensity score matching using data from a class size reduction experiment. J.  Policy Anal. Manag. 26 (3), 455–477 (2007)

Wilson, L.A., Stoker, R.P., McGrath, D.: Welfare bureaus as moral tutors: what do clients learn from paternalistic welfare reforms? Soc. Sci. Quart. 80 (3), 473–486 (1999)

Download references

Acknowledgements

I  am grateful to Laura Langbein, David Salkever, Peter Schochet, Gesine Stephan, and participants in workshops at George Washington University and the University of Maryland at Baltimore County for comments. I  am particularly indebted to Jeffrey Smith for his thoughtful detailed comments and suggestions. Responsibility for remaining errors is mine.

Author information

Authors and affiliations.

Trachtenberg School of Public Policy and Public Administration, George Washington University, 805 21st St, NW, Washington, DC, 20052, USA

Burt S. Barnow

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Burt S. Barnow .

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Barnow, B.S. Setting up social experiments: the good, the bad, and the ugly. ZAF 43 , 91–105 (2010). https://doi.org/10.1007/s12651-010-0042-6

Download citation

Published : 20 October 2010

Issue Date : November 2010

DOI : https://doi.org/10.1007/s12651-010-0042-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Propensity Score
  • Random Assignment
  • Propensity Score Match
  • Social Experiment
  • Impact Estimate

conducting a social experiment

TheHighSchooler

8 Effective Social Psychology Experiments & Activities For High School Students

In school, social interaction plays a crucial role and forms the core of one’s academic life. Have you ever been curious about what others are thinking? Have you ever found yourself wondering about the thoughts and opinions of others? This is something that crosses everyone’s mind. The study of social psychology gives you a peek into some of these interesting stances. 

Social psychology is a field of psychology that investigates how the social environment shapes people’s thoughts, beliefs, and behavior. By studying social psychology, one can gain a deeper understanding of people’s actions and the consequences they have. Furthermore, engaging in practical experiments and activities can make this subject even more fascinating. 

In this post, you will find such engaging specific activities that will offer students valuable hands-on experience in the field of social psychology, allowing them to gain practical knowledge and insights into this fascinating subject matter.

Social psychology experiments and activities for high school students 

Here are a few interesting experiments and activities for high school students to learn about social psychology : 

1. Bystander effect simulation

Group of people surrounding an infured boy

The bystander effect [ 1 ] is a social psychology phenomenon that studies how an individual is unlikely to help in an urgent situation if surrounded by other people. Students can conduct experiments to study this effect in controlled settings. They can choose a social setting and select one person to pretend to need help, such as someone with a false injury struggling to cross the road or gather scattered items. 

The remaining students can observe their behavior while amongst the public.  This experiment aims to display the phenomenon called “diffusion of responsibility”. It will also help one understand the importance of helping people, acts of kindness , and empathetic understanding. Understanding the Bystander effect helps one understand the concept of social initiation, and can further be useful when a real social situation needs their intervention. 

2. Conformity experiment 

Measuring and predicting the length of a rod

People tend to change their beliefs to match what they think is normal, which is called conformity bias. An experiment can be done to test this by asking a group of students to guess the length of a rod from three choices (25 cm, 30 cm, and 40 cm), with 25 cm being the correct answer. 

Some students might be told to give the wrong answer (like 40 cm) and act like they are sure it’s right, giving confident explanations for the same. This creates a situation of peer pressure and social conformity, making the students want to fit in and therefore agree with the group.

Other students might start to do the same thing as well to fit in with their friends. This experiment shows how conformity bias works. It also teaches students about the effects of peer pressure and social conformity, and how acting like others can affect things like confidence.

3. The marshmallow test 

Kids having marshmallows and cookies

The marshmallow test is a study about delaying pleasure, called delayed gratification. This happens when something else gets in the way of enjoying something right away. In an experiment such as this, immediate gratification can be understood as being given something delicious and eating it immediately. High school students can perform this experiment on preschoolers who are between three and five years old. 

The students will randomly select a few children and observe them individually. Each child will be given one marshmallow at a time and will be told that they will be given one more marshmallow if they resist eating this one until the observer returns. This is the process of delayed gratification [ 2 ]

The students would then observe and note the number of kids who attempted and succeeded in doing the same, and see if it agrees with their hypothesis. This test can help the students learn the importance of delayed gratification and how one can apply it to build virtues like discipline and organization.

4. Group polarization experiments 

Discussing

The society contributes tremendously to forming one’s beliefs, prejudices, stereotypes, and notions. This particular experiment focuses on how societal agreements and discussions can strengthen already existing beliefs, lead them to extremities, and increase the rigidity of one’s thoughts. 

These experiments can take place both in classrooms and among peer groups. The first step is for students to express their opinions on a specific societal topic, such as gender norms. Then, the teacher can split the students into pairs, each holding a different viewpoint. 

The pairs will engage in discussions about the topic, sharing their personal opinions and biases. This increases their insight into the topic and open to more agreeable or disagreeable opinions. As the next step, the students will be asked their personal stance on the same topics again after the discussion. 

As per the hypothesis, their opinions will be more diversely spread and will have an increased intensity. This will help them notice any changes in the level of emphasis, aggressiveness, and rigidity of their opinions before and after the discussion. This experiment helps one realize the social effect on the rigidity of one’s thought formation and how social construct plays a role in molding one’s beliefs and values to polarized extremities. 

5. Mirror neurons 

Role play

The brain has a fascinating component called mirror neurons. Just like mirror images,  these are activated by subconsciously copying or adapting to performing any action or feeling a certain emotion, because others are doing so.

This explains why laughter can be contagious, or when your friend feels sad without any apparent reason, you begin feeling down in the dumps too. This indicates how we have the natural ability to empathize and feel others’ emotions only by imagining us being in their shoes, or even by being in the same environment as they are. Conducting experiments and activities, such as imitation, can help us understand the workings of mirror neurons. 

In a classroom or peer group setting, students can choose to perform a skit based on a story they know, but they must play a character they don’t particularly like. For instance, a student who dislikes Draco Malfoy from the Harry Potter series may be assigned to play that character. After the skit, the students can discuss the character traits of the parts they played and the qualities they possess.

In the end, it will be seen that the students have developed a sense of understanding and empathy towards the character that they previously disliked, by being in the same character for some time. Through these exercises, the students can learn how mirror neurons foster empathy, increase understanding, and make it easier to take on different perspectives. 

6. Nonverbal cues and communication

Charades

Societal construct is built in a way that puts so much emphasis on communication skills but ironically conducts it more in nonverbal ways. For high school students, it is not only important for them to be aware of its importance, but to learn it through the perspective of social psychology. This can be manifested as a learning-based class activity similar to the game of dumb charades. 

In this activity, the teacher or a peer will split the students into two teams. Then, one member from each team will be chosen to stand in front of the class and be given a list of emotions to express through facial expressions. Starting with simple emotions like happiness and sadness, they will gradually move on to more complex emotions like anticipation, confusion, grief, and sarcasm. 

The other team members will have to guess the nonverbal cues being portrayed by their teammates and will earn five points for every correct guess. By working together, the class can gain a better understanding of nonverbal communication and its significant impact on even the smallest interactions. This fosters collaborative engagement and teamwork, along with increasing understanding and receptive levels. 

7. Foot-in-the-door experiment

The foot-in-the-door technique is derived from the English idiom that means getting an initial start to something. This technique is studied in social psychology as a strategy used usually in the corporate or marketing sector. This social phenomenon can be observed in the form of organizing an activity like role play . 

The class will be split into pairs, and each pair will act out a marketing scenario. For example, they might choose to sell a skincare product. In the scenario, the salesperson will start by offering a free sample product and explaining its qualities. This small request is more likely to be accepted by the customer as it does not require much attention or effort, or any form of financial demand. 

Then, the salesperson will slowly follow up by convincing the customer to buy the product after trying it and agreeing with the description. In a social situation like this, it builds pressure on the customer to maintain the same agreeable behavior as before, which is why the customer will be more likely to buy the product. This experiment helps the student learn about social conformity and how society plays a role in shaping one’s moral values, categorizing their behavior as acceptable and non-acceptable.

8. Door-in-the-face experiment

Salesperson inviting people to the event

This technique is the exact opposite of foot-in-the-door activity. In the case of a marketing strategy, it is used very smartly. High school students can conduct a social experiment with the permission and supervision of their teacher or faculty member.

The experiment involves inviting someone to a fundraiser organized by their school or institution. The students will start by making an unreasonable request, such as asking a random person to donate a thousand dollars to the charitable initiative of the fundraiser. 

The person is likely to deny the request, but that denial can make the person feel guilty for responding negatively. The students will then follow up with a small request to attend the fundraiser event. This is now possible and easy for the person to agree to, and also calms down the guilt of denying the earlier request by forming an acceptable image of an agreeable person.

This experiment teaches students about the importance of social acceptability in building self-image and confidence. It also lets them get an insight into how society can play a role in both building their values and morals, while at the same time, inducing feelings of unease and guilt. 

Wrapping it up

Already an intriguing subject, social psychology can be made even more fun by incorporating practical experiments and activities. The experiments done in social psychology are for observational and comprehensive purposes. 

They aim to better one’s understanding of social settings and their impact on an individual’s mind, together forming a cohesive psycho-social educational experience. Additionally, students can also engage in psychology games and activities for more clarity on the subject matter. These activities will help you dive deeper into how society operates, and also get to look at it from an observer’s perspective, giving you a clear, unbiased, and non-judgmental view of social occurrences and phenomena. 

  • James M. Hudson, & Amy Bruckman. (2004). The Bystander Effect: A Lens for understanding patterns of participation.  The Journal of the Learning Sciences, 13(2), 165–195.
  • Mischel Walter; Ebbesen, Ebbe B. (1970). “Attention in delay of gratification”. Journal of Personality and Social Psychology . 16 (2): 329–337.

conducting a social experiment

Sananda Bhattacharya, Chief Editor of TheHighSchooler, is dedicated to enhancing operations and growth. With degrees in Literature and Asian Studies from Presidency University, Kolkata, she leverages her educational and innovative background to shape TheHighSchooler into a pivotal resource hub. Providing valuable insights, practical activities, and guidance on school life, graduation, scholarships, and more, Sananda’s leadership enriches the journey of high school students.

Explore a plethora of invaluable resources and insights tailored for high schoolers at TheHighSchooler, under the guidance of Sananda Bhattacharya’s expertise. You can follow her on Linkedin

Leave a Comment Cancel reply

Save my name, email, and website in this browser for the next time I comment.

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 1. Introducing Social Psychology

1.3 Conducting Research in Social Psychology

Learning Objectives

  • Explain why social psychologists rely on empirical methods to study social behavior.
  • Provide examples of how social psychologists measure the variables they are interested in.
  • Review the three types of research designs, and evaluate the strengths and limitations of each type.
  • Consider the role of validity in research, and describe how research programs should be evaluated.

Social psychologists are not the only people interested in understanding and predicting social behavior or the only people who study it. Social behavior is also considered by religious leaders, philosophers, politicians, novelists, and others, and it is a common topic on TV shows. But the social psychological approach to understanding social behavior goes beyond the mere observation of human actions. Social psychologists believe that a true understanding of the causes of social behavior can only be obtained through a systematic scientific approach, and that is why they conduct scientific research. Social psychologists believe that the study of social behavior should be empirical —that is, based on the collection and systematic analysis of observable data .

The Importance of Scientific Research

Because social psychology concerns the relationships among people, and because we can frequently find answers to questions about human behavior by using our own common sense or intuition, many people think that it is not necessary to study it empirically (Lilienfeld, 2011). But although we do learn about people by observing others and therefore social psychology is in fact partly common sense, social psychology is not entirely common sense.

Is social psychology just common sense?

To test for yourself whether or not social psychology is just common sense, try doing this activity. Based on your past observations of people’s behavior, along with your own common sense, you will likely have answers to each of the questions on the activity. But how sure are you? Would you be willing to bet that all, or even most, of your answers have been shown to be correct by scientific research? If you are like most people, you will get at least some of these answers wrong.

Read through each finding, and decide if you think the research evidence shows that it is either mainly true or mainly false. When you have figured out the answers, think about why each finding is either mainly true or mainly false. You may also find some other ideas on this as you work your way through the textbook chapters!

  • Opposites attract.
  • An athlete who wins the bronze medal (third place) in an event is happier about his or her performance than the athlete who won the silver medal (second place).
  • Having good friends you can count on can keep you from catching colds.
  • Subliminal advertising (i.e., persuasive messages that are presented out of our awareness on TV or movie screens) is very effective in getting us to buy products.
  • The greater the reward promised for an activity, the more one will come to enjoy engaging in that activity.
  • Physically attractive people are seen as less intelligent than less attractive people.
  • Punching a pillow or screaming out loud is a good way to reduce frustration and aggressive tendencies.
  • People pull harder in a tug-of-war when they’re pulling alone than when pulling in a group.

H5P: TEST YOUR LEARNING: CHAPTER 1 DRAG THE WORDS – CLASSIC FINDINGS IN SOCIAL PSYCHOLOGY

Read through each finding, taken from Table 1.5 in the chapter summary, and decide if you think the research evidence shows that it is either mainly true or mainly false by dragging the correct word into each box. Pay attention to the number of “trues” and “falses” available! When you have figured out the answers, think about why each finding is either mainly true or mainly false. You may also find some other ideas on this as you work your way through the textbook chapters!

  • An athlete who wins the bronze medal (third place) in an event is happier about his or her performance than the athlete who wins the silver medal (second place).
  • Subliminal advertising (i.e., persuasive messages that are displayed out of our awareness on TV or movie screens) is very effective in getting us to buy products.

See Table 1.5 in the chapter summary for answers and explanations.

One of the reasons we might think that social psychology is common sense is that once we learn about the outcome of a given event (e.g., when we read about the results of a research project), we frequently believe that we would have been able to predict the outcome ahead of time. For instance, if half of a class of students is told that research concerning attraction between people has demonstrated that “opposites attract,” and if the other half is told that research has demonstrated that “birds of a feather flock together,” most of the students in both groups will report believing that the outcome is true and that they would have predicted the outcome before they had heard about it. Of course, both of these contradictory outcomes cannot be true. The problem is that just reading a description of research findings leads us to think of the many cases that we know that support the findings and thus makes them seem believable. The tendency to think that we could have predicted something that we probably would not have been able to predict is called the hindsight bias .

Our common sense also leads us to believe that we know why we engage in the behaviors that we engage in, when in fact we may not. Social psychologist Daniel Wegner and his colleagues have conducted a variety of studies showing that we do not always understand the causes of our own actions. When we think about a behavior before we engage in it, we believe that the thinking guided our behavior, even when it did not (Morewedge, Gray, & Wegner, 2010). People also report that they contribute more to solving a problem when they are led to believe that they have been working harder on it, even though the effort did not increase their contribution to the outcome (Preston & Wegner, 2007). These findings, and many others like them, demonstrate that our beliefs about the causes of social events, and even of our own actions, do not always match the true causes of those events.

Social psychologists conduct research because it often uncovers results that could not have been predicted ahead of time. Putting our hunches to the test exposes our ideas to scrutiny. The scientific approach brings a lot of surprises, but it also helps us test our explanations about behavior in a rigorous manner. It is important for you to understand the research methods used in psychology so that you can evaluate the validity of the research that you read about here, in other courses, and in your everyday life.

Social psychologists publish their research in scientific journals, and your instructor may require you to read some of these research articles. The most important social psychology journals are listed in “ Social Psychology Journals .” If you are asked to do a literature search on research in social psychology, you should look for articles from these journals.

Social Psychology Journals:

  • Journal of Personality and Social Psychology
  • Journal of Experimental Social Psychology
  • Personality and Social Psychology Bulletin
  • Social Psychology and Personality Science
  • Social Cognition
  • European Journal of Social Psychology
  • Social Psychology Quarterly
  • Basic and Applied Social Psychology
  • Journal of Applied Social Psychology

Note. The research articles in these journals are likely to be available in your college or university library.

We’ll discuss the empirical approach and review the findings of many research projects throughout this book, but for now let’s take a look at the basics of how scientists use research to draw overall conclusions about social behavior. Keep in mind as you read this book, however, that although social psychologists are pretty good at understanding the causes of behavior, our predictions are a long way from perfect. We are not able to control the minds or the behaviors of others or to predict exactly what they will do in any given situation. Human behavior is complicated because people are complicated and because the social situations that they find themselves in every day are also complex. It is this complexity—at least for me—that makes studying people so interesting and fun.

Measuring Affect, Behavior, and Cognition

One important aspect of using an empirical approach to understand social behavior is that the concepts of interest must be measured (Figure 1.7, “The Operational Definition”). If we are interested in learning how much Sarah likes Robert, then we need to have a measure of her liking for him. But how, exactly, should we measure the broad idea of “liking”? In scientific terms, the characteristics that we are trying to measure are known as conceptual variables , and the particular method that we use to measure a variable of interest is called an operational definition.

For anything that we might wish to measure, there are many different operational definitions, and which one we use depends on the goal of the research and the type of situation we are studying. To better understand this, let’s look at an example of how we might operationally define “Sarah likes Robert.”

Conceptual and measured variables

One approach to measurement involves directly asking people about their perceptions using self-report measures. Self-report measures are measures in which individuals are asked to respond to questions posed by an interviewer or on a questionnaire . Generally, because any one question might be misunderstood or answered incorrectly, in order to provide a better measure, more than one question is asked and the responses to the questions are averaged together. For example, an operational definition of Sarah’s liking for Robert might involve asking her to complete the following measure:

  • I enjoy being around Robert. Strongly disagree 1 2 3 4 5 6 Strongly agree
  • I get along well with Robert. Strongly disagree 1 2 3 4 5 6 Strongly agree
  • I like Robert. Strongly disagree 1 2 3 4 5 6 Strongly agree

The operational definition would be the average of her responses across the three questions. Because each question assesses the attitude differently, and yet each question should nevertheless measure Sarah’s attitude toward Robert in some way, the average of the three questions will generally be a better measure than would any one question on its own.

Although it is easy to ask many questions on self-report measures, these measures have a potential disadvantage. As we have seen, people’s insights into their own opinions and their own behaviors may not be perfect, and they might also not want to tell the truth—perhaps Sarah really likes Robert, but she is unwilling or unable to tell us so. Therefore, an alternative to self-report that can sometimes provide a more valid measure is to measure behavior itself. Behavioral measures are measures designed to directly assess what people do . Instead of asking Sarah how much she likes Robert, we might instead measure her liking by assessing how much time she spends with Robert or by coding how much she smiles at him when she talks to him. Some examples of behavioral measures that have been used in social psychological research are shown in Table 1.3, “Examples of Operational Definitions of Conceptual Variables That Have Been Used in Social Psychological Research.”

Table 1.3 Examples of Operational Definitions of Conceptual Variables that have been used in Sociological Research.
Conceptual variable Operational definitions
Aggression
Interpersonal attraction
Altruism
Group-decision making skills
Prejudice

Social Neuroscience: Measuring Social Responses in the Brain

Still another approach to measuring thoughts and feelings is to measure brain activity, and recent advances in brain science have created a wide variety of new techniques for doing so. One approach, known as electroencephalography (EEG) , is a technique that records the electrical activity produced by the brain’s neurons through the use of electrodes that are placed around the research participant’s head . An electroencephalogram (EEG) can show if a person is asleep, awake, or anesthetized because the brain wave patterns are known to differ during each state. An EEG can also track the waves that are produced when a person is reading, writing, and speaking with others. A particular advantage of the technique is that the participant can move around while the recordings are being taken, which is useful when measuring brain activity in children who often have difficulty keeping still. Furthermore, by following electrical impulses across the surface of the brain, researchers can observe changes over very fast time periods.

Man wearing an EEG Cap

Although EEGs can provide information about the general patterns of electrical activity within the brain, and although they allow the researcher to see these changes quickly as they occur in real time, the electrodes must be placed on the surface of the skull, and each electrode measures brain waves from large areas of the brain. As a result, EEGs do not provide a very clear picture of the structure of the brain.

But techniques exist to provide more specific brain images. Functional magnetic resonance imaging (fMRI) is a neuroimaging technique that uses a magnetic field to create images of brain structure and function . In research studies that use the fMRI, the research participant lies on a bed within a large cylindrical structure containing a very strong magnet. Nerve cells in the brain that are active use more oxygen, and the need for oxygen increases blood flow to the area. The fMRI detects the amount of blood flow in each brain region and thus is an indicator of which parts of the brain are active.

Very clear and detailed pictures of brain structures (see Figure 1.9, “MRI BOLD activation in an emotional Stroop task”) can be produced via fMRI. Often, the images take the form of cross-sectional “slices” that are obtained as the magnetic field is passed across the brain. The images of these slices are taken repeatedly and are superimposed on images of the brain structure itself to show how activity changes in different brain structures over time. Normally, the research participant is asked to engage in tasks while in the scanner, for instance, to make judgments about pictures of people, to solve problems, or to make decisions about appropriate behaviors. The fMRI images show which parts of the brain are associated with which types of tasks. Another advantage of the fMRI is that is it noninvasive. The research participant simply enters the machine and the scans begin.

mri

Although the scanners themselves are expensive, the advantages of fMRIs are substantial, and scanners are now available in many university and hospital settings. The fMRI is now the most commonly used method of learning about brain structure, and it has been employed by social psychologists to study social cognition, attitudes, morality, emotions, responses to being rejected by others, and racial prejudice, to name just a few topics (Eisenberger, Lieberman, & Williams, 2003; Greene, Sommerville, Nystrom, Darley, & Cohen, 2001; Lieberman, Hariri, Jarcho, Eisenberger, & Bookheimer, 2005; Ochsner, Bunge, Gross, & Gabrieli, 2002; Richeson et al., 2003).

Observational Research

Once we have decided how to measure our variables, we can begin the process of research itself. As you can see in Table 1.4, “Three Major Research Designs Used by Social Psychologists,” there are three major approaches to conducting research that are used by social psychologists—the observational approach , the correlational approach , and the experimental approach . Each approach has some advantages and disadvantages.

Table 1.4 Three Major Research Designs Used by Social Psychologists
Research Design Goal Advantages Disadvantages
Observational To create a snapshot of the current state of affairs Provides a relatively complete picture of what is occurring at a given time. Allows the development of questions for further study. Does not assess relationships between variables.
Correlational To assess the relationships between two or more variables Allows the testing of expected relationships between variables and the making of predictions. Can assess these relationships in everyday life events. Cannot be used to draw inferences about the causal relationships between the variables.
Experimental To assess the causal impact of one or more experimental manipulations on a dependent variable Allows the drawing of conclusions about the causal relationships among variables. Cannot experimentally manipulate many important variables. May be expensive and take much time to conduct.

The most basic research design, observational research , is research that involves making observations of behavior and recording those observations in an objective manner . Although it is possible in some cases to use observational data to draw conclusions about the relationships between variables (e.g., by comparing the behaviors of older versus younger children on a playground), in many cases the observational approach is used only to get a picture of what is happening to a given set of people at a given time and how they are responding to the social situation. In these cases, the observational approach involves creating a type of “snapshot” of the current state of affairs.

One advantage of observational research is that in many cases it is the only possible approach to collecting data about the topic of interest. A researcher who is interested in studying the impact of an earthquake on the residents of Tokyo, the reactions of Israelis to a terrorist attack, or the activities of the members of a religious cult cannot create such situations in a laboratory but must be ready to make observations in a systematic way when such events occur on their own. Thus observational research allows the study of unique situations that could not be created by the researcher. Another advantage of observational research is that the people whose behavior is being measured are doing the things they do every day, and in some cases they may not even know that their behavior is being recorded.

One early observational study that made an important contribution to understanding human behavior was reported in a book by Leon Festinger and his colleagues (Festinger, Riecken, & Schachter, 1956). The book, called When Prophecy Fails , reported an observational study of the members of a “doomsday” cult. The cult members believed that they had received information, supposedly sent through “automatic writing” from a planet called “Clarion,” that the world was going to end. More specifically, the group members were convinced that Earth would be destroyed as the result of a gigantic flood sometime before dawn on December 21, 1954.

When Festinger learned about the cult, he thought that it would be an interesting way to study how individuals in groups communicate with each other to reinforce their extreme beliefs. He and his colleagues observed the members of the cult over a period of several months, beginning in July of the year in which the flood was expected. The researchers collected a variety of behavioral and self-report measures by observing the cult, recording the conversations among the group members, and conducting detailed interviews with them. Festinger and his colleagues also recorded the reactions of the cult members, beginning on December 21, when the world did not end as they had predicted. This observational research provided a wealth of information about the indoctrination patterns of cult members and their reactions to disconfirmed predictions. This research also helped Festinger develop his important theory of cognitive dissonance.

Despite their advantages, observational research designs also have some limitations. Most importantly, because the data that are collected in observational studies are only a description of the events that are occurring, they do not tell us anything about the relationship between different variables. However, it is exactly this question that correlational research and experimental research are designed to answer.

The Research Hypothesis

Because social psychologists are generally interested in looking at relationships among variables, they begin by stating their predictions in the form of a precise statement known as a research hypothesis . A research hypothesis is a  specific prediction about the relationship between the variables of interest and about the specific direction of that relationship . For instance, the research hypothesis “People who are more similar to each other will be more attracted to each other” predicts that there is a relationship between a variable called similarity and another variable called attraction. In the research hypothesis “The attitudes of cult members become more extreme when their beliefs are challenged,” the variables that are expected to be related are extremity of beliefs and the degree to which the cult’s beliefs are challenged.

Because the research hypothesis states both that there is a relationship between the variables and the direction of that relationship, it is said to be falsifiable, which means  that the outcome of the research can demonstrate empirically either that there is support for the hypothesis (i.e., the relationship between the variables was correctly specified) or that there is actually no relationship between the variables or that the actual relationship is not in the direction that was predicted . Thus the research hypothesis that “People will be more attracted to others who are similar to them” is falsifiable because the research could show either that there was no relationship between similarity and attraction or that people we see as similar to us are seen as less attractive than those who are dissimilar.

Correlational Research

Correlational research is designed to search for and test hypotheses about the relationships between two or more variables. In the simplest case, the correlation is between only two variables, such as that between similarity and liking, or between gender (male versus female) and helping.

In a correlational design, the research hypothesis is that there is an association (i.e., a correlation) between the variables that are being measured. For instance, many researchers have tested the research hypothesis that a positive correlation exists between the use of violent video games and the incidence of aggressive behavior, such that people who play violent video games more frequently would also display more aggressive behavior.

Correlational design

A statistic known as the Pearson correlation coefficient (symbolized by the letter r ) is normally used to summarize the association, or correlation, between two variables . The Pearson correlation coefficient can range from −1 (indicating a very strong negative relationship between the variables) to +1 (indicating a very strong positive relationship between the variables). Recent research has found that there is a positive correlation between the use of violent video games and the incidence of aggressive behavior and that the size of the correlation is about r = .30 (Bushman & Huesmann, 2010).

One advantage of correlational research designs is that, like observational research (and in comparison with experimental research designs in which the researcher frequently creates relatively artificial situations in a laboratory setting), they are often used to study people doing the things that they do every day. Correlational research designs also have the advantage of allowing prediction. When two or more variables are correlated, we can use our knowledge of a person’s score on one of the variables to predict his or her likely score on another variable. Because high-school grades are correlated with university grades, if we know a person’s high-school grades, we can predict his or her likely university grades. Similarly, if we know how many violent video games a child plays, we can predict how aggressively he or she will behave. These predictions will not be perfect, but they will allow us to make a better guess than we would have been able to if we had not known the person’s score on the first variable ahead of time.

Despite their advantages, correlational designs have a very important limitation. This limitation is that they cannot be used to draw conclusions about the causal relationships among the variables that have been measured. An observed correlation between two variables does not necessarily indicate that either one of the variables caused the other. Although many studies have found a correlation between the number of violent video games that people play and the amount of aggressive behaviors they engage in, this does not mean that viewing the video games necessarily caused the aggression. Although one possibility is that playing violent games increases aggression,

Causation

another possibility is that the causal direction is exactly opposite to what has been hypothesized. Perhaps increased aggressiveness causes more interest in, and thus increased viewing of, violent games. Although this causal relationship might not seem as logical, there is no way to rule out the possibility of such reverse causation on the basis of the observed correlation.

Causation

Still another possible explanation for the observed correlation is that it has been produced by the presence of another variable that was not measured in the research. Common-causal variables (also known as third variables ) are variables that are not part of the research hypothesis but that cause both the predictor and the outcome variable and thus produce the observed correlation between them (Figure 1.13, “Correlation and Causality”). It has been observed that students who sit in the front of a large class get better grades than those who sit in the back of the class. Although this could be because sitting in the front causes the student to take better notes or to understand the material better, the relationship could also be due to a common-causal variable, such as the interest or motivation of the students to do well in the class. Because a student’s interest in the class leads him or her to both get better grades and sit nearer to the teacher, seating position and class grade are correlated, even though neither one caused the other.

Correlation and causation

The possibility of common-causal variables must always be taken into account when considering correlational research designs. For instance, in a study that finds a correlation between playing violent video games and aggression, it is possible that a common-causal variable is producing the relationship. Some possibilities include the family background, diet, and hormone levels of the children. Any or all of these potential common-causal variables might be creating the observed correlation between playing violent video games and aggression. Higher levels of the male sex hormone testosterone, for instance, may cause children to both watch more violent TV and behave more aggressively.

You may think of common-causal variables in correlational research designs as “mystery” variables, since their presence and identity is usually unknown to the researcher because they have not been measured. Because it is not possible to measure every variable that could possibly cause both variables, it is always possible that there is an unknown common-causal variable. For this reason, we are left with the basic limitation of correlational research: correlation does not imply causation.

Experimental Research

The goal of much research in social psychology is to understand the causal relationships among variables, and for this we use experiments. Experimental research designs are research designs that include the manipulation of a given situation or experience for two or more groups of individuals who are initially created to be equivalent, followed by a measurement of the effect of that experience .

In an experimental research design, the variables of interest are called the independent variables and the dependent variables. The independent variable refers to the situation that is created by the experimenter through the experimental manipulations , and the dependent variable refers to the variable that is measured after the manipulations have occurred . In an experimental research design, the research hypothesis is that the manipulated independent variable (or variables) causes changes in the measured dependent variable (or variables). We can diagram the prediction like this, using an arrow that points in one direction to demonstrate the expected direction of causality:

viewing violence (independent variable) → aggressive behavior (dependent variable)

Consider an experiment conducted by Anderson and Dill (2000), which was designed to directly test the hypothesis that viewing violent video games would cause increased aggressive behavior. In this research, male and female undergraduates from Iowa State University were given a chance to play either a violent video game (Wolfenstein 3D) or a nonviolent video game (Myst). During the experimental session, the participants played the video game that they had been given for 15 minutes. Then, after the play, they participated in a competitive task with another student in which they had a chance to deliver blasts of white noise through the earphones of their opponent. The operational definition of the dependent variable (aggressive behavior) was the level and duration of noise delivered to the opponent. The design and the results of the experiment are shown in Figure 1.14, “An Experimental Research Design (After Anderson & Dill, 2000).”

A/B Testing

Experimental designs have two very nice features. For one, they guarantee that the independent variable occurs prior to measuring the dependent variable. This eliminates the possibility of reverse causation. Second, the experimental manipulation allows ruling out the possibility of common-causal variables that cause both the independent variable and the dependent variable. In experimental designs, the influence of common-causal variables is controlled, and thus eliminated, by creating equivalence among the participants in each of the experimental conditions before the manipulation occurs.

The most common method of creating equivalence among the experimental conditions is through random assignment to conditions before the experiment begins, which involves determining separately for each participant which condition he or she will experience through a random process, such as drawing numbers out of an envelope or using a website such as randomizer.org . Anderson and Dill first randomly assigned about 100 participants to each of their two groups. Let’s call them Group A and Group B. Because they used random assignment to conditions, they could be confident that before the experimental manipulation occurred , the students in Group A were, on average , equivalent to the students in Group B on every possible variable , including variables that are likely to be related to aggression, such as family, peers, hormone levels, and diet—and, in fact, everything else.

Then, after they had created initial equivalence, Anderson and Dill created the experimental manipulation—they had the participants in Group A play the violent video game and the participants in Group B play the nonviolent video game. Then they compared the dependent variable (the white noise blasts) between the two groups and found that the students who had viewed the violent video game gave significantly longer noise blasts than did the students who had played the nonviolent game. When the researchers observed differences in the duration of white noise blasts between the two groups after the experimental manipulation, they could draw the conclusion that it was the independent variable (and not some other variable) that caused these differences because they had created initial equivalence between the groups. The idea is that the only thing that was different between the students in the two groups was which video game they had played.

When we create a situation in which the groups of participants are expected to be equivalent before the experiment begins, when we manipulate the independent variable before we measure the dependent variable, and when we change only the nature of independent variables between the conditions, then we can be confident that it is the independent variable that caused the differences in the dependent variable. Such experiments are said to have high internal validity, where internal validity is the extent to which changes in the dependent variable in an experiment can confidently be attributed to changes in the independent variable .

Despite the advantage of determining causation, experimental research designs do have limitations. One is that the experiments are usually conducted in laboratory situations rather than in the everyday lives of people. Therefore, we do not know whether results that we find in a laboratory setting will necessarily hold up in everyday life. To counter this, researchers sometimes conduct  field experiments, which are experimental research studies that are conducted in a natural environment , such as a school or a factory .  However,   they are difficult to conduct because they require a means of creating random assignment to conditions, and this is frequently not possible in natural settings.

A second and perhaps more important limitation of experimental research designs is that some of the most interesting and important social variables cannot be experimentally manipulated. If we want to study the influence of the size of a mob on the destructiveness of its behavior, or to compare the personality characteristics of people who join suicide cults with those of people who do not join suicide cults, these relationships must be assessed using correlational designs because it is simply not possible to manipulate mob size or cult membership.

H5P: TEST YOUR LEARNING: CHAPTER 1 DRAG THE WORDS – INDEPENDENT AND DEPENDENT VARIABLES

Read through the following descriptions of experimental studies, and identify the independent and dependent variables in each scenario.

  • Amount of aggression:
  • Type of video game:
  • Size of group of onlookers
  • Speed of helping response
  • Amount of attitude change
  • Type of message
  • Hostile intention bias score
  • Type of word
  • Target of attribution
  • Type of attribution

Factorial Research Designs

Social psychological experiments are frequently designed to simultaneously study the effects of more than one independent variable on a dependent variable. Factorial research designs are experimental designs that have two or more independent variables . By using a factorial design, the scientist can study the influence of each variable on the dependent variable (known as the main effects of the variables) as well as how the variables work together to influence the dependent variable (known as the interaction between the variables). Factorial designs sometimes demonstrate the person by situation interaction.

In one such study, Brian Meier and his colleagues (Meier, Robinson, & Wilkowski, 2006) tested the hypothesis that exposure to aggression-related words would increase aggressive responses toward others. Although they did not directly manipulate the social context, they used a technique common in social psychology in which they primed (i.e., activated) thoughts relating to social settings. In their research, half of their participants were randomly assigned to see words relating to aggression and the other half were assigned to view neutral words that did not relate to aggression. The participants in the study also completed a measure of individual differences in agreeableness —a personality variable that assesses the extent to which people see themselves as compassionate, cooperative, and high on other-concern.

Then the research participants completed a task in which they thought they were competing with another student. Participants were told that they should press the space bar on the computer keyboard as soon as they heard a tone over their headphones, and the person who pressed the space bar the fastest would be the winner of the trial. Before the first trial, participants set the intensity of a blast of white noise that would be delivered to the loser of the trial. The participants could choose an intensity ranging from 0 (no noise) to the most aggressive response (10, or 105 decibels). In essence, participants controlled a “weapon” that could be used to blast the opponent with aversive noise, and this setting became the dependent variable. At this point, the experiment ended.

Agreeableness comparison chart

As you can see in Figure 1.15, “A Person-Situation Interaction,” there was a person-by-situation interaction. Priming with aggression-related words (the situational variable) increased the noise levels selected by participants who were low on agreeableness, but priming did not increase aggression (in fact, it decreased it a bit) for students who were high on agreeableness. In this study, the social situation was important in creating aggression, but it had different effects for different people.

Deception in Social Psychology Experiments

You may have wondered whether the participants in the video game study that we just discussed were told about the research hypothesis ahead of time. In fact, these experiments both used a cover story — a false statement of what the research was really about . The students in the video game study were not told that the study was about the effects of violent video games on aggression, but rather that it was an investigation of how people learn and develop skills at motor tasks like video games and how these skills affect other tasks, such as competitive games. The participants in the task performance study were not told that the research was about task performance. In some experiments, the researcher also makes use of an experimental confederate — a person who is actually part of the experimental team but who pretends to be another participant in the study . The confederate helps create the right “feel” of the study, making the cover story seem more real.

In many cases, it is not possible in social psychology experiments to tell the research participants about the real hypotheses in the study, and so cover stories or other types of deception may be used. You can imagine, for instance, that if a researcher wanted to study racial prejudice, he or she could not simply tell the participants that this was the topic of the research because people may not want to admit that they are prejudiced, even if they really are. Although the participants are always told—through the process of informed consent —as much as is possible about the study before the study begins, they may nevertheless sometimes be deceived to some extent. At the end of every research project, however, participants should always receive a complete debriefing in which all relevant information is given, including the real hypothesis, the nature of any deception used, and how the data are going to be used.

H5P: TEST YOUR LEARNING: CHAPTER 1 DRAG THE WORDS – TYPES OF RESEARCH DESIGN

Now that you have reviewed the three main types of research design used in social psychology, read each brief summary of empirical findings below and identify which type of design the results were derived from – experimental, observational or correlational. Table 1.4 contains some helpful information here.

  • There is a positive relationship between level of academic self-concept and self-esteem scores in university students.
  • People are more persuaded if given a two-sided versus a one-sided message.
  • People assigned to a group of four are more likely to conform to the dominant response in a perceptual task than people tasked with performing the task alone.
  • People in individualistic cultures make predominantly internal attributions about the causes of social behavior.
  • The more hours per month individuals spend doing voluntary work with people who are socially marginalized, the less they tend to believe in the just world hypothesis.
  • 13 year-olds engage in more acts of relational aggression towards their peers than 8 year-olds.

Interpreting Research

No matter how carefully it is conducted or what type of design is used, all research has limitations. Any given research project is conducted in only one setting and assesses only one or a few dependent variables. And any one study uses only one set of research participants. Social psychology research is sometimes criticized because it frequently uses university students from Western cultures as participants (Henrich, Heine, & Norenzayan, 2010). But relationships between variables are only really important if they can be expected to be found again when tested using other research designs, other operational definitions of the variables, other participants, and other experimenters, and in other times and settings.

External validity  refers to the extent to which relationships can be expected to hold up when they are tested again in different ways and for different people . Science relies primarily upon replication—that is, the repeating of research —to study the external validity of research findings. Sometimes the original research is replicated exactly, but more often, replications involve using new operational definitions of the independent or dependent variables, or designs in which new conditions or variables are added to the original design. And to test whether a finding is limited to the particular participants used in a given research project, scientists may test the same hypotheses using people from different ages, backgrounds, or cultures. Replication allows scientists to test the external validity as well as the limitations of research findings.

In some cases, researchers may test their hypotheses, not by conducting their own study, but rather by looking at the results of many existing studies, using a meta-analysis — a statistical procedure in which the results of existing studies are combined to determine what conclusions can be drawn on the basis of all the studies considered together . For instance, in one meta-analysis, Anderson and Bushman (2001) found that across all the studies they could locate that included both children and adults, college students and people who were not in college, and people from a variety of different cultures, there was a clear positive correlation (about r = .30) between playing violent video games and acting aggressively. The summary information gained through a meta-analysis allows researchers to draw even clearer conclusions about the external validity of a research finding.

Figure 1.16 Some Important Aspects of the Scientific Approach

Scientists generate research hypotheses , which are tested using an observational, correlational, or experimental research design .

The variables of interest are measured using self-report or behavioral measures .

Data is interpreted according to its validity (including internal validity and external validity ).

The results of many studies may be combined and summarized using meta-analysis .

It is important to realize that the understanding of social behavior that we gain by conducting research is a slow, gradual, and cumulative process. The research findings of one scientist or one experiment do not stand alone—no one study proves a theory or a research hypothesis. Rather, research is designed to build on, add to, and expand the existing research that has been conducted by other scientists. That is why whenever a scientist decides to conduct research, he or she first reads journal articles and book chapters describing existing research in the domain and then designs his or her research on the basis of the prior findings. The result of this cumulative process is that over time, research findings are used to create a systematic set of knowledge about social psychology (Figure 1.16, “Some Important Aspects of the Scientific Approach”).

H5P: Test your Learning: Chapter 1 True or False Quiz

Try these true/false questions, to see how well you have retained some key ideas from this chapter!

  • Social psychology is a scientific discipline.
  • Cultural differences are rarely studied nowadays in social psychology because it has been established that all of its important concepts are universal.
  • In social psychology, the primary focus in on the behavior of groups, not individuals.
  • Factorial designs are a type of correlational research.
  • Nonrandom assignments of participants to conditions in experimental social psychological research ensures that everyone has an equal chance of being in any of the conditions.

Key Takeaways

  • Social psychologists study social behavior using an empirical approach. This allows them to discover results that could not have been reliably predicted ahead of time and that may violate our common sense and intuition.
  • The variables that form the research hypothesis, known as conceptual variables, are assessed by using measured variables such as self-report, behavioral, or neuroimaging measures.
  • Observational research is research that involves making observations of behavior and recording those observations in an objective manner. In some cases, it may be the only approach to studying behavior.
  • Correlational and experimental research designs are based on developing falsifiable research hypotheses.
  • Correlational research designs allow prediction but cannot be used to make statements about causality. Experimental research designs in which the independent variable is manipulated can be used to make statements about causality.
  • Social psychological experiments are frequently factorial research designs in which the effects of more than one independent variable on a dependent variable are studied.
  • All research has limitations, which is why scientists attempt to replicate their results using different measures, populations, and settings and to summarize those results using meta-analyses.

Exercises and Critical Thinking

  • Using Google Scholar  find journal articles that report observational, correlational, and experimental research designs. Specify the research design, the research hypothesis, and the conceptual and measured variables in each design.
  • Liking another person
  • Life satisfaction
  • Visit the website  Online Social Psychology Studies and take part in one of the online studies listed there.

Anderson, C. A., & Dill, K. E. (2000). Video games and aggressive thoughts, feelings, and behavior in the laboratory and in life.  Journal of Personality and Social Psychology, 78 (4), 772–790.

Bushman, B. J., & Huesmann, L. R. (2010). Aggression. In S. T. Fiske, D. T. Gilbert, & G. Lindzey (Eds.),  Handbook of social psychology  (5th ed., Vol. 2, pp. 833–863). Hoboken, NJ: John Wiley & Sons.

Eisenberger, N. I., Lieberman, M. D., & Williams, K. D. (2003). Does rejection hurt? An fMRI study of social exclusion.  Science, 302 (5643), 290–292.

Festinger, L., Riecken, H. W., & Schachter, S. (1956).  When prophecy fails: A social and psychological study of a modern group that predicted the destruction of the world . Minneapolis, MN: University of Minnesota Press.

Greene, J. D., Sommerville, R. B., Nystrom, L. E., Darley, J. M., & Cohen, J. D. (2001). An fMRI investigation of emotional engagement in moral judgment.  Science, 293 (5537), 2105–2108.

Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world?  Behavioral and Brain Sciences, 33 (2–3), 61–83.

Lieberman, M. D., Hariri, A., Jarcho, J. M., Eisenberger, N. I., & Bookheimer, S. Y. (2005). An fMRI investigation of race-related amygdala activity in African-American and Caucasian-American individuals.  Nature Neuroscience, 8 (6), 720–722.

Lilienfeld, S. O. (2011, June 13). Public skepticism of psychology: Why many people perceive the study of human behavior as unscientific.  American Psychologist.  doi: 10.1037/a0023963

Meier, B. P., Robinson, M. D., & Wilkowski, B. M. (2006). Turning the other cheek: Agreeableness and the regulation of aggression-related crimes.  Psychological Science, 17 (2), 136–142.

Morewedge, C. K., Gray, K., & Wegner, D. M. (2010). Perish the forethought: Premeditation engenders misperceptions of personal control. In R. R. Hassin, K. N. Ochsner, & Y. Trope (Eds.),  Self-control in society, mind, and brain  (pp. 260–278). New York, NY: Oxford University Press.

Ochsner, K. N., Bunge, S. A., Gross, J. J., & Gabrieli, J. D. E. (2002). Rethinking feelings: An fMRI study of the cognitive regulation of emotion.  Journal of Cognitive Neuroscience, 14 (8), 1215–1229

Preston, J., & Wegner, D. M. (2007). The eureka error: Inadvertent plagiarism by misattributions of effort.  Journal of Personality and Social Psychology, 92 (4), 575–584.

Richeson, J. A., Baird, A. A., Gordon, H. L., Heatherton, T. F., Wyland, C. L., Trawalter, S., Richeson, J. A., Baird, A. A., Gordon, H. L., Heatherton, T. F., Wyland, C. L., Trawalter, S., et al.#8230;Shelton, J. N. (2003). An fMRI investigation of the impact of interracial contact on executive function.  Nature Neuroscience, 6 (12), 1323–1328.

Media Attributions

  • “ EEG cap ” by Thuglas is licensed under a CC0 1.0 licence.
  • “ FMRI BOLD activation in an emotional Stroop task ” by Shima Ovaysikia, Khalid A. Tahir, Jason L. Chan and Joseph F. X. DeSouza is licensed under a CC BY 2.5 licence.
  • “ Varian4T ” by A314268 is licensed under a CC0 1.0 licence.

Based on the collection and systematic analysis of observable data.

The tendency to think that we could have predicted something that we probably would not have been able to predict.

Characteristics that we are trying to measure.

particular method that we use to measure a variable of interest

Measures in which individuals are asked to respond to questions posed by an interviewer or on a questionnaire.

Measures designed to directly assess what people do.

A technique that records the electrical activity produced by the brain’s neurons through the use of electrodes that are placed around the research participant’s head.

Neuroimaging technique that uses a magnetic field to create images of brain structure and function.

Research that involves making observations of behavior and recording those observations in an objective manner.

Specific prediction about the relationship between the variables of interest and about the specific direction of that relationship.

That the outcome of the research can demonstrate empirically either that there is support for the hypothesis (i.e., the relationship between the variables was correctly specified) or that there is actually no relationship between the variables or that the actual relationship is not in the direction that was predicted.

Search for and test hypotheses about the relationships between two or more variables.

Used to summarize the association, or correlation, between two variables.

Variables that are not part of the research hypothesis but that cause both the predictor and the outcome variable and thus produce the observed correlation between them.

Research designs that include the manipulation of a given situation or experience for two or more groups of individuals who are initially created to be equivalent, followed by a measurement of the effect of that experience.

The situation that is created by the experimenter through the experimental manipulations.

The variable that is measured after the manipulations have occurred.

Determining separately for each participant which condition he or she will experience through a random process,

The extent to which changes in the dependent variable in an experiment can confidently be attributed to changes in the independent variable.

Are experimental research studies that are conducted in a natural environment,

Experimental designs that have two or more independent variables.

A false statement of what the research was really about.

A person who is actually part of the experimental team but who pretends to be another participant in the study.

The extent to which relationships can be expected to hold up when they are tested again in different ways and for different people. Science relies primarily upon replication—that is, the repeating of research.

A statistical procedure in which the results of existing studies are combined to determine what conclusions can be drawn on the basis of all the studies considered together.

Principles of Social Psychology - 1st International H5P Edition Copyright © 2022 by Dr. Rajiv Jhangiani and Dr. Hammond Tarry is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

conducting a social experiment

Logo for M Libraries Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

1.3 Conducting Research in Social Psychology

Learning objectives.

  • Explain why social psychologists rely on empirical methods to study social behavior.
  • Provide examples of how social psychologists measure the variables they are interested in.
  • Review the three types of research designs, and evaluate the strengths and limitations of each type.
  • Consider the role of validity in research, and describe how research programs should be evaluated.

Social psychologists are not the only people interested in understanding and predicting social behavior or the only people who study it. Social behavior is also considered by religious leaders, philosophers, politicians, novelists, and others, and it is a common topic on TV shows. But the social psychological approach to understanding social behavior goes beyond the mere observation of human actions. Social psychologists believe that a true understanding of the causes of social behavior can only be obtained through a systematic scientific approach, and that is why they conduct scientific research. Social psychologists believe that the study of social behavior should be empirical —that is, based on the collection and systematic analysis of observable data .

The Importance of Scientific Research

Because social psychology concerns the relationships among people, and because we can frequently find answers to questions about human behavior by using our own common sense or intuition, many people think that it is not necessary to study it empirically (Lilienfeld, 2011). But although we do learn about people by observing others and therefore social psychology is in fact partly common sense, social psychology is not entirely common sense.

In case you are not convinced about this, perhaps you would be willing to test whether or not social psychology is just common sense by taking a short true-or-false quiz. If so, please have a look at Table 1.1 “Is Social Psychology Just Common Sense?” and respond with either “True” or “False.” Based on your past observations of people’s behavior, along with your own common sense, you will likely have answers to each of the questions on the quiz. But how sure are you? Would you be willing to bet that all, or even most, of your answers have been shown to be correct by scientific research? Would you be willing to accept your score on this quiz for your final grade in this class? If you are like most of the students in my classes, you will get at least some of these answers wrong. (To see the answers and a brief description of the scientific research supporting each of these topics, please go to the Chapter Summary at the end of this chapter.)

Table 1.1 Is Social Psychology Just Common Sense?

Answer each of the following questions, using your own initution, as either true or false.
Opposites attract.
An athlete who wins the bronze medal (third place) in an event is happier about his or her performance than the athlete who wins the silver medal (second place).
Having good friends you can count on can keep you from catching colds.
Subliminal advertising (i.e., persuasive messages that are displayed out of our awareness on TV or movie screens) is very effective in getting us to buy products.
The greater the reward promised for an activity, the more one will come to enjoy engaging in that activity.
Physically attractive people are seen as less intelligent than less attractive people.
Punching a pillow or screaming out loud is a good way to reduce frustration and aggressive tendencies.
People pull harder in a tug-of-war when they’re pulling alone than when pulling in a group.

One of the reasons we might think that social psychology is common sense is that once we learn about the outcome of a given event (e.g., when we read about the results of a research project), we frequently believe that we would have been able to predict the outcome ahead of time. For instance, if half of a class of students is told that research concerning attraction between people has demonstrated that “opposites attract,” and if the other half is told that research has demonstrated that “birds of a feather flock together,” most of the students in both groups will report believing that the outcome is true and that they would have predicted the outcome before they had heard about it. Of course, both of these contradictory outcomes cannot be true. The problem is that just reading a description of research findings leads us to think of the many cases that we know that support the findings and thus makes them seem believable. The tendency to think that we could have predicted something that we probably would not have been able to predict is called the hindsight bias .

Our common sense also leads us to believe that we know why we engage in the behaviors that we engage in, when in fact we may not. Social psychologist Daniel Wegner and his colleagues have conducted a variety of studies showing that we do not always understand the causes of our own actions. When we think about a behavior before we engage in it, we believe that the thinking guided our behavior, even when it did not (Morewedge, Gray, & Wegner, 2010). People also report that they contribute more to solving a problem when they are led to believe that they have been working harder on it, even though the effort did not increase their contribution to the outcome (Preston & Wegner, 2007). These findings, and many others like them, demonstrate that our beliefs about the causes of social events, and even of our own actions, do not always match the true causes of those events.

Social psychologists conduct research because it often uncovers results that could not have been predicted ahead of time. Putting our hunches to the test exposes our ideas to scrutiny. The scientific approach brings a lot of surprises, but it also helps us test our explanations about behavior in a rigorous manner. It is important for you to understand the research methods used in psychology so that you can evaluate the validity of the research that you read about here, in other courses, and in your everyday life.

Social psychologists publish their research in scientific journals, and your instructor may require you to read some of these research articles. The most important social psychology journals are listed in Table 1.2 “Social Psychology Journals” . If you are asked to do a literature search on research in social psychology, you should look for articles from these journals.

Table 1.2 Social Psychology Journals

The research articles in these journals are likely to be available in your college library. A fuller list can be found here:

We’ll discuss the empirical approach and review the findings of many research projects throughout this book, but for now let’s take a look at the basics of how scientists use research to draw overall conclusions about social behavior. Keep in mind as you read this book, however, that although social psychologists are pretty good at understanding the causes of behavior, our predictions are a long way from perfect. We are not able to control the minds or the behaviors of others or to predict exactly what they will do in any given situation. Human behavior is complicated because people are complicated and because the social situations that they find themselves in every day are also complex. It is this complexity—at least for me—that makes studying people so interesting and fun.

Measuring Affect, Behavior, and Cognition

One important aspect of using an empirical approach to understand social behavior is that the concepts of interest must be measured ( Figure 1.4 “The Operational Definition” ). If we are interested in learning how much Sarah likes Robert, then we need to have a measure of her liking for him. But how, exactly, should we measure the broad idea of “liking”? In scientific terms, the characteristics that we are trying to measure are known as conceptual variables , and the particular method that we use to measure a variable of interest is called an operational definition .

For anything that we might wish to measure, there are many different operational definitions, and which one we use depends on the goal of the research and the type of situation we are studying. To better understand this, let’s look at an example of how we might operationally define “Sarah likes Robert.”

Figure 1.4 The Operational Definition

The Operational Definition: Sarah Likes Robert. Either Sarah says,

An idea or conceptual variable (such as “how much Sarah likes Robert”) is turned into a measure through an operational definition.

One approach to measurement involves directly asking people about their perceptions using self-report measures. Self-report measures are measures in which individuals are asked to respond to questions posed by an interviewer or on a questionnaire . Generally, because any one question might be misunderstood or answered incorrectly, in order to provide a better measure, more than one question is asked and the responses to the questions are averaged together. For example, an operational definition of Sarah’s liking for Robert might involve asking her to complete the following measure:

I enjoy being around Robert.

Strongly disagree 1 2 3 4 5 6 Strongly agree

I get along well with Robert.

I like Robert.

The operational definition would be the average of her responses across the three questions. Because each question assesses the attitude differently, and yet each question should nevertheless measure Sarah’s attitude toward Robert in some way, the average of the three questions will generally be a better measure than would any one question on its own.

Although it is easy to ask many questions on self-report measures, these measures have a potential disadvantage. As we have seen, people’s insights into their own opinions and their own behaviors may not be perfect, and they might also not want to tell the truth—perhaps Sarah really likes Robert, but she is unwilling or unable to tell us so. Therefore, an alternative to self-report that can sometimes provide a more valid measure is to measure behavior itself. Behavioral measures are measures designed to directly assess what people do . Instead of asking Sara how much she likes Robert, we might instead measure her liking by assessing how much time she spends with Robert or by coding how much she smiles at him when she talks to him. Some examples of behavioral measures that have been used in social psychological research are shown in Table 1.3 “Examples of Operational Definitions of Conceptual Variables That Have Been Used in Social Psychological Research” .

Table 1.3 Examples of Operational Definitions of Conceptual Variables That Have Been Used in Social Psychological Research

Conceptual variable Operational definitions
Aggression • Number of presses of a button that administers shock to another student
• Number of seconds taken to honk the horn at the car ahead after a stoplight turns green
Interpersonal attraction • Number of times that a person looks at another person
• Number of millimeters of pupil dilation when one person looks at another
Altruism • Number of pieces of paper a person helps another pick up
• Number of hours of volunteering per week that a person engages in
Group decision-making skills • Number of groups able to correctly solve a group performance task
• Number of seconds in which a group correctly solves a problem
Prejudice • Number of negative words used in a creative story about another person
• Number of inches that a person places their chair away from another person

Social Neuroscience: Measuring Social Responses in the Brain

Still another approach to measuring our thoughts and feelings is to measure brain activity, and recent advances in brain science have created a wide variety of new techniques for doing so. One approach, known as electroencephalography (EEG) , is a technique that records the electrical activity produced by the brain’s neurons through the use of electrodes that are placed around the research participant’s head . An electroencephalogram (EEG) can show if a person is asleep, awake, or anesthetized because the brain wave patterns are known to differ during each state. An EEG can also track the waves that are produced when a person is reading, writing, and speaking with others. A particular advantage of the technique is that the participant can move around while the recordings are being taken, which is useful when measuring brain activity in children who often have difficulty keeping still. Furthermore, by following electrical impulses across the surface of the brain, researchers can observe changes over very fast time periods.

A woman wearing an EEG cap

This woman is wearing an EEG cap.

goocy – Research – CC BY-NC 2.0.

Although EEGs can provide information about the general patterns of electrical activity within the brain, and although they allow the researcher to see these changes quickly as they occur in real time, the electrodes must be placed on the surface of the skull, and each electrode measures brain waves from large areas of the brain. As a result, EEGs do not provide a very clear picture of the structure of the brain.

But techniques exist to provide more specific brain images. Functional magnetic resonance imaging (fMRI) is a neuroimaging technique that uses a magnetic field to create images of brain structure and function . In research studies that use the fMRI, the research participant lies on a bed within a large cylindrical structure containing a very strong magnet. Nerve cells in the brain that are active use more oxygen, and the need for oxygen increases blood flow to the area. The fMRI detects the amount of blood flow in each brain region and thus is an indicator of which parts of the brain are active.

Very clear and detailed pictures of brain structures (see Figure 1.5 “Functional Magnetic Resonance Imaging (fMRI)” ) can be produced via fMRI. Often, the images take the form of cross-sectional “slices” that are obtained as the magnetic field is passed across the brain. The images of these slices are taken repeatedly and are superimposed on images of the brain structure itself to show how activity changes in different brain structures over time. Normally, the research participant is asked to engage in tasks while in the scanner, for instance, to make judgments about pictures of people, to solve problems, or to make decisions about appropriate behaviors. The fMRI images show which parts of the brain are associated with which types of tasks. Another advantage of the fMRI is that is it noninvasive. The research participant simply enters the machine and the scans begin.

Figure 1.5 Functional Magnetic Resonance Imaging (fMRI)

an fMRI image and an MRI machine

The fMRI creates images of brain structure and activity. In this image, the red and yellow areas represent increased blood flow and thus increased activity.

Reigh LeBlanc – Reigh’s Brain rlwat – CC BY-NC 2.0; Wikimedia Commons – public domain.

Although the scanners themselves are expensive, the advantages of fMRIs are substantial, and scanners are now available in many university and hospital settings. The fMRI is now the most commonly used method of learning about brain structure, and it has been employed by social psychologists to study social cognition, attitudes, morality, emotions, responses to being rejected by others, and racial prejudice, to name just a few topics (Eisenberger, Lieberman, & Williams, 2003; Greene, Sommerville, Nystrom, Darley, & Cohen, 2001; Lieberman, Hariri, Jarcho, Eisenberger, & Bookheimer, 2005; Ochsner, Bunge, Gross, & Gabrieli, 2002; Richeson et al., 2003).

Observational Research

Once we have decided how to measure our variables, we can begin the process of research itself. As you can see in Table 1.4 “Three Major Research Designs Used by Social Psychologists” , there are three major approaches to conducting research that are used by social psychologists—the observational approach , the correlational approach , and the experimental approach . Each approach has some advantages and disadvantages.

Table 1.4 Three Major Research Designs Used by Social Psychologists

Research Design Goal Advantages Disadvantages
Observational To create a snapshot of the current state of affairs Provides a relatively complete picture of what is occurring at a given time. Allows the development of questions for further study. Does not assess relationships between variables.
Correlational To assess the relationships between two or more variables Allows the testing of expected relationships between variables and the making of predictions. Can assess these relationships in everyday life events. Cannot be used to draw inferences about the causal relationships between the variables.
Experimental To assess the causal impact of one or more experimental manipulations on a dependent variable Allows the drawing of conclusions about the causal relationships among variables. Cannot experimentally manipulate many important variables. May be expensive and take much time to conduct.

The most basic research design, observational research , is research that involves making observations of behavior and recording those observations in an objective manner . Although it is possible in some cases to use observational data to draw conclusions about the relationships between variables (e.g., by comparing the behaviors of older versus younger children on a playground), in many cases the observational approach is used only to get a picture of what is happening to a given set of people at a given time and how they are responding to the social situation. In these cases, the observational approach involves creating a type of “snapshot” of the current state of affairs.

One advantage of observational research is that in many cases it is the only possible approach to collecting data about the topic of interest. A researcher who is interested in studying the impact of a hurricane on the residents of New Orleans, the reactions of New Yorkers to a terrorist attack, or the activities of the members of a religious cult cannot create such situations in a laboratory but must be ready to make observations in a systematic way when such events occur on their own. Thus observational research allows the study of unique situations that could not be created by the researcher. Another advantage of observational research is that the people whose behavior is being measured are doing the things they do every day, and in some cases they may not even know that their behavior is being recorded.

One early observational study that made an important contribution to understanding human behavior was reported in a book by Leon Festinger and his colleagues (Festinger, Riecken, & Schachter, 1956). The book, called When Prophecy Fails , reported an observational study of the members of a “doomsday” cult. The cult members believed that they had received information, supposedly sent through “automatic writing” from a planet called “Clarion,” that the world was going to end. More specifically, the group members were convinced that the earth would be destroyed, as the result of a gigantic flood, sometime before dawn on December 21, 1954.

When Festinger learned about the cult, he thought that it would be an interesting way to study how individuals in groups communicate with each other to reinforce their extreme beliefs. He and his colleagues observed the members of the cult over a period of several months, beginning in July of the year in which the flood was expected. The researchers collected a variety of behavioral and self-report measures by observing the cult, recording the conversations among the group members, and conducting detailed interviews with them. Festinger and his colleagues also recorded the reactions of the cult members, beginning on December 21, when the world did not end as they had predicted. This observational research provided a wealth of information about the indoctrination patterns of cult members and their reactions to disconfirmed predictions. This research also helped Festinger develop his important theory of cognitive dissonance.

Despite their advantages, observational research designs also have some limitations. Most important, because the data that are collected in observational studies are only a description of the events that are occurring, they do not tell us anything about the relationship between different variables. However, it is exactly this question that correlational research and experimental research are designed to answer.

The Research Hypothesis

Because social psychologists are generally interested in looking at relationships among variables, they begin by stating their predictions in the form of a precise statement known as a research hypothesis . A research hypothesis is a statement about the relationship between the variables of interest and about the specific direction of that relationship . For instance, the research hypothesis “People who are more similar to each other will be more attracted to each other” predicts that there is a relationship between a variable called similarity and another variable called attraction. In the research hypothesis “The attitudes of cult members become more extreme when their beliefs are challenged,” the variables that are expected to be related are extremity of beliefs and the degree to which the cults’ beliefs are challenged.

Because the research hypothesis states both that there is a relationship between the variables and the direction of that relationship, it is said to be falsifiable . Being falsifiable means that the outcome of the research can demonstrate empirically either that there is support for the hypothesis (i.e., the relationship between the variables was correctly specified) or that there is actually no relationship between the variables or that the actual relationship is not in the direction that was predicted . Thus the research hypothesis that “people will be more attracted to others who are similar to them” is falsifiable because the research could show either that there was no relationship between similarity and attraction or that people we see as similar to us are seen as less attractive than those who are dissimilar.

Correlational Research

The goal of correlational research is to search for and test hypotheses about the relationships between two or more variables. In the simplest case, the correlation is between only two variables, such as that between similarity and liking, or between gender (male versus female) and helping.

In a correlational design, the research hypothesis is that there is an association (i.e., a correlation) between the variables that are being measured. For instance, many researchers have tested the research hypothesis that a positive correlation exists between the use of violent video games and the incidence of aggressive behavior, such that people who play violent video games more frequently would also display more aggressive behavior.

Playing violent video games may lead to aggressive behavior, but aggressive behavior may lead to playing violent video games

A statistic known as the Pearson correlation coefficient (symbolized by the letter r ) is normally used to summarize the association, or correlation, between two variables. The correlation coefficient can range from −1 (indicating a very strong negative relationship between the variables) to +1 (indicating a very strong positive relationship between the variables). Research has found that there is a positive correlation between the use of violent video games and the incidence of aggressive behavior and that the size of the correlation is about r = .30 (Bushman & Huesmann, 2010).

One advantage of correlational research designs is that, like observational research (and in comparison with experimental research designs in which the researcher frequently creates relatively artificial situations in a laboratory setting), they are often used to study people doing the things that they do every day. And correlational research designs also have the advantage of allowing prediction. When two or more variables are correlated, we can use our knowledge of a person’s score on one of the variables to predict his or her likely score on another variable. Because high-school grade point averages are correlated with college grade point averages, if we know a person’s high-school grade point average, we can predict his or her likely college grade point average. Similarly, if we know how many violent video games a child plays, we can predict how aggressively he or she will behave. These predictions will not be perfect, but they will allow us to make a better guess than we would have been able to if we had not known the person’s score on the first variable ahead of time.

Despite their advantages, correlational designs have a very important limitation. This limitation is that they cannot be used to draw conclusions about the causal relationships among the variables that have been measured. An observed correlation between two variables does not necessarily indicate that either one of the variables caused the other. Although many studies have found a correlation between the number of violent video games that people play and the amount of aggressive behaviors they engage in, this does not mean that viewing the video games necessarily caused the aggression. Although one possibility is that playing violent games increases aggression,

Playing violent video games may lead to aggressive behavior

another possibility is that the causal direction is exactly opposite to what has been hypothesized. Perhaps increased aggressiveness causes more interest in, and thus increased viewing of, violent games. Although this causal relationship might not seem as logical to you, there is no way to rule out the possibility of such reverse causation on the basis of the observed correlation.

Aggressive behavior may lead to playing violent video games

Still another possible explanation for the observed correlation is that it has been produced by the presence of another variable that was not measured in the research. Common-causal variables (also known as third variables) are variables that are not part of the research hypothesis but that cause both the predictor and the outcome variable and thus produce the observed correlation between them ( Figure 1.6 “Correlation and Causality” ). It has been observed that students who sit in the front of a large class get better grades than those who sit in the back of the class. Although this could be because sitting in the front causes the student to take better notes or to understand the material better, the relationship could also be due to a common-causal variable, such as the interest or motivation of the students to do well in the class. Because a student’s interest in the class leads him or her to both get better grades and sit nearer to the teacher, seating position and class grade are correlated, even though neither one caused the other.

Figure 1.6 Correlation and Causality

Where we sit in the class may correlate with our course grade, however, interest in the class, intelligence, and motivation to get good grades could also influences that decision

The correlation between where we sit in a large class and our grade in the class is likely caused by the influence of one or more common-causal variables.

The possibility of common-causal variables must always be taken into account when considering correlational research designs. For instance, in a study that finds a correlation between playing violent video games and aggression, it is possible that a common-causal variable is producing the relationship. Some possibilities include the family background, diet, and hormone levels of the children. Any or all of these potential common-causal variables might be creating the observed correlation between playing violent video games and aggression. Higher levels of the male sex hormone testosterone, for instance, may cause children to both watch more violent TV and behave more aggressively.

I like to think of common-causal variables in correlational research designs as “mystery” variables, since their presence and identity is usually unknown to the researcher because they have not been measured. Because it is not possible to measure every variable that could possibly cause both variables, it is always possible that there is an unknown common-causal variable. For this reason, we are left with the basic limitation of correlational research: Correlation does not imply causation.

Experimental Research

The goal of much research in social psychology is to understand the causal relationships among variables, and for this we use experiments. Experimental research designs are research designs that include the manipulation of a given situation or experience for two or more groups of individuals who are initially created to be equivalent, followed by a measurement of the effect of that experience .

In an experimental research design, the variables of interest are called the independent variables and the dependent variables. The independent variable refers to the situation that is created by the experimenter through the experimental manipulations , and the dependent variable refers to the variable that is measured after the manipulations have occurred . In an experimental research design, the research hypothesis is that the manipulated independent variable (or variables) causes changes in the measured dependent variable (or variables). We can diagram the prediction like this, using an arrow that points in one direction to demonstrate the expected direction of causality:

viewing violence (independent variable) → aggressive behavior (dependent variable)

Consider an experiment conducted by Anderson and Dill (2000), which was designed to directly test the hypothesis that viewing violent video games would cause increased aggressive behavior. In this research, male and female undergraduates from Iowa State University were given a chance to play either a violent video game (Wolfenstein 3D) or a nonviolent video game (Myst). During the experimental session, the participants played the video game that they had been given for 15 minutes. Then, after the play, they participated in a competitive task with another student in which they had a chance to deliver blasts of white noise through the earphones of their opponent. The operational definition of the dependent variable (aggressive behavior) was the level and duration of noise delivered to the opponent. The design and the results of the experiment are shown in Figure 1.7 “An Experimental Research Design (After Anderson & Dill, 2000)” .

Figure 1.7 An Experimental Research Design (After Anderson & Dill, 2000)

Two advantages of the experimental research design are an assurance that the independent variable (also known as the experimental manipulation) occurs prior to the measured dependent variable and the creation of initial equivalence between the conditions of the experiment.

Two advantages of the experimental research design are (a) an assurance that the independent variable (also known as the experimental manipulation) occurs prior to the measured dependent variable and (b) the creation of initial equivalence between the conditions of the experiment (in this case, by using random assignment to conditions).

Experimental designs have two very nice features. For one, they guarantee that the independent variable occurs prior to measuring the dependent variable. This eliminates the possibility of reverse causation. Second, the experimental manipulation allows ruling out the possibility of common-causal variables that cause both the independent variable and the dependent variable. In experimental designs, the influence of common-causal variables is controlled, and thus eliminated, by creating equivalence among the participants in each of the experimental conditions before the manipulation occurs.

The most common method of creating equivalence among the experimental conditions is through random assignment to conditions , which involves determining separately for each participant which condition he or she will experience through a random process, such as drawing numbers out of an envelope or using a website such as http://randomizer.org . Anderson and Dill first randomly assigned about 100 participants to each of their two groups. Let’s call them Group A and Group B. Because they used random assignment to conditions, they could be confident that before the experimental manipulation occurred , the students in Group A were, on average , equivalent to the students in Group B on every possible variable , including variables that are likely to be related to aggression, such as family, peers, hormone levels, and diet—and, in fact, everything else.

Then, after they had created initial equivalence, Anderson and Dill created the experimental manipulation—they had the participants in Group A play the violent video game and the participants in Group B the nonviolent video game. Then they compared the dependent variable (the white noise blasts) between the two groups and found that the students who had viewed the violent video game gave significantly longer noise blasts than did the students who had played the nonviolent game. Because they had created initial equivalence between the groups, when the researchers observed differences in the duration of white noise blasts between the two groups after the experimental manipulation, they could draw the conclusion that it was the independent variable (and not some other variable) that caused these differences. The idea is that the only thing that was different between the students in the two groups was which video game they had played.

When we create a situation in which the groups of participants are expected to be equivalent before the experiment begins, when we manipulate the independent variable before we measure the dependent variable, and when we change only the nature of independent variables between the conditions, then we can be confident that it is the independent variable that caused the differences in the dependent variable. Such experiments are said to have high internal validity , where internal validity refers to the confidence with which we can draw conclusions about the causal relationship between the variables .

Despite the advantage of determining causation, experimental research designs do have limitations. One is that the experiments are usually conducted in laboratory situations rather than in the everyday lives of people. Therefore, we do not know whether results that we find in a laboratory setting will necessarily hold up in everyday life. To counter this, in some cases experiments are conducted in everyday settings—for instance, in schools or other organizations . Such field experiments are difficult to conduct because they require a means of creating random assignment to conditions, and this is frequently not possible in natural settings.

A second and perhaps more important limitation of experimental research designs is that some of the most interesting and important social variables cannot be experimentally manipulated. If we want to study the influence of the size of a mob on the destructiveness of its behavior, or to compare the personality characteristics of people who join suicide cults with those of people who do not join suicide cults, these relationships must be assessed using correlational designs because it is simply not possible to manipulate mob size or cult membership.

Factorial Research Designs

Social psychological experiments are frequently designed to simultaneously study the effects of more than one independent variable on a dependent variable. Factorial research designs are experimental designs that have two or more independent variables . By using a factorial design, the scientist can study the influence of each variable on the dependent variable (known as the main effects of the variables) as well as how the variables work together to influence the dependent variable (known as the interaction between the variables). Factorial designs sometimes demonstrate the person by situation interaction.

In one such study, Brian Meier and his colleagues (Meier, Robinson, & Wilkowski, 2006) tested the hypothesis that exposure to aggression-related words would increase aggressive responses toward others. Although they did not directly manipulate the social context, they used a technique common in social psychology in which they primed (i.e., activated) thoughts relating to social settings. In their research, half of their participants were randomly assigned to see words relating to aggression and the other half were assigned to view neutral words that did not relate to aggression. The participants in the study also completed a measure of individual differences in agreeableness —a personality variable that assesses the extent to which the person sees themselves as compassionate, cooperative, and high on other-concern.

Then the research participants completed a task in which they thought they were competing with another student. Participants were told that they should press the space bar on the computer as soon as they heard a tone over their headphones, and the person who pressed the button the fastest would be the winner of the trial. Before the first trial, participants set the intensity of a blast of white noise that would be delivered to the loser of the trial. The participants could choose an intensity ranging from 0 (no noise) to the most aggressive response (10, or 105 decibels). In essence, participants controlled a “weapon” that could be used to blast the opponent with aversive noise, and this setting became the dependent variable. At this point, the experiment ended.

Figure 1.8 A Person-Situation Interaction

In this experiment by Meier, Robinson, and Wilkowski (2006) the independent variables are type of priming (aggression or neutral) and participant agreeableness (high or low). The dependent variable is the white noise level selected (a measure of aggression). The participants who were low in agreeableness became significantly more aggressive after seeing aggressive words, but those high in agreeableness did not.

In this experiment by Meier, Robinson, and Wilkowski (2006) the independent variables are type of priming (aggression or neutral) and participant agreeableness (high or low). The dependent variable is the white noise level selected (a measure of aggression). The participants who were low in agreeableness became significantly more aggressive after seeing aggressive words, but those high in agreeableness did not.

As you can see in Figure 1.8 “A Person-Situation Interaction” , there was a person by situation interaction. Priming with aggression-related words (the situational variable) increased the noise levels selected by participants who were low on agreeableness, but priming did not increase aggression (in fact, it decreased it a bit) for students who were high on agreeableness. In this study, the social situation was important in creating aggression, but it had different effects for different people.

Deception in Social Psychology Experiments

You may have wondered whether the participants in the video game study and that we just discussed were told about the research hypothesis ahead of time. In fact, these experiments both used a cover story — a false statement of what the research was really about . The students in the video game study were not told that the study was about the effects of violent video games on aggression, but rather that it was an investigation of how people learn and develop skills at motor tasks like video games and how these skills affect other tasks, such as competitive games. The participants in the task performance study were not told that the research was about task performance . In some experiments, the researcher also makes use of an experimental confederate — a person who is actually part of the experimental team but who pretends to be another participant in the study . The confederate helps create the right “feel” of the study, making the cover story seem more real.

In many cases, it is not possible in social psychology experiments to tell the research participants about the real hypotheses in the study, and so cover stories or other types of deception may be used. You can imagine, for instance, that if a researcher wanted to study racial prejudice, he or she could not simply tell the participants that this was the topic of the research because people may not want to admit that they are prejudiced, even if they really are. Although the participants are always told—through the process of informed consent —as much as is possible about the study before the study begins, they may nevertheless sometimes be deceived to some extent. At the end of every research project, however, participants should always receive a complete debriefing in which all relevant information is given, including the real hypothesis, the nature of any deception used, and how the data are going to be used.

Interpreting Research

No matter how carefully it is conducted or what type of design is used, all research has limitations. Any given research project is conducted in only one setting and assesses only one or a few dependent variables. And any one study uses only one set of research participants. Social psychology research is sometimes criticized because it frequently uses college students from Western cultures as participants (Henrich, Heine, & Norenzayan, 2010). But relationships between variables are only really important if they can be expected to be found again when tested using other research designs, other operational definitions of the variables, other participants, and other experimenters, and in other times and settings.

External validity refers to the extent to which relationships can be expected to hold up when they are tested again in different ways and for different people . Science relies primarily upon replication —that is, the repeating of research —to study the external validity of research findings. Sometimes the original research is replicated exactly, but more often, replications involve using new operational definitions of the independent or dependent variables, or designs in which new conditions or variables are added to the original design. And to test whether a finding is limited to the particular participants used in a given research project, scientists may test the same hypotheses using people from different ages, backgrounds, or cultures. Replication allows scientists to test the external validity as well as the limitations of research findings.

In some cases, researchers may test their hypotheses, not by conducting their own study, but rather by looking at the results of many existing studies, using a meta-analysis — a statistical procedure in which the results of existing studies are combined to determine what conclusions can be drawn on the basis of all the studies considered together . For instance, in one meta-analysis, Anderson and Bushman (2001) found that across all the studies they could locate that included both children and adults, college students and people who were not in college, and people from a variety of different cultures, there was a clear positive correlation (about r = .30) between playing violent video games and acting aggressively. The summary information gained through a meta-analysis allows researchers to draw even clearer conclusions about the external validity of a research finding.

Figure 1.9 Some Important Aspects of the Scientific Approach

Scientists generate research hypotheses, which are tested using an observational, correlational, or experimental research design. The variables of interest are measured using self-report or behavioral measures. Data is interpreted according to its validity (including internal validity and external validity). The results of many studies may be combined and summarized using meta-analysis.

It is important to realize that the understanding of social behavior that we gain by conducting research is a slow, gradual, and cumulative process. The research findings of one scientist or one experiment do not stand alone—no one study “proves” a theory or a research hypothesis. Rather, research is designed to build on, add to, and expand the existing research that has been conducted by other scientists. That is why whenever a scientist decides to conduct research, he or she first reads journal articles and book chapters describing existing research in the domain and then designs his or her research on the basis of the prior findings. The result of this cumulative process is that over time, research findings are used to create a systematic set of knowledge about social psychology ( Figure 1.9 “Some Important Aspects of the Scientific Approach” ).

Key Takeaways

  • Social psychologists study social behavior using an empirical approach. This allows them to discover results that could not have been reliably predicted ahead of time and that may violate our common sense and intuition.
  • The variables that form the research hypothesis, known as conceptual variables, are assessed using measured variables by using, for instance, self-report, behavioral, or neuroimaging measures.
  • Observational research is research that involves making observations of behavior and recording those observations in an objective manner. In some cases, it may be the only approach to studying behavior.
  • Correlational and experimental research designs are based on developing falsifiable research hypotheses.
  • Correlational research designs allow prediction but cannot be used to make statements about causality. Experimental research designs in which the independent variable is manipulated can be used to make statements about causality.
  • Social psychological experiments are frequently factorial research designs in which the effects of more than one independent variable on a dependent variable are studied.
  • All research has limitations, which is why scientists attempt to replicate their results using different measures, populations, and settings and to summarize those results using meta-analyses.

Exercises and Critical Thinking

1. Find journal articles that report observational, correlational, and experimental research designs. Specify the research design, the research hypothesis, and the conceptual and measured variables in each design. 2.

Consider the following variables that might have contributed to teach of the following events. For each one, (a) propose a research hypothesis in which the variable serves as an independent variable and (b) propose a research hypothesis in which the variable serves as a dependent variable.

  • Liking another person
  • Life satisfaction

Anderson, C. A., & Bushman, B. J. (2001). Effects of violent video games on aggressive behavior, aggressive cognition, aggressive affect, physiological arousal, and prosocial behavior: A meta-analytic review of the scientific literature. Psychological Science, 12 (5), 353–359.

Anderson, C. A., & Dill, K. E. (2000). Video games and aggressive thoughts, feelings, and behavior in the laboratory and in life. Journal of Personality and Social Psychology, 78 (4), 772–790.

Bushman, B. J., & Huesmann, L. R. (2010). Aggression. In S. T. Fiske, D. T. Gilbert, & G. Lindzey (Eds.), Handbook of social psychology (5th ed., Vol. 2, pp. 833–863). Hoboken, NJ: John Wiley & Sons.

Eisenberger, N. I., Lieberman, M. D., & Williams, K. D. (2003). Does rejection hurt? An fMRI study of social exclusion. Science, 302 (5643), 290–292.

Festinger, L., Riecken, H. W., & Schachter, S. (1956). When prophecy fails: A social and psychological study of a modern group that predicted the destruction of the world . Minneapolis, MN: University of Minnesota Press.

Greene, J. D., Sommerville, R. B., Nystrom, L. E., Darley, J. M., & Cohen, J. D. (2001). An fMRI investigation of emotional engagement in moral judgment. Science, 293 (5537), 2105–2108.

Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33 (2–3), 61–83.

Lieberman, M. D., Hariri, A., Jarcho, J. M., Eisenberger, N. I., & Bookheimer, S. Y. (2005). An fMRI investigation of race-related amygdala activity in African-American and Caucasian-American individuals. Nature Neuroscience, 8 (6), 720–722.

Lilienfeld, S. O. (2011, June 13). Public skepticism of psychology: Why many people perceive the study of human behavior as unscientific. American Psychologist. doi: 10.1037/a0023963

Meier, B. P., Robinson, M. D., & Wilkowski, B. M. (2006). Turning the other cheek: Agreeableness and the regulation of aggression-related crimes. Psychological Science, 17 (2), 136–142.

Morewedge, C. K., Gray, K., & Wegner, D. M. (2010). Perish the forethought: Premeditation engenders misperceptions of personal control. In R. R. Hassin, K. N. Ochsner, & Y. Trope (Eds.), Self-control in society, mind, and brain (pp. 260–278). New York, NY: Oxford University Press.

Ochsner, K. N., Bunge, S. A., Gross, J. J., & Gabrieli, J. D. E. (2002). Rethinking feelings: An fMRI study of the cognitive regulation of emotion. Journal of Cognitive Neuroscience, 14 (8), 1215–1229.

Preston, J., & Wegner, D. M. (2007). The eureka error: Inadvertent plagiarism by misattributions of effort. Journal of Personality and Social Psychology, 92 (4), 575–584.

Richeson, J. A., Baird, A. A., Gordon, H. L., Heatherton, T. F., Wyland, C. L., Trawalter, S., Richeson, J. A., Baird, A. A., Gordon, H. L., Heatherton, T. F., Wyland, C. L., Trawalter, S., et al.#8230.

Shelton, J. N. (2003). An fMRI investigation of the impact of interracial contact on executive function. Nature Neuroscience, 6 (12), 1323–1328.

Principles of Social Psychology Copyright © 2015 by University of Minnesota is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Update 13th September 2024: Our systems are now restored following recent technical disruption, and we’re working hard to catch up on publishing. We apologise for the inconvenience caused. Find out more  

Social Science Experiments A Hands-on Introduction

  • Library eCollections

Description

This book is designed for an undergraduate, one-semester course in experimental research, primarily targeting programs in sociology, political science, environmental studies, psychology, and communications. Aimed at those with limited technical background, this introduction to social science experiments takes a practical, hands-on approach. After explaining key features of experimental designs, Green takes students through exercises designed to build appreciation for the nuances of design, implementation, analysis, and interpretation. Using applications and statistical examples from many social science fields, the textbook illustrates…

  • Add to bookmarks
  • Download flyer
  • Add bookmark
  • Cambridge Spiral eReader

Key features

  • The book combines lively examples and hands-on exercises to build interest and engagement
  • The book is written with the first-time R user in mind. Very basic examples are presented and reinforced with exercises; solutions are provided
  • The book is concise and can be used as a standalone text for a class on experimental design or for a three-week module in an introductory statistics or research methods class
  • Experiments
  • causal inference
  • credibility revolution
  • field experiment
  • survey experiment
  • lab experiment
  • natural experiment
  • research ethics
  • introduction to statistics

About the book

  • DOI https://doi.org/10.1017/9781009186957
  • Subjects American Government, Politics and Policy, American Studies, Area Studies, Politics and International Relations
  • Publication date: 29 September 2022
  • ISBN: 9781009186964
  • Dimensions (mm): 254 x 177 mm
  • Weight: 0.37kg
  • Page extent: 162 pages
  • Availability: In stock
  • ISBN: 9781009186971
  • Weight: 0.5kg
  • Page extent: 160 pages
  • Publication date: 08 September 2022
  • ISBN: 9781009186957

Access options

Review the options below to login to check your access.

Personal login

Log in with your Cambridge Higher Education account to check access.

Purchase options

There are no purchase options available for this title.

Have an access code?

To redeem an access code, please log in with your personal login.

If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.

Donald P. Green is Burgess Professor of Political Science at Columbia University. His path-breaking research uses experiments to study topics such as voting, prejudice, mass media, and gender-based violence.

Related content

AI generated results by Discovery for publishers [opens in a new window]

  • Andrew Gelman ,
  • Aki Vehtari

Online publication date: 07 March 2024

  • Jennifer Hill ,

Online publication date: 01 December 2020

Hardback publication date: 23 July 2020

Paperback publication date: 23 July 2020

  • Thad Dunning

Online publication date: 05 November 2012

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Guide to Experimental Design | Overview, Steps, & Examples

Guide to Experimental Design | Overview, 5 steps & Examples

Published on December 3, 2019 by Rebecca Bevans . Revised on June 21, 2023.

Experiments are used to study causal relationships . You manipulate one or more independent variables and measure their effect on one or more dependent variables.

Experimental design create a set of procedures to systematically test a hypothesis . A good experimental design requires a strong understanding of the system you are studying.

There are five key steps in designing an experiment:

  • Consider your variables and how they are related
  • Write a specific, testable hypothesis
  • Design experimental treatments to manipulate your independent variable
  • Assign subjects to groups, either between-subjects or within-subjects
  • Plan how you will measure your dependent variable

For valid conclusions, you also need to select a representative sample and control any  extraneous variables that might influence your results. If random assignment of participants to control and treatment groups is impossible, unethical, or highly difficult, consider an observational study instead. This minimizes several types of research bias, particularly sampling bias , survivorship bias , and attrition bias as time passes.

Table of contents

Step 1: define your variables, step 2: write your hypothesis, step 3: design your experimental treatments, step 4: assign your subjects to treatment groups, step 5: measure your dependent variable, other interesting articles, frequently asked questions about experiments.

You should begin with a specific research question . We will work with two research question examples, one from health sciences and one from ecology:

To translate your research question into an experimental hypothesis, you need to define the main variables and make predictions about how they are related.

Start by simply listing the independent and dependent variables .

Research question Independent variable Dependent variable
Phone use and sleep Minutes of phone use before sleep Hours of sleep per night
Temperature and soil respiration Air temperature just above the soil surface CO2 respired from soil

Then you need to think about possible extraneous and confounding variables and consider how you might control  them in your experiment.

Extraneous variable How to control
Phone use and sleep in sleep patterns among individuals. measure the average difference between sleep with phone use and sleep without phone use rather than the average amount of sleep per treatment group.
Temperature and soil respiration also affects respiration, and moisture can decrease with increasing temperature. monitor soil moisture and add water to make sure that soil moisture is consistent across all treatment plots.

Finally, you can put these variables together into a diagram. Use arrows to show the possible relationships between variables and include signs to show the expected direction of the relationships.

Diagram of the relationship between variables in a sleep experiment

Here we predict that increasing temperature will increase soil respiration and decrease soil moisture, while decreasing soil moisture will lead to decreased soil respiration.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

conducting a social experiment

Now that you have a strong conceptual understanding of the system you are studying, you should be able to write a specific, testable hypothesis that addresses your research question.

Null hypothesis (H ) Alternate hypothesis (H )
Phone use and sleep Phone use before sleep does not correlate with the amount of sleep a person gets. Increasing phone use before sleep leads to a decrease in sleep.
Temperature and soil respiration Air temperature does not correlate with soil respiration. Increased air temperature leads to increased soil respiration.

The next steps will describe how to design a controlled experiment . In a controlled experiment, you must be able to:

  • Systematically and precisely manipulate the independent variable(s).
  • Precisely measure the dependent variable(s).
  • Control any potential confounding variables.

If your study system doesn’t match these criteria, there are other types of research you can use to answer your research question.

How you manipulate the independent variable can affect the experiment’s external validity – that is, the extent to which the results can be generalized and applied to the broader world.

First, you may need to decide how widely to vary your independent variable.

  • just slightly above the natural range for your study region.
  • over a wider range of temperatures to mimic future warming.
  • over an extreme range that is beyond any possible natural variation.

Second, you may need to choose how finely to vary your independent variable. Sometimes this choice is made for you by your experimental system, but often you will need to decide, and this will affect how much you can infer from your results.

  • a categorical variable : either as binary (yes/no) or as levels of a factor (no phone use, low phone use, high phone use).
  • a continuous variable (minutes of phone use measured every night).

How you apply your experimental treatments to your test subjects is crucial for obtaining valid and reliable results.

First, you need to consider the study size : how many individuals will be included in the experiment? In general, the more subjects you include, the greater your experiment’s statistical power , which determines how much confidence you can have in your results.

Then you need to randomly assign your subjects to treatment groups . Each group receives a different level of the treatment (e.g. no phone use, low phone use, high phone use).

You should also include a control group , which receives no treatment. The control group tells us what would have happened to your test subjects without any experimental intervention.

When assigning your subjects to groups, there are two main choices you need to make:

  • A completely randomized design vs a randomized block design .
  • A between-subjects design vs a within-subjects design .

Randomization

An experiment can be completely randomized or randomized within blocks (aka strata):

  • In a completely randomized design , every subject is assigned to a treatment group at random.
  • In a randomized block design (aka stratified random design), subjects are first grouped according to a characteristic they share, and then randomly assigned to treatments within those groups.
Completely randomized design Randomized block design
Phone use and sleep Subjects are all randomly assigned a level of phone use using a random number generator. Subjects are first grouped by age, and then phone use treatments are randomly assigned within these groups.
Temperature and soil respiration Warming treatments are assigned to soil plots at random by using a number generator to generate map coordinates within the study area. Soils are first grouped by average rainfall, and then treatment plots are randomly assigned within these groups.

Sometimes randomization isn’t practical or ethical , so researchers create partially-random or even non-random designs. An experimental design where treatments aren’t randomly assigned is called a quasi-experimental design .

Between-subjects vs. within-subjects

In a between-subjects design (also known as an independent measures design or classic ANOVA design), individuals receive only one of the possible levels of an experimental treatment.

In medical or social research, you might also use matched pairs within your between-subjects design to make sure that each treatment group contains the same variety of test subjects in the same proportions.

In a within-subjects design (also known as a repeated measures design), every individual receives each of the experimental treatments consecutively, and their responses to each treatment are measured.

Within-subjects or repeated measures can also refer to an experimental design where an effect emerges over time, and individual responses are measured over time in order to measure this effect as it emerges.

Counterbalancing (randomizing or reversing the order of treatments among subjects) is often used in within-subjects designs to ensure that the order of treatment application doesn’t influence the results of the experiment.

Between-subjects (independent measures) design Within-subjects (repeated measures) design
Phone use and sleep Subjects are randomly assigned a level of phone use (none, low, or high) and follow that level of phone use throughout the experiment. Subjects are assigned consecutively to zero, low, and high levels of phone use throughout the experiment, and the order in which they follow these treatments is randomized.
Temperature and soil respiration Warming treatments are assigned to soil plots at random and the soils are kept at this temperature throughout the experiment. Every plot receives each warming treatment (1, 3, 5, 8, and 10C above ambient temperatures) consecutively over the course of the experiment, and the order in which they receive these treatments is randomized.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Finally, you need to decide how you’ll collect data on your dependent variable outcomes. You should aim for reliable and valid measurements that minimize research bias or error.

Some variables, like temperature, can be objectively measured with scientific instruments. Others may need to be operationalized to turn them into measurable observations.

  • Ask participants to record what time they go to sleep and get up each day.
  • Ask participants to wear a sleep tracker.

How precisely you measure your dependent variable also affects the kinds of statistical analysis you can use on your data.

Experiments are always context-dependent, and a good experimental design will take into account all of the unique considerations of your study system to produce information that is both valid and relevant to your research question.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 21). Guide to Experimental Design | Overview, 5 steps & Examples. Scribbr. Retrieved September 22, 2024, from https://www.scribbr.com/methodology/experimental-design/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, random assignment in experiments | introduction & examples, quasi-experimental design | definition, types & examples, how to write a lab report, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Sweepstakes
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

The Most Famous Social Psychology Experiments Ever Performed

Social experiments often seek to answer questions about how people behave in groups or how the presence of others impacts individual behavior. Over the years, social psychologists have explored these questions by conducting experiments .

The results of some of the most famous social psychology experiments remain relevant (and often quite controversial) today. Such experiments give us valuable information about human behavior and how group influence can impact our actions in social situations.

At a Glance

Some of the most famous social psychology experiments include Asch's conformity experiments, Bandura's Bobo doll experiments, the Stanford prison experiment, and Milgram's obedience experiments. Some of these studies are quite controversial for various reasons, including how they were conducted, serious ethical concerns, and what their results suggested.

The Asch Conformity Experiments

What do you do when you know you're right but the rest of the group disagrees with you? Do you bow to group pressure?

In a series of famous experiments conducted during the 1950s, psychologist Solomon Asch demonstrated that people would give the wrong answer on a test to fit in with the rest of the group.

In Asch's famous conformity experiments , people were shown a line and then asked to select a line of a matching length from a group of three. Asch also placed confederates in the group who would intentionally choose the wrong lines.

The results revealed that when other people picked the wrong line, participants were likely to conform and give the same answers as the rest of the group.

What the Results Revealed

While we might like to believe that we would resist group pressure (especially when we know the group is wrong), Asch's results revealed that people are surprisingly susceptible to conformity .

Not only did Asch's experiment teach us a great deal about the power of conformity, but it also inspired a whole host of additional research on how people conform and obey, including Milgram's infamous obedience experiments.

The Bobo Doll Experiment

Does watching violence on television cause children to behave more aggressively? In a series of experiments conducted during the early 1960s, psychologist Albert Bandura set out to investigate the impact of observed aggression on children's behavior.

In his Bobo doll experiments , children would watch an adult interacting with a Bobo doll. In one condition, the adult model behaved passively toward the doll, but in another, the adult would kick, punch, strike, and yell at the doll.

The results revealed that children who watched the adult model behave violently toward the doll were likelier to imitate the aggressive behavior later on.​

The Impact of Bandura's Social Psychology Experiment

The debate over the degree to which violence on television, movies, gaming, and other media influences children's behavior continues to rage on today, so it perhaps comes as no surprise that Bandura's findings are still so relevant.

The experiment has also helped inspire hundreds of additional studies exploring the impacts of observed aggression and violence.

The Stanford Prison Experiment

During the early 1970s, Philip Zimbardo set up a fake prison in the basement of the Stanford Psychology Department, recruited participants to play prisoners and guards, and played the role of the prison warden.

The experiment was designed to look at the effect that a prison environment would have on behavior, but it quickly became one of the most famous and controversial experiments of all time.

Results of the Stanford Prison Experiment

The Stanford prison experiment was initially slated to last a full two weeks. It ended after just six days. Why? Because the participants became so enmeshed in their assumed roles, the guards became almost sadistically abusive, and the prisoners became anxious, depressed, and emotionally disturbed.

While the Stanford prison experiment was designed to look at prison behavior, it has since become an emblem of how powerfully people are influenced by situations.  

Ethical Concerns

Part of the notoriety stems from the study's treatment of the participants. The subjects were placed in a situation that created considerable psychological distress. So much so that the study had to be halted less than halfway through the experiment.

The study has long been upheld as an example of how people yield to the situation, but critics have suggested that the participants' behavior may have been unduly influenced by Zimbardo himself in his capacity as the mock prison's "warden."  

Recent Criticisms

The Stanford prison experiment has long been controversial due to the serious ethical concerns of the research, but more recent evidence casts serious doubts on the study's scientific merits.

An examination of study records indicates participants faked their behavior to either get out of the experiment or "help" prove the researcher's hypothesis. The experimenters also appear to have encouraged certain behaviors to help foster more abusive behavior.

The Milgram Experiments

Following the trial of Adolph Eichmann for war crimes committed during World War II, psychologist Stanley Milgram wanted to better understand why people obey. "Could it be that Eichmann and his million accomplices in the Holocaust were just following orders? Could we call them all accomplices?" Milgram wondered.

The results of Milgram's controversial obedience experiments were astonishing and continue to be both thought-provoking and controversial today.

What the Social Psychology Experiment Involved

The study involved ordering participants to deliver increasingly painful shocks to another person. While the victim was simply a confederate pretending to be injured, the participants fully believed that they were giving electrical shocks to the other person.

Even when the victim was protesting or complaining of a heart condition, 65% of the participants continued to deliver painful, possibly fatal shocks on the experimenter's orders.

Obviously, no one wants to believe that they are capable of inflicting pain or torture on another human being simply on the orders of an authority figure. The results of the obedience experiments are disturbing because they reveal that people are much more obedient than they may believe.

Controversy and Recent Criticisms

The study is also controversial because it suffers from ethical concerns, primarily the psychological distress it created for the participants. More recent findings suggest that other problems question the study's findings.

Some participants were coerced into continuing against their wishes. Many participants appeared to have guessed that the learner was faking their responses, and other variations showed that many participants refused to continue the shocks.

What This Means For You

There are many interesting and famous social psychology experiments that can reveal a lot about our understanding of social behavior and influence. However, it is important to be aware of the controversies, limitations, and criticisms of these studies. More recent research may reflect differing results. In some cases, the re-evaluation of classic studies has revealed serious ethical and methodological flaws that call the results into question.

Jeon, HL.  The environmental factor within the Solomon Asch Line Test .  International Journal of Social Science and Humanity.  2014;4(4):264-268. doi:10.7763/IJSSH.2014.V4.360 

Bandura and Bobo . Association for Psychological Science.

Zimbardo, G. The Stanford Prison Experiment: a simulation study on the psychology of imprisonment .

Le Texier T.  Debunking the Stanford Prison Experiment.   Am Psychol.  2019;74(7):823-839. doi:10.1037/amp0000401

Blum B.  The lifespan of a lie .  Medium .

Baker PC. Electric Schlock: Did Stanley Milgram's famous obedience experiments prove anything? Pacific Standard .

Perry G.  Deception and illusion in Milgram's accounts of the obedience experiments .  Theory Appl Ethics . 2013;2(2):79-92.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

We have bookmarked key sections of this document, to hel\ p you find specific items addressing your need. Just click on the "bookmarks" tab on the left-hand margin. This document is posted on the Department of Health and Human Services' website at: http://aspe.hhs.gov/search/hsp/qeval/part2.pdf

  • Architecture and Design
  • Asian and Pacific Studies
  • Business and Economics
  • Classical and Ancient Near Eastern Studies
  • Computer Sciences
  • Cultural Studies
  • Engineering
  • General Interest
  • Geosciences
  • Industrial Chemistry
  • Islamic and Middle Eastern Studies
  • Jewish Studies
  • Library and Information Science, Book Studies
  • Life Sciences
  • Linguistics and Semiotics
  • Literary Studies
  • Materials Sciences
  • Mathematics
  • Social Sciences
  • Sports and Recreation
  • Theology and Religion
  • Publish your article
  • The role of authors
  • Promoting your article
  • Abstracting & indexing
  • Publishing Ethics
  • Why publish with De Gruyter
  • How to publish with De Gruyter
  • Our book series
  • Our subject areas
  • Your digital product at De Gruyter
  • Contribute to our reference works
  • Product information
  • Tools & resources
  • Product Information
  • Promotional Materials
  • Orders and Inquiries
  • FAQ for Library Suppliers and Book Sellers
  • Repository Policy
  • Free access policy
  • Open Access agreements
  • Database portals
  • For Authors
  • Customer service
  • People + Culture
  • Journal Management
  • How to join us
  • Working at De Gruyter
  • Mission & Vision
  • De Gruyter Foundation
  • De Gruyter Ebound
  • Our Responsibility
  • Partner publishers

conducting a social experiment

Your purchase has been completed. Your documents are now available to view.

Reflections on the Ethics of Social Experimentation

Social scientists are increasingly engaging in experimental research projects of importance for public policy in developing areas. While this research holds the possibility of producing major social benefits, it may also involve manipulating populations, often without consent, sometimes with potentially adverse effects, and often in settings with obvious power differentials between researcher and subject. Such research is currently conducted with few clear ethical guidelines. In this paper I discuss research ethics as currently understood in this field, highlighting the limitations of standard procedures and the need for the construction of appropriate ethics, focusing on the problems of determining responsibility for interventions and assessing appropriate forms of consent.

1 Introduction

Social science researchers are increasingly using field experimental methods to try to answer all kinds of questions about political processes and public policies. Unlike traditional “observational” methods, in which you observe the world as it comes to you, the idea right at the heart of the experimental approach is that you learn about the world by seeing how it reacts to interventions. In international development research these interventions can sometimes take the form of researchers from wealthy institutions manipulating citizens from poorer populations to answer questions of little interest to those populations.

These studies raise a host of ethical concerns that social scientists are not well equipped to deal with. US based social science researchers rely on principles such as respect for persons, justice, and beneficence that have been adopted by health researchers and institutionalized through formal review processes but that do not always do the work asked of them by social scientists.

Consider one example where many of the points of tension come to a head. Say a researcher is contacted by a set of community organizations that want to figure out whether placing street lights in slums will reduce violent crime. In this research the subjects are the criminals: seeking informed consent of the criminals would likely compromise the research and it would likely not be forthcoming anyhow (violation of respect for persons); the criminals will likely bear the costs of the research without benefitting (violation of justice); and there will be disagreement regarding the benefits of the research – if it is effective, the criminals in particular will not value it (producing a difficulty for assessing benevolence). There is no pretense at neutrality in this research since assessing the effectiveness of the lamps is taking sides, but despite the absence of neutrality no implicit contract between researchers and subjects is broken. The special issues here are not just around the subjects however. Here there are also risks that obtain to non-subjects, if for example criminals retaliate against the organizations putting the lamps in place. The organization may be very aware of these risks but be willing to bear them because they erroneously put faith in the ill-founded expectations of researchers from wealthy universities who are themselves motivated in part to publish.

The example raises a lot of issues. It is chosen because despite the many issues raised, the principles that are currently employed provide almost no guidance to deal with the issues raised. It is not however a particularly unusual case and many of the features of the case are shared by other projects including work in spheres such as reduction of violence against women, efforts to introduce democratic institutions in rural communities, job training programs for ex combatants, efforts to alter electoral behaviour of constituents, and efforts to stamp out corruption by politicians [for a discussion of many relevant cases see Baele (2013)]. Unlike classic health and education interventions, these projects routinely deal with interventions that have winners and losers, create risks for some, and are done without the consent of all parties affected by them.

The absence of clear principles to handle these issues leaves individuals and the professions in a difficult situation, at least if they care about the ethical implications of their research designs above and beyond whether they receive formal research approval.

So how should researchers proceed in these cases? At present there are no satisfactory answers. To make progress I discuss three sets of problems raised by research designs like this, which I call the problem of audience , the problem of agency , and the problem of consent .

The audience question is about determining what the professional ethical issues are. My focus throughout will be on professional ethics rather than more metaphysical questions of what is right or wrong in some objective sense. Thus in Section 2, I highlight a conceptualization of the problem not as a problem of normative ethics – whether any of these designs are right or wrong in any fundamental sense – but as a question of audience. A key purpose of professional ethics is to clarify expectations of members of a profession for relevant groups that are important to their work. For medical ethics the key audience is patients, or particularly subjects: those patients with which medical professionals engage. The current guidelines used by social scientists are inherited from medical ethics, which place a primary focus on human subjects. While subjects perhaps represent the primary audience for medical interventions, this may not be the case for social science interventions for which the key audience can be the general public or policy. This section highlights the need for the construction of an ethics that addresses the preoccupations of social scientists engaging in this type of research. It also highlights the more thorny nature of this problem for interventions in which there are winners and losers, as in the motivating example above.

The agency problem is the problem of determining who is responsible for manipulations. I discuss this in Section 3, describing an argument – which I call the “spheres of ethics” argument – that researchers sometimes employ as grounds for collaborating in partnerships in which subjects are exposed to risks to an extent not normally admissible in the course of research projects. The key idea is that if an intervention is ethical for implementing agencies with respect to the ethical standards of their sphere – which may differ from the ethical standards of researchers – then responsibility may be divided between researchers and implementers, with research ethics standards applied to research components and partner standards applied to manipulations. Put crudely this approach can be considered a way of passing the buck, but in fact the arguments for employing it are much more subtle than that. In a way, the buck-passing interpretation fundamentally misses the point of professional ethics. Even still, this argument is subject to abuse and so this section outlines protections related to agency autonomy and legitimacy which in turn depend on the conceptualization of professional ethics described in Section 2.

The third problem is the critical problem of consent . The bulk of this essay focuses on consent and the role it plays in research ethics. Current norms for informed consent are again inherited from medical ethics and reflect answers in the medical community to the first two questions. Yet alternative conceptualizations of consent are possible, and may be more appropriate for social scientists, given the different answers to questions of audience and agency in social science research. I outline a range of these in Section 4.

I close with reflections on implications for practice and for the development of ethical standards that can address the issues raised by experimental research in social science.

2 Problem 1: Audience

What are we worrying about when we worry about whether implementing experiments like that described above is ethical? It often seems as though we are worrying about whether in some fundamental sense these research activities are right or wrong. But framing the question in that way renders it largely unanswerable. The more practical approach of professional ethics is to determine whether one or another action is more or less consistent with the expectations of a relevant “audience” regarding the behaviour of the members of the profession. [1]

While the response that ethical action is action that is in line with expectations of a relevant audience is not technically question begging, it does require the existence of some recognized set of norms for a profession. In practice, social scientists largely work within the ethical framework provided by the human subjects protection system. [2] The systems was devised primarily with a view to regulating medical research, but now covers all research involving human subjects, at least for US based researchers or researchers receiving federal funding.

The principles embedded in the Belmont report [3] and that permeate the work of Institutional Review Boards in the United States self-consciously seek to prescribe a set of common expectations for a community of researchers and their patients and clients. Indeed, sidestepping the question of ethical foundations seems to have been a strategy of the US Commission that produced these reports. [4] The pragmatic approach adopted by the commission is a strength. As argued by Jonsen (1983), medical ethics, as captured by the documents produced by the Commission, is “a Concord in Medical Ethics,” a concord “reached by a responsible group drawn from the profession and from the public.”

But this pragmatic approach also limits the pretensions to universality of research ethics in an obvious way. The principles of the Belmont report were developed to address particular problems confronting the medical profession that carry authority because they were developed through a deliberative process that sought to reach consensus in the profession around conventions of behaviour. The result is both elegant in sidestepping the unanswerable questions and messy in its result. The final principles are a mixture of deontological and consequentialist principles, with no overarching principle to refer to to determine what kinds of tradeoffs should be made in cases where interventions that benefit one group harm another. The practical solution is to outsource the problem of making these determinations to the judgments of individuals placed on university institutional review boards. While effective for some purposes, there is ex ante no reason to expect that the principles developed provide the appropriate guidelines for social science. [5]

The poor fit stems in part from the fact that medical research differs from social science research in various ways.

researchers are interested in the behaviour of institutions or groups, whether governmental, private sector, or nongovernmental, and do not require information about individuals (for example if you want to figure out if a government licensing agency processes applications faster from high caste applicants than from low caste applicants)

those most likely to be harmed by an intervention are not the subjects (for example when researchers are interested in the behaviour of bureaucrats whose decisions affects citizens, or in the behaviour of pivotal voters, which in turn can affect the outcome of elections)

subjects are not potential beneficiaries of the research and may even oppose it (for example for studies of interventions seeking to reduce corruption in which the corrupt bureaucrats are the subjects)

consent processes can compromise the research (for example for studies that seek to measure gender or race based discrimination by landlords or employers)

there is disagreement over whether the outcomes are valuable (compare finding a cure for a disease to finding out that patronage politics is an effective electoral strategy); indeed some social scientific interventions are centered on the distributive implications of interventions: when different outcomes benefit some and hurt others, the desideratum of benefitting all that are implicated by an intervention is unobtainable

there is no expectation of care between the research subjects and the researcher

These features can sometimes make the standard procedures used by Institutional Review Boards for approving social science research irrelevant or unworkable.

The first two differences mean that formal reviews, as currently set up, can ignore the full range of benefits and harms of research or do not cover the research at all. Formal reviews focus on human subjects: living individuals about whom investigators obtain data through intervention or interaction or obtain identifiable private information.

The third and fourth, which again focus on subjects rather than broader populations, can quickly put the principles of justice and respect for persons – two of the core principles elaborated in the Belmont report (upon which standard review processes are based) at odds with research that may seem justifiable on other grounds.

The fifth difference can make the third Belmont principle, beneficence, unworkable, at least in the absence of some formula for comparing the benefits to some against the costs for others (see Baele 2013 on the difficulties of applying beneficence arguments).

The sixth difference means that the stakes are different. If a health researcher fails to provide care for an individual in a control group, this may violate their duty of care and break the public trust in their professions. This may not be true for social scientists however.

Thus, standard considerations inherited from the human subjects protection system can be blind to the salient considerations for social science researchers and their primary audiences. The focus on private data and the protection of subjects may sometimes seem excessive; but the blindness to the risks for non-subjects may be more costly. Specific risks, beyond welfare costs, are that researchers gain a reputation for providing unsound advice to government officials on sensitive issues, encourage the withholding of benefits from the public, interfere with judicial processes, or put vulnerable (non-subject) populations at risk, in order to further research agendas.

Refocussing on the question of audience however can give some guidance here. A preoccupation of medical ethics is the maintenance of relations of trust between medical professionals and patients. In this sense, patients are a key audience for medical ethics. [6] Patients can expect care from medical professionals no matter who they are. But the nature of social science questions puts researchers in different relations with subjects, most obviously when interventions are interventions aimed against subjects. It seems improbable that social scientists can maintain relations of trust with corrupt politicians, human rights abusers, and perpetrators of violence when the interventions they are examining are designed precisely to confront these groups.

What audiences are most critical for social scientists? Subjects are of course a key audience for social scientists also, not least because for much data collection depends on the trust, generosity, and goodwill of subjects. But two wider audiences are also critical and the fashioning of social science research ethics for field experimentation should focus closely on these. The first are research partners and the second are research consumers.

2.1 Partner Matters

As in the example above, much field experimentation can involve partnerships with local governmental or nongovernmental groups. Partnering in experimental research can be very costly for partners however. And if they do not have a full understanding of the research design, partners can be convinced to do things not in their interests which is a risk when the interests of partners and researchers diverge. One point of divergence is with respect to statistical power. For a partner, an underpowered study can mean costly investments that result in ambiguous findings. Underpowered studies are in general a problem for researchers too with the difference that they can still be useful if their findings can be incorporated into metaanalyses. Researchers may also be more willing to accept underpowered studies if they are less risk averse than partners and if they discount the costs of the interventions. Thus to account for global beneficence, researchers need to establish some form of informed consent with partners . At a minimum this requires establishing that partners really understand the limitations and the costs of an experiment.

One useful practice is to sign a formal Memorandum of Understanding between the researcher and the partner organization at the beginning of a project laying out the roles and responsibilities of both parties. However, even when they exist, these rarely include many of the most important elements that researchers are required to provide to subjects during the informed consent process, such as the potential risks or alternatives to experimentation. These documents could even include discussions of the power of a study to ensure that partners are aware of the probability that their experiment will result in unfavourable findings, even if their program has a positive impact. Having clearer standards for what information should be required before a partner consents to an experiment could facilitate continued positive relationships between researchers and partners.

In addition, concern must be given to how researchers explain technical information to partners. The informed consent process with research subjects defines additional precautions that must be taken to obtain consent from people with limited autonomy. Similarly, there is a burden on researchers to explain the risks and benefits of technical choices to partners in layman’s terms. Alderman et al. (2013) highlight the false expectations that subjects can have when they engage with researchers coming from privileged institutions and the responsibilities that this can produce. A similar logic can be in operation for partner organizations. Sharing (and explaining) statistical power calculations is one way of ensuring understanding. Another is to generate “mock” tables of results in advance so that partners can see exactly what is being tested and how those tests will be interpreted. [7]

A second concern relates to the researchers’ independence from partners. The concern is simple, that in the social sciences, as in medical sciences, partnering induces pressures on researchers to produce results that make the partner happy. These concerns relate to the credibility of results, a problem I return to below. The problems are especially obvious when researchers receive remuneration; but they apply more generally and may put the quality of the research at risk. But the lack of independence cuts the other way also: if staff in partner organizations depend on researchers for access to expertise or funding, this may generate conflicts of interest for them in agreeing to implement some kind of research or other.

One way that independence can be increased is through separation of funding: when researchers are not remunerated for conducting experimental evaluations, they may be freer to report negative results. Another is to clarify from the outset that researchers have the right to the data and the right to publish the results no matter what the findings are. However, even when these measures are taken, there may be psychological or ideological reasons that researchers might still not be fully independent from partners.

2.2 Users: Quality of Research Findings

Given the fact that field experiments can impose costs on some groups, including subjects, assessing the beneficence of a study is especially tricky. A part of the consideration of beneficence however involves an assessment of the quality of the work and the lessons that can be drawn from it. If an argument in favor of a research design is that the lessons from the research produce positive effects, for example by providing answers to normatively important questions, then an assessment of beneficence requires an expectation that the design is capable of generating credible results (Baele 2013). [8] In practice though researchers sometimes defend research that involves potential risks on the basis of the gains from knowledge there is rarely any kind of systematic accounting for such gains and rarely a treatment of how to assess these gains when there are value disagreements. Moreover researchers, given their interests in the research, are likely the wrong people to try to make this determination. Nevertheless, any claim based on the value of the findings needs to assume that the findings are credible.

The credibility of research depends on many features. I would like to draw attention to one which is the loss in credibility that can arise from weak analytic transparency. Post hoc analysis is still the norm in much of political science and economics. Until recently it has been almost impossible to find a registered design of any experiment in the political economy of development (in the first draft of this paper I pointed to one study; there are now close to 200 pre-registered designs housed on the EGAP registry (109), RIDIE (37), and AEA registry (49)). When experiments are not pre-registered there may be concerns that results are selected based on their statistical significance or the substantive claims they make, with serious implications for bias (Gerber and Malhotra 2008; Casey et al. 2012) .

As research of this form increases in prominence, there will be a need to develop principles to address these questions of audience. For this, social scientists might follow the lead of the National Commission that established the principles for health research and seek not to root assessments of what is or is not ethical research in conflicting moral intuitions or on normative theories that may or may not be broadly shared. Instead in response to the issues raised by field experiments, social scientists could initiate a public process to decide what should constitute expected practice in this field in light of the interests of the audiences specific to their research – notably partners, governments, and the general public. [9]

3 Problem 2: Agency

In the example above of an experiment on street-lighting the intervention was initiated and implemented by a local organization and not by the researchers. Is this fact of ethical relevance for researchers taking part in the experiment?

Currently many social science experiments are implemented in this way by political actors of various forms such as a government, an NGO or a development agency. In these cases, and unlike many medical trials, research often only exists because of the intervention rather than the other way round. [10] This approach can be contrasted with a “framed field experiment” in which the intervention is established by researchers for the purpose of addressing a research question and done in a way in which participants know that they are part of a research experiment. [11] In practice, of course, the distinction between these two types of experiment is often not clear, [12] even still it raises an important point of principle: can things be arranged such that the ethical responsibility for experiments can be shared with partners?

Assume heroically that there is agreement among researchers about appropriate standards of research. Say now, still more heroically, that there are other standards of behaviour for other actors in other spheres that are also generally accepted. For NGOs for example we might think of the INGO Accountability Charter; for governments we might think of international treaty obligations. One might think of these ethical principles in different spheres as stemming from a single theory of ethics, or simply as the possibly incompatible principles adopted by different communities. In either case, these different standards may specify different behaviours for different actors. Thus for example by the ethical principles of research, a researcher interviewing a genocidaire in Rwanda should seek fully informed consent prior to questioning and stop questioning when asked by the subject or if they sense discomfort on the part of the subject. However, a government interrogator might not, but still act ethically according to the principles adopted by governments by eschewing other behaviour, such as torture. In this example, the ethical constraints on the researcher seem more demanding. There may be more intractable incompatibilities if constraints are not “nested.” For example a researcher may think it unethical to give over information about a subject suspected of criminal activities while a government official may think it unethical not to.

The question then is whose ethical principles to follow when there are collaborations? One possibility is to adhere to the most stringent principle of the partners. Thus researchers working in partnerships with governments may expect governments to follow principles of research ethics when engaging with subjects. In some situations, discussed below, this may be a fruitful approach. But as a general principle it suffers from two flaws. The first is that in making these requirements the researcher is altering the behaviour of partners in ways that may limit their effectiveness. The second is that, as noted above, the constraints may be non-nested: the ethical position for a government may be to prosecute a criminal; but the researcher wants to minimize harm to subjects. In practice this might rule out appending research components to interventions that would have happened without the researcher and that are ethical from the perspective of implementers; it could for example prevent the use of experimental approaches to study a large range of government strategies without any gain, and possibly some loss, to affected populations.

An alternative approach is to divide responsibilities: to make implementers responsible for implementation and researchers responsible for the research. This is what I call above the “spheres of ethics” argument. The principle of allocating responsibility of implementation to partners may then be justified on the grounds that in the absence of researchers, partners would be implementing (or, more weakly, that they could implement) such interventions anyhow, and are capable of bearing ethical responsibility for the interventions outside of the research context.

Quite distinct rationales for this approach are that partner organizations may be better placed to make decisions in the relevant areas and may be more effectively held to account if things go wrong. In addition partners may be seen by others as having legitimacy to take actions which might (correctly) be seen as meddling by outsiders (see Baele (2013) on the “Foreign Intervention problem”).

As a practical matter researchers can do this in an underhand way by advising on interventions qua consultants and then returning to analyse data qua researchers; or by setting up an NGO to implement an intervention qua activist and then return for the data qua researcher. But this approach risks creating a backdoor for simply avoiding researcher responsibilities altogether.

Instead, by appealing to spheres of ethics, researchers collaborating with autonomous partners can do something like this in a transparent way by formally dividing responsibility. Although researchers play a role in the design of interventions it may still be possible to draw a line between responsibility for design and responsibility for implementation. Here, responsibility is understood not in the causal sense of who contributed to the intervention, but formally as who shoulders moral and legal responsibility for the intervention.

An argument against the spheres of ethics approach is that it is simply passing the buck and not engaging with the ethical issues at all. But this response misses the point of professional ethics; professional ethics is not about what outcomes should obtain in the world but about who should do what. Allocating responsibility to partners is no more buck-passing than calling on police to intervene in a threatening situation rather than relying on self-help.

The sphere of ethics approach is consistent with ideas in medical research for assessing non-validated practice. On this issue the Belmont report notes: “Research and practice may be carried on together when research is designed to evaluate the safety and efficacy of a therapy. This need not cause any confusion regarding whether or not the activity requires review; the general rule is that if there is any element of research in an activity, that activity should undergo review for the protection of human subjects.” In terms of the standards to be applied in such a review, however, Levine (1988) notes: “the ethical norms and procedures that apply to non-validated practice are complex. Use of a modality that has been classified as non-validated practice is justified according to the norms of practice. However, the research designed to develop information about the safety and efficacy of the practice is conducted according to the norms of research.”

Levine’s interpretation of the division of labour appears consistent with the spheres of ethics approach. But the approach raises at least two critical difficulties. The first is a problem of implementer autonomy . In practice implementers may not be so autonomous from the researchers, in which case the spheres of ethics argument may simply serve as a cover for avoiding researcher responsibilities. The second is deeper: the argument is incomplete insofar as it depends on an unanswered normative question: it requires that the researcher have grounds to deem actions that are ethical from the partner’s perspective are indeed ethical – perhaps in terms of content or on the grounds of the process used by partners to construct them. This is the partner legitimacy concern. A researcher adopting a spheres of ethics argument may reasonably be challenged for endorsing or benefitting from weak ethical standards of partners. Indeed without an answer to this question, any collection of people could engage in any action which they claim to be ethical with respect to their “sphere;” a version of this argument could for example serve as grounds for doctors participating in medical experimentation in partnership with the Nazi government.

In line with the principle of socially constructed professional ethics, described in Section 2, a solution might be the formal recognition by the professions of classes of legitimate partners for various spheres – such as all governments, or all governments satisfying some particular criteria. The incompleteness of the spheres of ethics argument then adds urgency to the need for an answer to the problem of audience.

4 Problem 3: Consent

Medical ethics places considerable focus on the principle of informed consent, and indeed consent can in principle allay the twin concerns of audience and agency discussed in Sections 2 and 3: If the relevant audience provides consent then the expectations of the audience are arguably met and there is also a clearer allocation of responsibility for action. Both of these arguments confront difficulties however. Moreover different conceptualizations of audience and agency have different implications for consent.

The US National Commission motivated the principle of consent as follows:

Respect for persons requires that subjects, to the degree that they are capable, be given the opportunity to choose what shall or shall not happen to them… there is widespread agreement that the consent process can be analyzed as containing three elements: information, comprehension and voluntariness.

In promoting the concept of consent, the commission also sought to produce definitional clarity around it. Whereas the terms can mean many things in different settings, as described by Levine (1988), “the Commission […] abandoned the use of the word “consent,” except in situations in which an individual can provide “legally effective consent” on his or her own behalf.” [13]

In practice however in many social experiments, consent is very imperfect. imperfect consent is routinely sought for measurement purposes, for example when survey data is collected. It is sometimes sought at least implicitly for interventions, although individual subjects may often not be consulted on whether for example they are to be exposed to particular ads or whether a school is to be built in their town. But even if consent for exposure to a treatment is sought, individual level consent may not be sought for participation in the experiment per se, for example subjects are often not informed that they were randomly assigned to receive (or not receive) a treatment for research purposes. [14]

To assess how great a problem this is, it is useful to consider the rationales for informed consent that inspired medical professionals and other rationales that may be relevant for social scientists.

4.1 The Argument from Respect of Persons

The argument provided for informed consent in the Belmont report and related documents is the principle of “respect for persons.” Manipulating subjects without their consent diminishes their autonomy and instantiates a lack of respect. Consent, conversely, can serve two functions.

The first is diagnostic : that consent can provide a test of whether people are in fact being used “merely as ends.” [15] Critically, this diagnostic function of consent can in principle be achieved without actual consent; though actual consent eliminates the need for guesswork.

The second is effective : that consent may enhance autonomy (or conversely, forgoing consent reduces autonomy). Thus the Belmont report advises the importance of maximizing the autonomy of subjects: “Respect for persons requires that subjects, to the degree that they are capable, be given the opportunity to choose what shall or shall not happen to them.” There are multiple aspects of autonomy that may be affected by engagement with an experiment, with somewhat different implications for what is required of consent. I distinguish here between three: participation autonomy , behaviour autonomy , and product autonomy . [16]

The first, participation autonomy , relates to the decision of whether or not to be involved with the research. The absence of choice reduces subject autonomy at least with respect to the decision to take part. Behavioural autonomy may be compromised due to lack of consent because of information deficits (see example below) resulting in subjects making decisions that they would not otherwise make, given the options available to them. Behavioural autonomy can also be compromised if individuals’ choice sets are constrained because of the manipulation. Third, as a subject’s actions yield a research product, a lack of consent means that the subject loses control over how their labour is to be used, or a loss of product autonomy . [17] To illustrate: say an intervention broadcasts information about political performance on the radio in order to assess how the information alters voting behaviour by the politician’s constituents. Done without consent, the listeners had no option but to take part in the study (participation autonomy), their subsequent actions are affected by the treatment and might have been different had they known the information was provided for research purposes (behavioural autonomy) and they will have no say in the publication of knowledge that is derived from their actions (product autonomy).

A problem with this formulation is that consent, or even notional consent, is not clearly either a necessary or sufficient condition for respect for persons. That is, unless respect for persons is defined in terms of consent (rather than, for example, a concern with the welfare or capabilities of others), the diagnostic function of consent as described above faces difficulties. There is a logical disconnect between consent and respect since determining respect requires information about the disposition of the researcher but consent provides information on the disposition of the subject. Consent might not be a necessary condition for establishing respect for persons since it is possible that the subject would never consent to an action that is nevertheless taken by a researcher with a view to enhancing their welfare or their capabilities. And of course, subjects may consent to actions not in their interests and not consent to other actions that are, or they may unknowingly take actions that limit their autonomy. The specific markers sometimes invoked to indicate that respect for persons is violated, such as the use of deceit or force, also suffer difficulties since one can construct instances in which a deceived person can recognize that deceit was necessary to achieve a good in question. [18] In addition, consent might not be sufficient since it is possible that a subject consents to an action that is not being done because it is in their interest, but nevertheless has their welfare as a byproduct.

Consider again the three types of autonomy that are threatened by an incomplete consent process. Loss in participation autonomy does not necessarily imply that individuals are treated simply as ends. Holding a surprise birthday for a friend deliberately compromises participation autonomy in order to provide a benefit for the friend – one that they might consent to if only the consent did not destroy the surprise. [19] In some situations, where providing consent may put individuals at risk, not seeking consent may even increase participation autonomy by providing the choice to participate de facto or not even if risks make formal consent impossible. Even in the absence of consent however it is possible that participation in an experiment enhances behaviour autonomy either by expanding information or by expanding choice sets. Product autonomy can be restored by ex post consent, for example allowing a subject to determine whether they want data collected from them to be used in an analysis. Thus consent, as currently required, does not seem to be necessary or sufficient for the work asked of it.

4.2 Other rationales for Consent

Legal protection from charges of abuse : A nonethical reason for seeking consent is to protect researchers from civil or criminal charges of abuse. For medical trials, the need for protection is obvious since actions as simple as providing an injection involve physical injury, which would under normal circumstance have criminal implications. [20] Consent clarifies that the action is non-criminal in nature (although this depends on the action – consent to be killed does not generally protect the killer). The rationale for documenting consent is primarily legal. As noted by Levine, HEW regulations “require that if there are risks associated with research then ‘legally effective informed consent will be obtained… The purpose of documenting consent on a consent form is […] to protect the investigator and the institution against legal liability” (Levine 1979).

Information aggregation/subject filtering : Consent may also provide researchers with information regarding the relative costs or benefits of an intervention. If a researcher discovers that an individual is unwilling to take part in a study, this provides information on the perceived benefits of the study. In such cases there are double grounds not to proceed, not just because it compromises autonomy but also because it violates beneficence. As discussed below however, this goal of information aggregation may be met at a population level by seeking consent from a subset of potential subjects.

Maintaining the reputation of the academy : A third rationale for consent is that consent preserves the reputation of the academy. It clarifies to the public the nature of relations between researchers and populations, that this relation is based on respect, and that populations should not expect that their trust in researchers will be abused or that they will be put at risk without consent. Though clearly of pragmatic benefit to the academy this argument is ethical insofar as it reflects a standard of behaviour that is expected of a particular group. Note that this argument, more than any of the others, provides a rationale for ethical standards specific to researcher-subject relations that maintain higher standards than is expected of general interactions.

In the context of naturally occurring field experiments, there are also arguments for why consent might not be sought.

One is that because the intervention is naturally occurring, an attempt to gain consent would be intrusive for subjects and especially damaging for research. Consider for example an experiment that focuses on the effects of billboard ads. In this experiment it is precisely because seeing government ads is a routine event that preceding (if that is possible) viewing of the ad with an announcement that the ad is being posted to understand such and such an effect will have particularly adverse consequences. Preceding the ad with a disclaimer may moreover falsely suggest to subjects that some unusual participation or measurement is taking place, even if a purpose of the disclaimer is to deny it.

A second, more difficult reason is that the withholding of consent may not be within the rights of the subjects. Consider for example a case where a police force seeks to understand the effects of patrols on reducing crime. The force could argue that the consent of possible criminals (the subjects in this case) is not required, and indeed is undesirable, for the force to decide where to place police. This argument is the most challenging since it highlights the fact that consent is not even notionally required by all actors for all interventions, even if it is generally always required of researchers for subjects. In this example the police can argue that the subject has no rights over whether or how the intervention is administered (participation autonomy). One might counter that even if that is correct, the subject may still have rights regarding whether his responses to the interventions can be used for research purposes (product autonomy). However, one might in turn counter that even these concerns might be discounted if the actions are public information.

In Section 2, I noted that maintaining the trust of subjects is of paramount concern to medical researchers. This provides a basis for insisting on informed consent by subjects. As argued in Section 2, for social scientists, the confidence of the general public and of policy makers in particular are also critical. Moreover the welfare of non-subjects may be of critical importance. These considerations have two implications: first that depending on the treatment of the problem of audience, the form of consent needed may differ from the current standard; second that depending on the population affected, the focus on subjects as the locus of consent may not be appropriate: the informed consent of practitioner partners and affected third parties may be just, or perhaps more, critical.

4.3 Varieties of Consent

Given the multiple desiderata associated with consent we may expect that variations of the informed consent process might succeed in meeting some or other of these.

For example, if what is valued is participation autonomy , then this seems to require actual ex ante consent. The loss in autonomy consists of the absence of choice to be subjected to a treatment. The demands of product autonomy , unlike participation or behaviour autonomy, can be met with ex post consent. The demands of the diagnostic test can in principle be met by notional consent, and so on.

With this in mind, Table 1 , considers how eight approaches to the consent process fare on different desiderata. [21]

Consent Strategies.

Strategy
12345678
Informed consentImplied consentProxy (delegated) consentSuperset consentPackage consentDeferred (ex post) consentInferred (surrogate) consentSpheres of ethics: (compartmentalized consent)
Illustration/description:Subject provided intelligible information and asked for consent before implementationPatient holds arm out to doctor to provide injection[Statistical] Subject asked to appoint someone who they trust to receive information about experiment A and provide consent on their behalf

[Authoritative] : representative used that is not specifically appointed by subject for this purpose
Subject asked in advance of experiment A if they would be willing to take part in of experiments A, B, CSubject asked in advance of experiment A if they would be willing to be assigned to take part in one of a set of experimentsSubject unwittingly takes part in study and asked after the fact if data may be usedSample of nonsubjects that are “like” the subject are asked if they would be willing to take part in experiment A. Inferences on hypothetical consent of subject are drawnA researcher partners with a practitioner to implement a study. The practitioner is responsible for the intervention and the researcher for measurement
Consideration
 Application of diagnostic test++++++++
 Individual participation autonomy+++++++
 Individual behaviour autonomy+++++++
 Individual product autonomy+++++++++
 Researcher gains knowledge about likely felt costs of intervention+++++++++++
 Researcher gain prior knowledge about particular risks facing individual+++++++
 Legal protection of researchers (assuming documentation)+++++++++
 Reputation of discipline++++++++++
 Beneficence (towards subjects)????????
 Avoids Hawthorne and related biases??+++++++++
 Low cost?+?++

Source: Author.

Ex ante informed consent : Ex ante informed consent fares well on autonomy principles as well as on legal protection of researchers (if documented) and the reputation of the discipline. As argued above however it is not a necessary or sufficient condition for respect for persons, in addition it may impose costs on subjects, weaken the quality of some kinds of research, and be costly to achieve.

Implied consent: An alternative is implied consent which arises when there are grounds to think that consent is given even if consent is not formally given or elicited. Implied consent might include cases in which voluntary participation is itself considered evidence of consent to be in a study. Implied consent can reduce costs to subjects and researchers but may leave researchers in a legally weaker position and may put their reputation more into question.

Proxy (delegated) consent : Both ex ante consent and implied consent suppose that subjects are informed of the purpose of the experiment ex ante . In some settings, this can threaten the validity of the research. An approach to maintain a form of participation autonomy but keep subjects blind to treatment is to ask subjects to delegate someone who will be given full information and determine on their behalf whether to give consent. [22] Insofar as the subject sees the delegate as their agent in the matter, proxy consent inherits the benefits of ex ante informed consent, but with reduced risks to the research. A weaker alternative – the “ authoritative ” approach – is to seek consent from a proxy that is not specifically delegated for the purpose by a subject. In some settings for example the consent of community leaders is sought for interventions that take place at a community level; this procedure invokes the principles of proxy consent but assumes that individuals that are delegated for one purpose inherit the authority to be delegates for the consent process. Baele (2013) for example recommends this form of consent.

Superset (Blanket) Consent : Another way to protect research integrity while preserving subject autonomy is to seek what might be called “Superset consent.” Say a researcher identifies set X of possible experiments, including the experiment of interest. The researcher then asks subject to identify set C ∩ X of interventions for which the subject is willing to take part. [23] Given this procedure, if set C includes the experiment of interest, a researcher can conclude that consent has been given for the experiment of interest even though the experiment has not been specified as being the one of interest. In practice, abstract descriptions may suffice to generate consent for large classes of experiments (for example a subject may consent to any experiment that seeks to answer some question in some class for which there is no more than minimal harm); greater coarsening of this form implies less specific information (see Easton on waived consent). [24]

Package consent : An alternative to superset consent is a process in which subjects are asked whether they are willing to take part in an experiment that will involve some intervention in set X , including the intervention of interest. If the subject agrees, then consent for the intervention is assumed. This differs from superset consent insofar as it is possible that a subject would be willing to accept the package but not accept the individual component if offered that component alone. For example if X contained experiment A in which I could expect to win $1000 and experiment B in which I expect to lose US$10 I might consent to set X, but only in the hope that I will be assigned to experiment A . To enhance informedness, the subject may be provided with the probabilities associated with the implementation of each possible experiment. Critically, this approach may be inconsistent with a desire to have continuous consent – in the sense of consent not just at study outset but in the course of the study also. In a sense under this design a deal is struck between researcher and subject and the subject is expected to follow through on their side of the deal; this limitation runs counter to common practice but is not inconsistent with respect for persons.

Deferred (retrospective, ex post) consent : When consent is not sought before the fact, it is common to provide a debriefing after the fact. In some cases this might be important to avoid harm. In the Milgrom experiments debriefing could help remove guilt, if subjects find out that they did not in fact torture the confederates. But beyond debriefing it is possible to seek consent after the fact (Fost and Robertson 1980). For some purposes this is too late: it does not restore participation or behaviour autonomy, [25] but it does provide product autonomy and it does satisfy the diagnostic test. In some situations, retrospective consent might impose costs on subjects however and generate a sense of lost autonomy.

Inferred (surrogate) consent: [26] Consent is inferred (sometimes, “presumed”) if there are empirical grounds to expect that consent would be given were it elicited. As described above, the diagnostic test for respect for person is not that consent has been obtained but that it would not be refused if sought. This question is partly answerable. A number of different approaches might be used. For example one might describe an experiment to a random subset of subjects and ask them if they would be happy to take part in this experiment, or if they would be happy to take part in this experiment, even if their consent were not sought . One could also combine this with ex post consent by implementing the experiment with a subset of actors and then ask them if they are happy that they took part, even though they were not told the purpose; or alternatively if, knowing what they know now, would they have been willing to give their consent to take part ex ante ? Inferences may then be made to the willingness of the larger population to provide consent. This might be called the statistical approach. [27] Again a weaker, authoritative alternative may be invoked by seeking consent from a third person that does not have legitimacy to speak on behalf of the subject but who is believed to have insight into the subject’s disposition.

The final approach marked in column 8 Table 1 , is the spheres of ethics approach, described in Section 3.

Thus although currently researchers use a very narrow operationalization of the principle of consent the broader menu of possibilities is quite large. As of now, researchers could test and develop these in settings in which consent is not routinely sought. Though most of these fall short of fully informed consent, many meet the principles of respect for persons more effectively than consent as sometimes practiced. Looking forward, collective answers to the question of audience and agency can help determine which type of consent is optimal when.

5 Conclusion

I have described the primary problem of assessing the ethical implications of social experiments as a problem of audience. Medical ethics have been developed in large part to regulate relations between medical researchers and patients. Social scientists have adopted the framework created for medical researchers but their audiences are different: at least in the area of experimental research on public policy, relations with policy makers, practitioner organizations, and the general public can be just as important as the relationship with research subjects. Moreover the interests of these different groups often diverge, making the problem of constructing ethics more obviously political.

These considerations suggest two conclusions.

First, rather than seeking some fundamental answer to ethical dilemmas or seeking to address the practical problems facing social scientists using the tools generated for another discipline, there is a need for a social process of construction of ethical principles that address the preoccupations of social scientists in this field, especially in settings in which there are power imbalances between lead researchers and research partners and in which there are value disagreements regarding what constitutes beneficent outcomes. Such a process will be inherently political. Just as social scientific interventions are more likely to have distributive implications – generating costs for some and benefits for others – so ethical principles of engagement, if there is to be engagement at all, may require the principled taking of sides, that is, the choice of an audience. The importance of constructing an appropriate ethics for this field is of some urgency since there is no reason to expect that all researchers working in this domain will independently converge on consistent standards for experimental research in grey areas.

Second, depending on answers to the problem of audience, it may turn out that answers to the questions of agency (Section 3) and consent (Section 4) will be different for social scientists than for medical researchers. I have sketched some possible answers to the questions of agency and consent that diverge somewhat from standard practice. Currently when researchers engage in studies that generate risks, they defend the research on the basis of its social value. But they do so often as interested researchers and without equipment to weigh benefits in the presence of value disagreements. Greater efforts to share the responsibility of research, whether through more carefully crafted relations of agency with developing country actors or more diligent focus on consent may reduce these pressures on value assessments and may also reduce risks to both populations and the professions.

Acknowledgments

Warm thanks to the WIDER research group on Experimental and Non-Experimental Methods in the Study of Government Performance. Earlier version presented at UCSD conference on ethics and experiments in comparative politics. My thanks to Jasper Cooper and Lauren Young for very generous comments on this manuscript. This paper draws on previous work titled “Ethical Challenges of Embedded Experimentation.”

Abram, M. B. and S. M. Wolf (1984) “Public Involvement in Medical Ethics. A Model for Government Action,” The New England Journal of Medicine, 310(10):627–632. 10.1056/NEJM198403083101005 Search in Google Scholar

Alderman, Harold, Jishnu Das, and Vijayendra Rao (2013) Conducting Ethical Economic Research: Complications from the Field . World Bank Policy Research Working Paper No. 6446. 10.1596/1813-9450-6446 Search in Google Scholar

Baele, S. J. (2013) “The Ethics of New Development Economics: is the Experimental Approach to Development Economics morally wrong?,” Journal of Philosophical Economics, 7(1):2–42. Search in Google Scholar

Bertrand, M., S. Djankov, R. Hanna, and S. Mullainathan (2007) “Obtaining a Driver’s License in India: An Experimental Approach to Studying Corruption,” The Quarterly Journal of Economics, 122(4):1639–1676. 10.1162/qjec.2007.122.4.1639 Search in Google Scholar

Binmore, K. G. (1998) Game Theory and the Social Contract: Just Playing . Vol. 2. Cambridge: MIT Press. Search in Google Scholar

Casey, K. R. Glennerster, and E. Miguel (2012) “Reshaping Institutions: Evidence on Aid Impacts Using a Preanalysis Plan,” The Quarterly Journal of Economics 127(4):1755–1812. 10.1093/qje/qje027 Search in Google Scholar

Cassileth, B. R., R. V. Zupkis, K. Sutton-Smith, and V. March (1980) “Informed Consent – Why are its Goals Imperfectly Realized?” The New England Journal of Medicine, 302(16):896–900. 10.1056/NEJM198004173021605 Search in Google Scholar

DeScioli, P. and R. Kurzban (2013) “A Solution to the Mysteries of Morality,” Psychological Bulletin, 139(2):477. 10.1037/a0029065 Search in Google Scholar

Fost, N. and J. A. Robertson (1980) “Deferring Consent with Incompetent patients in an Intensive Care Unit,” IRB, 2(7):5. 10.2307/3564363 Search in Google Scholar

Gerber, A. and N. Malhotra (2008) “Do Statistical Reporting Standards Affect What Is Published? Publication Bias in Two Leading Political Science Journals,” Quarterly Journal of Political Science, 3(3):313–326. 10.1561/100.00008024 Search in Google Scholar

Gray, J. D. (2001) “The Problem of Consent in Emergency Medicine Research,” Canadian Journal of Emergency Medicine, 3(3):213–218. 10.1017/S1481803500005583 Search in Google Scholar

Harms, D. (1978) “The Belmont Report: Ethical Principles and Guidelines for the Protection of Human Subjects of Research,” DHEW Publication No. (OS) 78-0012. Search in Google Scholar

Jonsen, A. R. (1983) “A Concord in Medical Ethics,” Annals of Internal Medicine, 99(2):261–264. 10.7326/0003-4819-99-2-261 Search in Google Scholar

Kant, I. (1956). Critique of Practical Reason , translated by Lewis White Beck. Indianapolis, Ind.: Bobbs-Merrill. Search in Google Scholar

Levine, R. J. (1979). “Clarifying the Concepts of Research Ethics,” Hastings Center Report, 9(3):21–26. 10.2307/3560793 Search in Google Scholar

Levine, R. J. (1988). Ethics and Regulation of Clinical Research . Yale University Press. Search in Google Scholar

Levine, F. J. and P. R. Skedsvold (2008). “Where the Rubber Meets the Road: Aligning IRBs and Research Practice,” PS: Political Science and Politics, 41(3):501–505. Search in Google Scholar

Lipscomb A. and A.E. Bergh, eds. (1903) The Writings of Thomas Jefferson . Washington, DC: Thomas Jefferson Memorial Association of the United States, 1903-04. 20 vols. Search in Google Scholar

Love, R. R. and N. C. Fost (1997) “Ethical and Regulatory Challenges in a Randomized Control Trial of Adjuvant Treatment for Breast Cancer in Vietnam,” Journal of Investigative Medicine, 45:423–431. Search in Google Scholar

Pallikkathayil, J. (2010) “Deriving Morality from Politics: Rethinking the Formula of Humanity,” Ethics, 121(1):116–147. 10.1086/656041 Search in Google Scholar

Tolleson-Rinehart, S. (2008) “A Collision of Noble Goals: Protecting Human Subjects, Improving Health Care, and a Research Agenda for Political Science,” PS: Political Science and Politics, 41(3):507–511. Search in Google Scholar

Veatch, R. (2007) “Implied, Presumed and Waive Consent: the Relative Moral Wrongs of Under and Over-informing,” The American Journal of Bioethics, 7(12):39–41. 10.1080/15265160701710253 Search in Google Scholar

Vollmann, J. and R. Winau (1996) “Informed Consent in Human Experimentation Before the Nuremberg Code.” British Medical Journal, 313(7070):1445. 10.1136/bmj.313.7070.1445 Search in Google Scholar

Wantchekon, L. (2003) “Clientelism and Voting Behavior: Evidence from a Field Experiment in Benin,” World Politics, 55:399–422. 10.1353/wp.2003.0018 Search in Google Scholar

©2015, Macartan Humphreys, published by De Gruyter

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

  • X / Twitter

Supplementary Materials

Please login or register with De Gruyter to order this product.

Journal of Globalization and Development

Journal and Issue

Articles in the same issue.

conducting a social experiment

The New York Times

The learning network | social experiments: investigating questions related to personality attributes.

The Learning Network - Teaching and Learning With The New York Times

Social Experiments: Investigating Questions Related to Personality Attributes

Sciencetake | unlocking spider secrets.

A new look at the social organization of some eight-legged predators.

Lesson Plans - The Learning Network

Teaching ideas based on New York Times content.

  • See all in Science »
  • See all lesson plans »

Overview | What is the definition of personality, and how have scientists studied its attributes in animals? What can we learn about ourselves by studying the science behind personality? In this lesson, students learn about current research on social structure and personality attributes in a type of communal spider, then design their own research questions meant to investigate traits in humans like shyness, boldness or fear of the new.

Materials | Computers with Internet access.

Warm-up | When students arrive, ask them to indicate with a show of hands how many have taken a personality test online or in a magazine. Have a short discussion about these tests. What kinds of questions did they ask? What were these quizzes meant to uncover? Did students feel the tests accurately portrayed who they are? Why or why not?

Then, have students complete the third item in this Well blog post, the “I Just Get Myself” personality test. You can also consider pairing students, and having them guess their partner’s personality (though this requires signing up on the website).

Have a discussion about the results. Ask students if they generally agree with the assessments they received. Do they think the test accurately reflects who they are? Why or why not? Ask students to think about the questions they answered. You might have them return to the quiz so they can review the questions again. How do students think their answers might have contributed to their personality profile? Are there questions that the test should have included, but didn’t? What are these questions, and what would they address?

Finally, ask students if they think animals have personalities, too. Then, have them read about research showing that even some spiders show personality differences.

Related | From the article “The Lives of Sociable Spiders” by Natalie Angier:

Of the world’s 43,000 known varieties of spiders, an overwhelming majority are peevish loners: spinning webs, slinging lassos, liquefying prey and attacking trespassers, each spider unto its own. But about 25 arachnid species have swapped the hermit’s hair shirt for a more sociable and cooperative strategy, in which dozens or hundreds of spiders pool their powers to exploit resources that would elude a solo player. And believe it or not, O ye of rolled-up newspaper about to dispatch the poor little Charlotte dangling from your curtain rod for no better reason than your purported “primal fear,” these oddball spider socialites may offer fresh insight into an array of human mysteries: where our personalities come from, why some people can’t open their mouths at a party while others can’t keep theirs shut and, why, no matter our age, we can’t seem to leave high school behind.

Read the entire article with your class, and have them consider the questions below.

Questions | For discussion and reading comprehension:

  • What are social spiders? How do they differ from solitary species?
  • What is the definition of personality?
  • What personality attributes did the researchers study in social spiders?
  • How did the researchers measure these attributes?
  • One of the scientists in the article said, “When you go into a group, your behavior changes depending on the nature of that group, but it can only change so far.” What does this mean in the context of social spiders? Do you think this statement applies equally to people? How?

<a href="//www.nytimes.com/2010/04/06/science/06angi.html">Related Article</a>

Activity | Working in pairs, students should read more about personality tests, then develop a research question related to personality in either animals or people. If time permits, they might then begin to develop an experiment to test their key research question scientifically.

RELATED RESOURCES

From the learning network.

  • 6 Q’s About the News | Spiders That Thrive in a Social Web
  • Lesson: What’s Your Reading History? Reflecting on the Self as Reader
  • Do You Take More Risks When You Are Around Your Friends?

From NYTimes.com

  • Friends Can Be Dangerous
  • Even Among Animals: Leaders, Followers and Schmoozers
  • Meditation, for the Mind and Heart

Around the Web

  • Animals Have Personalities, Too
  • National Wildlife Federation: They’ve Got Personality
  • Personality Tests

Here are some resources that may help students identify possible research questions of their own and give them ideas for how to design experiments that test the ideas:

  • An overview of the growing field of animal personality research , highlighting key research questions and methods.
  • An article about the connections between personality and genes, and insights into how understanding animals’ personalities may affect captive breeding programs .
  • An overview of individual differences in behavior (PDF), with reference to a measure known as the “shy-bold continuum.”
  • A description of The Big Five personality test.

As students read the studies, have them consider the guiding questions below:

  • What is personality? Is there any one definition of personality?
  • What are some of the main personality attributes that scientists investigate, in either people or animals? What can these personality attributes tell you?
  • How do scientists define attributes such as shyness, boldness and fear of novelty? What are some of the ways scientists explore these attributes experimentally?
  • Who conducted the study, and which questions does it address?
  • What are the study’s main findings, and what evidence supports these conclusions?

When students have completed their research, hold a personality summit in which pairs of students share and present their findings and describe the question they would like to investigate. Why is this question important? How might it be researched? Finally, circle back to the beginning of the lesson by returning to the basic question of personality. Can students come to a consensus on the definition of personality? Why or why not?

Going further

<a href="//www.nytimes.com/2014/04/27/opinion/sunday/the-dangers-of-friends.html">Related Article</a>

Students might take additional personality quizzes, such as the ones included in this roundup of some of the best. As students work through the quizzes, have them think about each question being asked. What personality attribute or attributes might the question be measuring? How? What can these tests actually tell us? What can’t they tell us? Students also might try to game the system by purposely giving inaccurate answers to see how the test results change.

To go in a slightly different direction, encourage students to cast a critical eye toward some of the quizzes they might encounter. How, for example, could a personality test determine which fragrance might be right for you? Can a personality test really tell you which state best suits your personality? What about tests that are meant to show who might make your ideal romantic partner ? Or whether you have the “right stuff” to become a doctor ?

After reading widely on the topic, students might discuss what makes a good personality test. How should they be used? How shouldn’t they be used? Why?

This resource may be used to address the academic standards listed below.

Common Core E.L.A. Anchor Standards

1   Read closely to determine what the text says explicitly and to make logical inferences from it; cite specific textual evidence when writing or speaking to support conclusions drawn from the text.

Common Core Standards for Mathematical Practice

6   Understands relationships among organisms and their physical environment.

12   Understands the nature of scientific inquiry.

Comments are no longer being accepted.

Determining someone’s personality can be particularly difficult considering the amount of endless questions that can be asked. When dealing with these types of personality quizzes, its common to see questions such as: “whats your favorite color? What type of music do you listen to? Do you travel a lot?” Yet, i see to think that when understanding a someone’s personality you must go further than on the surface questions that are not always asked. The way a person behaves can be determined through the community where he/she was born. Proceeding to go further in, how many friends did they have growing up? How close were they to them? You can also take the approach towards more personal experiences in which they have gone through. What was the scariest thing that has happened to you? Have you ever experienced a life or death situation? These key questions truly make up a person and their behaviors. With this information, you can understand their choice in music, tv show, and all other general information without even asking it. After picking the person’s brain long enough, you will soon be able to anticipate what their actions would be in certain situations. Once that is possible, you can accurately decipher their personality. As for determining an animal’s personality, it will be tricky. Unfortunately, we do not have the luxury of being able to fluently communicate with animals yet. Never the less, we can still get a general idea of their personality through several experiments and observing them naturally. We need also be aware of their role as a member of their species. A role can determine a lot about the animal and we understand how much responsibility they hold. It is also important to see how efficiently they work. The effort they put in can show how much pride they have for completing their assignment, similar to humans. If observing animals similar to a pride of lions, then it is important to see how the other members treat and respect them. This shows the ranking it withholds. A higher ranking may result in a more aggressive and bold personality. You can see how observing an animal in the situations given can also be related to how to observe humans in their work environment. It is easy to see how we are so closely related.

Any one can invent a app or game or any sort of quiz to show us the user what personality attributes we may possess. All of that doesn’t matter though, because us humans are bewildering organisms any event can change on how we think or react to others we can dramatically change our personality through feeling. We humans are random well never know if our personality now well stay the same or change the next day. How I think a personality can be known by others is by looking at the persons surrounding and loved ones. If he or she has been abused then they are most likely to have a dark personality and on the other hand if it was the exact opposite well then the person must have a cheerful and bright personality. A personality can not be made in a software or any sort of electronic device, the person is born and has refine its personality over the years of their life. A personality on the other hand can be related to a soul we all have one, but In a way their all unique; well some can be the same personality but the will all have different causes for that personality they possess. Humans are not the only one with personalities, there is as well to all the other life organisms in this world, for example there is the animal kingdom. We all have brains, its what stores our personality and feelings we have ever felt in our life time. In conclusion the brain is a complex organ and as well us humans and other organisms in this world we all live in and our personalities will always change and not stay the exact same.

What's Next

IMAGES

  1. (PDF) Conducting social experiments

    conducting a social experiment

  2. A Social Experiment Based On Science

    conducting a social experiment

  3. Run Your Own Social Experiment Online 🔬

    conducting a social experiment

  4. Social Experiments (conducted by students)

    conducting a social experiment

  5. A Social Experiment

    conducting a social experiment

  6. PPT

    conducting a social experiment

VIDEO

  1. Social experiment

  2. How do you like the social experiment?

  3. social experiment

  4. Social experiment

  5. Social experiment

  6. was the social experiment successful?

COMMENTS

  1. Social experiment

    Sociology. A social experiment is a method of psychological or sociological research that observes people's reactions to certain situations or events. The experiment depends on a particular social approach where the main source of information is the participants' point of view and knowledge. To carry out a social experiment, specialists usually ...

  2. Social Experiments and Studies in Psychology

    Print. A social experiment is a type of research performed in psychology to investigate how people respond in certain social situations. In many of these experiments, the experimenters will include confederates who are people who act like regular participants but who are actually acting the part. Such experiments are often used to gain insight ...

  3. Great Ideas for Psychology Experiments to Explore

    Piano stairs experiment. Cognitive dissonance experiments. False memory experiments. You might not be able to replicate an experiment exactly (lots of classic psychology experiments have ethical issues that would preclude conducting them today), but you can use well-known studies as a basis for inspiration.

  4. Social Psychology Experiments: 10 Of The Most Famous Studies

    It has since become a classic social psychology experiment, studied by generations of students and recently coming under a lot of criticism. 5. The Milgram Social Psychology Experiment. The Milgram experiment, led by the well-known psychologist Stanley Milgram in the 1960s, aimed to test people's obedience to authority.

  5. Conducting an Experiment in Psychology

    When conducting an experiment, it is important to follow the seven basic steps of the scientific method: Ask a testable question. Define your variables. Conduct background research. Design your experiment. Perform the experiment. Collect and analyze the data. Draw conclusions.

  6. Setting up social experiments: the good, the bad, and the ugly

    It is widely agreed that randomized controlled trials - social experiments - are the gold standard for evaluating social programs. There are, however, many important issues that cannot be tested using social experiments, and often things go wrong when conducting social experiments. This paper explores these issues and offers suggestions on ways to deal with commonly encountered problems.

  7. Social Experiment

    The Social Experiment Reconsidered. Experimental research in the later sense of assessing attempts at social change would have been an incoherence in the 19th century determinist belief pattern on social regularity, normalcy, and the largely peacekeeping role of the State. In times of laissez-faire, the word experiment could impossibly be more ...

  8. 8 Effective Social Psychology Experiments ...

    Here are a few interesting experiments and activities for high school students to learn about social psychology : 1. Bystander effect simulation. The bystander effect [1] is a social psychology phenomenon that studies how an individual is unlikely to help in an urgent situation if surrounded by other people. Students can conduct experiments to ...

  9. 1.3 Conducting Research in Social Psychology

    Social psychological experiments are frequently factorial research designs in which the effects of more than one independent variable on a dependent variable are studied. All research has limitations, which is why scientists attempt to replicate their results using different measures, populations, and settings and to summarize those results ...

  10. Social labs as an inclusive methodology to implement and study social

    According to features 1 and 2, a social lab essentially revolves around conducting social experiments. A social experiment is an intervention that is built on the level of the social lab and tested in a relevant social context i.e. a particular case that is representative of the social challenge that is being tackled by the lab. The aim of the ...

  11. 1.3 Conducting Research in Social Psychology

    Deception in Social Psychology Experiments. You may have wondered whether the participants in the video game study and that we just discussed were told about the research hypothesis ahead of time. In fact, these experiments both used a cover story —a false statement of what the research was really about. The students in the video game study ...

  12. 12

    Probing and tinkering in a social context can be fruitfully described in terms of a social experiment as long as a few minimal conditions are fulfilled. Two of them may be that, first, the problems to be solved are framed in a way that includes an intersubjective process of hypothesis building and, second, that the - more or less passive ...

  13. Social Psychology Research Methods

    How Social Psychologists Conduct Their Research Surveys, observations, and case studies provide necessary data. By Kendra Cherry, MSEd. Updated on November 06, 2023. ... By using the scientific method, designing an experiment, collecting data, and analyzing the results, researchers can then determine if there is a causal relationship between ...

  14. The social lab as a method for experimental engagement in participatory

    According to the project's Social Lab Manual, pilot hosts, 'manage the implementation of a specific social experiment (pilot)', 'oversee[ing] the development of a 'prototype' intervention, tak[ing] it to the field and implement the experiment in the case (project, call, or program level) and take care of appropriate feedback to the ...

  15. Social Science Experiments

    Donald P. Green is Burgess Professor of Political Science at Columbia University. His path-breaking research uses experiments to study topics such as voting, prejudice, mass media, and gender-based violence. Discover Social Science Experiments, 1st Edition, Donald P. Green, HB ISBN: 9781009186971 on Higher Education from Cambridge.

  16. PDF Social Experimentation: Some Whys and Hows

    Title. Social Experimentation: Some Whys and Hows. Author. Rae W. Archibald . Subject. Assembles lessons about the technology of social experimentation by drawing on experience at RAND, notably the authors' experience in designing and managing the Health Insurance Study. Created Date. 7/19/2007 11:43:05 AM.

  17. Guide to Experimental Design

    Table of contents. Step 1: Define your variables. Step 2: Write your hypothesis. Step 3: Design your experimental treatments. Step 4: Assign your subjects to treatment groups. Step 5: Measure your dependent variable. Other interesting articles. Frequently asked questions about experiments.

  18. Famous Social Psychology Experiments

    At a Glance. Some of the most famous social psychology experiments include Asch's conformity experiments, Bandura's Bobo doll experiments, the Stanford prison experiment, and Milgram's obedience experiments. Some of these studies are quite controversial for various reasons, including how they were conducted, serious ethical concerns, and what ...

  19. PDF Basic Concepts and Principles of Social Experimentation

    surance experiment are examples of social experiments that focused on fundamental policy issues that were still rel-evant many years after the experiments were completed. Rather than estimating the impact of a specific policy, these experiments were designed to estimate underlying behav-ioral parametersŠthe elasticity of supply of labor and the

  20. Reflections on the Ethics of Social Experimentation

    Social scientists are increasingly engaging in experimental research projects of importance for public policy in developing areas. While this research holds the possibility of producing major social benefits, it may also involve manipulating populations, often without consent, sometimes with potentially adverse effects, and often in settings with obvious power differentials between researcher ...

  21. Social Experiments: Investigating Questions Related to Personality

    In conclusion the brain is a complex organ and as well us humans and other organisms in this world we all live in and our personalities will always change and not stay the exact same. In this lesson, students learn about current research on social structure and personality attributes in a type of communal spider, then design their own research ...

  22. Social Experimentation for Public Policy

    Abstract. This article discusses and considers the nature of social experiments that have been conducted for the past four decades. It provides a review of the efforts of many social scientists and economists to develop systematic empirical evidence about the likely advantages and disadvantages of specific policy proposals via the conduct of social experiments.