eMathZone

Basic Principles of Experimental Designs

The basic principles of experimental designs are randomization, replication and local control. These principles make a valid test of significance possible. Each of them is described briefly in the following subsections.

(1) Randomization. The first principle of an experimental design is randomization, which is a random process of assigning treatments to the experimental units. The random process implies that every possible allotment of treatments has the same probability. An experimental unit is the smallest division of the experimental material, and a treatment means an experimental condition whose effect is to be measured and compared. The purpose of randomization is to remove bias and other sources of extraneous variation which are not controllable. Another advantage of randomization (accompanied by replication) is that it forms the basis of any valid statistical test. Hence, the treatments must be assigned at random to the experimental units. Randomization is usually done by drawing numbered cards from a well-shuffled pack of cards, by drawing numbered balls from a well-shaken container or by using tables of random numbers.

(2) Replication. The second principle of an experimental design is replication, which is a repetition of the basic experiment. In other words, it is a complete run for all the treatments to be tested in the experiment. In all experiments, some kind of variation is introduced because of the fact that the experimental units such as individuals or plots of land in agricultural experiments cannot be physically identical. This type of variation can be removed by using a number of experimental units. We therefore perform the experiment more than once, i.e., we repeat the basic experiment. An individual repetition is called a replicate. The number, the shape and the size of replicates depend upon the nature of the experimental material. A replication is used to:

(i) Secure a more accurate estimate of the experimental error, a term which represents the differences that would be observed if the same treatments were applied several times to the same experimental units;

(ii) Decrease the experimental error and thereby increase precision, which is a measure of the variability of the experimental error; and

(iii) Obtain a more precise estimate of the mean effect of a treatment, since $${\sigma ^2}_{\overline y } = \frac{{{\sigma ^2}}}{n}$$, where $$n$$ denotes the number of replications.

(3) Local Control. It has been observed that all extraneous sources of variation are not removed by randomization and replication. This necessitates a refinement of the experimental technique. In other words, we need to choose a design in such a manner that all extraneous sources of variation are brought under control. For this purpose, we make use of local control, a term referring to the amount of balancing, blocking and grouping of the experimental units. Balancing means that the treatments should he assigned to the experimental units in such a way that the result is a balanced arrangement of the treatments. Blocking means that like experimental units should be collected together to form a relatively homogeneous group. A block is also a replicate. The main purpose of the principle of local control is to increase the efficiency of an experimental design by decreasing the experimental error. The point to remember here is that the term local control should not be confused with the word control. The word control in experimental design is used for a treatment which does not receive any treatment when we need to find out the effectiveness of other treatments through comparison.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Guide to Experimental Design | Overview, Steps, & Examples

Guide to Experimental Design | Overview, 5 steps & Examples

Published on December 3, 2019 by Rebecca Bevans . Revised on June 21, 2023.

Experiments are used to study causal relationships . You manipulate one or more independent variables and measure their effect on one or more dependent variables.

Experimental design create a set of procedures to systematically test a hypothesis . A good experimental design requires a strong understanding of the system you are studying.

There are five key steps in designing an experiment:

  • Consider your variables and how they are related
  • Write a specific, testable hypothesis
  • Design experimental treatments to manipulate your independent variable
  • Assign subjects to groups, either between-subjects or within-subjects
  • Plan how you will measure your dependent variable

For valid conclusions, you also need to select a representative sample and control any  extraneous variables that might influence your results. If random assignment of participants to control and treatment groups is impossible, unethical, or highly difficult, consider an observational study instead. This minimizes several types of research bias, particularly sampling bias , survivorship bias , and attrition bias as time passes.

Table of contents

Step 1: define your variables, step 2: write your hypothesis, step 3: design your experimental treatments, step 4: assign your subjects to treatment groups, step 5: measure your dependent variable, other interesting articles, frequently asked questions about experiments.

You should begin with a specific research question . We will work with two research question examples, one from health sciences and one from ecology:

To translate your research question into an experimental hypothesis, you need to define the main variables and make predictions about how they are related.

Start by simply listing the independent and dependent variables .

Research question Independent variable Dependent variable
Phone use and sleep Minutes of phone use before sleep Hours of sleep per night
Temperature and soil respiration Air temperature just above the soil surface CO2 respired from soil

Then you need to think about possible extraneous and confounding variables and consider how you might control  them in your experiment.

Extraneous variable How to control
Phone use and sleep in sleep patterns among individuals. measure the average difference between sleep with phone use and sleep without phone use rather than the average amount of sleep per treatment group.
Temperature and soil respiration also affects respiration, and moisture can decrease with increasing temperature. monitor soil moisture and add water to make sure that soil moisture is consistent across all treatment plots.

Finally, you can put these variables together into a diagram. Use arrows to show the possible relationships between variables and include signs to show the expected direction of the relationships.

Diagram of the relationship between variables in a sleep experiment

Here we predict that increasing temperature will increase soil respiration and decrease soil moisture, while decreasing soil moisture will lead to decreased soil respiration.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

what are the basic principles of design of experiments

Now that you have a strong conceptual understanding of the system you are studying, you should be able to write a specific, testable hypothesis that addresses your research question.

Null hypothesis (H ) Alternate hypothesis (H )
Phone use and sleep Phone use before sleep does not correlate with the amount of sleep a person gets. Increasing phone use before sleep leads to a decrease in sleep.
Temperature and soil respiration Air temperature does not correlate with soil respiration. Increased air temperature leads to increased soil respiration.

The next steps will describe how to design a controlled experiment . In a controlled experiment, you must be able to:

  • Systematically and precisely manipulate the independent variable(s).
  • Precisely measure the dependent variable(s).
  • Control any potential confounding variables.

If your study system doesn’t match these criteria, there are other types of research you can use to answer your research question.

How you manipulate the independent variable can affect the experiment’s external validity – that is, the extent to which the results can be generalized and applied to the broader world.

First, you may need to decide how widely to vary your independent variable.

  • just slightly above the natural range for your study region.
  • over a wider range of temperatures to mimic future warming.
  • over an extreme range that is beyond any possible natural variation.

Second, you may need to choose how finely to vary your independent variable. Sometimes this choice is made for you by your experimental system, but often you will need to decide, and this will affect how much you can infer from your results.

  • a categorical variable : either as binary (yes/no) or as levels of a factor (no phone use, low phone use, high phone use).
  • a continuous variable (minutes of phone use measured every night).

How you apply your experimental treatments to your test subjects is crucial for obtaining valid and reliable results.

First, you need to consider the study size : how many individuals will be included in the experiment? In general, the more subjects you include, the greater your experiment’s statistical power , which determines how much confidence you can have in your results.

Then you need to randomly assign your subjects to treatment groups . Each group receives a different level of the treatment (e.g. no phone use, low phone use, high phone use).

You should also include a control group , which receives no treatment. The control group tells us what would have happened to your test subjects without any experimental intervention.

When assigning your subjects to groups, there are two main choices you need to make:

  • A completely randomized design vs a randomized block design .
  • A between-subjects design vs a within-subjects design .

Randomization

An experiment can be completely randomized or randomized within blocks (aka strata):

  • In a completely randomized design , every subject is assigned to a treatment group at random.
  • In a randomized block design (aka stratified random design), subjects are first grouped according to a characteristic they share, and then randomly assigned to treatments within those groups.
Completely randomized design Randomized block design
Phone use and sleep Subjects are all randomly assigned a level of phone use using a random number generator. Subjects are first grouped by age, and then phone use treatments are randomly assigned within these groups.
Temperature and soil respiration Warming treatments are assigned to soil plots at random by using a number generator to generate map coordinates within the study area. Soils are first grouped by average rainfall, and then treatment plots are randomly assigned within these groups.

Sometimes randomization isn’t practical or ethical , so researchers create partially-random or even non-random designs. An experimental design where treatments aren’t randomly assigned is called a quasi-experimental design .

Between-subjects vs. within-subjects

In a between-subjects design (also known as an independent measures design or classic ANOVA design), individuals receive only one of the possible levels of an experimental treatment.

In medical or social research, you might also use matched pairs within your between-subjects design to make sure that each treatment group contains the same variety of test subjects in the same proportions.

In a within-subjects design (also known as a repeated measures design), every individual receives each of the experimental treatments consecutively, and their responses to each treatment are measured.

Within-subjects or repeated measures can also refer to an experimental design where an effect emerges over time, and individual responses are measured over time in order to measure this effect as it emerges.

Counterbalancing (randomizing or reversing the order of treatments among subjects) is often used in within-subjects designs to ensure that the order of treatment application doesn’t influence the results of the experiment.

Between-subjects (independent measures) design Within-subjects (repeated measures) design
Phone use and sleep Subjects are randomly assigned a level of phone use (none, low, or high) and follow that level of phone use throughout the experiment. Subjects are assigned consecutively to zero, low, and high levels of phone use throughout the experiment, and the order in which they follow these treatments is randomized.
Temperature and soil respiration Warming treatments are assigned to soil plots at random and the soils are kept at this temperature throughout the experiment. Every plot receives each warming treatment (1, 3, 5, 8, and 10C above ambient temperatures) consecutively over the course of the experiment, and the order in which they receive these treatments is randomized.

Prevent plagiarism. Run a free check.

Finally, you need to decide how you’ll collect data on your dependent variable outcomes. You should aim for reliable and valid measurements that minimize research bias or error.

Some variables, like temperature, can be objectively measured with scientific instruments. Others may need to be operationalized to turn them into measurable observations.

  • Ask participants to record what time they go to sleep and get up each day.
  • Ask participants to wear a sleep tracker.

How precisely you measure your dependent variable also affects the kinds of statistical analysis you can use on your data.

Experiments are always context-dependent, and a good experimental design will take into account all of the unique considerations of your study system to produce information that is both valid and relevant to your research question.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 21). Guide to Experimental Design | Overview, 5 steps & Examples. Scribbr. Retrieved September 30, 2024, from https://www.scribbr.com/methodology/experimental-design/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, random assignment in experiments | introduction & examples, quasi-experimental design | definition, types & examples, how to write a lab report, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

  • Collections & Categories

Main Principles of experimental design: the 3 “R’s”

There are three basic principles behind any experimental design:

Randomisation:  the random allocation of treatments to the experimental units.

Randomize to avoid confounding between treatment effects and other unknown effects.

Replication:  the repetition of a treatment within an experiment allows:

To quantify the natural variation between experimental units.

To increase accuracy of estimated effects.

Reduce noise:  by controlling as much as possible the conditions in the experiment. A classical example is the grouping of similar experimental units in blocks.

Using known characteristics/properties of the experimental units to explain variation, for example inclusion of block effects in the statistical model.

Statistical Design and Analysis of Biological Experiments

Chapter 1 principles of experimental design, 1.1 introduction.

The validity of conclusions drawn from a statistical analysis crucially hinges on the manner in which the data are acquired, and even the most sophisticated analysis will not rescue a flawed experiment. Planning an experiment and thinking about the details of data acquisition is so important for a successful analysis that R. A. Fisher—who single-handedly invented many of the experimental design techniques we are about to discuss—famously wrote

To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ( Fisher 1938 )

(Statistical) design of experiments provides the principles and methods for planning experiments and tailoring the data acquisition to an intended analysis. Design and analysis of an experiment are best considered as two aspects of the same enterprise: the goals of the analysis strongly inform an appropriate design, and the implemented design determines the possible analyses.

The primary aim of designing experiments is to ensure that valid statistical and scientific conclusions can be drawn that withstand the scrutiny of a determined skeptic. Good experimental design also considers that resources are used efficiently, and that estimates are sufficiently precise and hypothesis tests adequately powered. It protects our conclusions by excluding alternative interpretations or rendering them implausible. Three main pillars of experimental design are randomization , replication , and blocking , and we will flesh out their effects on the subsequent analysis as well as their implementation in an experimental design.

An experimental design is always tailored towards predefined (primary) analyses and an efficient analysis and unambiguous interpretation of the experimental data is often straightforward from a good design. This does not prevent us from doing additional analyses of interesting observations after the data are acquired, but these analyses can be subjected to more severe criticisms and conclusions are more tentative.

In this chapter, we provide the wider context for using experiments in a larger research enterprise and informally introduce the main statistical ideas of experimental design. We use a comparison of two samples as our main example to study how design choices affect an analysis, but postpone a formal quantitative analysis to the next chapters.

1.2 A Cautionary Tale

For illustrating some of the issues arising in the interplay of experimental design and analysis, we consider a simple example. We are interested in comparing the enzyme levels measured in processed blood samples from laboratory mice, when the sample processing is done either with a kit from a vendor A, or a kit from a competitor B. For this, we take 20 mice and randomly select 10 of them for sample preparation with kit A, while the blood samples of the remaining 10 mice are prepared with kit B. The experiment is illustrated in Figure 1.1 A and the resulting data are given in Table 1.1 .

Table 1.1: Measured enzyme levels from samples of twenty mice. Samples of ten mice each were processed using a kit of vendor A and B, respectively.
A 8.96 8.95 11.37 12.63 11.38 8.36 6.87 12.35 10.32 11.99
B 12.68 11.37 12.00 9.81 10.35 11.76 9.01 10.83 8.76 9.99

One option for comparing the two kits is to look at the difference in average enzyme levels, and we find an average level of 10.32 for vendor A and 10.66 for vendor B. We would like to interpret their difference of -0.34 as the difference due to the two preparation kits and conclude whether the two kits give equal results or if measurements based on one kit are systematically different from those based on the other kit.

Such interpretation, however, is only valid if the two groups of mice and their measurements are identical in all aspects except the sample preparation kit. If we use one strain of mice for kit A and another strain for kit B, any difference might also be attributed to inherent differences between the strains. Similarly, if the measurements using kit B were conducted much later than those using kit A, any observed difference might be attributed to changes in, e.g., mice selected, batches of chemicals used, device calibration, or any number of other influences. None of these competing explanations for an observed difference can be excluded from the given data alone, but good experimental design allows us to render them (almost) arbitrarily implausible.

A second aspect for our analysis is the inherent uncertainty in our calculated difference: if we repeat the experiment, the observed difference will change each time, and this will be more pronounced for a smaller number of mice, among others. If we do not use a sufficient number of mice in our experiment, the uncertainty associated with the observed difference might be too large, such that random fluctuations become a plausible explanation for the observed difference. Systematic differences between the two kits, of practically relevant magnitude in either direction, might then be compatible with the data, and we can draw no reliable conclusions from our experiment.

In each case, the statistical analysis—no matter how clever—was doomed before the experiment was even started, while simple ideas from statistical design of experiments would have provided correct and robust results with interpretable conclusions.

1.3 The Language of Experimental Design

By an experiment we understand an investigation where the researcher has full control over selecting and altering the experimental conditions of interest, and we only consider investigations of this type. The selected experimental conditions are called treatments . An experiment is comparative if the responses to several treatments are to be compared or contrasted. The experimental units are the smallest subdivision of the experimental material to which a treatment can be assigned. All experimental units given the same treatment constitute a treatment group . Especially in biology, we often compare treatments to a control group to which some standard experimental conditions are applied; a typical example is using a placebo for the control group, and different drugs for the other treatment groups.

The values observed are called responses and are measured on the response units ; these are often identical to the experimental units but need not be. Multiple experimental units are sometimes combined into groupings or blocks , such as mice grouped by litter, or samples grouped by batches of chemicals used for their preparation. More generally, we call any grouping of the experimental material (even with group size one) a unit .

In our example, we selected the mice, used a single sample per mouse, deliberately chose the two specific vendors, and had full control over which kit to assign to which mouse. In other words, the two kits are the treatments and the mice are the experimental units. We took the measured enzyme level of a single sample from a mouse as our response, and samples are therefore the response units. The resulting experiment is comparative, because we contrast the enzyme levels between the two treatment groups.

Three designs to determine the difference between two preparation kits A and B based on four mice. A: One sample per mouse. Comparison between averages of samples with same kit. B: Two samples per mouse treated with the same kit. Comparison between averages of mice with same kit requires averaging responses for each mouse first. C: Two samples per mouse each treated with different kit. Comparison between two samples of each mouse, with differences averaged.

Figure 1.1: Three designs to determine the difference between two preparation kits A and B based on four mice. A: One sample per mouse. Comparison between averages of samples with same kit. B: Two samples per mouse treated with the same kit. Comparison between averages of mice with same kit requires averaging responses for each mouse first. C: Two samples per mouse each treated with different kit. Comparison between two samples of each mouse, with differences averaged.

In this example, we can coalesce experimental and response units, because we have a single response per mouse and cannot distinguish a sample from a mouse in the analysis, as illustrated in Figure 1.1 A for four mice. Responses from mice with the same kit are averaged, and the kit difference is the difference between these two averages.

By contrast, if we take two samples per mouse and use the same kit for both samples, then the mice are still the experimental units, but each mouse now groups the two response units associated with it. Now, responses from the same mouse are first averaged, and these averages are used to calculate the difference between kits; even though eight measurements are available, this difference is still based on only four mice (Figure 1.1 B).

If we take two samples per mouse, but apply each kit to one of the two samples, then the samples are both the experimental and response units, while the mice are blocks that group the samples. Now, we calculate the difference between kits for each mouse, and then average these differences (Figure 1.1 C).

If we only use one kit and determine the average enzyme level, then this investigation is still an experiment, but is not comparative.

To summarize, the design of an experiment determines the logical structure of the experiment ; it consists of (i) a set of treatments (the two kits); (ii) a specification of the experimental units (animals, cell lines, samples) (the mice in Figure 1.1 A,B and the samples in Figure 1.1 C); (iii) a procedure for assigning treatments to units; and (iv) a specification of the response units and the quantity to be measured as a response (the samples and associated enzyme levels).

1.4 Experiment Validity

Before we embark on the more technical aspects of experimental design, we discuss three components for evaluating an experiment’s validity: construct validity , internal validity , and external validity . These criteria are well-established in areas such as educational and psychological research, and have more recently been discussed for animal research ( Würbel 2017 ) where experiments are increasingly scrutinized for their scientific rationale and their design and intended analyses.

1.4.1 Construct Validity

Construct validity concerns the choice of the experimental system for answering our research question. Is the system even capable of providing a relevant answer to the question?

Studying the mechanisms of a particular disease, for example, might require careful choice of an appropriate animal model that shows a disease phenotype and is accessible to experimental interventions. If the animal model is a proxy for drug development for humans, biological mechanisms must be sufficiently similar between animal and human physiologies.

Another important aspect of the construct is the quantity that we intend to measure (the measurand ), and its relation to the quantity or property we are interested in. For example, we might measure the concentration of the same chemical compound once in a blood sample and once in a highly purified sample, and these constitute two different measurands, whose values might not be comparable. Often, the quantity of interest (e.g., liver function) is not directly measurable (or even quantifiable) and we measure a biomarker instead. For example, pre-clinical and clinical investigations may use concentrations of proteins or counts of specific cell types from blood samples, such as the CD4+ cell count used as a biomarker for immune system function.

1.4.2 Internal Validity

The internal validity of an experiment concerns the soundness of the scientific rationale, statistical properties such as precision of estimates, and the measures taken against risk of bias. It refers to the validity of claims within the context of the experiment. Statistical design of experiments plays a prominent role in ensuring internal validity, and we briefly discuss the main ideas before providing the technical details and an application to our example in the subsequent sections.

Scientific Rationale and Research Question

The scientific rationale of a study is (usually) not immediately a statistical question. Translating a scientific question into a quantitative comparison amenable to statistical analysis is no small task and often requires careful consideration. It is a substantial, if non-statistical, benefit of using experimental design that we are forced to formulate a precise-enough research question and decide on the main analyses required for answering it before we conduct the experiment. For example, the question: is there a difference between placebo and drug? is insufficiently precise for planning a statistical analysis and determine an adequate experimental design. What exactly is the drug treatment? What should the drug’s concentration be and how is it administered? How do we make sure that the placebo group is comparable to the drug group in all other aspects? What do we measure and what do we mean by “difference?” A shift in average response, a fold-change, change in response before and after treatment?

The scientific rationale also enters the choice of a potential control group to which we compare responses. The quote

The deep, fundamental question in statistical analysis is ‘Compared to what?’ ( Tufte 1997 )

highlights the importance of this choice.

There are almost never enough resources to answer all relevant scientific questions. We therefore define a few questions of highest interest, and the main purpose of the experiment is answering these questions in the primary analysis . This intended analysis drives the experimental design to ensure relevant estimates can be calculated and have sufficient precision, and tests are adequately powered. This does not preclude us from conducting additional secondary analyses and exploratory analyses , but we are not willing to enlarge the experiment to ensure that strong conclusions can also be drawn from these analyses.

Risk of Bias

Experimental bias is a systematic difference in response between experimental units in addition to the difference caused by the treatments. The experimental units in the different groups are then not equal in all aspects other than the treatment applied to them. We saw several examples in Section 1.2 .

Minimizing the risk of bias is crucial for internal validity and we look at some common measures to eliminate or reduce different types of bias in Section 1.5 .

Precision and Effect Size

Another aspect of internal validity is the precision of estimates and the expected effect sizes. Is the experimental setup, in principle, able to detect a difference of relevant magnitude? Experimental design offers several methods for answering this question based on the expected heterogeneity of samples, the measurement error, and other sources of variation: power analysis is a technique for determining the number of samples required to reliably detect a relevant effect size and provide estimates of sufficient precision. More samples yield more precision and more power, but we have to be careful that replication is done at the right level: simply measuring a biological sample multiple times as in Figure 1.1 B yields more measured values, but is pseudo-replication for analyses. Replication should also ensure that the statistical uncertainties of estimates can be gauged from the data of the experiment itself, without additional untestable assumptions. Finally, the technique of blocking , shown in Figure 1.1 C, can remove a substantial proportion of the variation and thereby increase power and precision if we find a way to apply it.

1.4.3 External Validity

The external validity of an experiment concerns its replicability and the generalizability of inferences. An experiment is replicable if its results can be confirmed by an independent new experiment, preferably by a different lab and researcher. Experimental conditions in the replicate experiment usually differ from the original experiment, which provides evidence that the observed effects are robust to such changes. A much weaker condition on an experiment is reproducibility , the property that an independent researcher draws equivalent conclusions based on the data from this particular experiment, using the same analysis techniques. Reproducibility requires publishing the raw data, details on the experimental protocol, and a description of the statistical analyses, preferably with accompanying source code. Many scientific journals subscribe to reporting guidelines to ensure reproducibility and these are also helpful for planning an experiment.

A main threat to replicability and generalizability are too tightly controlled experimental conditions, when inferences only hold for a specific lab under the very specific conditions of the original experiment. Introducing systematic heterogeneity and using multi-center studies effectively broadens the experimental conditions and therefore the inferences for which internal validity is available.

For systematic heterogeneity , experimental conditions are systematically altered in addition to the treatments, and treatment differences estimated for each condition. For example, we might split the experimental material into several batches and use a different day of analysis, sample preparation, batch of buffer, measurement device, and lab technician for each batch. A more general inference is then possible if effect size, effect direction, and precision are comparable between the batches, indicating that the treatment differences are stable over the different conditions.

In multi-center experiments , the same experiment is conducted in several different labs and the results compared and merged. Multi-center approaches are very common in clinical trials and often necessary to reach the required number of patient enrollments.

Generalizability of randomized controlled trials in medicine and animal studies can suffer from overly restrictive eligibility criteria. In clinical trials, patients are often included or excluded based on co-medications and co-morbidities, and the resulting sample of eligible patients might no longer be representative of the patient population. For example, Travers et al. ( 2007 ) used the eligibility criteria of 17 random controlled trials of asthma treatments and found that out of 749 patients, only a median of 6% (45 patients) would be eligible for an asthma-related randomized controlled trial. This puts a question mark on the relevance of the trials’ findings for asthma patients in general.

1.5 Reducing the Risk of Bias

1.5.1 randomization of treatment allocation.

If systematic differences other than the treatment exist between our treatment groups, then the effect of the treatment is confounded with these other differences and our estimates of treatment effects might be biased.

We remove such unwanted systematic differences from our treatment comparisons by randomizing the allocation of treatments to experimental units. In a completely randomized design , each experimental unit has the same chance of being subjected to any of the treatments, and any differences between the experimental units other than the treatments are distributed over the treatment groups. Importantly, randomization is the only method that also protects our experiment against unknown sources of bias: we do not need to know all or even any of the potential differences and yet their impact is eliminated from the treatment comparisons by random treatment allocation.

Randomization has two effects: (i) differences unrelated to treatment become part of the ‘statistical noise’ rendering the treatment groups more similar; and (ii) the systematic differences are thereby eliminated as sources of bias from the treatment comparison.

Randomization transforms systematic variation into random variation.

In our example, a proper randomization would select 10 out of our 20 mice fully at random, such that the probability of any one mouse being picked is 1/20. These ten mice are then assigned to kit A, and the remaining mice to kit B. This allocation is entirely independent of the treatments and of any properties of the mice.

To ensure random treatment allocation, some kind of random process needs to be employed. This can be as simple as shuffling a pack of 10 red and 10 black cards or using a software-based random number generator. Randomization is slightly more difficult if the number of experimental units is not known at the start of the experiment, such as when patients are recruited for an ongoing clinical trial (sometimes called rolling recruitment ), and we want to have reasonable balance between the treatment groups at each stage of the trial.

Seemingly random assignments “by hand” are usually no less complicated than fully random assignments, but are always inferior. If surprising results ensue from the experiment, such assignments are subject to unanswerable criticism and suspicion of unwanted bias. Even worse are systematic allocations; they can only remove bias from known causes, and immediately raise red flags under the slightest scrutiny.

The Problem of Undesired Assignments

Even with a fully random treatment allocation procedure, we might end up with an undesirable allocation. For our example, the treatment group of kit A might—just by chance—contain mice that are all bigger or more active than those in the other treatment group. Statistical orthodoxy recommends using the design nevertheless, because only full randomization guarantees valid estimates of residual variance and unbiased estimates of effects. This argument, however, concerns the long-run properties of the procedure and seems of little help in this specific situation. Why should we care if the randomization yields correct estimates under replication of the experiment, if the particular experiment is jeopardized?

Another solution is to create a list of all possible allocations that we would accept and randomly choose one of these allocations for our experiment. The analysis should then reflect this restriction in the possible randomizations, which often renders this approach difficult to implement.

The most pragmatic method is to reject highly undesirable designs and compute a new randomization ( Cox 1958 ) . Undesirable allocations are unlikely to arise for large sample sizes, and we might accept a small bias in estimation for small sample sizes, when uncertainty in the estimated treatment effect is already high. In this approach, whenever we reject a particular outcome, we must also be willing to reject the outcome if we permute the treatment level labels. If we reject eight big and two small mice for kit A, then we must also reject two big and eight small mice. We must also be transparent and report a rejected allocation, so that critics may come to their own conclusions about potential biases and their remedies.

1.5.2 Blinding

Bias in treatment comparisons is also introduced if treatment allocation is random, but responses cannot be measured entirely objectively, or if knowledge of the assigned treatment affects the response. In clinical trials, for example, patients might react differently when they know to be on a placebo treatment, an effect known as cognitive bias . In animal experiments, caretakers might report more abnormal behavior for animals on a more severe treatment. Cognitive bias can be eliminated by concealing the treatment allocation from technicians or participants of a clinical trial, a technique called single-blinding .

If response measures are partially based on professional judgement (such as a clinical scale), patient or physician might unconsciously report lower scores for a placebo treatment, a phenomenon known as observer bias . Its removal requires double blinding , where treatment allocations are additionally concealed from the experimentalist.

Blinding requires randomized treatment allocation to begin with and substantial effort might be needed to implement it. Drug companies, for example, have to go to great lengths to ensure that a placebo looks, tastes, and feels similar enough to the actual drug. Additionally, blinding is often done by coding the treatment conditions and samples, and effect sizes and statistical significance are calculated before the code is revealed.

In clinical trials, double-blinding creates a conflict of interest. The attending physicians do not know which patient received which treatment, and thus accumulation of side-effects cannot be linked to any treatment. For this reason, clinical trials have a data monitoring committee not involved in the final analysis, that performs intermediate analyses of efficacy and safety at predefined intervals. If severe problems are detected, the committee might recommend altering or aborting the trial. The same might happen if one treatment already shows overwhelming evidence of superiority, such that it becomes unethical to withhold this treatment from the other patients.

1.5.3 Analysis Plan and Registration

An often overlooked source of bias has been termed the researcher degrees of freedom or garden of forking paths in the data analysis. For any set of data, there are many different options for its analysis: some results might be considered outliers and discarded, assumptions are made on error distributions and appropriate test statistics, different covariates might be included into a regression model. Often, multiple hypotheses are investigated and tested, and analyses are done separately on various (overlapping) subgroups. Hypotheses formed after looking at the data require additional care in their interpretation; almost never will \(p\) -values for these ad hoc or post hoc hypotheses be statistically justifiable. Many different measured response variables invite fishing expeditions , where patterns in the data are sought without an underlying hypothesis. Only reporting those sub-analyses that gave ‘interesting’ findings invariably leads to biased conclusions and is called cherry-picking or \(p\) -hacking (or much less flattering names).

The statistical analysis is always part of a larger scientific argument and we should consider the necessary computations in relation to building our scientific argument about the interpretation of the data. In addition to the statistical calculations, this interpretation requires substantial subject-matter knowledge and includes (many) non-statistical arguments. Two quotes highlight that experiment and analysis are a means to an end and not the end in itself.

There is a boundary in data interpretation beyond which formulas and quantitative decision procedures do not go, where judgment and style enter. ( Abelson 1995 )
Often, perfectly reasonable people come to perfectly reasonable decisions or conclusions based on nonstatistical evidence. Statistical analysis is a tool with which we support reasoning. It is not a goal in itself. ( Bailar III 1981 )

There is often a grey area between exploiting researcher degrees of freedom to arrive at a desired conclusion, and creative yet informed analyses of data. One way to navigate this area is to distinguish between exploratory studies and confirmatory studies . The former have no clearly stated scientific question, but are used to generate interesting hypotheses by identifying potential associations or effects that are then further investigated. Conclusions from these studies are very tentative and must be reported honestly as such. In contrast, standards are much higher for confirmatory studies, which investigate a specific predefined scientific question. Analysis plans and pre-registration of an experiment are accepted means for demonstrating lack of bias due to researcher degrees of freedom, and separating primary from secondary analyses allows emphasizing the main goals of the study.

Analysis Plan

The analysis plan is written before conducting the experiment and details the measurands and estimands, the hypotheses to be tested together with a power and sample size calculation, a discussion of relevant effect sizes, detection and handling of outliers and missing data, as well as steps for data normalization such as transformations and baseline corrections. If a regression model is required, its factors and covariates are outlined. Particularly in biology, handling measurements below the limit of quantification and saturation effects require careful consideration.

In the context of clinical trials, the problem of estimands has become a recent focus of attention. An estimand is the target of a statistical estimation procedure, for example the true average difference in enzyme levels between the two preparation kits. A main problem in many studies are post-randomization events that can change the estimand, even if the estimation procedure remains the same. For example, if kit B fails to produce usable samples for measurement in five out of ten cases because the enzyme level was too low, while kit A could handle these enzyme levels perfectly fine, then this might severely exaggerate the observed difference between the two kits. Similar problems arise in drug trials, when some patients stop taking one of the drugs due to side-effects or other complications.

Registration

Registration of experiments is an even more severe measure used in conjunction with an analysis plan and is becoming standard in clinical trials. Here, information about the trial, including the analysis plan, procedure to recruit patients, and stopping criteria, are registered in a public database. Publications based on the trial then refer to this registration, such that reviewers and readers can compare what the researchers intended to do and what they actually did. Similar portals for pre-clinical and translational research are also available.

1.6 Notes and Summary

The problem of measurements and measurands is further discussed for statistics in Hand ( 1996 ) and specifically for biological experiments in Coxon, Longstaff, and Burns ( 2019 ) . A general review of methods for handling missing data is Dong and Peng ( 2013 ) . The different roles of randomization are emphasized in Cox ( 2009 ) .

Two well-known reporting guidelines are the ARRIVE guidelines for animal research ( Kilkenny et al. 2010 ) and the CONSORT guidelines for clinical trials ( Moher et al. 2010 ) . Guidelines describing the minimal information required for reproducing experimental results have been developed for many types of experimental techniques, including microarrays (MIAME), RNA sequencing (MINSEQE), metabolomics (MSI) and proteomics (MIAPE) experiments; the FAIRSHARE initiative provides a more comprehensive collection ( Sansone et al. 2019 ) .

The problems of experimental design in animal experiments and particularly translation research are discussed in Couzin-Frankel ( 2013 ) . Multi-center studies are now considered for these investigations, and using a second laboratory already increases reproducibility substantially ( Richter et al. 2010 ; Richter 2017 ; Voelkl et al. 2018 ; Karp 2018 ) and allows standardizing the treatment effects ( Kafkafi et al. 2017 ) . First attempts are reported of using designs similar to clinical trials ( Llovera and Liesz 2016 ) . Exploratory-confirmatory research and external validity for animal studies is discussed in Kimmelman, Mogil, and Dirnagl ( 2014 ) and Pound and Ritskes-Hoitinga ( 2018 ) . Further information on pilot studies is found in Moore et al. ( 2011 ) , Sim ( 2019 ) , and Thabane et al. ( 2010 ) .

The deliberate use of statistical analyses and their interpretation for supporting a larger argument was called statistics as principled argument ( Abelson 1995 ) . Employing useless statistical analysis without reference to the actual scientific question is surrogate science ( Gigerenzer and Marewski 2014 ) and adaptive thinking is integral to meaningful statistical analysis ( Gigerenzer 2002 ) .

In an experiment, the investigator has full control over the experimental conditions applied to the experiment material. The experimental design gives the logical structure of an experiment: the units describing the organization of the experimental material, the treatments and their allocation to units, and the response. Statistical design of experiments includes techniques to ensure internal validity of an experiment, and methods to make inference from experimental data efficient.

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

1.2 - the basic principles of doe, randomization section  .

This is an essential component of any experiment that is going to have validity. If you are doing a comparative experiment where you have two treatments, a treatment and a control, for instance, you need to include in your experimental process the assignment of those treatments by some random process. An experiment includes experimental units. You need to have a deliberate process to eliminate potential biases from the conclusions, and random assignment is a critical step.

Replication Section  

Replication is some in sense the heart of all of statistics. To make this point... Remember what the standard error of the mean is? It is the square root of the estimate of the variance of the sample mean, i.e., \(\sqrt{\dfrac{s^2}{n}}\). The width of the confidence interval is determined by this statistic. Our estimates of the mean become less variable as the sample size increases.

Replication is the basic issue behind every method we will use in order to get a handle on how precise our estimates are at the end. We always want to estimate or control the uncertainty in our results. We achieve this estimate through replication. Another way we can achieve short confidence intervals is by reducing the error variance itself. However, when that isn't possible, we can reduce the error in our estimate of the mean by increasing n .

Another way is to reduce the size or the length of the confidence interval is to reduce the error variance - which brings us to blocking.

Blocking Section  

Blocking is a technique to include other factors in our experiment which contribute to undesirable variation. Much of the focus in this class will be to creatively use various blocking techniques to control sources of variation that will reduce error variance. For example, in human studies, the gender of the subjects is often an important factor.  Age is another factor affecting the response.  Age and gender are often considered nuisance factors which contribute to variability and make it difficult to assess systematic effects of a treatment.  By using these as blocking factors, you can avoid biases that might occur due to differences between the allocation of subjects to the treatments, and as a way of accounting for some noise in the experiment. We want the unknown error variance at the end of the experiment to be as small as possible. Our goal is usually to find out something about a treatment factor (or a factor of primary interest), but in addition to this, we want to include any blocking factors that will explain variation.

Multi-factor Designs Section  

We will spend at least half of this course talking about multi-factor experimental designs: \(2^k\) designs, \(3^k\) designs, response surface designs, etc. The point to all of these multi-factor designs is contrary to the scientific method where everything is held constant except one factor which is varied. The one factor at a time method is a very inefficient way of making scientific advances. It is much better to design an experiment that simultaneously includes combinations of multiple factors that may affect the outcome. Then you learn not only about the primary factors of interest but also about these other factors. These may be blocking factors which deal with nuisance parameters or they may just help you understand the interactions or the relationships between the factors that influence the response.

Confounding Section  

Confounding is something that is usually considered bad! Here is an example. Let's say we are doing a medical study with drugs A and B. We put 10 subjects on drug A and 10 on drug B. If we categorize our subjects by gender, how should we allocate our drugs to our subjects? Let's make it easy and say that there are 10 male and 10 female subjects. A balanced way of doing this study would be to put five males on drug A and five males on drug B, five females on drug A and five females on drug B. This is a perfectly balanced experiment such that if there is a difference between male and female at least it will equally influence the results from drug A and the results from drug B.

An alternative scenario might occur if patients were randomly assigned treatments as they came in the door. At the end of the study, they might realize that drug A had only been given to the male subjects and drug B was only given to the female subjects. We would call this design totally confounded. This refers to the fact that if you analyze the difference between the average response of the subjects on A and the average response of the subjects on B, this is exactly the same as the average response on males and the average response on females. You would not have any reliable conclusion from this study at all. The difference between the two drugs A and B, might just as well be due to the gender of the subjects since the two factors are totally confounded.

Confounding is something we typically want to avoid but when we are building complex experiments we sometimes can use confounding to our advantage. We will confound things we are not interested in order to have more efficient experiments for the things we are interested in. This will come up in multiple factor experiments later on. We may be interested in main effects but not interactions so we will confound the interactions in this way in order to reduce the sample size, and thus the cost of the experiment, but still have good information on the main effects.

  • For Individuals
  • For Businesses
  • For Universities
  • For Governments
  • Online Degrees
  • Find your New Career
  • Join for Free

Arizona State University

Experimental Design Basics

This course is part of Design of Experiments Specialization

Financial aid available

20,620 already enrolled

(271 reviews)

What you'll learn

By the end of this course, you will be able to:

Approach complex industrial and business research problems and address them through a rigorous, statistically sound experimental strategy

Use modern software to effectively plan experiments

Analyze the resulting data of an experiment, and communicate the results effectively to decision-makers.

Skills you'll gain

  • statistical Methods for Process and Product improvement
  • design of experiments
  • experiment design
  • designing experiments

Details to know

what are the basic principles of design of experiments

Add to your LinkedIn profile

See how employees at top companies are mastering in-demand skills

Placeholder

Build your subject-matter expertise

  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate

Placeholder

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

Placeholder

There are 5 modules in this course

This is a basic course in designing experiments and analyzing the resulting data. The course objective is to learn how to plan, design and conduct experiments efficiently and effectively, and analyze the resulting data to obtain objective conclusions. Both design and statistical analysis issues are discussed. Opportunities to use the principles taught in the course arise in all aspects of today’s industrial and business environment. Applications from various fields will be illustrated throughout the course. Computer software packages (JMP, Design-Expert, Minitab) will be used to implement the methods presented and will be illustrated extensively.

All experiments are designed experiments; some of them are poorly designed, and others are well-designed. Well-designed experiments allow you to obtain reliable, valid results faster, easier, and with fewer resources than with poorly-designed experiments. You will learn how to plan, conduct and analyze experiments efficiently in this course.

Unit 1: Getting Started and Introduction to Design and Analysis of Experiments

What's included.

6 videos 6 readings 1 quiz 1 app item 1 discussion prompt

6 videos • Total 53 minutes

  • Instructor Welcome • 3 minutes • Preview module
  • Course Introduction • 4 minutes
  • Specialization Overview • 2 minutes
  • History of DOX • 17 minutes
  • The Basic Principles of DOX • 11 minutes
  • Factorial Designs with Several Factors • 14 minutes

6 readings • Total 60 minutes

  • Course Description • 10 minutes
  • Course Textbook and Resources • 10 minutes
  • Best Practices in Online Learning (or How to Succeed in This Class) • 10 minutes
  • Course Project • 10 minutes
  • Unit 1 Introduction • 10 minutes
  • Introduction to course project • 10 minutes

1 quiz • Total 30 minutes

  • Concept Questions • 30 minutes

1 app item • Total 60 minutes

  • JMP Virtual Lab • 60 minutes

1 discussion prompt • Total 10 minutes

  • Meet the class • 10 minutes

Unit 2: Simple Comparative Experiments

9 videos 1 reading 2 quizzes 1 app item

9 videos • Total 96 minutes

  • Comparative Experiments and Basic Statistical Concepts • 10 minutes • Preview module
  • The Hypothesis Testing Framework • 14 minutes
  • Pooled t-test and Two-sample t-test • 12 minutes
  • Pooled t-test and Two-sample t-test, pt 2 • 13 minutes
  • Hypothesis Testing on Variances • 13 minutes
  • Paired t-test • 10 minutes
  • Portland Cement Data Example • 11 minutes
  • Florescence Data Example • 4 minutes
  • Hardness Testing Example • 5 minutes

1 reading • Total 10 minutes

  • Unit 2 Introduction • 10 minutes

2 quizzes • Total 60 minutes

  • Exercise 1 • 30 minutes

Unit 3: Experiments with a Single Factor - The Analysis of Variance

10 videos 1 reading 2 quizzes

10 videos • Total 83 minutes

  • Analysis of Variance (ANOVA) • 12 minutes • Preview module
  • Models for the Data • 10 minutes
  • ANOVA for Plasma Etching Experiment • 9 minutes
  • Post-ANOVA Comparison of Means • 9 minutes
  • Sample Size Determination • 7 minutes
  • Examples of Single-Factor Experiments • 9 minutes
  • The Random Effects Model • 8 minutes
  • Example of Random Factor Experiment • 7 minutes
  • Plasma Etching Example • 5 minutes
  • Fabric Strength Example • 2 minutes
  • Unit 3 Introduction: Experiments with a Singe Factor; the Analysis of Variance • 10 minutes
  • Exercise 2 • 30 minutes

Unit 4: Randomized Blocks, Latin Squares, and Related Designs

6 videos 1 reading 2 quizzes

6 videos • Total 52 minutes

  • The Blocking Principle • 10 minutes • Preview module
  • Extension of the ANOVA to the RCBD • 7 minutes
  • Example • 10 minutes
  • Residual Analysis for the Vascular Graft Example • 9 minutes
  • The Latin Square Design • 9 minutes
  • Vascular Graft Example • 6 minutes
  • Unit 4 Introduction: Randomized Blocks, Latin Squares, and Related Designs; techniques for handling nuisance factor is experiments • 10 minutes
  • Exercise 3 • 30 minutes

Unit 5: Project

1 peer review

1 peer review • Total 60 minutes

  • Project Report • 60 minutes

Instructor ratings

We asked all learners to give feedback on our instructors based on the quality of their teaching style.

Douglas C. Montgomery

Arizona State University has developed a new model for the American Research University, creating an institution that is committed to excellence, access and impact. ASU measures itself by those it includes, not by those it excludes. ASU pursues research that contributes to the public good, and ASU assumes major responsibility for the economic, social and cultural vitality of the communities that surround it.

Recommended if you're interested in Probability and Statistics

what are the basic principles of design of experiments

Arizona State University

Factorial and Fractional Factorial Designs

what are the basic principles of design of experiments

Response Surfaces, Mixtures, and Model Building

what are the basic principles of design of experiments

Random Models, Nested and Split-plot Designs

what are the basic principles of design of experiments

Design of Experiments

Specialization

Why people choose Coursera for their career

what are the basic principles of design of experiments

Learner reviews

Showing 3 of 271

271 reviews

Reviewed on Sep 30, 2023

Excellent training material, real case studies, explained in detail, and showing how to apply the statistical tools.

Reviewed on Sep 21, 2020

The peer review assignment is daunting as you are completely at the mercy of your classmates who might leave proper feedback on what you actually did wrong. Apart from that, great class.

Reviewed on Mar 21, 2021

I have used Dr. Montgomery's book off and on since the early 1990s! It is an enjoyment to watch his lectures. The only caveat is that it is a short course, which should have been obvious to me.

New to Probability and Statistics? Start here.

Placeholder

Open new doors with Coursera Plus

Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Frequently asked questions

When will i have access to the lectures and assignments.

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.

The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

What will I get if I subscribe to this Specialization?

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

What is the refund policy?

If you subscribed, you get a 7-day free trial during which you can cancel at no penalty. After that, we don’t give refunds, but you can cancel your subscription at any time. See our full refund policy Opens in a new tab .

Is financial aid available?

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

More questions

Principles of Experimental Design

  • First Online: 16 April 2021

Cite this chapter

what are the basic principles of design of experiments

  • Hans-Michael Kaltenbach 4  

Part of the book series: Statistics for Biology and Health ((SBH))

2596 Accesses

1 Altmetric

We introduce the statistical design of experiments and put the topic into the larger context of scientific experimentation. We give a non-technical discussion of some key ideas of experimental design, including the role of randomization, replication, and the basic idea of blocking for increasing precision and power. We also take a more high-level view and consider the construct, internal and external validities of an experiment, and the corresponding tools that experimental design offers to achieve them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abelson, R. P. (1995). Statistics as Principled Argument. Psychology Press.

Google Scholar  

Bailar III, J. C. (1981). “Bailar’s laws of data analysis”. In: Clinical Pharmacology & Therapeutics 20.1, pp. 113–119.

Article   Google Scholar  

Couzin-Frankel, J. (2013). “When mice mislead”. In: Science 342.6161, pp. 922–925.

Cox, D. R. (1958). Planning of Experiments. Wiley-Blackwell.

Cox, D. R. (2009). “Randomization in the design of experiments”. In: International Statistical Review 77, pp. 415–429.

Coxon, C. H., C. Longstaff, and C. Burns (2019). “Applying the science of measurement to biology: Why bother?” In: PLOS Biology 17.6, e3000338.

Dong, Y. and C. Y. J. Peng (2013). “Principled missing data methods for researchers”. In: SpringerPlus 2.1, pp. 1–17.

Fisher, R. A. (1938). “Presidential Address to the First Indian Statistical Congress”. In: Sankhya: The Indian Journal of Statistics 4, pp. 14–17.

Gigerenzer, G. (2002). Adaptive Thinking: Rationality in the Real World. Oxford Univ Press.

Gigerenzer, G. and J. N. Marewski (2014). “Surrogate Science: The Idol of a Universal Method for Scientific Inference”. In: Journal of Management 41.2, pp. 421–440.

Hand, D. J. (1996). “Statistics and the theory of measurement”. In: Journal of the Royal Statistical Society A 159.3, pp. 445–492.

Kafkafi, N. et al. (2017). “Addressing reproducibility in single-laboratory phenotyping experiments”. In: Nature Methods 14.5, pp. 462–464.

Karp, N. A. (2018). “Reproducible preclinical research-Is embracing variability the answer?” In: PLOS Biology 16.3, e2005413.

Kilkenny, C. et al. (2010). “Improving Bioscience Research Reporting: The ARRIVE Guidelines for Reporting Animal Research”. In: PLOS Biology 8.6, e1000412.

Kimmelman, J., J. S. Mogil, and U. Dirnagl (2014). “Distinguishing between Exploratory and Confirmatory Preclinical Research Will Improve Translation”. In: PLOS Biology 12.5, e1001863.

Llovera, G. and A. Liesz (2016). “The next step in translational research: lessons learned from the first preclinical randomized controlled trial”. In: Journal of Neurochemistry 139, pp. 271–279.

Moher D.and Hopewell, S. et al. (2010). “CONSORT 2010 Explanation and Elaboration: updated guidelines for reporting parallel group randomised trials”. In: BMJ: British Medical Journal 340.

Moore, C. G. et al. (2011). “Recommendations for planning pilot studies in clinical and translational research.” In: Clinical and Translational Science 4.5, pp. 332–337.

Pound, P. and M. Ritskes-Hoitinga (2018). “Is it possible to overcome issues of external validity in preclinical animal research? Why most animal models are bound to fail”. In: Journal of Translational Medicine 16.1, p. 304.

Richter, S. H. (2017). “Systematic heterogenization for better reproducibility in animal experimentation”. In: Lab Animal 46.9, pp. 343–349.

Richter, S. H. et al. (2010). “Systematic variation improves reproducibility of animal experiments”. In: Nature Methods 7.3, pp. 167–168.

Sansone, S.-A. et al. (2019). “FAIRsharing as a community approach to standards, repositories and policies”. In: Nature Biotechnology 37.4, pp. 358–367.

Sim, J. (2019). “Should treatment effects be estimated in pilot and feasibility studies?” In: Pilot and Feasibility Studies 5.107, e1–e7.

Thabane, L. et al. (2010). “A tutorial on pilot studies: the what, why and how”. In: BMC Medical Research Methodology 10.1, p. 1.

Travers J.and Marsh, S. et al. (2007). “External validity of randomised controlled trials in asthma: To whom do the results of the trials apply?” In: Thorax 62.3, pp. 219–233.

Tufte, E. (1997). Visual Explanations: Images and Quantities, Evidence and Narrative. 1st. Graphics Press.

Voelkl, B. et al. (2018). “Reproducibility of preclinical animal research improves with heterogeneity of study samples”. In: PLOS Biology 16.2, e2003693.

Würbel, H. (2017). “More than 3Rs: The importance of scientific validity for harm-benefit analysis of animal research”. In: Lab Animal 46.4, pp. 164–166.

Download references

Author information

Authors and affiliations.

Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland

Hans-Michael Kaltenbach

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Hans-Michael Kaltenbach .

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Kaltenbach, HM. (2021). Principles of Experimental Design. In: Statistical Design and Analysis of Biological Experiments. Statistics for Biology and Health. Springer, Cham. https://doi.org/10.1007/978-3-030-69641-2_1

Download citation

DOI : https://doi.org/10.1007/978-3-030-69641-2_1

Published : 16 April 2021

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-69640-5

Online ISBN : 978-3-030-69641-2

eBook Packages : Mathematics and Statistics Mathematics and Statistics (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

IMAGES

  1. Basic Principles of Experimental Designs

    what are the basic principles of design of experiments

  2. Basic principles of an experimental design

    what are the basic principles of design of experiments

  3. Basic principles of experimental design Randomization, Replication and Local control

    what are the basic principles of design of experiments

  4. Basic Principles of Design of Experiments

    what are the basic principles of design of experiments

  5. PPT

    what are the basic principles of design of experiments

  6. An Intuitive Study of Experimental Design

    what are the basic principles of design of experiments

VIDEO

  1. Introduction to Conjoint Analysis

  2. The Science Behind Everyday Life How Common Things Work in Surprising Ways

  3. Design of Experiments (DOE) Tutorial for Beginners

  4. 8 Basic Design Principles

  5. 10 Amazing Class 6 Science Project Ideas

  6. What is design principle|what are design patterns|software design tips for beginner

COMMENTS

  1. Lesson 1: Introduction to Design of Experiments | STAT 503

    understand the issues and principles of Design of Experiments (DOE), understand experimentation is a process, list the guidelines for designing experiments, and; recognize the key historical figures in DOE.

  2. Topic 1: INTRODUCTION TO PRINCIPLES OF EXPERIMENTAL DESIGN

    A designed experiment must satisfy all requirements of the objectives of a study but is also subject to the limitations of available resources. Below we will give examples of how the objective and hypothesis of a study influences the design of an experiment. 1. 4. 2. Objectives and experimental design

  3. Basic Principles of Experimental Designs - eMathZone

    The basic principles of experimental designs are randomization, replication and local control. These principles make a valid test of significance possible. Each of them is described briefly in the following subsections.

  4. Guide to Experimental Design | Overview, 5 steps & Examples

    A good experimental design requires a strong understanding of the system you are studying. There are five key steps in designing an experiment: Consider your variables and how they are related; Write a specific, testable hypothesis; Design experimental treatments to manipulate your independent variable

  5. Main Principles of experimental design: the 3 “R’s”

    There are three basic principles behind any experimental design: Randomisation: the random allocation of treatments to the experimental units. Randomize to avoid confounding between treatment effects and other unknown effects.

  6. Chapter 1 Principles of Experimental Design | Statistical ...

    Three main pillars of experimental design are randomization, replication, and blocking, and we will flesh out their effects on the subsequent analysis as well as their implementation in an experimental design.

  7. 1.2 - The Basic Principles of DOE | STAT 503 - Statistics Online

    Lesson 1: Introduction to Design of Experiments. 1.1 - A Quick History of the Design of Experiments (DOE) 1.2 - The Basic Principles of DOE; 1.3 - Steps for Planning, Conducting and Analyzing an Experiment; Lesson 2: Simple Comparative Experiments. 2.1 - Simple Comparative Experiments; 2.2 - Sample Size Determination; 2.3 - Determining Power

  8. Design of experiments - Wikipedia

    Fisher's principles. A methodology for designing experiments was proposed by Ronald Fisher, in his innovative books: The Arrangement of Field Experiments (1926) and The Design of Experiments (1935). Much of his pioneering work dealt with agricultural applications of statistical methods.

  9. Experimental Design Basics - Coursera

    About. Outcomes. Modules. Recommendations. Testimonials. Reviews. What you'll learn. By the end of this course, you will be able to: Approach complex industrial and business research problems and address them through a rigorous, statistically sound experimental strategy. Use modern software to effectively plan experiments.

  10. Principles of Experimental Design | SpringerLink

    The experimental design gives the logical structure of an experiment: the units describing the organization of the experimental material, the treatments and their allocation to units, and the response.