safe.” Where potential hazards are detected in these early tier tests, additional information is
required. In these cases, higher-tier tests can serve to confirm whether an effect might still
be detected at more realistic rates and routes of exposure. Higher-tier studies, including
semifield or field-based tests, offer greater environmental realism, but they may have
lower statistical power. Lower statistical power means that the there is a greater likelihood
that real effects will not be observed (false negative). One reason for lower power is the high
variability of environmental conditions (e.g., climate) that might counteract GM trait-
specific effects. Nevertheless, these higher-tier tests are triggered only when early tier
studies in the laboratory indicate potential hazards at environmentally relevant levels of
exposure. In exceptional cases, higher-tier studies may be conducted at the initial stage
when early tier tests are not possible or meaningful. For example, plant tissue might be
used because purified protein is not available for lower-tier work. Higher levels of replica-
tion or repetition may be needed to enhance statistical power in certain circumstances. In
cases where a potential hazard is detected in a lower-tier test, the tiered approach provides
the flexibility to undertake further lower-tier tests in the laboratory to increase the taxo-
nomic breadth (e.g., testing more insect species) or local relevance of test species, thus
avoiding the costs and uncertainties of higher-tier testing. Depending on the nature of
the effect, one may also progress to higher-tier testing anyway, particularly in cases
where there is no previous experience with the crop or protein under investigation. The
various tiered approaches that have been described fornontarget risk assessmentdiffer
in their specific definitions of individual tiers, but they all follow the same underlying prin-
ciples. Higher-tier tests usually involve semifield or field tests and sometimes are conducted
when lifecycle (especially reproduction parameters) ortritrophicevaluations are warranted.
In general, these tests are problematic because of their complexity and high intrinsic uncer-
tainty. Higher-tier tests require expertise and care in experimental design, execution, and
data analysis. As a consequence they are subject to problems of low statistical power, par-
ticularly if they are used for “proof of hazard.” These tests should therefore be conducted
only when they can further reduce uncertainty in the risk assessment, and only when jus-
tified by detection of unacceptable risk at the lower tiers of testing. For further reading, see
the paper by Romeis et al. 2008. Statistical power has been mentioned several times, and
this concept requires clarification. Multiple samples and replicates of experiments are
needed for high statistical power, which we can define here as the ability to detect real
differences that might exist. Biological systems are highly variable, and statistical tests
help researchers test hypotheses, for example, it the differences observed are due to
chance variation or result from expression of a transgene. Lower-tiered experiments that
can be tightly controlled offer higher capacity to detect real differences than when we
layer field effects on higher-tiered experiments. The ground rule is that the more lifelike
the experiment, the bigger and more expensive it will be to truly understand natural varia-
bility and variability caused by the transgene addition.
13.3 An Example Risk Assessment: The Case of Bt Maize
Let us examine the scenario that has garnered the most attention in the risk assessment
world: Bt maize pollen exposure. During flowering, maize pollen might land on leaves
of host plants (hosts or food for insects) growing in and around maize fields, and these
plants might be consumed by caterpillar larvae. Fields and field margins are important habi-
tats for some butterfly species. As a consequence of the intensification of agricultural
314 FIELD TESTING OF TRANSGENIC PLANTS