A Summary of
Research on the Effects of Test
Accommodations: 2002 Through 2004
Technical Report 45
Christopher J. Johnstone • Jason Altman
• Martha L. Thurlow
•
Sandra J. Thompson*
September 2006
All rights reserved. Any or all
portions of this document may be reproduced and distributed
without prior permission, provided the source is cited as:
Johnstone, C. J., Altman, J.,
Thurlow, M. L., & Thompson, S. J. (2006). A summary of
research on the effects of test
accommodations: 2002 through 2004 (Technical Report
45). Minneapolis, MN: University of Minnesota, National Center
on Educational Outcomes.
Retrieved [today's date], from the
World Wide Web: http://education.umn.edu/NCEO/OnlinePubs/Tech45/
* Dr. Thompson was a Research Associate at
NCEO when working on this report. She passed away December 2005
after a career of improving outcomes for students with
disabilities.
Table of Contents
Executive Summary
Introduction
Methods
Results
Purpose of Accommodations Research
Types of Assessment
Content Areas Assessed
Type of Accommodation
Research Participants
Research Results
Limitations
Recommendations for Future Research
Discussion and Implications for Future
Research
References
Appendix A—Summary of Research Purpose
Appendix B—Summary of Type of Assessment
Appendix C—Subject Area Studied (by
Author)
Appendix D—Type of Accommodation Studied
(by Author)
Appendix E—Summary of Participants
Appendix F—Summary of Research Results
Appendix G—Summary of Limitations Cited
by Researchers
Appendix H—Summary of Suggestions for
Future Research (as recommended by
authors)
Executive Summary
The No Child Left Behind
Act of 2001 (NCLB) requires the
reporting of participation in
assessments overall and by subgroup,
including students with disabilities. As
states and school districts strive to
meet the goals for adequate yearly
progress required by NCLB, the use of
individual accommodations continues to
be scrutinized for effectiveness,
threats to test validity, and score
comparability. This report summarizes 49
empirical research studies completed on
test accommodations between 2002 and
2004, and provides direction in the
design of critically needed future
research on accommodations.
NCEO found that studies
during this three-year period had the
following characteristics:
Purpose. The primary
purpose of the 2002-2004 accommodations
research was to determine the effects of
accommodations use on the large-scale
test scores of students with
disabilities.
Types of assessment,
content areas, and accommodations.
The majority of the studies tested
students using norm-referenced or
criterion-referenced tests, on math or
reading/language arts.
Participants. Equal
numbers of research studies involved
between 1-100 participants, 100-1,000
participants, and more than 1,000
participants of multiple age categories.
Participants were varying percentages of
students without disabilities and
students with disabilities. Students
with learning disabilities were studied
most frequently among students who
receive special education services.
Findings. Findings
shared no common theme, with various
accommodations shown to have both a
positive and non-positive effect on
scores. Individual accommodations showed
either differential item functioning or
no differential item functioning
depending on the study. The lack of
consistent findings points to a need for
further research.
Limitations. Most
often, authors noted that studies were
too narrow in scope, involved a small
sample size, or provided confounding
factors. These limitations and other
considerations led researchers to
recommend investigating the
characteristics of accommodations in
further detail.
Important overall
observations from the NCEO analysis
include a need in future research for a
clear definition of the constructs
tested, a reduction in confounding
factors, increased study of
institutional factors affecting
accommodations judgment, and exploration
of the desirability and perceived
usefulness of accommodations by students
themselves. Future research should focus
on improvement in these areas but also
on the positive effects of field-testing
potential items in accommodated formats
in addition to standard formats.
Introduction
Over the past decade,
students with disabilities have
increasingly participated and performed
at proficient levels on general
education assessments. Participation and
proficient performance of all students
are required by the No Child Left Behind
Act of 2001. As participation rates
increase, so does the use of testing
accommodations. Increased use of
accommodations should reflect an attempt
to ensure that the scores received by
students with disabilities are valid
measures of achievement. It is also
possible that increased use of
accommodations is simply a reflection of
concern about including students in
assessments and a belief that these
students need additional aids to help
them perform better. Because of this,
states are clarifying appropriate
accommodation use in state policy, with
the goal of encouraging Individualized
Education Program (IEP) teams to select
accommodations that remove specific
disability barriers, but do not give
students with disabilities an unfair
advantage over their peers. States have
begun to monitor accommodations use, and
this will help them to better track the
effects of accommodations. In 2005, 20
states maintained a database of
accommodations actually used during
testing within the state, and 26 states
documented the specific accommodations
used by students on test day (Thompson,
Johnstone, Thurlow, & Altman, 2005).
State policymakers and
practitioners define accommodations in a
variety of ways. For the purposes of
this report, we draw from accommodations
research to help shape our definitions
and outlook on testing accommodations
for students with disabilities. For
example, Thurlow and Bolt (2001) defined
testing accommodations as:
changes in
assessment materials or
procedures that address aspects
of students’ disabilities that
may interfere with the
demonstration of their knowledge
and skills on standardized
tests. Accommodations attempt to
eliminate barriers to meaningful
testing, thereby allowing for
the participation of students
with disabilities in state and
district assessments. (p. 3)
Sireci, Li, and Scarpati
(2005) explained the validity of
accommodations through an "interaction
hypothesis" or the theoretical
assumption that test accommodations will
lead to improved test scores for
students who need accommodations, but
not for students who do not need
accommodations (i.e., students with
disabilities receive a boost in scores
as a result of accommodations whereas
students without disabilities do not
receive a boost or receive a less
pronounced boost in scores).
The accommodations
allowed in state assessment policies
vary from state to state. The most
common accommodations found in previous
syntheses of research were read aloud
accommodations (sometimes referred to as
oral administration), computer
administration of tests, extended time
and tests across multiple days,
calculator use, and use of a scribe
(Thompson, Blount, & Thurlow, 2002).
Most students use a combination of
accommodations (Bielinski, Ysseldyke,
Bolt, Friedebach, & Friedebach, 2001),
as dictated by their IEPs.
Research on
accommodations has yielded mixed results
in terms of validity and efficacy of
providing them to students with
disabilities. Over the past several
years, accommodations research has yet
to produce definitive answers for
policymakers and practitioners (Sireci,
Li, & Scarpati, 2003). Nevertheless,
research reports on accommodations
continue to be found in professional
journals, indicating that there is still
a need to investigate the complex issues
surrounding accommodations. This report
examines empirical research published
between 2002 and 2004. We searched
peer-reviewed articles, technical
reports, and dissertations in order to
provide the readers with up-to-date
information on accommodations.
In this report, research
is summarized according to several
components, including research purpose,
type of assessment, content area
assessed, type of accommodation, number
of participants, percent of sample
consisting of students with
disabilities, participant grade level,
type of disability, research results,
research limitations, and
recommendations for further research.
During our review process, we found 49
published studies on accommodations
between 2002-2004. This number reflects
a high number of studies conducted in
the new millennium, and is slightly more
than the number of studies published
between 1999 and 2001 (Thompson, Blount,
& Thurlow, 2002) (see Table 1). The
publications reviewed for this report
are found in Appendix A.
Table 1. Number of Accommodations
Studies by Years
|
Years |
Number of Studies |
|
1990 through 1992 |
11 |
|
1993 through 1995 |
18 |
|
1996 through 1998 |
29 |
|
1999 through 2001 |
46 |
|
2002 through 2004 |
49 |
Methods
NCEO used a four-stage
process to find publications related to
accommodations from 2002 through 2004.
First, we conducted a search of
electronic databases including ERIC,
PsychInfo, Educational Abstracts, and
Digital Dissertations using the keywords
"accommodation," "test adaptation,"
"test changes," "test modifications,"
"test accommodations," "state testing
accommodations," "standards-based
testing accommodations," and
"large-scale testing accommodations."
A second electronic
search consisted of organizational Web
sites, including Behavior Research and
Training (http://brt.uoregon.edu/), the
National Center for Research on
Evaluation, Standards, and Student
Testing (http://www.cse.ucla.edu/), the
Center for the Study of Assessment
Validity and Evaluation (http://www.c-save.umd.edu/index.html),
and the Wisconsin Center for Educational
Research (http://www.wcer.wisc.edu/tesacc/).
In addition, an archival search of
Educational Policy Analysis Archives was
undertaken.
In addition to the
electronic searches, NCEO staff
performed two hand searches. First,
references from all selected materials
dated 2002 through 2004 were examined in
an effort to find further source
material. Second, 2002 through 2004
issues of major measurement and special
education journals were hand searched in
the University of Minnesota library.
These journals included Applied
Measurement in Education, British
Journal of Special Education,
Diagnostique, Educational Assessment,
Educational and Psychological
Measurement, Educational Measurement:
Issues and Practice, Educational
Psychologist, Educational Psychology,
Exceptional Children, Journal of
Educational Measurement, Journal of
Learning Disabilities, Journal of School
Psychology, Journal of Special
Education, Remedial and Special
Education. Last, the schedules of
the annual conferences of major
organizations (such as the American
Educational Research Association, the
Council of Chief State School Officers,
Council for Exceptional Children, and
the National Council on Measurement in
Education) were scanned for
presentations on accommodations.
NCEO research staff
searched all identified sources over an
18-month period beginning in fall, 2004
and concluding in spring, 2005. All of
the studies cited were either empirical
research or meta-analyses, each with
succinct research findings that added to
the field’s knowledge about the effects
of accommodations. The References
section of this report includes all
journal article, research report,
conference presentation, and
dissertation references.
Results
Purpose
of Accommodations Research
Two primary purposes
appeared most often in accommodations
research from 2002 through 2004 (see
Table 2). First, the majority of studies
examined the effect of the use of
accommodations on scores; 23 studies
sought to determine the effect of the
use of accommodations on test scores
with students with disabilities, and 13
studies investigated the effects of
accommodations on test score validity.
Second, a set of studies seemed to
reflect the purpose of looking at
accommodations institutional factors
(such as teacher knowledge, effects of
policy, and IEP team decision making;
nine publications fit this purpose. In
addition to these, two publications
examined patterns of errors across items
or tests and two meta-analyses
synthesized accommodations studies.
Details of the studies according to
purpose are provided in Appendix A.
Table 2. Research Purposes
|
Research Purpose |
Number of Studies |
|
Determine the effect of the use
of accommodations on test scores
of students with disabilities |
23 |
|
Investigate the effects of
accommodations on test score
validity |
13 |
|
Study institutional factors,
teacher judgment, or student
desirability of accommodation
use |
9 |
|
Examine patterns of errors
across items or tests |
2 |
|
Meta-analysis |
2 |
Authors employed a
variety of methods in their research
(see Table 3). The most common methods
were experimental and quasi-experimental
research, in which research participants
took tests under different conditions,
and reviews of extant data, in which
researchers reviewed data from
assessments that students took with or
without accommodations that were not
specifically designed for experimental
and control conditions. Twenty-one
publications used experimental design in
their methodology and 17 studies
reviewed existing data from state
large-scale assessments or local
assessments. A variety of descriptive
and comparative statistics were employed
to examine extant data. In addition to
these methods, seven additional studies
used survey or interview methods to
better understand stakeholder
understanding of opinions on
accommodations. Two studies were
meta-analyses, one study evaluated a
product and one study described
interventions for IEP teams.
Table 3. Research Methods
|
Method
|
Number of
Studies
|
|
Experimental or
Quasi-experimental
|
21
|
|
Review of extant
data
|
17
|
|
Survey/Interview
|
7
|
|
Meta-analysis
|
2
|
|
IEP intervention
|
1
|
|
Product
evaluation
|
1
|
Note: Studies are
described by their primary methodology.
The experimental and
extant data analysis methods combined
had two main goals: understanding the
effect of accommodations on test scores
and understanding the effects of
accommodations on the psychometric
qualities of items. Among the 23 studies
that examined the effects of
accommodations on test scores (see Table
2), researchers found that computerized
administration (Pomplun, Frey, & Becker,
2002), read-aloud accommodation (Helwig,
Rozek-Tedesco, & Tindal, 2002; Meloy,
Deville, & Frisbie, 2002), video
administration (Burch, 2002; Tindal,
2002), extended time (Bridgeman, Cline,
& Hessinger, 2004), and assistive
technology (Landau, Russell, Gourgey,
Erin, & Cowan, 2003; MacArthur &
Cavalier, 2004) all had a positive
effect on the test scores of at least
some of the students with disabilities
included in the research samples.
Conversely, Burch (2002) and Barton
(2002) found that some students with
learning disabilities did not benefit
from computer or video accommodations.
Likewise, Schuneman, Camara, Cascallar,
Wendler, and Lawrence (2002) found that
calculator usage did not have an effect
on student scores. According to Elliott
and Marquart (2003, 2004), extended time
also did not yield improved scores for
students with disabilities.
In terms of the
psychometric properties of
accommodations, researchers obtained
mixed results in terms of score
comparability for items. Barton (2002),
Barton and and Huynh (2003), Calahan,
Mandinach, and Camara (2002), Huynh,
Meyer, and Gallant-Taylor (2002), and
Kobrin and Young (2003) found no change
in item comparability when various
accommodations (including read aloud,
extended time, and computerized
administration) were employed. Bolt and
Bielinski (2002), Choi and Tinker
(2002), and Thornton, Reese, Pashley,
and Dalessandro (2002), however, all
found that tests administered orally,
with extended time, or via computer
changed item difficulty or constructs.
Finally, findings
related to the institutional issues
around accommodations were also mixed.
Differing foci for studies yielded
different results. Six studies of
teachers, administrators, and students
yielded contrasting results on the
relative knowledge of school personnel
about accommodations. Cisar (2004) found
that special education teachers were
more knowledgeable, while Gagnon and
McLaughlin (2004) found that teachers
and administrators scored similarly on
knowledge measures. Woods (2004)
discovered that students do not often
predict their need for accommodations
well.
Types of
Assessment
Researchers who studied
the effects of accommodations used two
main types of assessment to determine
effects and error in the use of
accommodations in 2002-2004. This
information is shown in Table 4. One
common approach to testing the effects
of accommodations was for researchers to
use norm-referenced tests. Education
professionals typically use
norm-referenced tests for national
comparison, diagnostic decisions in
schools, and for college entrance
decisions. Researchers examined the
effects of accommodations on the
following norm-referenced tests:
California Achievement Tests (CAT),
Graduate Record Exam (GRE), Law School
Admission Test (LSAT), Nelson-Denny
Reading Rest, Scholastic Aptitude Test
(SAT), Terra Nova, and the General
Certificate of Secondary Education
Examination-United Kingdom (GCSE-UK).
In addition, researchers
employed a number of statewide
criterion-referenced tests (including
tests from Maryland, Missouri, Oregon,
and South Carolina). One researcher
gathered descriptive data from a test
that was still in the prototype stage.
Appendix B provides details of the
assessments used in the research during
the years 2002-2004.
Table 4. Type of
Assessment
|
Type of
Assessment
|
Number of
Studies
|
|
Norm-referenced
and Other Standardized Tests
|
18
|
|
State
Criterion-referenced Tests or
Performance Assessments
|
18
|
|
School or
District-designed Tests
|
0
|
|
Other
|
8
|
|
Survey
|
3
|
|
N/A,
Meta-Analyses
|
2
|
Content
Areas Assessed
Authors of studies
conducted research in five major
academic content areas: reading (n=23),
mathematics (n=21), science (n=3),
writing (n=3), and social studies (n=1)
(see Table 5). There were also seven
studies that examined test
accommodations, but not in a specific
content area. The "no specific content
area" studies included examinations of
test accommodations on general academic
assessments, but did not include
surveys, or meta-analyses (n=5).
Appendix C gives additional details on
the content area examined in research
studies (11 studies included two or more
content areas).
Table 5. Content Areas
Assessed*
|
Content Areas
Assessed
|
Number of
Studies
|
|
Mathematics
|
21
|
|
Reading/Language
Arts
|
23
|
|
Science
|
3
|
|
Writing
|
3
|
|
Social Studies
|
1
|
|
No Specific
Content Area
|
7
|
*Studies may have
reported on multiple content.
Type of
Accommodation
We found 15 types of
accommodations in the research
literature from 2002 through 2004 (see
Table 6). Four groups of accommodations
emerged: presentation (n=21),
timing/scheduling (n=8), response (n=2),
and technological aid (n=2) (see also
Appendix D). In addition, 11 of the
studies investigated the effects of
multiple accommodations.
Presentation
accommodations were investigated most
frequently in the research from
2002-2004. Among publications about
presentation accommodations, studies
about oral administration of tests (read
aloud accommodations) were most common
(n=11). The use of computers as a
testing accommodation was also common
(n=5). In addition, researchers
investigated video administration of
tests (n=2), large print accommodations
(n=1), dictionary use (n=1), and braille
formats of tests (n=1). Additional
studies examined multiple types of
accommodations.
Of the eight studies
that explored the timing or scheduling
of tests, seven studies investigated
extended time and one study examined the
outcomes of testing over multiple days.
Among the two studies about response
formats, one study examined student
dictated response. In addition, one
studied the use of calculators. Two
studies investigated technological aids.
Eleven studies examined
the effects of multiple accommodations.
Studies of multiple accommodations
included combinations of accommodations
such as read aloud, video presentation,
extra time, large print, individual
settings or small group settings.
Table 6. Type of
Accommodation
|
Type of
Accommodation
|
Number of
Studies
|
|
Presentation
(21):
|
Oral
Administration
|
11
|
|
|
Computer
Administration
|
5
|
|
|
Video
|
2
|
|
|
Large Print
|
1
|
|
|
Dictionary Use
|
1
|
|
|
Braille
|
1
|
|
Timing/Scheduling (8):
|
Extended Time
|
7
|
|
|
Multiple Day
|
1
|
|
Response (2):
|
Dictated
Response
|
1
|
|
|
Calculator
|
1
|
|
Technological
Aid (2)
|
|
2
|
|
Multiple
Accommodations (11)
|
|
11
|
|
N / A (Survey or
Meta-Analysis) (5)
|
|
5
|
Research
Participants
Studies varied in the
number of participants included in the
sample. Table 7 shows the number of
research participants in studies
reflected in intervals of 100
participants. Eight high school and
college students participated in the
smallest study (Landau et al., 2003),
and Hall (2002) used test data from
192,000 students in the largest study.
Full details of research participant
numbers is provided in Appendix E.
Overall, approximately one-third of the
studies had 99 or fewer participants
(n=18); approximately one-third had
between 100-999 participants (n=13), and
approximately one-third of studies has
more than 1,000 research participants
(n=16).
Table 7. Number of
Participants in Studies
|
Number of
Participants
|
Number of
Studies
|
|
1-99
|
18
|
|
100-199
|
2
|
|
200-299
|
3
|
|
300-499
|
4
|
|
500-999
|
4
|
|
More than 1000
|
16
|
|
Not Applicable
|
2
|
The percentage of
research participants with disabilities
differed across studies (see Table 8).
In 18 studies, students with
disabilities made up a majority of the
sample (participants in eight studies
were 50-74 percent students with
disabilities and the samples in ten
studies were 75-100 percent students
with disabilities). In 15 studies,
students with disabilities comprised
less than half of the sample (there were
less than 25 percent students with
disabilities in 10 studies and between
25 and 49 percent students with
disabilities in the sample of five
studies). Nine studies did not report
the percentage of the sample with
disabilities and seven studies
(including two meta-analyses) did not
use research methods that involved
students.
Table 8. Percent of
Sample Consisting of Students with
Disabilities
|
Percent of
Sample Consisting of Students
with Disabilities
|
Number of
Studies
|
|
1-24%
|
10
|
|
25-49%
|
5
|
|
50-74%
|
8
|
|
75-100%
|
10
|
|
Not Reported
|
9
|
|
No Students with
Disabilities Participated in
Study
|
0
|
|
Not Applicable
|
7
|
The grade level of
research participants also varied (see
Table 9). For example, six studies
targeted students who were in elementary
school (grades K-5), six studies
examined accommodations with middle
school students (grades 6-8), and 11
studies examined accommodations with
high school students (grades 9-12).
Postsecondary students participated in
six studies and 15 studies investigated
students across grade levels (from
grades K to 12). Five studies employed
surveys or were meta-analyses that did
not involve actual research
participants.
Table 9. Grade Level of
Participants in Studies
|
Participant
Grade Level
|
Number of
Studies
|
|
Elementary (K-5)
|
6
|
|
Middle School
(6-8)
|
6
|
|
High School
(9-12)
|
11
|
|
Multiple Grade
Level Categories
(K-Postsecondary)
|
15
|
|
Post Secondary
|
6
|
|
Not Applicable
|
5
|
In addition to
differences in grade level, students
with a variety of disability labels
participated in studies (see Table 10).
Several studies (n=16) included more
than one disability category. Fifteen
studies included students with learning
disabilities, 12 studies included
students with communication
disabilities, 10 studies included
students with cognitive disabilities,
and 10 studies included students with
emotional/behavioral disabilities.
Students with less common
disabilities, including physical
impairment, sensory disabilities,
autism, attention deficit disorder,
health impairments, and multiple
disabilities, were each included in at
least one study. Twenty-one studies did
not report the types of disabilities of
participants.
Table 10. Disability
Categories Included in Studies
|
Type of
Disability
|
Number of
Studies
|
|
Learning
Disability
|
15
|
|
Cognitive
Disability (e.g., mental
retardation)
|
10
|
|
Emotional/Behavioral Disability
|
10
|
|
Communication
Disability
|
12
|
|
Reading or Math
Deficit
|
4
|
|
Other (includes
physical and sensory
disabilities, autism, attention
deficit disorder, health
impairments, and multiple
disabilities)
|
16
|
|
Not Reported
|
21
|
|
Not Applicable
|
5
|
Note: Studies sometimes
include students with more than one
disability category; all are reflected
in this table.
Research
Results
Results from the 49
studies reviewed in this synthesis
varied. Researchers found accommodations
showed both statistically positive and
statistically non-significant effects on
scores. Likewise, some accommodations
had no effect on item comparability,
while other types of accommodations
compromised item comparability.
Presented here are
results according to the type of
accommodation used. These results are
shown in Table 11, with detailed results
available in Appendix F. Similar to
previous reviews of accommodations
research (Thompson, Blount, & Thurlow,
2002), accommodations appeared to have
mixed effects in studies from 2002
through 2004. Furthermore, even when
separated by type of accommodation,
studies still demonstrated mixed
effects. The results from the 11 studies
that investigated multiple
accommodations are not synthesized due
to lack of study focus comparability.
Oral Presentation (Read
Aloud). A total of 11 studies on
oral administration of assessments
(often called "read aloud"
accommodations) produced mixed results.
For example, Helwig et al., (2002),
Huynh, Meyer, and Gallant (2004), Janson
(2002), Meloy, Deville, and Frisbie
(2002), Tindal (2002), and Weston (2003)
all found that read aloud accommodations
had a positive effect on scores for
students with disabilities. Bolt and
Bielinski (2002) and McKevitt and
Elliott (2003), however, found read
aloud accommodations had no significant
impact on student scores.
In terms of item
comparability, Barton (2002) and Barton
and Huynh (2003) found that items were
comparable, whether presented under
standard conditions or with read aloud
accommodations. However, Bolt and
Bielinski (2002), Meloy et al. (2002),
and Weston (2003), found that read aloud
accommodations did affect item
comparability.
The determination of who
should use oral presentation
accommodations is also an issue. Woods
(2004) found great inaccuracy in
self-prediction for the need of the read
aloud accommodation. Such disparate
findings point to the on-going
controversies regarding read-aloud
accommodations.
One study examined the
impact of a student-reads-aloud (i.e.,
student reads text but aloud)
accommodation on the performance of
middle and high school students with and
without learning disabilities on a test
of reading comprehension. Elbaum,
Arguelles, Campbell, and Saleh (2004)
discovered that students’ test
performance did not differ in the two
conditions, and students with learning
disabilities did not benefit more from
the accommodation than students without
learning disabilities.
Extended Time.
Authors of eight studies examined how
the use of extended time affected
student achievement levels on tests and
the extent to which items under extended
time or multiple day administrations of
tests compared to those administered
under standard conditions. Several
studies (Bridgeman et al., 2004;
Dempsey, 2004) found that students with
disabilities profit from extended time
accommodations. In these studies,
students with disabilities had higher
test scores because of extended time
accommodations.
Buehler (2002) and
Elliott and Marquart (2004), however,
found no significant effect on scores
when students were provided extended
time. Such disparities in study results
again demonstrate the lack of
consistency in accommodations research.
The comparability of specific items
under different administration
categories further complicates
accommodations issues. Buehler (2002)
found varying student results for items
that were deemed to be comparable under
standard and extended time
administrations, but Elliott et al.
(2004) and Thornton et al. (2002) found
that the items they studied were not
comparable under different
administrations. A study by Crawford,
Helwig, and Tindal (2004) of multi-day
testing produced conflicting results.
Computer Administration.
In total, five studies investigated
the use of computer administered tests
from 2002-2004. Among these studies,
Pomplun, Frey and Becker (2002) found
that students had positive test results
when administered tests via computer
(rather than paper and pencil format).
Barton and Huynh (2003), Bridgeman,
Lennon and Jackenthal (2003), and Kobrin
and Young (2003), however, found that
computer administration of tests had no
statistical effect, or a statistically
negative impact on student scores.
Related studies on item comparability
between computer and paper/pencil
administration yielded similar results,
with Choi and Tinker (2002) determining
that items were changed as a result of
format differences.
Technological Aid.
In the years 2002 through 2004, three
research studies investigated
technology-based testing accommodations,
and all resulted in positive effects on
test scores. Hansen, Lee, and Forer
(2002) found that the usability of
speech output technology was evaluated
positively, and that ‘self-voicing’
testing systems have significant
potential and may be capable of
replacing human readers in certain
testing situations. Landau et al. (2003)
found that the Tactile Text Tablet, a
hybrid between a braille paper-based
test and laptop computer, had positive
effects on student achievement.
MacArthur and Cavalier (2004) found that
the use of speech recognition software
was feasible, created impressive
dictation results, and improved the
quality of student-written essays.
Calculator Use.
There was only one study between the
years of 2002-2004 that considered
calculator use. In this study Scheuneman
et al. (2002) found that calculator use
had no significant effect on scores for
students taking the SAT. It is unknown
why research in calculator usage has
become less common than in past years
(there were four research studies on
calculator accommodations in the years
1999-2001 but only one accommodations
study during the years of 2002-2004).
Dictionary Use. One
international study of dictionary
accommodations took place from 2002
through 2004. Idstein’s (2003) study of
Israeli students found that dictionaries
were not an effective accommodation and
that dictionary use interrupted student
thought patterns.
Table 11. Types of Accommodations in
Studies
|
Type of
Accommodation
|
Research Results
|
Number of
Studies
|
|
Oral
Presentation (11)
|
Positive effect
on scores
|
6
|
|
|
No Differential Item Functioning
|
2
|
|
|
No significant
effect on scores
|
2
|
|
|
Differential
Item Functioning
|
2
|
|
|
Self-prediction
for need unreliable
|
1
|
|
Extended Time
(12)
|
Positive effect
on scores
|
3
|
|
|
No significant
effect on scores
|
2
|
|
|
Differential
Item Functioning
|
2
|
|
|
No Differential
Item Functioning
|
1
|
|
|
Scores on
accommodated test predictor of
grades
|
1
|
|
Computer
Administration (5)
|
No significant
effect on scores
|
3
|
|
|
Positive effect
on scores
|
1
|
|
|
Differential
Item Functioning
|
1
|
|
Technological
Aid (3)
|
Positive effect
on scores
|
3
|
|
Calculator Use
(1)
|
No significant
effect on scores
|
1
|
|
Dictionary Use
(1)
|
Negative effect
on Scores
|
1
|
|
Multiple
Accommodations
|
|
11
|
|
N/A,
Meta-analyses, Survey, Teacher
|
|
5
|
Limitations
Educational research has
inherent limitations that require
readers to consider findings carefully.
For example, true experimental
conditions rarely mimic the true
conditions in schools under "live"
testing conditions. In addition, sample
sizes for studies of students with
disabilities are often small because
students with disabilities are a
minority population in schools (roughly
one in 10 students has a disability).
Thirty-six authors (74
percent) of accommodation studies
published from 2002 through 2004 noted
limitations in their studies. Table 12
provides tabular information on most
commonly found limitations, and Appendix
G provides brief annotations of studies
that reported limitations.
Fifteen authors reported
a small or narrow sample size, including
Hall (2002) who noted, "The study
focuses only on fifth grade students,
and the results may not generalize to
students with disabilities in other
grades or dissimilar disabilities,
socio-economic statues, etc." Thirteen
authors warned that confounding factors
may have influenced results. Common
factors were testing multiple
accommodations at once and an inability
to randomize the sample. Four authors
found a flaw in research design that
affected study results. Two authors each
listed conflicting results or
nonstandard administration across
proctors and schools as a limitation.
Eleven authors did not mention a
limitation nor did the two
meta-analyses.
Table 12. Limitations of Research
|
Research
Limitation
|
Number of
Studies
|
|
Small Sample
Size/Sample Too Narrow in Scope
|
15
|
|
Confounding
Factors
|
13
|
|
Flaw in Research
Design
|
4
|
|
Conflicting
Results
|
2
|
|
Nonstandard Administration
Across Proctors and Schools
|
2
|
|
No Limitations
Mentioned
|
12
|
|
Not
Applicable/Meta-Analyses
|
2
|
Recommendations for Future Research
From 2002 through 2004,
34 research studies included
recommendations for further research.
Table 13 represents the categories of
recommendations listed by researchers,
with more detailed explanations
available in Appendix H. Calls for
further investigation demonstrate the
continuing investigatory nature of
accommodations research. Although
scholars have conducted accommodations
research for several decades, there is
still a clear need for more examination
in various areas. In studies conducted
from 2002 through 2004, authors
suggested there was need for further
understanding of the factors of
accommodations that contribute to
possible variation in results, further
understanding of student factors that
contribute to accommodations use and
success, improved study design or
replication of studies, further research
on accommodations policy and overall
hypotheses, replication of studies due
to small sample sizes, investigation
into teacher characteristics that relate
to accommodation selection,
investigation into the possible uses for
accommodations in instruction, and
investigation into accommodation use
practicality.
Table 13. Recommendations for Future
Research
|
Recommendations
|
Number of
Studies
|
|
Investigate
characteristics of
accommodations themselves in
further detail
|
12
|
|
Investigate
student factors contributing to
accommodations use
|
6
|
|
Improved study
design or study replication
|
5
|
|
Study policy and
accommodations hypotheses
|
4
|
|
Replicate study
with larger sample
|
3
|
|
Investigate
teacher factors related to
accommodations selections
|
2
|
|
Investigate
possible instructional uses for
accommodations
|
1
|
|
Investigate
accommodation practicality
|
1
|
Discussion and Implications for Future
Research
Several themes arose
from the 49 studies of accommodations
published between 2002 and 2004. One
theme was that accommodations research
is inconclusive. This is similar to past
findings from NCEO summaries of research
(Thompson, Blount, & Thurlow, 2002).
Given that there is not a preponderance
of evidence concerning accommodations,
nearly two-thirds of the authors (n=31)
suggested that future research is needed
to solidify understanding of
accommodation effects.
Researchers published
three more accommodations studies from
2002 through 2004 than from 1999 through
2001. Similar to previous years, the
majority of studies in the most recent
period focused on test scores of
students with disabilities related to
accommodations. A significant number of
studies also investigated the effects of
accommodations on test score validity.
Researchers were particularly concerned
about accommodations changing the
construct of the items assessed. Similar
to previous reviews, a smaller number of
studies from 2002 through 2004
concentrated on institutional factors
related to accommodation use (e.g.,
teacher judgment, student selection,
policy), on patterns of errors across
items, or were meta-analyses of previous
work.
The assessments that
researchers selected for examination
were primarily standardized,
norm-referenced tests and performance
assessments. As would be expected with
assessment requirements of the No Child
Left Behind Act of 2001, most
accommodations research was conducted in
the areas of reading/language arts and
mathematics. Across subject areas,
researchers studied oral administration
and timing accommodations most often.
Sample sizes varied,
with approximately equal representation
of small (n=99 or fewer subjects),
medium (n=100-999 subjects), and large
(n³1,000
subjects). Studies with low sample sizes
typically targeted students with
disabilities in K-12 schools while large
studies typically used tests
administered to large numbers of
students (such as college entrance
examinations). Despite the uniform
nature of sample sizes, there was less
uniformity in terms of the grade-level
of participants. Accommodations studies
were spread across grade and educational
level, but were conducted primarily with
secondary and post-secondary education
students.
Finally, in terms of
demographics, researchers most often
studied students with learning
disabilities from 2002 through 2004.
Eight studies explicitly targeted
students with learning disabilities.
This pattern mirrors accommodations
patterns from 1999-2001 (Thompson,
Blount, & Thurlow, 2002) most likely
because students with learning
disabilities are the largest group of
students with disabilities and because
this population is frequently assigned
accommodations such as oral
administration and extended time. The
practical value of studying these
accommodations with students with
learning disabilities is obvious, and
was evident in research from 2002
through 2004.
Although this report is
not meant to provide a scientific
meta-analysis of accommodations research
(for meta-analyses see Sireci et al.,
2005 and Tindal and Ketterlin-Geller,
2004), general patterns that emerged
from a review of accommodations research
in the years 2002, 2003, and 2004
indicate possible considerations for
future research.
The majority of research
concentrated on the effect of
accommodations use for students with
disabilities and the effects on score
validity due to accommodations use.
Although there were 36 studies combined
that investigated scoring and validity,
there was little consensus among
researchers. Findings continue to be
contradictory. Research indicated that
accommodations were either beneficial or
not beneficial for students with
disabilities. Likewise, researchers did
not reach consensus on whether
accommodations change the construct of
the item assessed. Because findings are
relatively disparate, there does appear
to be a need for further research.
Research that continues
to delineate the "interaction
hypothesis" (Sireci et al., 2005) and
that reduces construct irrelevant
variance for students with disabilities
without introducing any new effects for
non-disabled students still appears to
be necessary. In 2002-2004, 21 studies
employed experimental or
quasi-experimental methods. Replications
of scientific methods to discover the
effects of accommodations may help the
field to better understand how
accommodations effect scoring and
validity.
While scientific
research holds great importance for
scoring and validity issues, the variety
of research conducted over the course of
2002-2004 is also important. Studies in
2002, 2003, and 2004 investigated
accommodations score effects, validity,
teacher decision-making in terms of
accommodations, accommodations effects
for students from grades K-postsecondary
education, and across five different
subject areas. In addition to more
studies on these topics, future research
should investigate the positive effects
of field-testing potential test items in
accommodated formats in addition to
standard formats.
Although the diversity
of studies about accommodations presents
a challenge to policymakers who may wish
to have definitive conclusions about
accommodations, the breadth of studies
reflects (at least to some extent) the
variety of issues present in education
today. Students with disabilities are
not a homogeneous group. Likewise, one
accommodation does not fit all students,
especially students at different grade
or educational levels.
The move toward more
universally accessible assessments
provides an opportunity to minimize the
need for accommodations. Thompson,
Johnstone, and Thurlow (2002), however,
noted that flexible, universally
designed assessments may only minimize,
not completely diminish, the need for
accommodations. Likewise, until there is
an individualized system for validly
choosing accommodations there is a
continued need for research on teacher
decision-making related to
accommodations.
Although accommodations
research has been a part of educational
research for decades, it appears that it
is still in its nascence. There is still
much scientific disagreement on the
effects, validity, and decision-making
surrounding accommodations. Such
challenges lead to difficult decisions
for future accommodations research.
Scientific studies with large sample
sizes hold promise for determining the
exact effect that researchers can derive
from particular accommodations.
Likewise, tests that are norm-referenced
and statistically defensible (such as
standardized tests) lead to claims of
effects on items that are more
significant.
Unfortunately, much is
lost on studies such as those presented
above. Students with disabilities are a
heterogeneous group that may require a
wide variety of accommodations in order
to access tests. Research from 2002-2004
was most focused on oral administration
and extended time conditions for
students with learning disabilities.
Research related to issues for students
with learning disabilities is meaningful
(given the large numbers of students
with learning disabilities), but
research should not excessively focus on
the needs of the most populous
disability group. Rather, research on
students with a wide variety of
disabilities receiving a wide variety of
accommodations is also a valuable focus,
even when statistical claims are more
difficult to generate.
As testing technology
emerges in the 21st century,
further research will need to address
flexible tests that allow on-demand
accommodations. Findings from 2002-2004
demonstrate only that questions still
abound concerning accommodations, and
that answers are often circumstantial,
population-dependent, or constrained to
particular tests. Such findings justify
the need for tests that diminish the
need for accommodations, but also more
research on accommodations that
currently exist. As we move forward into
the next generation of accommodations
research, one fact is certain: so long
as policies (such as the Individuals
with Disabilities Education Act and the
Americans with Disabilities Act) require
that students with disabilities receive
accommodations, there will always be a
need for research on how to best, most
fairly, and most accurately assess
all students.
References
Barton, K. E. (2002).
Stability of constructs across groups of
students with different disabilities on
a reading assessment under standard and
accommodated administrations (Doctoral
dissertation, University of South
Carolina, 2001). Dissertation
Abstracts International, 62/12,
4136.
Barton, K. E., & Huynh,
H. (2003). Patterns of errors made by
students with disabilities on a reading
test with oral reading administration.
Educational and Psychological
Measurement, 63(4), 602-614.
Barton, K. E., &
Sheinker, A. (2003). Comparability
and accessibility: On line versus on
paper writing prompt administration and
scoring across students with various
abilities. Monterey, CA: CTB-McGraw-Hill.
Bielinski, J.,
Ysseldyke, J., Bolt, S., Friedebach, J.,
& Friedebach, M. (2001).
Read-aloud
accommodation: Effects on
multiple-choice reading & math items.
Minneapolis, MN: University of
Minnesota, National Center on
Educational Outcomes.
Bielinski, J., Sheinker,
A., & Ysseldyke, J. (2003).
Varied
opinions on how to report accommodated
test scores: Findings based on
CTB/McGraw-Hill’s framework for
classifying accommodations
(Synthesis Report 49). Minneapolis, MN:
University of Minnesota, National Center
on Educational Outcomes.
Bolt, S., & Bielinski,
J. (2002). The effects of the read
aloud accommodation on math test items.
Paper presented at the annual meeting of
the National Council on Measurement in
Education, New Orleans, LA.
Bridgeman, B., Cline,
F., & Hessinger, J. (2004). Effect of
extra time on verbal and quantitative
GRE scores. Applied Measurement in
Education, 17(1), 25-37.
Bridgeman, B., Lennon,
M. L., & Jackenthal, A. (2003). Effects
of screen size, screen resolution and
display rate on computer-based test
performance. Applied Measurement in
Education, 16(3), 191-205.
Buehler, K. L. (2002).
Standardized group achievement tests and
the accommodation of additional time
(Doctoral dissertation, Indiana State
University, 2001). Dissertation
Abstracts International, 63/04,
1312.
Burch, M. (2002).
Effects of computer-based test
accommodations on the math
problem-solving performance of students
with and without disabilities (Doctoral
dissertation, Vanderbilt University,
2002). Dissertation Abstracts
International, 63/03, 902.
Cahalan, C., Mandinach
E., & Camara, W. J. (2002).
Predictive validity of SAT I: Reasoning
test for test-takers with learning
disabilities and extended time
accommodations. New York, NY: The
College Reporting Board.
Choi, S. W., & Tinker,
T. (2002). Evaluating comparability
of paper-and-pencil and computer-based
assessment in a K-12 setting. Paper
presented at the annual meeting of the
National Council on Measurement in
Education, New Orleans, LA.
Cisar, C. A. (2004).
Teacher’s knowledge about accommodations
and modifications as they relate to
assessment (Doctoral dissertation,
Loyola University of Chicago, 2004).
Dissertation Abstracts International,
65/10, 3754.
Crawford, L., Helwig,
R., & Tindal, G. (2004). Writing
performance assessments: How important
is extended time? Journal of Learning
Disabilitiess, 37(2), 132-142.
Dempsey, K. M. (2004).
The impact of additional time on LSAT
scores: Does time really matter? The
efficacy of making decisions on a
case-by-case basis (Doctoral
dissertation, La Salle University,
2004). Dissertation Abstracts
International, 64/10, 5212.
Elbaum, B., Arguelles,
M. E., Cambpell, Y., & Saleh, M. B.
(2004). Effects of a student-reads-aloud
accommodation on the performance of
students with and without learning
disabilities on a test of reading
comprehension. Exceptionalityy,
12(2), 71-87.
Elliott, S. N., &
Marquart, A. M. (2003). Extended time
as an accommodation on a standardized
mathematics test: An investigation of
its effects on scores and perceived
c |