A Summary of Research on the Effects of Test Accommodations: 2002 Through 2004

Technical Report 45

Christopher J. Johnstone • Jason Altman • Martha L. Thurlow • Sandra J. Thompson*

September 2006

All rights reserved. Any or all portions of this document may be reproduced and distributed without prior permission, provided the source is cited as:

Johnstone, C. J., Altman, J., Thurlow, M. L., & Thompson, S. J. (2006). A summary of research on the effects of test accommodations: 2002 through 2004 (Technical Report 45). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes. Retrieved [today's date], from the World Wide Web: http://education.umn.edu/NCEO/OnlinePubs/Tech45/

* Dr. Thompson was a Research Associate at NCEO when working on this report. She passed away December 2005 after a career of improving outcomes for students with disabilities.


Table of Contents

Executive Summary
Introduction
Methods
Results
Purpose of Accommodations Research
Types of Assessment
Content Areas Assessed
Type of Accommodation
Research Participants
Research Results
Limitations
Recommendations for Future Research
Discussion and Implications for Future Research
References
Appendix A—Summary of Research Purpose
Appendix B—Summary of Type of Assessment
Appendix C—Subject Area Studied (by Author)
Appendix D—Type of Accommodation Studied (by Author)
Appendix E—Summary of Participants
Appendix F—Summary of Research Results
Appendix G—Summary of Limitations Cited by Researchers
Appendix H—Summary of Suggestions for Future Research (as recommended by authors)


Executive Summary

The No Child Left Behind Act of 2001 (NCLB) requires the reporting of participation in assessments overall and by subgroup, including students with disabilities. As states and school districts strive to meet the goals for adequate yearly progress required by NCLB, the use of individual accommodations continues to be scrutinized for effectiveness, threats to test validity, and score comparability. This report summarizes 49 empirical research studies completed on test accommodations between 2002 and 2004, and provides direction in the design of critically needed future research on accommodations.

NCEO found that studies during this three-year period had the following characteristics:

Purpose. The primary purpose of the 2002-2004 accommodations research was to determine the effects of accommodations use on the large-scale test scores of students with disabilities.

Types of assessment, content areas, and accommodations. The majority of the studies tested students using norm-referenced or criterion-referenced tests, on math or reading/language arts.

Participants. Equal numbers of research studies involved between 1-100 participants, 100-1,000 participants, and more than 1,000 participants of multiple age categories. Participants were varying percentages of students without disabilities and students with disabilities. Students with learning disabilities were studied most frequently among students who receive special education services.

Findings. Findings shared no common theme, with various accommodations shown to have both a positive and non-positive effect on scores. Individual accommodations showed either differential item functioning or no differential item functioning depending on the study. The lack of consistent findings points to a need for further research.

Limitations. Most often, authors noted that studies were too narrow in scope, involved a small sample size, or provided confounding factors. These limitations and other considerations led researchers to recommend investigating the characteristics of accommodations in further detail.

Important overall observations from the NCEO analysis include a need in future research for a clear definition of the constructs tested, a reduction in confounding factors, increased study of institutional factors affecting accommodations judgment, and exploration of the desirability and perceived usefulness of accommodations by students themselves. Future research should focus on improvement in these areas but also on the positive effects of field-testing potential items in accommodated formats in addition to standard formats.


Introduction

Over the past decade, students with disabilities have increasingly participated and performed at proficient levels on general education assessments. Participation and proficient performance of all students are required by the No Child Left Behind Act of 2001. As participation rates increase, so does the use of testing accommodations. Increased use of accommodations should reflect an attempt to ensure that the scores received by students with disabilities are valid measures of achievement. It is also possible that increased use of accommodations is simply a reflection of concern about including students in assessments and a belief that these students need additional aids to help them perform better. Because of this, states are clarifying appropriate accommodation use in state policy, with the goal of encouraging Individualized Education Program (IEP) teams to select accommodations that remove specific disability barriers, but do not give students with disabilities an unfair advantage over their peers. States have begun to monitor accommodations use, and this will help them to better track the effects of accommodations. In 2005, 20 states maintained a database of accommodations actually used during testing within the state, and 26 states documented the specific accommodations used by students on test day (Thompson, Johnstone, Thurlow, & Altman, 2005).

State policymakers and practitioners define accommodations in a variety of ways. For the purposes of this report, we draw from accommodations research to help shape our definitions and outlook on testing accommodations for students with disabilities. For example, Thurlow and Bolt (2001) defined testing accommodations as:

changes in assessment materials or procedures that address aspects of students’ disabilities that may interfere with the demonstration of their knowledge and skills on standardized tests. Accommodations attempt to eliminate barriers to meaningful testing, thereby allowing for the participation of students with disabilities in state and district assessments. (p. 3)

Sireci, Li, and Scarpati (2005) explained the validity of accommodations through an "interaction hypothesis" or the theoretical assumption that test accommodations will lead to improved test scores for students who need accommodations, but not for students who do not need accommodations (i.e., students with disabilities receive a boost in scores as a result of accommodations whereas students without disabilities do not receive a boost or receive a less pronounced boost in scores).

The accommodations allowed in state assessment policies vary from state to state. The most common accommodations found in previous syntheses of research were read aloud accommodations (sometimes referred to as oral administration), computer administration of tests, extended time and tests across multiple days, calculator use, and use of a scribe (Thompson, Blount, & Thurlow, 2002). Most students use a combination of accommodations (Bielinski, Ysseldyke, Bolt, Friedebach, & Friedebach, 2001), as dictated by their IEPs.

Research on accommodations has yielded mixed results in terms of validity and efficacy of providing them to students with disabilities. Over the past several years, accommodations research has yet to produce definitive answers for policymakers and practitioners (Sireci, Li, & Scarpati, 2003). Nevertheless, research reports on accommodations continue to be found in professional journals, indicating that there is still a need to investigate the complex issues surrounding accommodations. This report examines empirical research published between 2002 and 2004. We searched peer-reviewed articles, technical reports, and dissertations in order to provide the readers with up-to-date information on accommodations.

In this report, research is summarized according to several components, including research purpose, type of assessment, content area assessed, type of accommodation, number of participants, percent of sample consisting of students with disabilities, participant grade level, type of disability, research results, research limitations, and recommendations for further research. During our review process, we found 49 published studies on accommodations between 2002-2004. This number reflects a high number of studies conducted in the new millennium, and is slightly more than the number of studies published between 1999 and 2001 (Thompson, Blount, & Thurlow, 2002) (see Table 1). The publications reviewed for this report are found in Appendix A.

Table 1. Number of Accommodations Studies by Years

Years

Number of Studies

1990 through 1992

11

1993 through 1995

18

1996 through 1998

29

1999 through 2001

46

2002 through 2004

49

 


Methods

NCEO used a four-stage process to find publications related to accommodations from 2002 through 2004. First, we conducted a search of electronic databases including ERIC, PsychInfo, Educational Abstracts, and Digital Dissertations using the keywords "accommodation," "test adaptation," "test changes," "test modifications," "test accommodations," "state testing accommodations," "standards-based testing accommodations," and "large-scale testing accommodations."

A second electronic search consisted of organizational Web sites, including Behavior Research and Training (http://brt.uoregon.edu/), the National Center for Research on Evaluation, Standards, and Student Testing (http://www.cse.ucla.edu/), the Center for the Study of Assessment Validity and Evaluation (http://www.c-save.umd.edu/index.html), and the Wisconsin Center for Educational Research (http://www.wcer.wisc.edu/tesacc/). In addition, an archival search of Educational Policy Analysis Archives was undertaken.

In addition to the electronic searches, NCEO staff performed two hand searches. First, references from all selected materials dated 2002 through 2004 were examined in an effort to find further source material. Second, 2002 through 2004 issues of major measurement and special education journals were hand searched in the University of Minnesota library. These journals included Applied Measurement in Education, British Journal of Special Education, Diagnostique, Educational Assessment, Educational and Psychological Measurement, Educational Measurement: Issues and Practice, Educational Psychologist, Educational Psychology, Exceptional Children, Journal of Educational Measurement, Journal of Learning Disabilities, Journal of School Psychology, Journal of Special Education, Remedial and Special Education. Last, the schedules of the annual conferences of major organizations (such as the American Educational Research Association, the Council of Chief State School Officers, Council for Exceptional Children, and the National Council on Measurement in Education) were scanned for presentations on accommodations.

NCEO research staff searched all identified sources over an 18-month period beginning in fall, 2004 and concluding in spring, 2005. All of the studies cited were either empirical research or meta-analyses, each with succinct research findings that added to the field’s knowledge about the effects of accommodations. The References section of this report includes all journal article, research report, conference presentation, and dissertation references.


Results

Purpose of Accommodations Research

Two primary purposes appeared most often in accommodations research from 2002 through 2004 (see Table 2). First, the majority of studies examined the effect of the use of accommodations on scores; 23 studies sought to determine the effect of the use of accommodations on test scores with students with disabilities, and 13 studies investigated the effects of accommodations on test score validity. Second, a set of studies seemed to reflect the purpose of looking at accommodations institutional factors (such as teacher knowledge, effects of policy, and IEP team decision making; nine publications fit this purpose. In addition to these, two publications examined patterns of errors across items or tests and two meta-analyses synthesized accommodations studies. Details of the studies according to purpose are provided in Appendix A.

Table 2. Research Purposes

Research Purpose

Number of Studies

Determine the effect of the use of accommodations on test scores of students with disabilities

23

Investigate the effects of accommodations on test score validity

13

Study institutional factors, teacher judgment, or student desirability of accommodation use

  9

Examine patterns of errors across items or tests

  2

Meta-analysis

  2

Authors employed a variety of methods in their research (see Table 3). The most common methods were experimental and quasi-experimental research, in which research participants took tests under different conditions, and reviews of extant data, in which researchers reviewed data from assessments that students took with or without accommodations that were not specifically designed for experimental and control conditions. Twenty-one publications used experimental design in their methodology and 17 studies reviewed existing data from state large-scale assessments or local assessments. A variety of descriptive and comparative statistics were employed to examine extant data. In addition to these methods, seven additional studies used survey or interview methods to better understand stakeholder understanding of opinions on accommodations. Two studies were meta-analyses, one study evaluated a product and one study described interventions for IEP teams.

Table 3. Research Methods

Method

Number of Studies

Experimental or Quasi-experimental

21

Review of extant data

17

Survey/Interview

7

Meta-analysis

2

IEP intervention

1

Product evaluation

1

Note: Studies are described by their primary methodology.

The experimental and extant data analysis methods combined had two main goals: understanding the effect of accommodations on test scores and understanding the effects of accommodations on the psychometric qualities of items. Among the 23 studies that examined the effects of accommodations on test scores (see Table 2), researchers found that computerized administration (Pomplun, Frey, & Becker, 2002), read-aloud accommodation (Helwig, Rozek-Tedesco, & Tindal, 2002; Meloy, Deville, & Frisbie, 2002), video administration (Burch, 2002; Tindal, 2002), extended time (Bridgeman, Cline, & Hessinger, 2004), and assistive technology (Landau, Russell, Gourgey, Erin, & Cowan, 2003; MacArthur & Cavalier, 2004) all had a positive effect on the test scores of at least some of the students with disabilities included in the research samples. Conversely, Burch (2002) and Barton (2002) found that some students with learning disabilities did not benefit from computer or video accommodations. Likewise, Schuneman, Camara, Cascallar, Wendler, and Lawrence (2002) found that calculator usage did not have an effect on student scores. According to Elliott and Marquart (2003, 2004), extended time also did not yield improved scores for students with disabilities.

In terms of the psychometric properties of accommodations, researchers obtained mixed results in terms of score comparability for items. Barton (2002), Barton and and Huynh (2003), Calahan, Mandinach, and Camara (2002), Huynh, Meyer, and Gallant-Taylor (2002), and Kobrin and Young (2003) found no change in item comparability when various accommodations (including read aloud, extended time, and computerized administration) were employed. Bolt and Bielinski (2002), Choi and Tinker (2002), and Thornton, Reese, Pashley, and Dalessandro (2002), however, all found that tests administered orally, with extended time, or via computer changed item difficulty or constructs.

Finally, findings related to the institutional issues around accommodations were also mixed. Differing foci for studies yielded different results. Six studies of teachers, administrators, and students yielded contrasting results on the relative knowledge of school personnel about accommodations. Cisar (2004) found that special education teachers were more knowledgeable, while Gagnon and McLaughlin (2004) found that teachers and administrators scored similarly on knowledge measures. Woods (2004) discovered that students do not often predict their need for accommodations well.

Types of Assessment

Researchers who studied the effects of accommodations used two main types of assessment to determine effects and error in the use of accommodations in 2002-2004. This information is shown in Table 4. One common approach to testing the effects of accommodations was for researchers to use norm-referenced tests. Education professionals typically use norm-referenced tests for national comparison, diagnostic decisions in schools, and for college entrance decisions. Researchers examined the effects of accommodations on the following norm-referenced tests: California Achievement Tests (CAT), Graduate Record Exam (GRE), Law School Admission Test (LSAT), Nelson-Denny Reading Rest, Scholastic Aptitude Test (SAT), Terra Nova, and the General Certificate of Secondary Education Examination-United Kingdom (GCSE-UK).

In addition, researchers employed a number of statewide criterion-referenced tests (including tests from Maryland, Missouri, Oregon, and South Carolina). One researcher gathered descriptive data from a test that was still in the prototype stage. Appendix B provides details of the assessments used in the research during the years 2002-2004.

Table 4. Type of Assessment

Type of Assessment

Number of Studies

Norm-referenced and Other Standardized Tests

18

State Criterion-referenced Tests or Performance Assessments

18

School or District-designed Tests

0

Other

8

Survey

3

N/A, Meta-Analyses

2

 

Content Areas Assessed

Authors of studies conducted research in five major academic content areas: reading (n=23), mathematics (n=21), science (n=3), writing (n=3), and social studies (n=1) (see Table 5). There were also seven studies that examined test accommodations, but not in a specific content area. The "no specific content area" studies included examinations of test accommodations on general academic assessments, but did not include surveys, or meta-analyses (n=5). Appendix C gives additional details on the content area examined in research studies (11 studies included two or more content areas).

Table 5. Content Areas Assessed*

Content Areas Assessed

Number of Studies

Mathematics

21

Reading/Language Arts

23

Science

3

Writing

3

Social Studies

1

No Specific Content Area

7

*Studies may have reported on multiple content.

Type of Accommodation

We found 15 types of accommodations in the research literature from 2002 through 2004 (see Table 6). Four groups of accommodations emerged: presentation (n=21), timing/scheduling (n=8), response (n=2), and technological aid (n=2) (see also Appendix D). In addition, 11 of the studies investigated the effects of multiple accommodations.

Presentation accommodations were investigated most frequently in the research from 2002-2004. Among publications about presentation accommodations, studies about oral administration of tests (read aloud accommodations) were most common (n=11). The use of computers as a testing accommodation was also common (n=5). In addition, researchers investigated video administration of tests (n=2), large print accommodations (n=1), dictionary use (n=1), and braille formats of tests (n=1). Additional studies examined multiple types of accommodations.

Of the eight studies that explored the timing or scheduling of tests, seven studies investigated extended time and one study examined the outcomes of testing over multiple days. Among the two studies about response formats, one study examined student dictated response. In addition, one studied the use of calculators. Two studies investigated technological aids.

Eleven studies examined the effects of multiple accommodations. Studies of multiple accommodations included combinations of accommodations such as read aloud, video presentation, extra time, large print, individual settings or small group settings.

Table 6. Type of Accommodation

Type of Accommodation

Number of Studies

Presentation (21):

Oral Administration

11

Computer Administration

5

Video

2

Large Print

1

Dictionary Use

1

Braille

1

Timing/Scheduling (8):

Extended Time

7

Multiple Day

1

Response (2):

Dictated Response

1

Calculator

1

Technological Aid (2)

2

Multiple Accommodations (11)

11

N / A (Survey or Meta-Analysis) (5)

5

 

Research Participants

Studies varied in the number of participants included in the sample. Table 7 shows the number of research participants in studies reflected in intervals of 100 participants. Eight high school and college students participated in the smallest study (Landau et al., 2003), and Hall (2002) used test data from 192,000 students in the largest study. Full details of research participant numbers is provided in Appendix E. Overall, approximately one-third of the studies had 99 or fewer participants (n=18); approximately one-third had between 100-999 participants (n=13), and approximately one-third of studies has more than 1,000 research participants (n=16).

Table 7. Number of Participants in Studies

Number of Participants

Number of Studies

1-99

18

100-199

2

200-299

3

300-499

4

500-999

4

More than 1000

16

Not Applicable

2

 

The percentage of research participants with disabilities differed across studies (see Table 8). In 18 studies, students with disabilities made up a majority of the sample (participants in eight studies were 50-74 percent students with disabilities and the samples in ten studies were 75-100 percent students with disabilities). In 15 studies, students with disabilities comprised less than half of the sample (there were less than 25 percent students with disabilities in 10 studies and between 25 and 49 percent students with disabilities in the sample of five studies). Nine studies did not report the percentage of the sample with disabilities and seven studies (including two meta-analyses) did not use research methods that involved students.

Table 8. Percent of Sample Consisting of Students with Disabilities

Percent of Sample Consisting of Students with Disabilities

Number of Studies

1-24%

10

25-49%

5

50-74%

8

75-100%

10

Not Reported

9

No Students with Disabilities Participated in Study

0

Not Applicable

7

 

The grade level of research participants also varied (see Table 9). For example, six studies targeted students who were in elementary school (grades K-5), six studies examined accommodations with middle school students (grades 6-8), and 11 studies examined accommodations with high school students (grades 9-12). Postsecondary students participated in six studies and 15 studies investigated students across grade levels (from grades K to 12). Five studies employed surveys or were meta-analyses that did not involve actual research participants.

Table 9. Grade Level of Participants in Studies

Participant Grade Level

Number of Studies

Elementary (K-5)

6

Middle School (6-8)

6

High School (9-12)

11

Multiple Grade Level Categories (K-Postsecondary)

15

Post Secondary

6

Not Applicable

5

 

In addition to differences in grade level, students with a variety of disability labels participated in studies (see Table 10). Several studies (n=16) included more than one disability category. Fifteen studies included students with learning disabilities, 12 studies included students with communication disabilities, 10 studies included students with cognitive disabilities, and 10 studies included students with emotional/behavioral disabilities. Students with less common disabilities, including physical impairment, sensory disabilities, autism, attention deficit disorder, health impairments, and multiple disabilities, were each included in at least one study. Twenty-one studies did not report the types of disabilities of participants.

Table 10. Disability Categories Included in Studies

Type of Disability

Number of Studies

Learning Disability

15

Cognitive Disability (e.g., mental retardation)

10

Emotional/Behavioral Disability

10

Communication Disability

12

Reading or Math Deficit

4

Other (includes physical and sensory disabilities, autism, attention deficit disorder, health impairments, and multiple disabilities)

16

Not Reported

21

Not Applicable

5

Note: Studies sometimes include students with more than one disability category; all are reflected in this table.

Research Results

Results from the 49 studies reviewed in this synthesis varied. Researchers found accommodations showed both statistically positive and statistically non-significant effects on scores. Likewise, some accommodations had no effect on item comparability, while other types of accommodations compromised item comparability.

Presented here are results according to the type of accommodation used. These results are shown in Table 11, with detailed results available in Appendix F. Similar to previous reviews of accommodations research (Thompson, Blount, & Thurlow, 2002), accommodations appeared to have mixed effects in studies from 2002 through 2004. Furthermore, even when separated by type of accommodation, studies still demonstrated mixed effects. The results from the 11 studies that investigated multiple accommodations are not synthesized due to lack of study focus comparability.

Oral Presentation (Read Aloud). A total of 11 studies on oral administration of assessments (often called "read aloud" accommodations) produced mixed results. For example, Helwig et al., (2002), Huynh, Meyer, and Gallant (2004), Janson (2002), Meloy, Deville, and Frisbie (2002), Tindal (2002), and Weston (2003) all found that read aloud accommodations had a positive effect on scores for students with disabilities. Bolt and Bielinski (2002) and McKevitt and Elliott (2003), however, found read aloud accommodations had no significant impact on student scores.

In terms of item comparability, Barton (2002) and Barton and Huynh (2003) found that items were comparable, whether presented under standard conditions or with read aloud accommodations. However, Bolt and Bielinski (2002), Meloy et al. (2002), and Weston (2003), found that read aloud accommodations did affect item comparability.

The determination of who should use oral presentation accommodations is also an issue. Woods (2004) found great inaccuracy in self-prediction for the need of the read aloud accommodation. Such disparate findings point to the on-going controversies regarding read-aloud accommodations.

One study examined the impact of a student-reads-aloud (i.e., student reads text but aloud) accommodation on the performance of middle and high school students with and without learning disabilities on a test of reading comprehension. Elbaum, Arguelles, Campbell, and Saleh (2004) discovered that students’ test performance did not differ in the two conditions, and students with learning disabilities did not benefit more from the accommodation than students without learning disabilities.

Extended Time. Authors of eight studies examined how the use of extended time affected student achievement levels on tests and the extent to which items under extended time or multiple day administrations of tests compared to those administered under standard conditions. Several studies (Bridgeman et al., 2004; Dempsey, 2004) found that students with disabilities profit from extended time accommodations. In these studies, students with disabilities had higher test scores because of extended time accommodations.

Buehler (2002) and Elliott and Marquart (2004), however, found no significant effect on scores when students were provided extended time. Such disparities in study results again demonstrate the lack of consistency in accommodations research. The comparability of specific items under different administration categories further complicates accommodations issues. Buehler (2002) found varying student results for items that were deemed to be comparable under standard and extended time administrations, but Elliott et al. (2004) and Thornton et al. (2002) found that the items they studied were not comparable under different administrations. A study by Crawford, Helwig, and Tindal (2004) of multi-day testing produced conflicting results.

Computer Administration. In total, five studies investigated the use of computer administered tests from 2002-2004. Among these studies, Pomplun, Frey and Becker (2002) found that students had positive test results when administered tests via computer (rather than paper and pencil format). Barton and Huynh (2003), Bridgeman, Lennon and Jackenthal (2003), and Kobrin and Young (2003), however, found that computer administration of tests had no statistical effect, or a statistically negative impact on student scores. Related studies on item comparability between computer and paper/pencil administration yielded similar results, with Choi and Tinker (2002) determining that items were changed as a result of format differences.

Technological Aid. In the years 2002 through 2004, three research studies investigated technology-based testing accommodations, and all resulted in positive effects on test scores. Hansen, Lee, and Forer (2002) found that the usability of speech output technology was evaluated positively, and that ‘self-voicing’ testing systems have significant potential and may be capable of replacing human readers in certain testing situations. Landau et al. (2003) found that the Tactile Text Tablet, a hybrid between a braille paper-based test and laptop computer, had positive effects on student achievement. MacArthur and Cavalier (2004) found that the use of speech recognition software was feasible, created impressive dictation results, and improved the quality of student-written essays.

Calculator Use. There was only one study between the years of 2002-2004 that considered calculator use. In this study Scheuneman et al. (2002) found that calculator use had no significant effect on scores for students taking the SAT. It is unknown why research in calculator usage has become less common than in past years (there were four research studies on calculator accommodations in the years 1999-2001 but only one accommodations study during the years of 2002-2004).

Dictionary Use. One international study of dictionary accommodations took place from 2002 through 2004. Idstein’s (2003) study of Israeli students found that dictionaries were not an effective accommodation and that dictionary use interrupted student thought patterns.

Table 11. Types of Accommodations in Studies

Type of Accommodation

Research Results

Number of Studies

Oral Presentation (11)

Positive effect on scores

6

No Differential Item Functioning

2

No significant effect on scores

2

Differential Item Functioning

2

Self-prediction for need unreliable

1

Extended Time (12)

Positive effect on scores

3

No significant effect on scores

2

Differential Item Functioning

2

No Differential Item Functioning

1

Scores on accommodated test predictor of grades

1

Computer Administration (5)

No significant effect on scores

3

Positive effect on scores

1

Differential Item Functioning

1

Technological Aid (3)

Positive effect on scores

3

Calculator Use (1)

No significant effect on scores

1

Dictionary Use (1)

Negative effect on Scores

1

Multiple Accommodations

11

N/A, Meta-analyses, Survey, Teacher

5

 

Limitations

Educational research has inherent limitations that require readers to consider findings carefully. For example, true experimental conditions rarely mimic the true conditions in schools under "live" testing conditions. In addition, sample sizes for studies of students with disabilities are often small because students with disabilities are a minority population in schools (roughly one in 10 students has a disability).

Thirty-six authors (74 percent) of accommodation studies published from 2002 through 2004 noted limitations in their studies. Table 12 provides tabular information on most commonly found limitations, and Appendix G provides brief annotations of studies that reported limitations.

Fifteen authors reported a small or narrow sample size, including Hall (2002) who noted, "The study focuses only on fifth grade students, and the results may not generalize to students with disabilities in other grades or dissimilar disabilities, socio-economic statues, etc." Thirteen authors warned that confounding factors may have influenced results. Common factors were testing multiple accommodations at once and an inability to randomize the sample. Four authors found a flaw in research design that affected study results. Two authors each listed conflicting results or nonstandard administration across proctors and schools as a limitation. Eleven authors did not mention a limitation nor did the two meta-analyses.

Table 12. Limitations of Research

Research Limitation

Number of Studies

Small Sample Size/Sample Too Narrow in Scope

15

Confounding Factors

13

Flaw in Research Design

4

Conflicting Results

2

Nonstandard Administration Across Proctors and Schools

2

No Limitations Mentioned

12

Not Applicable/Meta-Analyses

2

 

Recommendations for Future Research

From 2002 through 2004, 34 research studies included recommendations for further research. Table 13 represents the categories of recommendations listed by researchers, with more detailed explanations available in Appendix H. Calls for further investigation demonstrate the continuing investigatory nature of accommodations research. Although scholars have conducted accommodations research for several decades, there is still a clear need for more examination in various areas. In studies conducted from 2002 through 2004, authors suggested there was need for further understanding of the factors of accommodations that contribute to possible variation in results, further understanding of student factors that contribute to accommodations use and success, improved study design or replication of studies, further research on accommodations policy and overall hypotheses, replication of studies due to small sample sizes, investigation into teacher characteristics that relate to accommodation selection, investigation into the possible uses for accommodations in instruction, and investigation into accommodation use practicality.

Table 13. Recommendations for Future Research

Recommendations

Number of Studies

Investigate characteristics of accommodations themselves in further detail

12

Investigate student factors contributing to accommodations use

6

Improved study design or study replication

5

Study policy and accommodations hypotheses

4

Replicate study with larger sample

3

Investigate teacher factors related to accommodations selections

2

Investigate possible instructional uses for accommodations

1

Investigate accommodation practicality

1

 


Discussion and Implications for Future Research

Several themes arose from the 49 studies of accommodations published between 2002 and 2004. One theme was that accommodations research is inconclusive. This is similar to past findings from NCEO summaries of research (Thompson, Blount, & Thurlow, 2002). Given that there is not a preponderance of evidence concerning accommodations, nearly two-thirds of the authors (n=31) suggested that future research is needed to solidify understanding of accommodation effects.

Researchers published three more accommodations studies from 2002 through 2004 than from 1999 through 2001. Similar to previous years, the majority of studies in the most recent period focused on test scores of students with disabilities related to accommodations. A significant number of studies also investigated the effects of accommodations on test score validity. Researchers were particularly concerned about accommodations changing the construct of the items assessed. Similar to previous reviews, a smaller number of studies from 2002 through 2004 concentrated on institutional factors related to accommodation use (e.g., teacher judgment, student selection, policy), on patterns of errors across items, or were meta-analyses of previous work.

The assessments that researchers selected for examination were primarily standardized, norm-referenced tests and performance assessments. As would be expected with assessment requirements of the No Child Left Behind Act of 2001, most accommodations research was conducted in the areas of reading/language arts and mathematics. Across subject areas, researchers studied oral administration and timing accommodations most often.

Sample sizes varied, with approximately equal representation of small (n=99 or fewer subjects), medium (n=100-999 subjects), and large (n³1,000 subjects). Studies with low sample sizes typically targeted students with disabilities in K-12 schools while large studies typically used tests administered to large numbers of students (such as college entrance examinations). Despite the uniform nature of sample sizes, there was less uniformity in terms of the grade-level of participants. Accommodations studies were spread across grade and educational level, but were conducted primarily with secondary and post-secondary education students.

Finally, in terms of demographics, researchers most often studied students with learning disabilities from 2002 through 2004. Eight studies explicitly targeted students with learning disabilities. This pattern mirrors accommodations patterns from 1999-2001 (Thompson, Blount, & Thurlow, 2002) most likely because students with learning disabilities are the largest group of students with disabilities and because this population is frequently assigned accommodations such as oral administration and extended time. The practical value of studying these accommodations with students with learning disabilities is obvious, and was evident in research from 2002 through 2004.

Although this report is not meant to provide a scientific meta-analysis of accommodations research (for meta-analyses see Sireci et al., 2005 and Tindal and Ketterlin-Geller, 2004), general patterns that emerged from a review of accommodations research in the years 2002, 2003, and 2004 indicate possible considerations for future research.

The majority of research concentrated on the effect of accommodations use for students with disabilities and the effects on score validity due to accommodations use. Although there were 36 studies combined that investigated scoring and validity, there was little consensus among researchers. Findings continue to be contradictory. Research indicated that accommodations were either beneficial or not beneficial for students with disabilities. Likewise, researchers did not reach consensus on whether accommodations change the construct of the item assessed. Because findings are relatively disparate, there does appear to be a need for further research.

Research that continues to delineate the "interaction hypothesis" (Sireci et al., 2005) and that reduces construct irrelevant variance for students with disabilities without introducing any new effects for non-disabled students still appears to be necessary. In 2002-2004, 21 studies employed experimental or quasi-experimental methods. Replications of scientific methods to discover the effects of accommodations may help the field to better understand how accommodations effect scoring and validity.

While scientific research holds great importance for scoring and validity issues, the variety of research conducted over the course of 2002-2004 is also important. Studies in 2002, 2003, and 2004 investigated accommodations score effects, validity, teacher decision-making in terms of accommodations, accommodations effects for students from grades K-postsecondary education, and across five different subject areas. In addition to more studies on these topics, future research should investigate the positive effects of field-testing potential test items in accommodated formats in addition to standard formats.

Although the diversity of studies about accommodations presents a challenge to policymakers who may wish to have definitive conclusions about accommodations, the breadth of studies reflects (at least to some extent) the variety of issues present in education today. Students with disabilities are not a homogeneous group. Likewise, one accommodation does not fit all students, especially students at different grade or educational levels.

The move toward more universally accessible assessments provides an opportunity to minimize the need for accommodations. Thompson, Johnstone, and Thurlow (2002), however, noted that flexible, universally designed assessments may only minimize, not completely diminish, the need for accommodations. Likewise, until there is an individualized system for validly choosing accommodations there is a continued need for research on teacher decision-making related to accommodations.

Although accommodations research has been a part of educational research for decades, it appears that it is still in its nascence. There is still much scientific disagreement on the effects, validity, and decision-making surrounding accommodations. Such challenges lead to difficult decisions for future accommodations research. Scientific studies with large sample sizes hold promise for determining the exact effect that researchers can derive from particular accommodations. Likewise, tests that are norm-referenced and statistically defensible (such as standardized tests) lead to claims of effects on items that are more significant.

Unfortunately, much is lost on studies such as those presented above. Students with disabilities are a heterogeneous group that may require a wide variety of accommodations in order to access tests. Research from 2002-2004 was most focused on oral administration and extended time conditions for students with learning disabilities. Research related to issues for students with learning disabilities is meaningful (given the large numbers of students with learning disabilities), but research should not excessively focus on the needs of the most populous disability group. Rather, research on students with a wide variety of disabilities receiving a wide variety of accommodations is also a valuable focus, even when statistical claims are more difficult to generate.

As testing technology emerges in the 21st century, further research will need to address flexible tests that allow on-demand accommodations. Findings from 2002-2004 demonstrate only that questions still abound concerning accommodations, and that answers are often circumstantial, population-dependent, or constrained to particular tests. Such findings justify the need for tests that diminish the need for accommodations, but also more research on accommodations that currently exist. As we move forward into the next generation of accommodations research, one fact is certain: so long as policies (such as the Individuals with Disabilities Education Act and the Americans with Disabilities Act) require that students with disabilities receive accommodations, there will always be a need for research on how to best, most fairly, and most accurately assess all students.


References

Barton, K. E. (2002). Stability of constructs across groups of students with different disabilities on a reading assessment under standard and accommodated administrations (Doctoral dissertation, University of South Carolina, 2001). Dissertation Abstracts International, 62/12, 4136.

Barton, K. E., & Huynh, H. (2003). Patterns of errors made by students with disabilities on a reading test with oral reading administration. Educational and Psychological Measurement, 63(4), 602-614.

Barton, K. E., & Sheinker, A. (2003). Comparability and accessibility: On line versus on paper writing prompt administration and scoring across students with various abilities. Monterey, CA: CTB-McGraw-Hill.

Bielinski, J., Ysseldyke, J., Bolt, S., Friedebach, J., & Friedebach, M. (2001). Read-aloud accommodation: Effects on multiple-choice reading & math items. Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes.

Bielinski, J., Sheinker, A., & Ysseldyke, J. (2003). Varied opinions on how to report accommodated test scores: Findings based on CTB/McGraw-Hill’s framework for classifying accommodations (Synthesis Report 49). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes.

Bolt, S., & Bielinski, J. (2002). The effects of the read aloud accommodation on math test items. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.

Bridgeman, B., Cline, F., & Hessinger, J. (2004). Effect of extra time on verbal and quantitative GRE scores. Applied Measurement in Education, 17(1), 25-37.

Bridgeman, B., Lennon, M. L., & Jackenthal, A. (2003). Effects of screen size, screen resolution and display rate on computer-based test performance. Applied Measurement in Education, 16(3), 191-205.

Buehler, K. L. (2002). Standardized group achievement tests and the accommodation of additional time (Doctoral dissertation, Indiana State University, 2001). Dissertation Abstracts International, 63/04, 1312.

Burch, M. (2002). Effects of computer-based test accommodations on the math problem-solving performance of students with and without disabilities (Doctoral dissertation, Vanderbilt University, 2002). Dissertation Abstracts International, 63/03, 902.

Cahalan, C., Mandinach E., & Camara, W. J. (2002). Predictive validity of SAT I: Reasoning test for test-takers with learning disabilities and extended time accommodations. New York, NY: The College Reporting Board.

Choi, S. W., & Tinker, T. (2002). Evaluating comparability of paper-and-pencil and computer-based assessment in a K-12 setting. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.

Cisar, C. A. (2004). Teacher’s knowledge about accommodations and modifications as they relate to assessment (Doctoral dissertation, Loyola University of Chicago, 2004). Dissertation Abstracts International, 65/10, 3754.

Crawford, L., Helwig, R., & Tindal, G. (2004). Writing performance assessments: How important is extended time? Journal of Learning Disabilitiess, 37(2), 132-142.

Dempsey, K. M. (2004). The impact of additional time on LSAT scores: Does time really matter? The efficacy of making decisions on a case-by-case basis (Doctoral dissertation, La Salle University, 2004). Dissertation Abstracts International, 64/10, 5212.

Elbaum, B., Arguelles, M. E., Cambpell, Y., & Saleh, M. B. (2004). Effects of a student-reads-aloud accommodation on the performance of students with and without learning disabilities on a test of reading comprehension. Exceptionalityy, 12(2), 71-87.

Elliott, S. N., & Marquart, A. M. (2003). Extended time as an accommodation on a standardized mathematics test: An investigation of its effects on scores and perceived c