NCME Annual Meeting
2009 Annual Meeting and Training Sessions
April 12-16, 2009
San Diego, CA, USA
Program Highlights
Presidential Address
"What I Think I Know"
Mark Reckase
Career Award Recipient Address
"Scores and Scales for Educational Tests"
Moderator: David Frisbie
Presenter: Michael Kolen
Discussant: Michael Kane
Invited Symposia
"Alternate Assessment based on Alternate Achievement Standards: Improving Technical Rigor"
Organizer: Claudia Flowers
Moderator: Martha Thurlow
Presenters: Diane Browder, Scott Marion, Jim Pellegrino, Linda Cook, Marianne Perie, Stanley Rabinowitz
Discussants: Michael Kolen, Suzanne Lane
"Measuring and Evaluating Growth in Student Achievement: A Conversation about Technical and Conceptual Issues"
Organizer/Moderator: Derek Briggs
Presenters: Dale Ballou, Lou Mariano, Damian Betebenner, Derek Briggs, Mark Wilson Discussants: Michael Kolen, Richard Patz, Frank Rijmen
"Bradley Hanson: The Man Behind the Award and His Legacy as a Psychometrician" Organizer/Moderator: Jimmy de la Torre
Participants: Deborah Harris, Gary Skaggs, Won-Chan Lee, Jianbin Fu, Xiaohong Gao, Anton Beguin
Discussant: Richard Patz
"Making Test Score Scales and Reports More Understandable and Useful"
Organizer/Moderator: Ronald Hambleton
Participants: Krista Breithaupt, Shelby Haberman, John Hattie, Thanos Patelis, Joe Ryan, Sandip Sinharay, Stephen Sireci, April Zenisky
"New Directions in Test Security and Cheating Detection Research"
Organizer/Moderator: Alan D. Mead
Participants: Siang Chee Chuah, Ben-Roy Do, Fritz Drasgow, John Mattar, Aster Tessema, Dennis Maynes, Alan Mead, Gunnar Schrah, Leanne Buehler, Bobby Baker
Discussants: Gerald Melican, Anthony Zara
"Standard Setting in an Accountability Growth Context: A Process or One-Time Event?" Organizer/Moderator: Isaac Bejar
Presenters: Michael Kane, Damian Betebenner, Steve Ferrara, Dubravka Svetina, Anne Davidson, Jim Pellegrino, David Abrams
Discussants: Robert Linn, Ronald Hambleton
"Issues in the Use of Automated Essay Scoring in High Stakes Assessments"
Organizer/Moderator: Brent Bridgeman
Participants: David Williamson, Tim Davey, Brent Bridgeman, Catherine Trapani, Karen Lochbaum, John De Jong, Yigal Attali
Discussants: Mark Shermis, Brian Clauser
"Revising our Test Standards" (Co-sponsored with AERA Division D, to be scheduled as an AERA session)
Organizer: Rosemary Reshetar
Participants: Linda Cook, Barbara Plake, Brian Gong, Denny Way, Lauress Wise
Committee-Sponsored Symposia
DIVERSITY ISSUES AND TESTING COMMITTEE
"Large Scale Assessment and Accommodating Students with Disabilities: Past, Present & Future"
Organizer/Moderator: Sara Bolt
Participants: Martha Thurlow, Barbara Plake, Cara Cahalan Laitusis, Sami Kitmitto, Victor Bandeira de Mello, Jerry Tindal
NATIONAL ASSOCIATION OF TEST DIRECTORS
"NCLB at Year 8 in the Assessment of English Language Learners: Taking Stock of the Assessment and Accountability Systems"
Organizer/Moderator: Phil Morse
Participants: Jamal Abedi, David Francis, Rebecca Kopriva
Discussants: Gregory Cizek, Robert Linquanti
GRADUATE STUDENT ISSUES COMMITTEE
"Accurate Assessment of Student Achievement: Today's Challenges and Solutions"
Organizer: Dubravka Svetina
Moderator: Kimberly A. Swygert
Participants: Robert Lissitz, Cornelia Orr, Anne Davidson, John Tanner, Jamal Abedi
Graduate Student Poster
This 12th annual poster session of NCME's Graduate Student Issues Committee provides an opportunity for graduate students to share their work and receive feedback from professionals and their peers.
NCME and AERA Division D Joint Welcome Reception for Current and New Members
This year, we will begin a new tradition for the Annual Meetings of NCME and AERA Division D; the NCME and AERA Division D Joint Welcome Reception for Current and New Members. This reception will replace the NCME No-Host Welcome Reception and the Division D Reception and Business Meeting. Free drinks will be provided for graduate students and new members of AERA Division D and NCME. Please be sure to attend this joint event and meet old friends and welcome new members to both organizations.
NCME Fitness Run/Walk
Thursday, April 16th, 5:35 a.m.- 7:30 a.m.
Organizers: Brian F. French and Jill van den Heuvel
- Run 5k or walk 2.5k course along the water front
- Commemorative t-shirts for all participants (even if you don't wake up in time to make it!)
Pre-Conference Training Sessions
The 2009 NCME pre-conference training sessions will be held at the Hard Rock Hotel in San Diego, California on Sunday, April 12 and Monday, April 13, 2009.
Advance registration for the training sessions is strongly encouraged. The only way to register in advance for the training sessions is to use NCME's on-line registration system. To do this, please go to http://www.ncme.org.
Registration on-site will be available only for those training sessions that have not been filled through advance registration.
Refunds of registration fees for the training sessions cannot be made after March 2, 2009.
Please note that Internet connectivity will not be available at the conference and that, where applicable, participants should download the software required prior to the training sessions.
Sunday, April 12, 2009
Developing Noncognitive Assessments
Presenter(s): Patrick Kyllonen, Educational Testing Service; Richard Roberts, Educational Testing Service
Fee: $80
Time: 8:00 a.m. - 5:00 p.m.
Noncognitive qualities are increasingly recognized as important determinants and reflections of success in education from K-12 through graduate and professional school. In this training session we will provide background theory and frameworks for developing noncognitive assessments, and provide hands-on experience in developing and evaluating noncognitive assessments. We will review the major personality models and related noncognitive constructs, discuss methods used to measure noncognitive qualities, demonstrate how to find or to write noncognitive items, present the advantages and disadvantages of different approaches to collecting data, and review strategies for dealing with various validity threats, such as the problem of faking on self assessments. We will demonstrate analysis approaches, including exploratory and confirmatory factor-analysis, and review various uses of noncognitive assessments.
The session will consist of a series of lectures interspersed with examples and empirical findings. Q&A will be encouraged throughout. We will cover the following topics:
- Noncognitive construct frameworks, models, and theories (personality, attitudes, values, beliefs, and other constructs)
- Developing assessments from construct definitions and item pools, including the international personality item pool (IPIP)
- Various methods for assessing noncognitive qualities (self-assessments, others' ratings, situational judgment tests, conditional reasoning, implicit association tests)
- Item writing do's and don'ts
- The problem of faking on self-assessments (preventing, detecting, & correcting for it)
- Delivery platforms (web and paper-and-pencil)
- Exploratory factor analysis and other data structure exploration methods
- Confirmatory factor analysis
- Advanced methods (IRT, latent class models, unfolding models)
- Special topics (rating scale issues [optimal number of points; presence of neutral point, "do not know"], reverse key items)
- Indirect measures (e.g., from school records)
- Example noncognitive assessments (self-help for community college; institutional reporting for K-12; high stakes for graduate school)
Generalizability Theory and Applications
Presenter(s): Robert Brennan, University of Iowa; Xiaohong Gao, ACT, Inc.; Won-Chan Lee, University of Iowa
Fee: $135
Time: 8:00 a.m. - 5:00 p.m.
Generalizability theory liberalizes and extends classical test theory. In particular, generalizability theory enables an investigator to disentangle multiple sources of error through the application of analysis of variance procedures to assess the dependability of measurements.
The primary goals of this training session are to enable participants to understand the basic principles of generalizability theory, to conduct relatively straightforward generalizability analyses, and to interpret and use the results of such analyses. Mathematical and statistical foundations will be treated only minimally. Major emphasis will be placed upon quickly enabling participants to conduct and interpret relatively straightforward generalizability analyses, then more complicated ones. Examples will include various types of performance assessments.
Prerequisites include knowledge equivalent to one course in educational measurement and familiarity with ANOVA at the level treated in introductory graduate courses in education and psychology. A book written by the director and entitled Generalizability Theory will be distributed to participants and used as a principle reference in the training session. Computer programs for performing generalizability analyses will be discussed and illustrated. (Participants need not bring laptops.)
Item Response Theory: Parameter Estimation Techniques
Presenter(s): Seock-Ho Kim, Univeristy of Georgia
Fee: $135
Time: 8:00 a.m. - 5:00 p.m.
Theory and methods for the educational and psychological measurement of latent variables using item response theory methodology are discussed. The one-parameter logistic or Rasch, the two-parameter logistic, and the Birnbaum's three-parameter models for dichotomously scored item response data will be reviewed from a theoretical viewpoint with an emphasis on the various estimation techniques of the model parameters. Applications of these models to practical measurement situations will be studied using item response theory computer programs. Topics of the course consist of item calibration, scoring, information, and some applications to instrument construction. Models for polytomously scored items are briefly discussed.
Prerequisites include knowledge equivalent to one graduate course in theoretical educational measurement and familiarity with differential and integral calculus treated in undergraduate mathematics courses. A book coauthored by the director with Frank B. Baker entitled Item Response Theory: Parameter Estimation Techniques will be distributed to participants and used as a principle reference in the training session. Computer programs for performing item response theory analyses will be discussed and illustrated. Participants are encouraged to bring their own laptop computers.
The intended audience is principally upper-level graduate students and new measurement professionals who are interested in learning about the various parameter estimation techniques in the context of unidimensional item response theory models.
Using R for Everyday Research
Presenter(s): Brian Habing, University of South Carolina; Jessalyn Smith, University of South Carolina
Fee: $40
Time: 8:00 a.m. - Noon
The free statistics package R has become a favorite of statisticians over the past decade - and it offers a large number of benefits to quantitative researchers in all areas of educational research. With you working along through each step on your own laptop computer, this training course will cover some of the most useful aspects of R for any researcher, including: making fully customized graphs (including color, axes, and labels); manipulating data sets in an intuitive way to quickly get the precise subset of subjects and variables that you want; and performing statistical analyses with a single command. The course will end with basic examples of how R can be used to simulate data sets (with an example perfect for classroom use) and how it can be easily customized to perform functions that aren't built in.
This course is designed for those who have had a two-course sequence in quantitative methods but have no previous experience with R. Participants must bring their own (windows compatible) laptop computer; all required software will be provided.
Quality Control in Test Development, Scoring, and Reporting of Test Scores
Presenter(s): Avi Allalouf, National Institute for Testing and Evaluation; Ruth Fortus, National Institute for Testing and Evaluation
Fee: $65
Time: 8:00 a.m. - Noon
Testing in educational and psychological measurement involves a number of important stages, each depending greatly upon the previous one: test development, test scoring, test analysis and score reporting. This training session deals with quality control procedures for these stages.
Quality control procedures are required in order to monitor the testing process and to keep the number of mistakes to a minimum. Mistakes in scoring, for example, can lead to legal action against the testing agency or the educational institution; a high incidence of mistakes in items will have an adverse impact on test reliability and validity.
Professional practitioners should be aware of possible mistakes that can occur during test development, test scoring, test analysis and the reporting of scores. They should act in accordance with up-to-date standards and have a broad knowledge of quality control practices, as these are critical in the never-ending fight against errors. This session is intended to increase accuracy in test measurement.
In the session, mistakes that might occur at each stage will be presented, followed by examples and quality control procedures for avoiding, detecting or correcting these mistakes.. Many of the quality control procedures discussed are also relevant for internet-delivered and internet-scored testing.
The session will also touch on models that deal with the causes, prediction and reduction of human error.
The workshop will be potentially useful for people who are involved in:
- test development
- test administration
- scoring tests
- item and test analysis (including test norming and equating)
- maintaining test security
- reporting test results and providing feedback to people who have been tested
- policy-making and legislation
The workshop will consist of short modules, each accompanied by real-world examples. Participants will be given hands-on practice in detecting various types of errors. The workshop content is based upon experience gained by the presenters from their work at NITE, and upon an ongoing project of developing quality control guidelines for the ITC (International Testing Commission).
Linking and Aligning Scores and Scales
Presenter(s): Neil Dorans, Educational Testing Service; Jinghua Liu, Educational Testing Service; Mary Pommerich, Defense Manpwer Data Center; Michael Walker, Educational Testing Service
Fee: $110
Time: 8:00 a.m. - Noon
The communication of linking issues to test score users is a critical component to ensuring the validity of a linkage. This training session seeks to facilitate communication about the appropriate use and interpretation of linked scores by emphasizing the different meanings that can be attached to different linkages, and the necessary requirements to achieve solid linkages. A foundations portion will present a historical perspective on score linking, provide definitions and distinctions between types of linkages, discuss relevant data collection designs, and give an overview of linking methodology and assumptions. A linking scenarios portion will make expanded distinctions between types of linkages and discuss practical issues, using real world examples. Topics will be equating, tests in transition, concordance, vertical scaling, and linking group assessments to individual assessments. A tools portion will discuss indices that can be used to choose an appropriate linkage type and methods that can be used to evaluate linkage quality. A score interpretation portion will focus on the appropriate usage and interpretation of linked scores, comparing and contrasting across the different linking scenarios.
A book written by the presenter and entitled Linking and Aligning Scores and Scales will be distributed to participants.
Managing Simulation Studies with R
Presenter(s): Brian Habing, University of South Carolina; Jessalyn Smith, University of South Carolina
Fee: $40
Time: 1:00 p.m. - 5:00 p.m.
Simulation studies to validate various procedures' effectiveness are a major part of quantitative and psychometric research. The R statistical package can be used to easily run and manage simulation studies, including those that need to call pre-existing programs such as BILOG, MPlus, NOHARM, PARSCALE, POLYEQUATE, and TESTFACT. This course will guide the participants through using R to easily generate and manipulate a wide variety of data sets, create the command and data files required by other software, run the other software, and read in the output for further analysis.
This course assumes that the participants have at least some familiarity with R - programming experience is not assumed. Participants should bring their own (windows compatible) laptop computer and any executables that they need to integrate into their own simulation studies. Copies of R, NOHARM, and POLYEQUATE will be provided.
A nonlinear mixed models approach to IRT
Presenter(s): Paul De Boeck, KU Leuven, Frank Rijmen, Educational Testing Service; Francis Tuerlinckx, KU Leuven; Mark Wilson, UC Berkeley
Fee: $65
Time: 1:00 p.m. - 5:00 p.m.
The central message of the introduction is that it is beneficial to see IRT models as extensions of generalized linear regression models that seek to model facets of the measurement situation: These facets are most typically persons and items, but the set may be extended to incorporate other facets such as raters, and may also be re-labelled to suit particular applications. While the link function and the random component of the regression model remain the same, the most interesting part of the extension concerns the structural part of the model: (1) the kind of predictive function (linear or nonlinear, e.g. bilinear), (2) the effects (weights) of the predictors (fixed effects or random effects).
Starting from some well-known IRT models, other and less well-known models will be framed in this approach, based on a volume published by Springer: "Explanatory Item Response Models: A generalized linear and nonlinear approach" (De Boeck & Wilson, 2004). We will illustrate how the models can be estimated with the SAS procedure NLMIXED.
The workshop will consist of two parts. In the first part, the explanatory item response framework will be presented, and it will be explained how the framework fits within the family of generalized linear and nonlinear mixed models. Specific attention will be devoted to the distinction between descriptive and explanatory item response models, and the distinction between fixed and random effects. It will be shown how well-known item response models fit within this framework. In addition, the framework naturally leads to new item response models, such as models with both random item and random person effects.
In the second part, an in-depth account will be given of multidimensional item response models, and models for polytomous data. Again, both families of models can be conceptualized as generalized linear and nonlinear mixed models, and doing so naturally leads to model extensions that may be of interest to the applied researcher. In this part, some attention will be devoted to model estimation as well. We will also emphasize random item concepts and models
Throughout, the models are illustrated with datasets on anger and verbal aggression.
Monday, April 13, 2009
An Introduction to Student Growth Percentiles: Concepts, Estimation and Use
Presenter(s): Damian Betebenner, Center for Assessment; Jinnie Choi, University of California, Berkeley; Hi Shin Shim, Georgia Tech; Dianne Lefly, Colorado Department of Education; Marie Huchton, University of Colorado, Boulder
Fee: $80
Time: 8:00 a.m. - 5:00 p.m.
The proliferation of annual student testing during the last decade has left states and testing organizations with vast amounts of longitudinal assessment data and few sophisticated means to analyze these multiyear data sets. As a consequence, use of growth analyses to inform discussions about student growth and its relationship to education quality has been limited. In this training session, participants will be introduced to student growth percentiles and shown how to use the open source R software package to calculate student growth percentiles and percentile growth trajectories with large (e.g., state-level) longitudinal datasets. Topics covered will include a conceptual overview of student growth percentiles, data preparation, student growth percentile calculation, percentile growth trajectory calculation and their use with growth standard setting. The session will incorporate real-world examples of how the results of such analyses can be used as part of state and federal accountability systems to inform discussions about educational quality.
Applying Hierarchical Models to Causal Inference
Presenter(s): Guanglei Hong, OISE / University of Toronto; Stephen Raudenbush, University of Chicago
Fee: $80
Time: 8:00 a.m. - 5:00 p.m.
In this training session we will introduce recent development of causal inference concepts and methods for evaluating educational policy and program effects in multi-level settings when randomized experiments are infeasible. We teach hierarchical linear and nonlinear models in combination with propensity score-based methods for causal effect estimation. Education examples will be used throughout in lecture, discussion, and hands-on practice. The session is intended for researchers interested in investigating the effectiveness of educational policies, intervention programs, and various educational practices. After presenting the basics of hierarchical models and of causal inference, we use examples to illustrate (1) how to conceptualize, in terms of potential outcomes, the causal effects of educational interventions carried out in a multi-level school system, (2) how to identify and summarize information of selection bias from multiple sources through analyzing logistic regression models or hierarchical generalized linear models, (3) how to stratify sample data on the basis of the estimated propensity score, (4) how to use hierarchical models to statistically adjust for the selection bias in multi-level data, (5) how to make explicit statistical assumptions, and (6) how to assess the consequences of possible unmeasured confounders. Participants will practice the procedure of causal effect estimation using HLM version 6 along with SPSS 15.0. Participants are expected to bring a laptop computer with SPSS and HLM standard version or trial edition installed. The standard version or the free 15-day trial edition of the HLM 6 software available at http://www.ssicentral.com/hlm/downloads.html.
Bayesian Networks in Educational Assessment
Presenter(s): Duanli Yan, Educational Testing Service; Russell Almond, Educational Testing Service; Robert Mislevy, University of Maryland; David Williamson, Educational Testing Service
Fee: $80
Time: 8:00 a.m. - 5:00 p.m.
The Bayesian paradigm provides a convenient mathematical system for reasoning about evidence. Bayesian networks provide a graphical language for describing complex systems, and reasoning about evidence in complex models. This allows assessment designers to build scoring that have fidelity to cognitive theories about the domain and yet are mathematically tractable and can be refined with observational data. Topics covered in this tutorial are evidence-centered assessment design, basic Bayesian network representations and computations, available software for manipulating Bayesian networks, refining Bayesian networks using data, and example systems using Bayesian networks. It is recommended that participants bring a laptop to run sample exercises using the student version of Netica (http://www.norsys.com/).
Skils Diagnosis with Latent Variable Models
Presenter(s): Jimmy de la Torre, Rutgers University; Robert Henson, University of North Carolina at Greensboro; Jonathan Templin, University of Georgia
Fee: $85
Time: 8:00 a.m. - Noon
The primary aim of skills diagnosis is to develop and analyze tests in ways that reveal information with more diagnostic value, when compared with traditional approaches. In the methods for skills diagnosis that we consider mastery of particular skills or states of knowledge can be represented by a list of binary latent variables, indicating mastery of each of a finite set of skills under diagnosis. The main objective of skills diagnosis is to classify examinees according to this list of skills. In this training session, several popular modeling and classification approaches will be discussed. Three conjunctive latent class models known as the DINA, NIDA, and Fusion models will be introduced, and software for fitting these models with Mplus will be demonstrated. The training session is meant to provide practical guidelines for implementing skills diagnosis, and considers essential topics such as construction of fixed-length tests, identifying the attributes measured by items, and model-data fit.
The intended audience for this training session includes anyone interested in cognitive or skills diagnosis who has some familiarity with item response theory or classical test theory. No previous knowledge of latent class models or cognitive diagnosis is required. The material will be useful for faculty and students specializing in educational testing, as well as testing professionals working in government or private testing organizations.
The objective of this training session is to provide a short course in some of the most common methods of latent variable modeling that are being applied in cognitive and skills diagnosis. The emphasis is on education as well as training with a particular piece of software. By the end of this session, participants should have a basic understanding of general latent class models, conjunctive latent class models tailored to cognitive diagnosis, methods for constructing exams, and evaluation of goodness of fit. There will also be a discussion of identifying skills on an exam, and construction of exams when diagnosis is the primary objective.
Vertical Scaling Methodologies, Applications, and Research
Presenter(s): Michael Kolen, University of Iowa; Ye Tong, Pearson
Fee: $65
Time: 8:00 a.m. - Noon
The potential need for constructing a vertical scale arises whenever a testing program has multiple grade levels and wishes to have a common scale to compare test scores across these grade levels. Vertical scaling uses statistical process to place test scores that measure similar content domain but at different educational levels onto a common scale. The goals of the session are for attendees to be able to understand the principles of vertical scaling, to conduct vertical scaling and to interpret the results of vertical scaling in reasonable ways. Vertical scaling will be contrasted with related equating and linking processes. Traditional and IRT vertical linking methodologies will be described and practical issues will be discussed.
The focus is on developing a conceptual understanding of vertical scaling through numerical examples and discussion of practical issues. Importance and challenges related to vertical scaling will be included. The text for the session is a chapter in the second edition of Kolen and Brennan's (2004) Test Equating, Scaling, and Linking. Methods and Practices (Second Edition). The session is designed for upper level graduate students, new Ph.D.'s, testing professionals with operational or oversight responsibility for vertical scaling, and others with interest in learning about vertical scaling methods and practices. Participants should have at least two graduate course in measurement and two graduate courses in statistics.
Development and Use of Innovative item Types in Computer-based Testing
Presenter(s): Kathleen Scalise, University of Oregon; Mark Wilson, University of California, Berkeley
Fee: $65
Time: 8:00 a.m. - Noon
One potential limitation for realizing the benefits of computer-based assessment
(CBT) in both instructional assessment and large scale testing comes in designing
questions and tasks with which computers can effectively interface (i.e., for scoring and
score reporting purposes) while still gathering meaningful measurement evidence. This
workshop will allow participants to explore introducing some innovative item types into
their assessment content. A taxonomy of 28 innovative item types in computer-based
assessment will be introduced. These item types have responses that fall somewhere
between fully constrained responses (i.e., the conventional multiple-choice question),
which can be too limiting to tap much of the potential of new information technologies,
and fully constructed responses (i.e. the traditional essay), which can be a challenge for
computers to meaningfully analyze. Participants will bring example items to the
workshop or be provided with examples, work hands-on to convert to innovative types
through a variety of content approaches, investigate and implement automated scoring
options for their selected types, and finish the workshop with modeling practices for
collection of high quality assessment evidence, in a CBT interface using IRT.
Building and Documenting a Valid Assessment System for Students with Disabilities
Presenter(s): Karen Barton and Lara Osleson, CTB/McGraw-Hill
Invited Speaker: Dianne Lefly, Colorado Department of Education
Fee: $65
Time: 1:00 p.m. - 5:00 p.m.
This course is intended for psychometricians, researchers, state Departments of Education personnel, and test development experts who wish to design, build, and document in technical format reliable, valid, defensible assessments, particularly alternate and modified assessments for students with disabilities. Topics range from assessment policy, design, and development to appropriate statistical design and analyses, special studies, and technical documentation. The session will provide the audience with sound psychometric tools and practices to assure alternate (as well as modified and general) assessments can meet high standards of technical adequacy with practical tips and solutions for documenting evidence in a legally defensible manner. In particular, this session will focus on building validity evidence.
Participants will be guided through each step in designing and building a valid and defensible assessment, with approaches to collecting appropriate validity evidence linked to the Standards (AERA, NCME, APA) and Critical Elements). Parallels and distinctions will be made between alternate assessments and both modified and general assessments. Invited speakers will discuss modified and alternate approaches from a state perspective.
Cognitive Assessment: An Introduction to the Rule Space and Q-Matrix Method
Presenter(s): Kikumi Tatsuoka, Columbia University; Anabelle Guerrero, University of Costa Rica;
Enis Dogan, American Institutes for Research
Fee: $110
Time: 1:00 p.m. - 5:00 p.m.
This book introduces a new methodology that allows for the analysis of test results that is free from ambiguous interpretations and demonstrates an individual's true state of knowledge. Measuring the underlying knowledge and cognitive skills is not an easy task because it is impossible to directly observe them; therefore, they are named "latent variables". However, the latent variables useful in cognitive diagnosis must be in the 100's and not just one variable like a "q" ability variable in Item Response Theory. To achieve these difficult goals, we need a new methodology that will transform many unobservable knowledge and skills variables (defined as "attributes" throughout in the book) into observable and measurable attributes without loosing their original meanings.
The purpose of this book is to introduce one such methodology, Rule Space, that has been used since the 1980s and has made it possible to measure these unobservable latent variables and to clearly interpret the results, without loosing the original meaning of attributes. The Rule Space Method (RSM) transforms unobservable attributes involved in test items into observable attribute mastery probabilities that are defined as the probability of using each attribute correctly to get the correct answer for given problems. In other words, RSM converts students' item response patterns into the attribute mastery probabilities. The Rule Space Method (RSM), which can determine an individual's strengths and weakness, has been applied to PSAT to generate scoring reports, which inform schools, teachers and parents exactly what the total score of 500 means. Since RSM belongs to an approach of statistical pattern recognition and classification problems popular in engineering areas, this book will be useful to graduate students in a variety of disciplines. This book has ten chapters but in this training session, Emphases are given to the Q-matrix Theory, Rule Space classifications, the attribute reliability and validity theory. Inquiries about this session should be sent to kumitats@yahoo.com
Techinical Aspects of School Accountability
Presenter(s): Huynh Huynh, University of South Carolina; Robert Kennedy, University of Arkansas for Medical Sciences; Charity Smith, Arkansas Department of Education
Fee: $65
Time: 1:00 p.m. - 5:00 p.m.
The purpose of this training session is to introduce recent technical development regarding school accountability. Technical issues concern with creating school index based on test data, assessing reliability and conditional standard error for the index, setting via school-descriptor and bookmark processes, and assessing reliability and validity of school classifications. Using the Arkansas Act 35 school accountability system as a case study, participants will be guided through the development and operation of the index for school performance (status) and the index of school growth or improvement gain. Handouts given include two technical documents, one for school performance and the other for school growth. Participants are expected to be familiar with basic knowledge of applied statistics and technical aspects of assessment, and a level of awareness of operational and legal issues relating to school accountability.
Tips for graduate students: Advice for finishing school, obtaining a job, and starting a career
Presenter(s): Deborah Harris, ACT, Inc.; Julio Sanclemente, CTB/McGraw-Hill; Andrew Ho, University of Iowa
Fee: $15
Time: 1:00 p.m. - 5:00 p.m.

