Quantitative Methods for
Policy Research
Most researchers and academics tend to stick with the research methods they know best, learned mainly in graduate school—even though those methods might not represent current best practices or the most appropriate method. This is why statistician and education researcher Larry Hedges, with the support of a group of distinguished interdisciplinary scholars, launched the Center for Improving Methods for Quantitative Policy Research, or Q-Center, at the Institute for Policy Research. Hedges co-directs the center with social psychologist Thomas D. Cook. Q-Center faculty work on:
| • |
improving designs, analysis, and synthesis in policy research |
| • |
designing better research methods for education |
| • |
fostering a community of scholars |
| • |
developing new data sources and methods of data collection |
 Overview
of Activities
Partial Knowledge and Identification
Charles F. Manski, Board of Trustees Professor in Economics, continues his original work on the difficulties of selecting the best policy with limited knowledge of policy impacts, which he expounded in two books, most recently in Identification for Prediction and Decision (Harvard University Press, 2007).
Response Errors in Survey Research
Manski and Francesca Molinari of Cornell University are also working on nonresponse and response errors in survey research. They hope to improve the researchers’ ability to use data from the Health and Retirement Study (HRS) by recognizing that some respondent records are incomplete and possibly error-ridden and then extend their findings to general survey research. The two researchers are analyzing HRS data to see whether they can improve assessment of data quality. In particular, they want to learn whether certain types of respondents in large-scale surveys such as the HRS systematically tend to provide inaccurate or incomplete information. Most recently, they addressed the use of skip sequencing, in which respondents are only asked a certain question or series of questions based on their response to a broad, opening question. Manski and Molinari consider various predictions of nonresponse and response errors, outlining the situations in which skip sequencing works best. They published their findings in the Annals of Applied Statistics.
Quasi-Experimental Methods and Designs
Thomas D. Cook, Joan and Sarepta Harrison Chair in Ethics and Justice, continues his work on quasi-experimental alternatives to random assignment, focusing mostly on two methods: regression-discontinuity designs and propensity score matching.
Cook, a social psychologist, and IPR graduate research assistant Vivian Wong recently published a paper reviewing whether regression-discontinuity studies reproduce the results of randomized experiments conducted on the same topic. They enumerate the general conditions necessary for a strong test of correspondence in results when an experiment is used to validate any nonexperimental method. They identify three studies where regression discontinuity and experimental results with overlapping samples were explicitly contrasted. By the criteria of both effect sizes and statistical significance patterns, they then show that each study produced similar results. This correspondence is what theory predicts. To achieve it in the complex social settings in which these within-study comparisons were carried out, however, suggests that regression discontinuity results might be more generally robust than some critics contend.
Cook and IPR postdoctoral fellow Maynee Wong are investigating further potential for regression-discontinuity designs to see if such designs can handle multiple variables in general. They are using recent data from No Child Left Behind (NCLB), a program that uses multiple criteria to select children for remedial education services. In this type of analysis, the estimand is no longer a single point; instead, it becomes an intersection point of several independent variables on a multidimensional plane.
In conjunction with IPR visiting scholar Peter Steiner, Cook is also examining the use of matching as an analytic substitute for randomization. Cook and Steiner demonstrate why propensity score methods—coupled with observational data—can be used to recreate the results of a randomized experiment. They find that the key to reducing bias when faced with the unreliability of predictors is to select the “right” covariates and to make sure those covariates are measured well. In future work, they hope to develop better indicators for which covariates are the “right” ones in various research contexts.
Handbook of Meta-Analysis
Larry Hedges, Board of Trustees Professor of Statistics and Social Policy, has finished editing the second edition of The Handbook of Research Synthesis and Meta-Analysis (Russell Sage Foundation) with Harris Cooper of Duke University and Jeff Valentine of the University of Louisville. Updating the first edition, which became the most-cited reference book in the field, the new edition incorporates state-of-the-art techniques from all quantitative synthesis traditions. Distilling a vast technical literature and many informal sources, the handbook provides a portfolio of the most effective solutions to the problems of quantitative data integration. Among the statistical issues addressed by the authors are the synthesis of non-independent data sets, fixed and random effects methods, the performance of sensitivity analyses and model assessments, and the problem of missing data. In response to the increased use of research synthesis in formulating public policy, the second edition includes several new chapters. One is on the strengths and limitations of research synthesis in policy debates and decisions. Another looks at computing effect sizes and standard errors from clustered data, such as schools or clinics.
Randomization in Education Research
Many researchers believe that randomized experimentation is usually the best methodology for investigating issues in education. However, it is not always feasible. The usually advocated alternative—quasi-experimentation—has recently come under attack from scholars who contrast the results from a randomized experiment and a quasi-experiment on the same topic, where the quasi-experiment shares the same intervention as the experiment. Thus, the quasi-experiment and the experiment are supposed to vary only in whether the control group is randomly formed or not. Cook is critically examining this literature, comprising more than 20 studies. The project receives support from the Institute of Education Sciences (IES) and is part of a larger project examining methods for improving the design, implementation, and analysis of four specific quasi-experimental designs in education: regression discontinuity, case matching on the propensity score, short interrupted time series, and pattern matching. Cook is using Monte Carlo simulations and comparing quasi-experimental results to those from randomized experiments sharing the same treatment group to explore the specific advantages and limitations of each of the four designs.
Reference Values for Intraclass Correlations
Hedges is reanalyzing surveys with nationally representative samples to develop reference values of intraclass correlations. These data can then be used to help plan experiments in education. For example, one study with University of Chicago graduate student Eric Hedberg provides a compilation of intraclass correlation values of academic achievement and related covariate effects that could be used for planning group-randomized experiments in education. This project has funding from the Interagency Educational Research Initiative (IERI). IERI is a collaborative effort of the National Science Foundation, IES, and Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) to support scientific research that investigates the effectiveness of educational interventions in reading, mathematics, and the sciences.
Analysis of Multilevel Methods in Education
In another project supported by IES, Hedges is developing improved statistical methods for analyzing and reporting multilevel experiments in education. He is also working on more efficient designs for such experiments that require the assignment of fewer schools. Such designs should reduce the costs of educational experiments and thus make them more feasible to conduct.
For those designs involving cluster randomization, Hedges has defined three effect sizes—and computing estimates of those effect sizes and their standard errors—from information that is likely to be reported in journal articles. A common mistake in analysis of cluster-randomized trials is to ignore the effect of clustering and analyze the data as if each treatment group were a simple random sample. Hedges has provided a simple correction to the t-statistic that would be computed if clustering were incorrectly ignored.
Correlated Random Coefficient Model
The recent literature on instrumental variables (IV) describes models in which agents sort into treatment status based on gains from treatment as well as on baseline pretreatment levels. Yet the observing economist might not know the components of the gains known and acted on by the agents. Such models are called correlated random coefficient models. Sorting on unobserved components of gains complicates the interpretation of what IV estimates. In work with James Heckman of the University of Chicago, economist Sergio Urzúa tests implications of the hypothesis that agents do not sort into treatment based on gains. The economists develop new tests to gauge the empirical relevance of the correlated random coefficient model and assess whether the additional complications associated with it are required. They also examine the power of the proposed tests and derive a new representation of the variance of the IV estimator for the correlated random coefficient model. Applying their methods to the problem of estimating returns to schooling, they find evidence of sorting into schooling based on unobserved components of gains.
Comparing Instrumental Variables with Structural Models
In a forthcoming article with Heckman, Urzúa compares the economic questions addressed by IV estimators with those addressed by structural approaches. They discuss Marschak’s Maxim—estimators should be selected on the basis of their ability to address well-posed economic problems with minimal assumptions. A key identifying assumption that allows structural methods to be more informative than IV can be tested with data and does not have to be imposed.
Statistical Accuracy
The accuracy of public statistics and the use of statistics to inform and improve social processes and systems is a focus of statistics professor Bruce Spencer’s work. Spencer has completed the first stage of a project on the accuracy of jury verdicts. In a set of 271 cases from four areas, Spencer finds juries gave wrong verdicts in at least one out of eight cases—and judges’ performance was estimated to be not much better. The sample was specialized and not a basis for generalizations. But Spencer is developing designs for large-scale studies that could lead to a better understanding of the type and prevalence of incorrect verdicts—false convictions and false acquittals. He also continues to try to quantify invalidity arising from the use of latent class models.
Forecasting for Areas of Human Capital
Additionally, Spencer is working on estimates and forecasts for selected areas of human capital, such as those that categorize U.S. workers employed in science and technology jobs according to skill. Past studies of U.S. educational attainment have tended to focus on differences in averages across groups. This is consistent with most demographic research, which has focused on rates rather than totals. Total numbers of people with certain types of human capital are important for U.S. competitiveness, however. Using the framework for multiregional demography, as described in his and Juha Alho’s 2005 book, Statistical Demography and Forecasting (Springer), Spencer is developing a new model, which can allow for aging and retirement, international movement, and policy effects of improved incentives for attracting and training students. Such work will pull together a set of previously scattered numbers and could aid in better evaluations of U.S. competitiveness and discussions of the future of higher education.
Population Models and Estimation
Alberto Palloni, Board of Trustees Professor in Sociology, continues his work on transmission models for the spread of HIV/AIDS that he pioneered in the early 1990s. Progress in formulating models, methods, and techniques to trace the epidemic’s effects has been fast and impressive, but more work is needed before researchers can use these models to generate robust forecasts on the epidemic’s future course. Palloni is currently developing generalized stable population models that will be useful in estimating HIV/AIDS prevalence in countries with deficient data on infected individuals.
In his work on health and socioeconomic status (SES), Palloni is developing microsimulation models that combine Bayesian averaging of structural equation models with multiple imputation procedures to determine the magnitude of effects of early health on adult SES, health status, and mortality. Bayesian averaging allows researchers to blend forecasts from competing models to establish their combined predictive uncertainty.
Palloni and collaborators are using a set of different techniques to produce robust estimates for cohorts from data that only portray incomplete cohort trajectories. They are using these techniques to capture the effects of early childhood health conditions from 1970 to 2000 on late adult health in the United States.
Additionally, they formulated another new set of techniques to estimate adult mortality completeness, adjusting for age misstatements. These techniques are being used to produce uniformly adjusted data for countries in Latin America from 1850 to 2000. The adjusted data is then being used to analyze changes in mortality and longevity in Latin America and will constitute the raw material for a book on the subject
Data Centers
Q-Center faculty are involved in two major centers for developing data sources.
The ongoing research agenda of the Data Research and Development Center (DCDC) is to develop and apply research methods for identifying educational interventions that can be scaled up without diminishing the effectiveness of these interventions. The work involves basic research on the design and analysis of studies for determining if an intervention has been scaled successfully, providing technical assistance to similar studies at the Interagency Education Research Initiative. Hedges leads the project with Barbara Schneider of Michigan State University and Colm O’Muircheartaigh of the University of Chicago, which houses the center. The DRDC receives funding from the National Science Foundation.
Northwestern University is the lead institution of another local consortium behind the Chicago Census Research Data Center. The center, located at the Federal Reserve Bank of Chicago, provides researchers an opportunity to engage in approved projects using Census Bureau microdata. Other consortium members include the Argonne National Laboratory, Federal Reserve Bank of Chicago, University of Chicago, and University of Illinois at Chicago. A grant from the National Science Foundation also supports the center. Spencer has played a leading role in integrating the center at Northwestern and is currently working to make data more readily accessible to area graduate students.
Time-Sharing Experiments
Sociologist Jeremy Freese, with Penny Visser of the University of Chicago, has received a grant from the National Science Foundation for the Time-Sharing Experiments for the Social Sciences (TESS) Web site. TESS is an NSF infrastructure project that offers researchers opportunities to test their experimental ideas on large, diverse, randomly selected subject populations. Investigators submit proposals for experimental studies, and TESS fields selected proposals on a random sample of the U.S. population using the Internet. TESS thereby allows investigators to capture the internal validity of experiments while realizing the benefits of contact with large, diverse populations of research participants.
Promoting the Methodological Community
Hedges and Cook are active in fostering the methodological community at a national level as leading members of the Society for Research on Educational Effectiveness, which held its second national conference in March. Keynote speakers included Cook, Judith Gueron of MDRC, and Grover “Russ” Whitehurst, then director of the Institute of Education Sciences in the Department of Education.
By establishing a network of scholars in education, social policy, and behavioral sciences, the society seeks to advance and disseminate research on the causal effects of educational interventions, practices, programs, and policies. It recently hired Robert Greenwald, who specializes in analyzing the cost effectiveness of educational interventions, as the organization’s new director.
Hedges and Barbara Foorman of the University of Texas Health Center continue to edit the organization’s Journal of Research on Educational Effectiveness. The peer-reviewed journal published four issues in 2008 that covered a variety of subjects, including reporting detail for power analyses, scaling up a pre-K math curriculum, and statistical inference when classroom quality is measured with error. As the society’s director, Greenwald will serve as the journal’s editor.
Hedges and Cook are also founding members of the Society for Research Synthesis Methodology (SRSM), a professional society concerned with statistical methods for evidence-based social and health policy research. Hedges will be installed as president of SRSM for 2009–10 at its third annual conference.
Workshop on Cluster-Randomized Trials
Thanks to a grant from the National Center for Education Research in IES, Hedges and professors Mark Lipsey and David Cordray of Vanderbilt University have launched a summer institute on cluster-randomized trials in education research. Thirty researchers from around the country attended an intensive hands-on training session from July 7 to 17 at Northwestern University.
Topics included describing and quantifying outcomes, specifying conceptual and operational models, basic experimental designs for education studies, sampling size and statistical power, and using software like HLM to conduct hierarchical data modeling. Participants also worked on a group project, in which they had to conceptualize and submit a mock funding application for an education experiment.
At the end of the workshop, the participants were joined by the then-director of IES, Grover “Russ” Whitehurst, who flew in from Washington, D.C., to present each participant with a certificate of completion. The institute will be held this summer at Vanderbilt.
Workshop on Quasi-Experimentation
Cook and William Shadish of the University of California, Merced, held weeklong workshops in summer 2008 for 90 educational researchers from universities, contract research firms, and school districts. The two organizers covered the most empirically viable quasi-experimental practices such as regression-discontinuity designs and interrupted time series, highlighting the advantages and disadvantages of using them. They lectured on theory and practice, supplementing their discussions with as many examples as possible from education. They also relied on empirical research that compares the results of randomized experiments to quasi-experiments sharing the same intervention group. The Spencer Foundation supported the workshops.
Training Future Scholars
The Q-Center’s postdoctoral training program is supported by the Institute of Education Sciences. By providing two-year fellowships, the program aims to train postdoctoral fellows in applied education research and produce a new generation of education researchers dedicated to solving the pressing challenges facing the American education system through methodologically rigorous research.
Newly appointed Q-Center postdoctoral fellow Christopher Rhoads is exploring various aspects of clustered experiments, such as the inclusion of heterogeneous treatment effects and implications of “contamination” for experimental design. His other work attempts to develop better methods for dealing with missing data in experiments, procedures for evaluating measures of implementation fidelity, and ways to integrate measures of implementation fidelity into analyses of experiments.
The program’s first postdoctoral fellow, Ezekiel Dixon-Román, started as an assistant professor of sociology at the University of Pennsylvania in fall 2008.
|