Cleveland Chapter



The goal of this course is to provide a broad introduction to statistical methods for analyzing longitudinal data. The emphasis is on the practical aspects of longitudinal analysis. I begin the course with a review of established methods for longitudinal data analysis when the response of interest is continuous and present a general introduction to linear mixed effects models for continuous responses. Next, I discuss how smoothing and semiparametric regression allow greater flexibility for the form of the relationship between the mean response and covariates. I demonstrate how the mixed model representation of penalized splines makes this extension straightforward. When the response of interest is categorical (e.g., binary or count data), two main extensions of generalized linear models to longitudinal data have been proposed: "marginal models" and "generalized linear mixed models." While both classes account for the withinsubject correlation among the repeated measures, they differ in approach. We will highlight the main distinctions between these two types of models and discuss the types of scientific questions addressed by each.
This course reviews the core R language and teaches essentials of R programming to R users at a range of levels. It uses realworld data problems and emphasizes graphical exploration and development. This Cleveland iteration of the course is designed for participants who have some prior experience working with R (roughly at the "advanced beginner" level or higher). The distinction between programming (or scripting) with R and using R is an important one. Most people can use R as a tool for a small number of focused tasks that fit neatly into different boxes. This workshop emphasizes problem solving outsidethebox, where no single function or package is likely to be sufficient. The process is as important as the solution, and this approach to the R language is invaluable in the classroom and in the real world. This course describes and demonstrates effective strategies for using propensity score methods to address the potential for selection bias in observational studies comparing the effectiveness of treatments or exposures. We review the main analytical techniques associated with propensity score methods (matching, weighting, multivariate regression adjustment and stratification using the propensity score, sensitivity analysis for matched samples) and describe key strategic concerns related to effective propensity score estimation, assessment and display of covariate balance, choice of analytic technique, and communicating results effectively. Although we will focus on established approaches to dealing with design and analytical challenges, we conclude the session by reviewing some literature regarding recent methodological advances in propensity scores and application of propensity score methods to problems in health policy research. R has become one of the most popular languages for data analysis and graphics. Its ability to create sophisticated, highly customized, publication quality graphs on multiple platforms is unparalleled. The first half of this oneday workshop will include an introduction to R (R syntax, common mistakes, using packages) and basic data management (importing, cleaning, reformatting data). The second half of this workshop will focus on data visualization techniques including a practical review of R's major graphing capabilities (including base functionality), as well as new capabilities provided by packages such as ggplot2 (grammar of graphics) to create a wide range of univariate and multivariate graphs. This workshop and symposium, jointly presented by the Cleveland Chapter of the ASA and the University of Akron Department of Statistics, celebrated the International Year of Statistics. The handson workshop covered the basics of R applied to a variety of fields and the basics of Bioconductor for cuttingedge biomedical applications. The symposium explored the evolving role of statistics across society. This workshop will provide an introduction to structural equation modeling (SEM), a very general technique combining path models with latent (unobserved) variables. SEMbased approaches link conceptual models, path diagrams and regressionstyle equations together to capture complex and dynamic relationships among a web of variables. Topics will include the advantages of SEMbased approaches, modeling causal relationships, measurement error, model specification, and assessing model fit. The speaker will go through steps of conducting SEM analyses with real data using MPlus, one of the major software packages for SEM. The Ninth International Conference on Health Policy and Statistics (ICHPS) was held in Cleveland this year on October 57. Part of the conference included a full day of workshops by some well known speakers on a variety of statistical, methodological, and measurement topics. The Cleveland Chapter of the ASA was a cosponsor of ICHPS. These workshops served as an alternative to our usual annual Fall workshop. The workshop will be held in a computer lab. The morning session will be an introduction into R, data management using R, graphics in R and similar topics. The afternoon session will include advanced topics implemented in R from multiple linear regression, logistic regression, survival analysis, and resampling. This course will comprise two components each lasting about 1.5 hours:
(1) Definitions of terms used in human genetics and an introduction to the goals of genetic epidemiology,
with an explanation of genetic segregation, linkage and association analysis.
(2) An overview of what can be done with, and how to use, the Statistical Analysis for Genetic Epidemiology (S.A.G.E.)
package of software: this will include a description of the purpose of each S.A.G.E. program,
how to format and input data to the S.A.G.E. package, and a brief demonstration of the S.A.G.E. GUI.
(http://darwin.cwru.edu/sage/)
Paul will present methods for calculating sample size for confidence intervals and sample size and power for hypothesis tests for: means, standard deviations, proportions, counts, regression, correlation and agreement, ANOVA for fixed and random effects, reliability, process capability, and gage error studies. Practical methods using large sample approximations, variable transformations, and the delta method will be emphasized, but exact methods will be noted where the approximate methods fail. Paul will also demonstrate software solutions using Russ Lenth's FREE Piface program (www.stat.uiowa.edu/~rlenth/Power/), Power and Sample Size (PASS, by NCSS Inc., www.ncss.com), R (www.rproject.org), and MINITAB (www.minitab.com). This course is designed to provide a friendly, applied, and practical survey of propensity score methods used for dealing with selection bias in observational studies of exposure effects. Propensity score methods are applicable whenever treatment or policy decisions are of interest, and numerous examples and illustrations will be presented from a variety of subject areas drawn from published articles in biostatistics, education, and public health research and from the speaker's experiences working with industrial clients in insurance, market research, and management consulting. Regularization methods for model building and prediction
are popular in statistics and machine learning. They may be viewed as the procedures that
modify the maximum likelihood principle or the principle of empirical risk minimization.
In particular, methods of regularization in reproducing kernel Hilbert spaces provide a
unified framework for nonparametric statistical model building. Examples include
smoothing splines and support vector machines.
Screening is the process of using designed experiments and statistical analyses to sift through a
very large number of features, such as factors, genes or compounds, in order to discover the few
features that influence a measured response. In current research, screening methods are actively
being developed for the detection of factors which have a substantial effect on the average
response or response variability in a complex system. In particular, the design and analysis of
supersaturated and group screening experiments has been shown to be effective for this purpose
and much research has recently been done in this area.
This workshop is designed to provide statisticians, scientists, or engineers
with the tools necessary to begin to use Bayesian inference in applied problems. Participants in the course
will learn the basics of Bayesian modeling and inference using Markov chain Monte Carlo simulation with the
opensource software package WinBUGS.
This course is designed to provide a friendly, applied and practical survey of propensity score methods used for dealing with selection bias in observational studies of exposure effects. Propensity score methods are applicable whenever treatment or policy decisions are of interest, and numerous examples and illustrations will be presented from a variety of subject areas drawn from published articles in biostatistics, education and public health research and from the speaker's experiences working with industrial clients in insurance, market research and management consulting. Advances in technology have dramatically increased the amount and quality of data that are recorded in all areas of human endeavor. Thousands of measurements are available nowadays in situations where previously only a few measurements, at given points in time or space, were taken. These measurements allow the reconstruction of the whole profile or "signature". Basically, the profile becomes the unit of analysis. This seminar will discuss two problems with profiles: (1) how to determine if predetermined sets of curves are different, and (2) how to identify clusters in a set of curves. Examples from various fields will be presented. Instructions on the implementation of these techniques in SPlus and SAS will be given. Statistical procedures for missing data have vastly improved in the last two decades, yet misconception and unsound practice still abound. In this seminar, we will This seminar will be based on the book "Experiments: Planning, Analysis, and Parameter Design Optimization" by Jeff Wu and Mike Hamada (2000). Course notes will be made available to attendees. This book contains many new methods not found in existing textbooks, and covers more than 80 data sets and 200 exercises. The new tools covered include robust parameter design, use of minimum aberration criterion for optimal factor assignment, orthogonal arrays of economic run size, analysis strategies to exploit interactions, experiments for reliability improvement, and analysis of experiments with nonnormal responses. Data from real experiments will be used to illustrate concepts. Time will be reserved for questions and discussion. This is a onehalf day course on permutation methods. It is intended for practicing statisticians and others with interest in applying statistical methods. High school algebra is assumed but no higherlevel mathematics is required. Some familiarity with computer simulation would also be helpful. Attendees will be given historical background on resampling methods and a formal introduction to these methods. Emphasis will be placed on the wide variety of applications of the techniques, the computerintensive nature of implementation along with many examples and "real world" applications. The course is intended for those who use statistical methods in their work. This includes practitioners in medicine, business, engineering and the social sciences. It also will be useful to professors of statistics and those who do statistical research but may not be familiar with resampling methods and want to be updated on the latest advances in methodology and application. Dr. Good is the author of two popular texts on resampling methods. Resampling is a powerful technique which has only recently seen an explosion in applications due to enhancements in computational techniques that make these computerintensive methods practical. 