2009 training series
2009 training series includes webinars
on microarray data analysis, survival models, analysis of stratified binary
data, genomic data analysis, predictive biomarkers in clinical trials and hierarchical Bayes methods.
Microarray data analysis
Presented by Dr. Dhammika Amaratunga (Johnson
and Johnson) and Dr. Javier Cabrera (Rutgers University) on March
11, 2009 (noon-2:00 Eastern time).
This webinar will give a broad survey of
key topics related to microarray data analysis. A brief introduction
will outline a typical microarray experiment and the preprocessing
and quality assessment steps. This will be followed by the main
body of the session, which will cover:
(1) Statistical methods for detecting differentially expressing
(2) Functional analysis of genes for detecting significant gene
Concepts and methods will be discussed and illustrated with real
The authors' implementation of the methods
described may be found at their web sites:
Other implementations are available through Bioconductor.
Some of the material covered will be based
on the authors' text: D. Amaratunga and J. Cabrera (2004): Exploration
and Analysis of DNA Microarray and Protein Array Data, New York:
Case study in parametric survival modeling
Presented by Prof. Frank E. Harrell, Jr (Department
of Biostatistics Vanderbilt University School of Medicine) on
April 3, 2009 (noon-2:00 Eastern time).
In this webinar, advantages of parametric
survival modeling are discussed, contrasting with the Cox semiparametric
proportional hazards model. Common parametric models such as exponential,
Weibull, and log-normal will be overviewed. Then a comprehensive
case study of the development of a log-normal multivariable survival
model will be presented. Covariate effects are modeling flexibly
without assuming linearity, model assumptions are checked, and
the model is interpreted by a variety of graphical devices.
Harrell, F.E. Regression
. Springer, New York, 2001.
Another look at stratified analysis of binary
Presented by Dr. Christy Chuang-Stein (Pfizer
Inc) on May 7, 2009 (noon-2:00 Eastern time).
In this webinar, we will look at the two
classic weighting choices to combine binary data from multiple
strata. The two choices use the inverse weighting and the Cochran-Mantel-Haenszel
weighting. The former is popular among meta analysts while the
latter is frequently used by statisticians. We will look at the
implications of these two choices under different treatment effect
scenarios. In addition to stratified analyses of efficacy data,
we will also examine the situation where safety data are pooled
from multiple studies to create an integrated safety summary.
Experience has shown us that integration of safety data is vulnerable
to the mischief of the Simpson's Paradox. We will show that Simpson's
Paradox not only affects the inferential comparison between treatment
groups with respect to adverse event proportions, it also affects
the estimation of the proportions. We will discuss a proposal
to adjust proportions when reporting proportions is necessary,
as is the current practice in a product's package insert. We will
conclude the Webinar with some practical recommendations.
Design and analysis of count data
Presented by Dr. Mani Lakshminarayanan (Merck)
on June 18, 2009 (noon-2:00 Eastern time).
Large-scale significance testing of genomic
Presented by Prof. John Storey (Princeton
University) on July 23, 2009 (noon-2:00 Eastern time).
The presenter will discuss recent advances
in performing many hypothesis tests in the context of genomics
data. This will include discussion on the false discovery rate,
accounting for latent structure, and borrowing information across
variables to increase power.
Genomic data analysis with targeted maximum
likelihood and super learning
Presented by Prof. Mark van der Laan (UC
Berkeley) on August 24, 2009 (noon-2:00 Eastern time).
Current statistical practice to assess an
effect of an intervention or exposure on an outcome of interest
often involves either maximum likelihood estimation for a priori
specified regression model, or, manual and/or data adaptive interventions
to fine tune a choice of model. In both cases, bias in the point
estimates and the estimate of the signal to noise ratio are rampant,
causing an epidemic of false claims based on data analyses.
In this talk we present our efforts to construct machine learning
algorithms for estimating a causal or adjusted effect that take
away the need for specifying regression models, while still providing
maximum likelihood based estimators and inference. Two fundamental
concepts underlying this methodology are super learning, i.e.,
the very aggressive use of cross-validation to select optimal
combinations of many model fits, and subsequent targeted maximum
likelihood estimation to target the fit towards the causal effect
of interest. Our maximally unbiased and efficient estimates are
accompanied with statistical inference. In addition, multiple
testing methods are employed in case one pursues effect estimation
across a large set of variables.
We illustrate this method in observational studies for assessing
the effect of mutations in the HIV virus that cause resistance
to a particular drug regimen. We also illustrate the performance
for assessing the effect on the outcome or response to treatment
of single nucleotide polymorphisms and gene-expressions in genomic
studies, including randomized trials. In particular, we demonstrate
the performance of the super learning in prediction.
Use of genomics and predictive biomarkers in the design and analysis of Phase III clinical trials
Presented by Dr. Richard Simon (National Cancer Institute) on September 18, 2009 (noon-2:00 Eastern time).
Current methods for the design and analysis of phase III clinical trials often results in the approval and use of drugs in broad populations of patients, many of whom do not benefit. This has serious limitations for patients and for health care economics. Current methods are also problematic for the development of molecularly targeted drugs which are expected to benefit only a subset of traditionally diagnosed patients. New paradigms for the design and analysis of clinical trials are needed for the new era of genomic technologies for characterizing diseases and for evaluation of molecularly targeted therapeutics.
We will focus on recent developments in the prospective use of predictive biomarkers in the design and analysis of phase III therapeutic clinical trials. The presentation will not be about exploratory analysis of data from clinical trials, but rather on the use of the use of genomic biomarkers in the design and analysis in a sufficiently structured and prospective manner that the conclusions about treatment effects and how they relate to biomarker specified subsets have the degree of reliability normally associated with phase III clinical trials. We will cover the targeted "enrichment" design in which a classifier test result is used as an eligibility criterion. The efficiency of that design and how it depends on the specificity of treatment effect and test performance characteristics will be discussed as well as limitations of that design. We will discuss "stratified designs" in which the test result is not used to restrict eligibility but as part of the primary analysis plan of the trial. Specific analysis plans and sample size considerations will be discussed. Both the enrichment design and stratification design require that the classifier be completely specified and analytically validated prior to the start of the pivotal trial. We will discuss various approaches to easing that requirement including the "adaptive biomarker threshold" design and the "adaptive signature" design. Recent extensions of those design concepts will also be described.
We will present a viewpoint that some of the conventional wisdom concerning the analysis of clinical trials is not appropriate for clinical trials in which a predictive biomarker is incorporated in the primary analysis of treatment effects. These conventions include the requirement of significant overall treatment effects or significant interactions in order to justify analysis of treatment effects in subsets. We will also present a new framework for the analysis of clinical trials that incorporates both hypothesis testing and predictive modeling. This framework provides for complementary roles of frequentist and Bayesian methods but requires that models be justified based on predictive accuracy.
PDF reprints of the following relevant publications are available at http://brb.nci.nih.gov.
Web based interactive software for planning trials using several of the designs to be discussed are also available at that website.
Simon R. Roadmap for developing and validating therapeutically relevant genomic classifiers. Journal of Clinical Oncology 23:7332-41, 2005.
Simon R, Maitnourim A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clinical Cancer Research 10:6759-63, 2005; Supplement & Correction 12:3229, 2006.
Maitnourim A, Simon R. On the efficiency of targeted clinical trials. Statistics in Medicine 24:329-339, 2005.
Simon R. Using genomics in clinical trial design. Clinical Cancer Research 14:5984-93, 2008.
Simon, R. Designs and adaptive analysis plans for pivotal clinical trials of therapeutics and companion diagnostics. Expert Opinion on Medical Diagnostics 2:721-29, 2008.
Freidlin B, Simon R. Adaptive signature design: An adaptive clinical trial design for generating and prospectively testing a gene expression signature for sensitive patients. Clinical Cancer Research 11:7872-8, 2005.
Jiang W, Freidlin B, and Simon R. Biomarker adaptive threshold design: A procedure for evaluating treatment with possible biomarker-defined subset effect. Journal of the National Cancer Institute 99:1036-43, 2007.
Simon R. An agenda for Clinical Trials: clinical trials in the genomic era. Clinical Trials 1:468-470, 2004.
Simon R. New challenges for 21st century clinical trials, Clinical Trials 4:167-169, 2007.
Dobbin K, Simon R. Sample size planning for developing classifiers using high dimensional DNA microarray data. Biostatistics 8:101-117, 2007.
Dupuy A, Simon RM, Critical review of published Microarray studies for cancer outcome and guidelines on statistical analysis and reporting, Journal of the National Cancer Institute 99:147-157, 2007.
Simon R. Interpretation of genomic data: Questions and answers. Seminars in Hematology 45:196-204, 2008.
Simon R, Lam A, Li MC, Ngan M, Menenzes S, Zhao Y. Analysis of gene expression data using BRB-Array Tools, Cancer Informatics 2:11-17, 2007.
Introduction to hierarchical Bayes methods for data analysis
Presented by Prof. Brad Carlin (University of Minnesota) on October 13, 2009 (noon-2:00 Eastern time).
Hierarchical Bayes methods enable the combining of information from similar and independent experiments, yielding improved inference for both individual and shared model characteristics.
As a result of recent advances in computing and the consequent ability to evaluate complex models, Bayesian methods have increased in popularity in data analysis. This course introduces hierarchical Bayes methods, demonstrates their usefulness in challenging applied settings, and shows how they can be implemented using modern Markov chain Monte Carlo (MCMC) computational methods. We also provide an introduction to WinBUGS, the most general Bayesian software package available to date. Use of the methods will be demonstrated in advanced high-dimensional model settings (e.g., nonlinear longitudinal modeling or clinical trial design and analysis), where the MCMC Bayesian approach often provides the only feasible alternative that incorporates all relevant model features.
Webinar participants should have an M.S. (or advanced undergraduate) understanding of mathematical statistics at, say, the Hogg and Craig (1978) level. Basic familiarity with common statistical models (e.g., the linear regression model) and computing will be assumed, but we will not assume any significant previous exposure to Bayesian methods or Bayesian computing. The course is generally aimed at students and practicing statisticians who are intrigued by all the fuss about Bayes and Gibbs, but who may still mistrust the approach as theoretically mysterious and practically cumbersome.
Registration fee is $44 (Member of the Biopharmaceutical
Section), $59 (ASA Member) and $74 (Nonmember). To register for
individual webinars, visit the
Biopharmaceutical Section's web page