PubH 6470: SAS Procedures and Data Analysis, Fall 2011
2011 Course Information and
schedule of lectures, homework assignments, and tests
PubH 6470 introduces students with a background in statistics to
programming, graphics, and data analysis using SAS. The course
concentrates on data-step programming, data editing
and reformatting, as well as statistical applications.
Instructor: William Thomas, Mayo A-467, 625-0651
Office hours: 2:15 - 3:15 Wednesdays or by appointment
TA Office Hours in computing lab (Mayo C-381): Mondays 2:30 - 4:30 and
Wednesdays 4:35-5:45
In addition to Mayo C-381, PC-SAS is installed at these computing labs:
Diehl Hall Biomed Library
Coffey Hall, Coffman Union
Many documents on this website are PDFs (portable document format). For software to read them, download Adobe Acrobat Reader
here.
I recommend doing the coursework with PC-SAS on your own computer; it's not required but it's much more efficient.
Get PC-SAS through the University of Minnesota for $75 per year. See the course information for other options.
Introduction to SAS
Resources for solving problems in SAS
Exams and Homework
2011 Syllabus: Class Notes, Data, Programs
(* revised from 2010)
- Running SAS, editing code, importing and checking data, basic programming (weeks 1-3)
- 1. Intro to PC-SAS* ;
Child IQ.xls,
SAS program
-
2. Procedures (Ttest, NPar1Way,Freq), data step, IF, missing values*;
SAS program
-
3. Character variables, SET, MERGE, using output datasets*
-
4. SAS limitations, more on merge, data set options, dates and times* ;
Editor keyboard shortcuts,
D Morgan: "Essentials of SAS Dates and Times",
SAS program.
-
5. Data checking, Graphics: Plot, Insight, SGplot, Gplot*
-
6. Simplifying repetitive code with arrays and macros*;
P Grant: "The SKIP macro" (comment macro)
- General Linear Models: regression and ANOVA models (weeks 4-7)
-
7. Linear models - Minn math scores, 2000;
Grade-8-mathscores.xls
-
8. Checking code, Proc REG, math scores example*;
M. Yee: "Debugging SAS code"
-
9. Proc GLM, categorical predictors, class variables*
-
10. LSMeans, interaction plots, adjusting*
-
11. Multi-factor ANOVA, adjusted LSMEANS*
-
12. Reading bad spreadsheets, program structure, GLM models*,
bad_spreadsheet.xls,
SAS program
-
13. ODS select, ODS output, confounding, mediation, segmented regression*,
segmented regression.xls,
SAS program
- Bootstrap, missing values (week 8)
- Logistic regression, ordinal regression, log-binomial regression, propensity scores (weeks 9-11)
-
16. Risk, odds, smoothing, Proc Logistic*,
PL Flom: "Proc Logistic: Traps for the Unwary".
-
17. Logistic regression: fitted probabilities, lack-of-fit test*,
NCHS Hypertension data (SAS dataset hypertension_2008.sas7bdat).
-
18. Odds-ratio and relative risk, log-binomial regression, ordinal regression*;
SAS code,
age-BMI sample data (SAS dataset age_bmi_sample.sas7bdat).
-
19. Propensity scores*;
matching macro, adapted from
M. Coca-Perraillon (1987) "Local and Global Optimal Propensity Score Matching."
- Longitudinal data, crossover trials, hierarchical linear models (weeks 12-13)
-
20. Imbalance and adjusting, longitudinal data and correlation, area under the curve*.
-
21. Reshaping longitudinal data: long vs wide data, writing out CSV and Excel files*,
SAS code,
family income data (SAS Dataset econ_long.sas7bdat);
Zirbel (2009) "Learn the basics of Proc Transpose",
Tilanus (2007) "Turning the data around: Proc Transpose";
my reference SAS code.
-
22. Longitudinal data, correlation matrix, Proc Mixed*;
SAS code,
Alzheimer Trial.xls ;
Judith Singer: "Multilevel Models in Proc Mixed"
-
23. Random effects example, multilevel/hierarchical models*;
SAS code,
Minnesota-radon.xls (from Gelman and Hill, 2007).
-
24. Proc SQL, fuzzy merge, crossover designs*;
SAS code,
Family Economic Data.xls ;
TJ Harrington: "Intro to Proc SQL" ,
W Hu: "Top Ten Reasons to Use Proc SQL" .
2010 Syllabus: Class Notes, Data, Programs
- Running SAS, editing code, importing and checking data, basic programming (weeks 1-3)
- 1. Intro to PC-SAS ;
Child IQ.xls,
SAS program
- 2. Arithmetic, missing values, basic tests
-
3. Character variables, SET, MERGE, standardizing
-
4. Dataset options, more on merge, data checking, Insight
-
5. dates, macros, arrays, computing change from baseline;
P Grant: The SKIP macro (comment macro)
- Graphics: SGplot, Gplot, Insight, ODS (week 4)
- General Linear Models: regression and ANOVA models (weeks 5-7)
-
7. Linear models - Minn math scores;
SAS code,
Grade-8-mathscores
-
8. Proc REG, math scores example ;
SAS code
-
9. Proc GLM, categorical predictors, class variables.
-
10. Factorial ANOVA, interaction plots, LSmeans;
Rat-Diets.xls,
SAS code.
-
11. Proc GLM: LSmeans.
-
12 Proc Corr, Back-transformation with ODS, interactions, program structure.
-
13 Confounding, Mediation, reading bad spreadsheets;
bad_spreadsheet.xls.
- Bootstrap, missing values (week 8)
- Logistic regression, ordinal regression, log-binomial regression, propensity scores (weeks 9-11)
- Longitudinal data, crossover trials, hierarchical linear models (weeks 12-13)
-
19. Longitudinal plots, long vs wide data, Transpose,
Zirbel (2009) "Learn the basics of Proc Transpose",
Tilanus (2007) "Turning the data around: Proc Transpose".
-
20. SQL, fuzzy merge, response feature, AUC;
SAS code,
Family Economic Data.xls ;
TJ Harrington: "Intro to Proc SQL" ,
W Hu: "Top Ten Reasons to Use Proc SQL" .
-
21. Longitudinal data, correlation matrix, Proc Mixed;
SAS code,
Alzheimer Trial.xls ;
Judith Singer: "Multilevel Models in Proc Mixed"
-
22. Mixed model example; Crossover designs
- Survival data: survival function, competing risks, proportional hazards (weeks 14-15)
2008 Lecture Notes and Examples
- Intro to SAS I;
SAS program.
- Intro to SAS II, reading Excel spreadsheets;
Workbook1.xls,
Workbook2.xls,
SAS program.
- Data checking, Proc Insight, SAS Manual, basic tests
-
Missing values, graphics, reporting in MSWord, SET, MERGE;
"Fix SAS output" MSWord macro,
SAS program
-
Merging, data set options, GLM.
-
GLM: residual & interaction plots, means, LSmeans, dates, arrays;
SAS program.
-
GLM: LSmeans, estimate; missing values.
-
MI and MIanalyze with GLM; smoothing, jitter;
SAS code,
HAMD2 data.
-
Correlation, partial correlation, regression: Proc REG;
SAS program,
Grade 8 data (SAS permanent file).
-
Regression example, VIF, plots, subset selection.
-
Making CLASS variables for Proc Reg, predictions, sample size;
SAS program.
-
Macros and Bootstrap;
SAS program,
bootstrapmacros.sas,
SAS bootstrap documentation.
-
Bootstrap confidence intervals: correlation, kappa, agreeement.
-
Bootstrap prediction error, t-tests.
-
Longitudinal data: graphs, area under a curve (AUC);
SAS program.
-
Within-person correlation, covariance matrix.
-
Proc Mixed: repeated measures, random effects.
-
Crossover designs.
-
Logistic regression.
-
Log-binomial, repeated binary observations.
-
Conditional logistic regression, ordinal logistic regression.
-
Survival data, Kaplan-Meier estimates, randomization log-rank test,
SAS code,
macro file from Cantor: SAS Survival Analysis Techniques, 2nd ed..
-
Reporting comparisons of survival curves, proportional hazards regression.
-
Checking proportional hazards, subset selection, time-varying predictors.
-
Competing risks: cumulative incidence,
SAS code for lecture,
CumIncid macro (from www.sas.com),
BMT data from Klein & Moeschberger, Survival Analysis,2nd ed, sec 1.3, App D.
Updated 13 Dec 2011