Comparing Independent Group and Paired t-tests using SAS
PROC TTEST
See
www.stattutorials.com/SASDATA
for files mentioned in this tutorial © TexaSoft, 2007
These SAS statistics tutorials briefly explain the use and
interpretation of standard statistical analysis techniques for Medical,
Pharmaceutical, Clinical Trials, Marketing or Scientific Research. The examples
include how-to instructions for SAS Software
Comparing Independent Group and Paired t-tests
It is not uncommon for researchers to perform an incorrect
t-test when comparing two “groups.” The correct t-test depends on how the data
are observed (the design of the experiment.)
Independent Samples: When data are collected on
subjects where subjects are (hopefully randomly) divided into two groups, this
is called an independent or parallel study. That is, the subjects in one group
(treatment, etc) are different from the subjects in the other group. This data
may be analyzed using an independent group t-test (sometimes called an
independent samples t-test or parallel test.) This version of the t-test is
testing the null hypothesis (two-sided):
Ho:
m1
= m2 (means
of the two groups are equal)
Ha:
m1
¹
m2
(means are not equal)
Dependent Samples: When data are collected twice on
the same subjects (or matched subjects) the proper analysis is a paired t-test
(also called a dependent samples t-test). In this case, subjects may be measured
in a before – after fashion, or in a design where a treatment is administered
for a time, there is a washout period, and another treatment is administered (in
random order for each subject). Or, data might be measured on the same
individual in two areas such as one treatment in one eye and another treatment
for another eye (or leg, or arm, etc). In these cases the measurement of
interest is the difference between the first and second measure. Thus,
the null hypothesis (two-sided) is:
Ho:
mdifference
= 0 (The average difference is 0)
Ha:
mdifference
≠ 0 (The average difference is not 0)
Why it makes a difference: Performing an incorrect
t-test on your data can cause you to miss a significant difference when one
might exist. As an example, consider the data from a paper by Raskin and Unger
(1978) where four diabetic patients were used to compare the effects of insulin
infusion regimens. One treatment was insulin and somatostatin (IS) and the other
treatment was insulin, somatostatin and gulcagon (ISG). Each subject was given
each treatment with a period of washout between treatments. The data follow:
|
Patient |
Treatment |
|
|
Number |
IS |
ISG |
Difference |
|
1 |
14 |
17 |
3 |
|
2 |
6 |
8 |
2 |
|
3 |
7 |
11 |
4 |
|
4 |
6 |
9 |
3 |
|
Mean |
8.25 |
11.25 |
3.0 |
|
S.E.M. |
1.9 |
2 |
.40 |
A paper by Thomas Louis (1984) looked at this data using
both types of t-tests. The correct version of the t-test to use for this data
set is the paired t-test since each patient is observed twice. However, it is
all too common for researchers to compare the means 8.25 versus 11.25 using an
independent group approach. To see how these approaches differ, consider how the
two analyses would be performed in SAS.
Independent group analysis: The code to perform this
analysis using an independent group t-test is: (PROCTTEST_IND.SAS)
data
diabetic;
input
treatment $ urea;
datalines;
IS 14
IS 6
IS 7
IS 6
ISG 17
ISG 8
ISG 11
ISG 9
;
ODS
HTML;
PROC
TTEST;
CLASS
TREATMENT;
VAR
UREA;
RUN;
PROC
BOXPLOT;
PLOT
UREA*TREATMENT;
RUN;
ODS
HTML
CLOSE;
You get the following output (only part of the output is
shown here). (Remember that this is the incorrect t-test to analyze this
data):
The first table shows you that the two means differ by
11.25-8.25 = 3 with a (pooled) standard error of 2.80.
|
Variable |
treatment |
N |
Mean |
Std Dev |
Std Err |
Min |
Max |
|
urea |
IS |
4 |
8.25 |
3.8622 |
1.93 |
6 |
14 |
|
urea |
ISG |
4 |
11.25 |
4.0311 |
2.02 |
8 |
17 |
|
urea |
Diff (1-2) |
|
-3 |
3.9476 |
2.80 |
|
|
Since the
“Equality of variances” table below indicates that the variances can be assumed
equal (p=.95), you perform the “Pooled/Equal” t-test, which gives a p-value of
p=.32. (Not a statistically significant result.)
|
t-Tests |
|
Variable |
Method |
Variances |
DF |
t Value |
Pr > |t| |
|
urea |
Pooled |
Equal |
6 |
-1.07 |
0.3238 |
|
urea |
Satterthwaite |
Unequal |
5.99 |
-1.07 |
0.3239 |
|
Equality of Variances |
|
Variable |
Method |
Num DF |
Den DF |
F Value |
Pr > F |
|
urea |
Folded F |
3 |
3 |
1.09 |
0.9455 |
Furthermore, a comparative box plot shows a lot of overlap
between the two groups.

This independent group analysis is NOT the correct
analysis. This graph, by the way, is also misleading and not appropriate for
a paired analysis.
Since the data in this example are paired you should
instead do the PAIRED version of the t-test.
Paired t-test analysis: The appropriate analysis
for this data is a paired t-test. The calculations for this test can be
performed using the following SAS code (PROCTTEST_PAIRED.SAS):
data
diabetic;
input
IS ISG;
datalines;
14 17
6 8
7 11
6 9
ODS
HTML;
PROC
TTEST;
PAIRED
IS*ISG;
RUN;
ODS
HTML
CLOSE;
The (partial) output is as follows. Note that the analysis
is performed on the mean of the differences (-4.299) and that the standard error
of the difference is 0.41 (much less than the standard error (2.80) in the
previous analysis.)
|
Difference |
N |
Lower CL
Mean |
Mean |
Upper CL
Mean |
Lower CL
Std Dev |
Std Dev |
Upper CL
Std Dev |
Std Err |
|
IS - ISG |
4 |
-4.299 |
-3 |
-1.701 |
0.4625 |
0.8165 |
3.0443 |
0.4082 |
The
paired t-test yields p=0.005, which is statistically significant.
|
T-Tests |
|
Difference |
DF |
t Value |
Pr > |t| |
|
IS - ISG |
3 |
-7.35 |
0.0052 |
The reason that the paired t-test found significance when
the independent t-test on the same data did not achieve significance is
because the paired analysis is the more correct analysis and therefore it is
able to make use of a much smaller standard error (of the mean difference rather
than pooled.)
In his paper, Louis explains that to achieve the power of
this paired t-test, an independent group t-test (parallel test) would require 14
times as many subjects. Thus, when the model is appropriate, the paired t-test
can be a more powerful design to analysis your data. On the other hand, if you
use a paired analysis on independent group data you will get incorrect and
misleading results. Therefore, carefully consider how your experiment is
designed before you select which t-test to perform.
References:
Louis TA, Lavori, PW, Bailer, JC and Polansky, M (1984),
“Crossover and Self Controlled Designs in Clinical Research,” NEJM, 310:24-31.
Raskin, P, Unger, RH, Hyperglucagonemia and its
suppression: importance in the metabolic control of diabetes. N Engl J Med 1978:
299;433-6.
End of tutorial
See
http://www.stattutorials.com/SAS