What is the output of the T-test in Stata and how can it be interpreted?

What is the output of the T-test in Stata and how can it be interpreted?

The output of the T-test in Stata is a statistical analysis of the difference between two means. It provides a t-value, degrees of freedom, and p-value which can be used to determine if there is a significant difference between the two groups being compared. The t-value indicates the magnitude of the difference between the means, while the p-value indicates the probability of obtaining the observed difference by chance. A p-value less than 0.05 is typically considered significant. The results can be interpreted as evidence for or against the null hypothesis, with a significant result indicating that there is a significant difference between the two groups. Therefore, the T-test in Stata is a useful tool for determining if there is a significant difference between two means and can aid in making informed decisions in research and data analysis.

T-test | Stata Annotated Output

The ttest command performs t-tests for one sample, two samples and
paired observations.  The single-sample t-test compares the mean of the sample
to a given number (which you supply).  The independent samples t-test compares
the difference in the means from the two groups to a given value (usually 0).
In other words, it tests whether the difference in the means is 0.  The
dependent-sample or paired t-test compares the difference in the means from the two
variables measured on the same set of subjects to a given number (usually 0), while taking into account the fact that
the scores are not independent.  In our examples, we will use the
hsb2
data set.

use https://stats.idre.ucla.edu/stat/stata/notes/hsb2, clear
(highschool and beyond (200 cases))

Single sample t-test

The single sample t-test tests the null hypothesis that the population mean
is equal to the given number specified using the option write == .
For this example, we will compare the mean of the variable write with
a pre-selected value of 50.  In practice, the value against which the mean is
compared should be based on theoretical considerations and/or previous research.  Stata calculates the t-statistic and its
p-value under the assumption that the sample comes from an approximately normal
distribution. If the p-value associated with the t-test is small (0.05 is often
used as the threshold), there is evidence that the mean is different from the hypothesized
value.  If the p-value associated with the t-test is not small (p > 0.05),
then the null hypothesis is not rejected and you can conclude that the mean is
not different from the hypothesized value.

In this example, the t-statistic is 4.1403 with 199
degrees of freedom.  The corresponding two-tailed p-value is .0001, which is less than
0.05.  We conclude that the mean of variable write is different from
50.

ttest write==50
One-sample t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
   write |     200      52.775    .6702372    9.478586    51.45332    54.09668
------------------------------------------------------------------------------
    mean = mean(write)                                            t =   4.1403
Ho: mean = 50                                    degrees of freedom =      199

    Ha: mean < 50               Ha: mean != 50                 Ha: mean > 50
 Pr(T < t) = 1.0000         Pr(|T| > |t|) = 0.0001          Pr(T > t) = 0.0000

Summary Statistics

----------------------------------------------------------------------------------
Variablea |     Obsb      Meanc    Std. Err.d   Std. Dev.e   [95% Conf. Interval]f
----------+-----------------------------------------------------------------------
   write  |     200      52.775    .6702372     9.478586     51.45332    54.09668
----------------------------------------------------------------------------------

a.Variable – This is the variable for which the test was conducted.

b.Obs – The number of valid (i.e., non-missing) observations used in
calculating the t-test.

c. Mean – This is the mean of the variable.

d. Std. Err. – This is the estimated standard deviation of the sample
mean.  If we drew repeated samples of size 200, we would expect the
standard deviation of the sample means to be close to the standard error.
The standard deviation of the distribution of sample mean is estimated as the
standard deviation of the sample divided by the square root of sample size: 9.478586/(sqrt(200))
= .6702372.

e. Std. Dev. – This is the standard deviation of the variable.

f. 95% Confidence Interval – These are the lower and upper bound of
the confidence interval for the mean. A confidence interval for the mean
specifies a range of values within which the unknown population parameter, in
this case the mean, may lie.  It is given by

Image ttest_1-2

where s is
the sample deviation of the observations and N is the number of valid
observations.  The t-value in the formula can be computed or found in any
statistics book with the degrees of freedom being N-1 and the p-value being 1-alpha/2,
where alpha is the confidence level and by default is .95. 


Test Statistics

meang = mean(write)                                            th =   4.1403
Hoi: mean = 50                                    degrees of freedomj =      199

    Ha: mean < 50               Ha: mean != 50                 Ha: mean > 50
 Pr(T < t)k = 1.0000         Pr(|T| > |t|)l = 0.0001          Pr(T > t)k = 0.0000

g. mean – This is the mean being tested. In this example it is the mean of
write.

h. t – This is the Student t-statistic.  It is the ratio of the
difference between the sample mean and the given number to the standard error of
the mean: (52.775 – 50) / .6702372 = 4.1403. Since the standard error of the mean measures the variability of the
sample mean, the smaller the standard error of the mean, the more likely that
our sample mean is close to the true population mean.  This is illustrated by
the following three figures.

Image ttest1-1Image ttest2-1Image ttest3-1

In all three cases, the difference between the population means is the same.
But with large variability of sample means, second graph, two populations
overlap a great deal.  Therefore, the difference may well come by chance.  On
the other hand, with small variability, the difference is more clear as in the
third graph.  The smaller the standard error of the mean, the larger the
magnitude of the t-value and therefore, the smaller the p-value.

i. Ho – This is the null hypothesis that is being tested.  The
single sample t-test evaluates the null hypothesis that the population mean is equal
to the given number.

j. degrees of freedom – The degrees of freedom for the single sample
t-test is simply the number of valid observations minus 1.  We lose one degree
of freedom because we have estimated the mean from the sample.  We have used
some of the information from the data to estimate the mean, therefore it is not
available to use for the test and the degrees of freedom accounts for this.

k. Pr(T < t), Pr(T > t)
These are the one-tailed p-values evaluating the null against the alternatives
that the mean is less than 50 (left test) and greater than 50 (right test).
These probabilities are computed using the t distribution. Again, if p-value is less than the pre-specified alpha level (usually .05 or
.01) we will conclude that mean is statistically significantly greater or less
than the null hypothetical value.

l.   Pr(|T|> |t|) – This is the two-tailed p-value evaluating the null
against an alternative that the mean is not equal to 50.  It is equal to
the probability of observing a greater absolute value of t
under the null hypothesis.  If
p-value is less than the pre-specified alpha level (usually .05 or .01, here the
former) we will
conclude that mean is statistically significantly different from zero.  For
example, the p-value for write is smaller than 0.05. So we conclude that
the mean for write is different from 50.


Paired t-test

A paired (or “dependent”) t-test is used when the observations are not
independent of one another. In the example below, the same students took both
the writing and the reading test. Hence, you would expect there to be a
relationship between the scores provided by each student.  The paired
t-test accounts for this.  For each student, we are essentially looking at
the differences in the values of the two variables and testing if the mean of
these differences is equal to zero.

In this example, the t-statistic is 0.8673 with 199
degrees of freedom.  The corresponding two-tailed p-value is 0.3868, which is
greater than
0.05.  We conclude that the mean difference of write and read
is not different from 0.

ttest write==read
Paired t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
   write |     200      52.775    .6702372    9.478586    51.45332    54.09668
    read |     200       52.23    .7249921    10.25294    50.80035    53.65965
---------+--------------------------------------------------------------------
    diff |     200        .545    .6283822    8.886666   -.6941424    1.784142
------------------------------------------------------------------------------
     mean(diff) = mean(write - read)                              t =   0.8673
 Ho: mean(diff) = 0                              degrees of freedom =      199

 Ha: mean(diff) < 0           Ha: mean(diff) != 0           Ha: mean(diff) > 0
 Pr(T < t) = 0.8066         Pr(|T| > |t|) = 0.3868          Pr(T > t) = 0.1934

Summary Statistics

----------------------------------------------------------------------------------
Variablea |     Obsb      Meanc    Std. Err.d   Std. Dev.e   [95% Conf. Interval]f
----------+-----------------------------------------------------------------------
   write  |     200      52.775    .6702372     9.478586     51.45332    54.09668
    read  |     200       52.23    .7249921     10.25294     50.80035    53.65965
----------+-----------------------------------------------------------------------
    diff  |     200        .545    .6283822     8.886666    -.6941424    1.784142
----------------------------------------------------------------------------------

a.Variable – This is the list of variables used in the test.

b.Obs – The number of valid (i.e., non-missing) observations used in
calculating the t-test.

c. Mean – This is the list of the means of the variables.  The last
row displays the simple difference between the two means.

d. Std. Err. – This is the estimated standard deviation of the sample
mean.  If we drew repeated samples of size 200, we would expect the standard
deviation of the sample means to be close to the standard error.  The standard
deviation of the distribution of sample mean is estimated as the standard
deviation of the sample divided by the square root of sample size.  This
provides a measure of the variability of the sample mean.  The Central Limit
Theorem tells us that the sample means are approximately normally distributed
when the sample size is 30 or greater.

e. Std. Dev. – This is the standard deviation of the variable.  The
last row displays the standard deviation for the difference which is not equal
to the difference of  standard deviations for each group.

f. 95% Confidence Interval – These are the lower and upper bound of
the confidence interval for the mean. A confidence interval for the mean
specifies a range of values within which the unknown population parameter, in
this case the mean, may lie.  It is given by

Image ttest_1-2

where s is
the sample deviation of the observations and N is the number of valid
observations.  The t-value in the formula can be computed or found in any
statistics book with the degrees of freedom being N-1 and the p-value being 1-alpha/2,
where alpha is the confidence level and by default is .95. 


Test Statistics

mean(diff) = mean(write – read)g
t = 0.8673
h
Ho: mean(diff) = 0
degrees of freedom = 199
i

Ha: mean(diff) < 0k
Ha: mean(diff) != 0
j
Ha: mean(diff) > 0
k
Pr(T < t) = 0.8066     Pr(|T| > |t|) = 0.3868
Pr(T > t) = 0.1934

g.  mean(diff) = mean(var1 – var2)– The t-test for dependent groups
forms a
single random sample from the paired difference, which functions as a
simple random sample test. The interpretation for t-value and p-value is the
same as in the case of simple random sample.

h.  t – This is the t-statistic.  It is the ratio of
the mean of the difference to the standard error of the difference
(.545/.6283822).

i.  degrees of freedom – The degrees of freedom for the paired observations is
simply the number of observations minus 1. This is because the test is conducted
on the one sample of the paired differences.

j.   Pr(|T|> |t|)– This is the two-tailed p-value computed using
the t
distribution.  It is the probability of observing a greater absolute value of t
under the null hypothesis.  If
the p-value is less than the pre-specified alpha level (usually .05 or .01, here the
former) we will conclude that mean difference between write and read is statistically significantly different from
zero.  For example, the p-value for the difference between write and
read
is greater than 0.05 so we conclude that the mean difference is not
statistically significantly different from 0.

k. Pr(T < t), Pr(T > t)-
These are the one-tailed p-values for evaluating the alternatives (mean < H0
value) and (mean > H0 value),
respectively.  Like Pr(|T|> |t|), they are computed using the t distribution.
Again, if the p-value is less than the pre-specified alpha level (usually .05 or .01) we
will conclude that mean difference is statistically significantly greater than or less than zero.


Independent group t-test

This t-test is designed to compare means of same variable between two groups.
In our example, we compare the mean writing score between the group of
female students and the group of male students. Ideally, these subjects are
randomly selected from a larger population of subjects. The test assumes that
variances for the two populations are the same.  The interpretation for p-value is the same as
in other type of t-tests.

In this example, the t-statistic is -3.7341 with 198 degrees of freedom.  The corresponding
two-tailed p-value is 0.0002, which is less than 0.05.  We conclude that
the difference of means in write between males and females is different
from 0.

ttest write, by(female)
Two-sample t test with equal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
    male |      91    50.12088    1.080274    10.30516    47.97473    52.26703
  female |     109    54.99083    .7790686    8.133715    53.44658    56.53507
---------+--------------------------------------------------------------------
combined |     200      52.775    .6702372    9.478586    51.45332    54.09668
---------+--------------------------------------------------------------------
    diff |           -4.869947    1.304191               -7.441835   -2.298059
------------------------------------------------------------------------------
    diff = mean(male) - mean(female)                              t =  -3.7341
Ho: diff = 0                                     degrees of freedom =      198

    Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
 Pr(T < t) = 0.0001 Pr(|T| > |t|) = 0.0002          Pr(T > t) = 0.9999

Summary Statistics


——————————————————————————
Group
a
| Obs
b
Mean
c
Std. Err.
d
Std. Dev.
e
[95% Conf. Interval]
f
———+——————————————————————–

    male |      91    50.12088    1.080274    10.30516    47.97473    52.26703
  female |     109    54.99083    .7790686    8.133715    53.44658    56.53507
---------+--------------------------------------------------------------------
combined |     200      52.775    .6702372    9.478586    51.45332    54.09668
---------+--------------------------------------------------------------------
    diff |           -4.869947    1.304191               -7.441835   -2.298059
------------------------------------------------------------------------------

 

a.  Group – This column
gives categories of
the independent variable, in our case female. This variable is specified by the by(female)
statement.

b.  Obs – This is the number of valid (i.e., non-missing)
observations in each group.

c.  Mean – This is the mean of the dependent variable for each
level of the independent variable.  On the last line the difference between
the means is given.

d.  Std Err – This is the standard error of the mean for each
level of the independent variable.

e.  Std Dev – This is the standard deviation of the dependent
variable for each of the levels of the independent variable.  On the last
line the standard deviation for the difference is given.

f.  [95% Conf. Interval] – These are the lower and upper
confidence limits of the means.


Test Statistics

diff = mean(male) – mean(female)g
t = -3.7341
h
Ho: diff = 0
degrees of freedom = 198
i

Ha: diff < 0k        Ha: diff != 0j
Ha: diff > 0k
Pr(T < t) = 0.0001  Pr(|T| > |t|) = 0.0002   Pr(T > t) = 0.9999

g. diff = mean(male) – mean(female) – The t-test compares the means
between the two groups, the null hypothesis being that the difference between
the means is zero.

h.  t –  This is the t-statistic.  It is the ratio of
the mean of the difference to the standard error of the difference:
(-4.869947/1.304191).

i.  degrees of freedom – The degrees of freedom for the paired
observations is simply the number of observations minus 2. We use one degree of
freedom for estimating the mean of each group, and because there are two groups,
we subtract two degrees of freedom.

j.  Pr (|T| > |t|) – This is the two-tailed p-value
computed using the t
distribution.  It is the probability of observing a greater absolute value of t
under the null hypothesis.  If
p-value is less than the pre-specified alpha level (usually .05 or .01, here the
former) we will conclude that mean is statistically significantly different from
zero.
For example, the p-value for the difference between females and males is less
than 0.05, so we conclude that the difference in means is statistically
significantly different from 0.

k. Pr(T < t), Pr(T>t)
These are the one-tailed p-values
for the alternative hypotheses (mean difference < 0) and (mean difference > 0),
respectively.  Like Pr(|T|> |t|), they are computed using the t distribution.
As usual, if p-value is less than the pre-specified alpha level (usually .05 or .01) we
will conclude that mean is statistically significantly greater or less than zero.


Independent sample T-test assuming unequal variances

We are again going to compare means of the same variable between two groups.
In our example, we compare the mean writing score between the group of
female students and the group of male students. Ideally, these subjects are
randomly selected from a larger population of subjects. We previously assumed
that the
variances for the two populations are the same.  Here, we will allow for
unequal variances in our samples.  The interpretation for p-value is the same as
in other type of t-tests.

In this example, the t-statistic is -3.6564 with 169.707 degrees of freedom.  The corresponding
two-tailed p-value is 0.0003, which is less than 0.05.  We conclude that
the difference of means in write between males and females is different
from 0, allowing for differences in variances across groups.

ttest write, by(female) unequal
Two-sample t test with unequal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
    male |      91    50.12088    1.080274    10.30516    47.97473    52.26703
  female |     109    54.99083    .7790686    8.133715    53.44658    56.53507
---------+--------------------------------------------------------------------
combined |     200      52.775    .6702372    9.478586    51.45332    54.09668
---------+--------------------------------------------------------------------
    diff |           -4.869947    1.331894               -7.499159   -2.240734
------------------------------------------------------------------------------
    diff = mean(male) - mean(female)                              t =  -3.6564
Ho: diff = 0                     Satterthwaite's degrees of freedom =  169.707

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0002         Pr(|T| > |t|) = 0.0003          Pr(T > t) = 0.9998

Summary Statistics

----------------------------------------------------------------------------------
   Groupa |     Obsb     Meanc    Std. Err.d   Std. Dev.e   [95% Conf. Interval]f
----------+-----------------------------------------------------------------------
    male  |      91    50.12088     1.080274    10.30516    47.97473    52.26703
  female  |     109    54.99083     .7790686    8.133715    53.44658    56.53507
----------+-----------------------------------------------------------------------
combined  |     200      52.775     .6702372    9.478586    51.45332    54.09668
----------+-----------------------------------------------------------------------
    diff  |           -4.869947     1.331894               -7.499159   -2.240734
----------------------------------------------------------------------------------

a. Group – The list of groups whose means are being compared.

b. Obs. – This is the number of valid observations (ie. non-missing)
from each group as well as the combined.

c. Mean – This is the mean of the variable of interest for each group
we are comparing.  On the third line the combined mean is given and on
the last line the difference between the means is given.

d. Std. Err. – This is the standard error of the mean.

e. Std. Dev. – This is the standard deviation of the dependent
variable for each of the groups.

f. 95% Confidence interval – These are the lower and upper limits for the
95% confidence interval of the mean for each of the groups.


Test Statistics

    diffg = mean(male) - mean(female)                              th =  -3.6564
Ho: diff = 0                      Satterthwaite's degrees of freedomi =  169.707

    Ha: diff < 0                   Ha: diff != 0                 Ha: diff > 0
 Pr(T < t)k = 0.0002         Pr(|T| > |t|)j = 0.0003          Pr(T > t)k = 0.9998

g. diff – This is the value we are testing: the difference in the
means of the male group and the female group.

h. t – This is the t-statistic.  It is the test statistic we will
use to evaluate our hypothesis.  It is the ratio of the mean to the
standard error of the difference of the two groups: (-4.869947/1.331894).

i. Satterthwaite’s degrees of freedom – Satterthwaite’s is an
alternative
way to calculate the degrees of freedom that takes into account that the
variances are assumed to be unequal.  It is a more conservative approach than
using the traditional degrees of freedom.  This is the degrees of freedom
under this calculation.

j. Pr(|T| > |t|) – This is the two-tailed p-value
computed using the t
distribution.  It is the probability of observing a greater absolute value of t
under the null hypothesis.  If
p-value is less than the pre-specified alpha level (usually .05 or .01, here the
former) we will conclude that the difference in means is statistically significantly different from
zero.
For example, the p-value for the difference between females and males is less
than 0.05, so we conclude that the difference in means is statistically
significantly different from 0.

l. Pr(T < t), Pr(T > t) – These are the one-tailed p-values
for the alternative hypotheses (difference < 0) and (difference > 0),
respectively.  Like Pr(|T|> |t|), they are computed using the t distribution.
As usual, if p-value is less than the pre-specified alpha level (usually .05 or .01) we
will conclude that mean is statistically significantly greater or less than zero.

 

 

 

Cite this article

stats writer (2024). What is the output of the T-test in Stata and how can it be interpreted?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/what-is-the-output-of-the-t-test-in-stata-and-how-can-it-be-interpreted/

stats writer. "What is the output of the T-test in Stata and how can it be interpreted?." PSYCHOLOGICAL SCALES, 29 Jun. 2024, https://scales.arabpsychology.com/stats/what-is-the-output-of-the-t-test-in-stata-and-how-can-it-be-interpreted/.

stats writer. "What is the output of the T-test in Stata and how can it be interpreted?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/what-is-the-output-of-the-t-test-in-stata-and-how-can-it-be-interpreted/.

stats writer (2024) 'What is the output of the T-test in Stata and how can it be interpreted?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/what-is-the-output-of-the-t-test-in-stata-and-how-can-it-be-interpreted/.

[1] stats writer, "What is the output of the T-test in Stata and how can it be interpreted?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, June, 2024.

stats writer. What is the output of the T-test in Stata and how can it be interpreted?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.

Download Post (.PDF)
Slide Up
x
PDF
Scroll to Top