This page provides the full dataset used in the analyses presented in the paper
"Parametric Survival Analysis and Taxonomy of Hazard Functions for the
Generalized Gamma Distribution" by Cox,
In the paper we introduce a graphical taxonomy of the hazard functions of the generalized gamma family which includes many special distributions and most importantly includes all four of the most common types of hazard function: monotonically increasing or decreasing, as well as bathtub and arc-shaped hazard functions. We apply the proposed taxonomy to study survival after AIDS during different eras of HIV therapy, where proportionality of hazard functions is clearly not fulfilled.
We will be grateful if individuals using our methods in publications share
them with us by e-mail: amunoz@jhu.edu.
Send comments or questions to Mr. Michael Schneider at mschnei4@jhu.edu.
------------------------------------------ Data Files
-------------------------------------------------
The data file fulldata.dat
contains the complete dataset used in the analyses presented in Table 2 of
the paper. A sample of observations from this dataset is summarized in section
5.1 of the paper. Briefly, this file contains data for nine variables: participant
identification number, entry (time with AIDS when entering indicated period),
exit (time with AIDS when death or censoring occurs during indicated period),
event indicator (1= event, 0=censored), four period indicator variables and the
period number.
The data file nonparametricFIG3.dat
contains the complete dataset used in the nonparametric estimation of the
log of the cumulative hazard function in each of the four therapeutic eras
shown in Figure 3 of the paper. The data file contains data for three
variables: exit (time with AIDS when death or censoring occurs during indicated
period), the estimated value of the survival function and a period indicator
variable. This file contains a record of data for each unique event time in
each of the four periods.
The three files, RelTime2vs1FIG5.dat,
RelTime3vs1FIG5.dat, and RelTime4vs1FIG5.dat contain
the data that were used for the analyses shown in Figure 5 of the paper.
Specifically, the file RelTime2vs1FIG5.dat contains data for five variables:
period, cumulative percent deceased, estimated relative time of period two to
period one, lower limit of 90% confidence band for estimated relative time,
upper limit of 90% confidence band for estimated relative time. The files
RelTime3vs1FIG5.dat and RelTime4vs1FIG5.dat contain the same type of data for
period 3 and period 4, respectively.
The three files, RelHzd2vs1FIG6.dat,
RelHzd3vs1FIG6.dat, and RelHzd4vs1FIG6.dat
contain the data that were used for the analyses shown in Figure 6 of
the paper. Specifically, the file RelHzd2vs1FIG6.dat contains data for five
variables: period, years following incident AIDS diagnosis, estimated relative
hazard of period two to period one, lower limit of 90% confidence band for
estimated relative hazard, upper limit of 90% confidence band for estimated
relative hazard. The files RelHzd3vs1FIG6.dat and RelHzd4vs1FIG6.dat contain
the same type of data for period 3 and period 4, respectively.
---------------------------- Stata, Sas, and S-Plus programs -----------------------------------
The four Stata programs (as described in section 5.2 of the paper) will produce
the first three parametric analyses shown in Table 2 of the paper. The program mleconventional(AFT)weibull.do
will fit a conventional Weibull AFT model for the location parameter
beta, the program mleconventional(AFT)gralgamma.do
will fit a conventional generalized gamma AFT model for the location parameter
beta, and the program mlesaturatedgralgamma.do
will fit a generalized gamma distribution to each of the four periods separately.
The program mlesaturatedgralgamma_onemodel.do
will fit a generalized gamma distribution to all four periods using one model
allowing the scale (sigma) and shape (lambda) parameters in each of the four
periods to be different. Each of the four .log files contains the relevant
output from running the corresponding Stata program.
1. mleconventional(AFT)weibull.log,
2. mleconventional(AFT)gralgamma.log,
3. mlesaturatedgralgamma.log
4. mlesaturatedgralgamma_onemodel.log
The four SAS programs (as described in section 5.3 of the paper) will produce
the four parametric analyses shown in Table 2 of the paper. The program mleconventional(AFT)weibull.sas will fit a conventional Weibull AFT model for the location parameter beta, the program
mleconventional(AFT)gralgamma.sas will fit a conventional
general gamma AFT model for the location parameter beta, the program mlesaturatedgralgamma.sas
will fit a generalized gamma model to each of the four periods separately,
and finally the program mlegralgammaforfinal.sas fits the best two-parameter
model in each of the four periods. Each of the four .lst files contains the relevant output from running the corresponding
SAS program.
1. mleconventional(AFT)weibull.lst,
2. mleconventional(AFT)gralgamma.lst,
3. mlesaturatedgralgamma.lst,
4. mlegralgammaforfinal.lst
The four S-Plus programs (the functions that are used in these programs are
described briefly in section 5.4 of the paper) will produce the four parametric
analyses shown in Table 2 of the paper. The program mleconventional(AFT)weibull.ssc will fit a conventional Weibull AFT model for the location parameter beta, the
program mleconventional(AFT)gralgamma.ssc will fit a conventional general
gamma AFT model for the location parameter beta, the program mlesaturatedgralgamma.ssc
will fit a generalized gamma model to each of the four periods separately, and
finally the program mlegralgammaforfinal.ssc
fits the best two-parameter model in each of the four periods. Each of
the four .txt files contains the relevant output from running the corresponding
S-Plus program.
1. mleconventional(AFT)weibull.txt,
2. mleconventional(AFT)gralgamma.txt,
3. mlesaturatedgralgamma.txt,
4. mlegralgammaforfinal.txt
Below the heading, S-Plus functions we have included four functions the will
return values of the quantile function (qggamma.ssc),
the hazard function (hggamma.ssc),
the density function (dggamma.ssc),
and survival function (sggamma.ssc)
for the general gamma distribution. Values for beta (location parameter), sigma
(scale parameter), and lambda (shape parameter) must be supplied as arguments
to each of the four functions. A cumulative probability must also be supplied
as the last argument for the quantile function and a
value of time must be supplied as the last argument for the hazard, density,
and survival functions.
-------- Expanded Version of Statistical Software section of paper (section
5)--------------
expandedsection5.pdf
GEM2014.03.11.AlvaroMunoz.pdf
--------------------------------- Figures presented in paper
---------------------------------------
The six S-Plus programs below will produce the six figures presented in the
paper.
1. GGfig1.ssc,
2. GGfig2.ssc ,
3 .GGfig3.ssc,
4. GGfig4.ssc,
5. GGfig5.ssc,
6. GGfig6.ssc.