General Gamma



This page provides the full dataset used in the analyses presented in the paper "Parametric Survival Analysis and Taxonomy of Hazard Functions for the Generalized Gamma Distribution" by Cox, Chu, Schneider, and Muñoz published in Statistics in Medicine 2007; 26:4352-4374. In addition, this website contains Stata, SAS, and S-Plus programs used to conduct the analyses presented in the paper along with a detailed description of the methods used in analyses.

In the paper we introduce a graphical taxonomy of the hazard functions of the generalized gamma family which includes many special distributions and most importantly includes all four of the most common types of hazard function: monotonically increasing or decreasing, as well as bathtub and arc-shaped hazard functions. We apply the proposed taxonomy to study survival after AIDS during different eras of HIV therapy, where proportionality of hazard functions is clearly not fulfilled.

We will be grateful if individuals using our methods in publications share them with us by e-mail: amunoz@jhu.edu. Send comments or questions to Mr. Michael Schneider at mschnei4@jhu.edu.

------------------------------------------ Data Files -------------------------------------------------

The data file fulldata.dat contains the complete dataset used in the analyses presented in Table 2 of the paper. A sample of observations from this dataset is summarized in section 5.1 of the paper. Briefly, this file contains data for nine variables: participant identification number, entry (time with AIDS when entering indicated period), exit (time with AIDS when death or censoring occurs during indicated period), event indicator (1= event, 0=censored), four period indicator variables and the period number.

The data file nonparametricFIG3.dat contains the complete dataset used in the nonparametric estimation of the log of the cumulative hazard function in each of the four therapeutic eras shown in Figure 3 of the paper. The data file contains data for three variables: exit (time with AIDS when death or censoring occurs during indicated period), the estimated value of the survival function and a period indicator variable. This file contains a record of data for each unique event time in each of the four periods.

The three files, RelTime2vs1FIG5.dat, RelTime3vs1FIG5.dat, and RelTime4vs1FIG5.dat contain the data that were used for the analyses shown in Figure 5 of the paper. Specifically, the file RelTime2vs1FIG5.dat contains data for five variables: period, cumulative percent deceased, estimated relative time of period two to period one, lower limit of 90% confidence band for estimated relative time, upper limit of 90% confidence band for estimated relative time. The files RelTime3vs1FIG5.dat and RelTime4vs1FIG5.dat contain the same type of data for period 3 and period 4, respectively.

The three files, RelHzd2vs1FIG6.dat, RelHzd3vs1FIG6.dat, and RelHzd4vs1FIG6.dat contain the data that were used for the analyses shown in Figure 6 of the paper. Specifically, the file RelHzd2vs1FIG6.dat contains data for five variables: period, years following incident AIDS diagnosis, estimated relative hazard of period two to period one, lower limit of 90% confidence band for estimated relative hazard, upper limit of 90% confidence band for estimated relative hazard. The files RelHzd3vs1FIG6.dat and RelHzd4vs1FIG6.dat contain the same type of data for period 3 and period 4, respectively.

---------------------------- Stata, Sas, and S-Plus programs -----------------------------------

The four Stata programs (as described in section 5.2 of the paper) will produce the first three parametric analyses shown in Table 2 of the paper. The program mleconventional(AFT)weibull.do will fit a conventional Weibull AFT model for the location parameter beta, the program mleconventional(AFT)gralgamma.do will fit a conventional generalized gamma AFT model for the location parameter beta, and the program mlesaturatedgralgamma.do will fit a generalized gamma distribution to each of the four periods separately. The program mlesaturatedgralgamma_onemodel.do will fit a generalized gamma distribution to all four periods using one model allowing the scale (sigma) and shape (lambda) parameters in each of the four periods to be different. Each of the four .log files contains the relevant output from running the corresponding Stata program.

1. mleconventional(AFT)weibull.log,
2. mleconventional(AFT)gralgamma.log,
3. mlesaturatedgralgamma.log
4. mlesaturatedgralgamma_onemodel.log

The four SAS programs (as described in section 5.3 of the paper) will produce the four parametric analyses shown in Table 2 of the paper. The program mleconventional(AFT)weibull.sas will fit a conventional Weibull AFT model for the location parameter beta, the program mleconventional(AFT)gralgamma.sas will fit a conventional general gamma AFT model for the location parameter beta, the program mlesaturatedgralgamma.sas will fit a generalized gamma model to each of the four periods separately, and finally the program mlegralgammaforfinal.sas fits the best two-parameter model in each of the four periods. Each of the four .lst files contains the relevant output from running the corresponding SAS program.

1. mleconventional(AFT)weibull.lst,
2. mleconventional(AFT)gralgamma.lst,
3. mlesaturatedgralgamma.lst,
4. mlegralgammaforfinal.lst

The four S-Plus programs (the functions that are used in these programs are described briefly in section 5.4 of the paper) will produce the four parametric analyses shown in Table 2 of the paper. The program mleconventional(AFT)weibull.ssc will fit a conventional Weibull AFT model for the location parameter beta, the program mleconventional(AFT)gralgamma.ssc will fit a conventional general gamma AFT model for the location parameter beta, the program mlesaturatedgralgamma.ssc will fit a generalized gamma model to each of the four periods separately, and finally the program mlegralgammaforfinal.ssc fits the best two-parameter model in each of the four periods. Each of the four .txt files contains the relevant output from running the corresponding S-Plus program.

1. mleconventional(AFT)weibull.txt,
2. mleconventional(AFT)gralgamma.txt,
3. mlesaturatedgralgamma.txt,
4. mlegralgammaforfinal.txt


Below the heading, S-Plus functions we have included four functions the will return values of the quantile function (qggamma.ssc), the hazard function (hggamma.ssc), the density function (dggamma.ssc), and survival function (sggamma.ssc) for the general gamma distribution. Values for beta (location parameter), sigma (scale parameter), and lambda (shape parameter) must be supplied as arguments to each of the four functions. A cumulative probability must also be supplied as the last argument for the quantile function and a value of time must be supplied as the last argument for the hazard, density, and survival functions.

-------- Expanded Version of Statistical Software section of paper (section 5)--------------

expandedsection5.pdf

GEM2014.03.11.AlvaroMunoz.pdf

--------------------------------- Figures presented in paper ---------------------------------------

The six S-Plus programs below will produce the six figures presented in the paper.

1. GGfig1.ssc,
2. GGfig2.ssc ,
3 .GGfig3.ssc,
4. GGfig4.ssc,
5.
GGfig5.ssc,
6.
GGfig6.ssc.