khan2001.RdGene expression data (2308 genes for 88 samples) from the microarray study of Khan et al. (2001).
data(khan2001)This data set contains measurements of the gene expression of 2308 genes for 88 observations: 29 cases of Ewing sarcoma (EWS), 11 cases of Burkitt lymphoma (BL), 18 cases of neuroblastoma (NB), 25 cases of rhabdomyosarcoma (RMS), and 5 other (non-SRBCT) samples.
khan2001$x is a 88 x 2308 matrix containing the expression levels. Note that
rows correspond to samples, and columns to genes. The row names are the original
image IDs, and the column names the orginal probe labels.
khan2001$y is a factor containing the diagnosis for each sample ("BL", "EWS", "NB", "non-SRBCT", "RMS").
khan2001$descr provides some annotation for each gene.
The data are described in Khan et al. (2001). Note that the values in
khan.data$x are logarithmized (using natural log) for normalization.
Khan et al. 2001. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Medicine 7:673–679.
# load sda library
library("sda")
# load full Khan et al (2001) data set
data(khan2001)
dim(khan2001$x) # 88 2308
#> [1] 88 2308
hist(khan2001$x)
khan2001$y # 5 levels
#> [1] EWS EWS EWS EWS EWS EWS EWS
#> [8] EWS EWS EWS EWS EWS EWS EWS
#> [15] EWS EWS EWS EWS EWS EWS EWS
#> [22] EWS EWS BL BL BL BL BL
#> [29] BL BL BL NB NB NB NB
#> [36] NB NB NB NB NB NB NB
#> [43] NB RMS RMS RMS RMS RMS RMS
#> [50] RMS RMS RMS RMS RMS RMS RMS
#> [57] RMS RMS RMS RMS RMS RMS RMS
#> [64] non-SRBCT non-SRBCT non-SRBCT NB RMS non-SRBCT non-SRBCT
#> [71] NB EWS RMS BL EWS RMS EWS
#> [78] EWS EWS RMS BL RMS NB NB
#> [85] NB NB BL EWS
#> Levels: BL EWS NB non-SRBCT RMS
# data set containing the SRBCT samples
get.srbct = function()
{
data(khan2001)
idx = which( khan2001$y == "non-SRBCT" )
x = khan2001$x[-idx,]
y = factor(khan2001$y[-idx])
descr = khan2001$descr[-idx]
list(x=x, y=y, descr=descr)
}
srbct = get.srbct()
dim(srbct$x) # 83 2308
#> [1] 83 2308
hist(srbct$x)
srbct$y # 4 levels
#> [1] EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS
#> [20] EWS EWS EWS EWS BL BL BL BL BL BL BL BL NB NB NB NB NB NB NB
#> [39] NB NB NB NB NB RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS
#> [58] RMS RMS RMS RMS RMS RMS NB RMS NB EWS RMS BL EWS RMS EWS EWS EWS RMS BL
#> [77] RMS NB NB NB NB BL EWS
#> Levels: BL EWS NB RMS