anova - How to change an order of samples in Tukey's test in R? -

January 15, 2013

problem: learn how can change order of samples tukey's test in r calculates means , assign corresponding letters. simple example below.

i played iris data , have found there differences in sepal.length among different species. here boxplot:

enter image description here

i conducted anova test , have found differences statistically significant.

> fit <- lm(sepal.length ~ species, data = iris) > summary(aov(fit))               df sum sq mean sq f value pr(>f)     species       2  63.21  31.606   119.3 <2e-16 *** residuals   147  38.96   0.265                    --- signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

then conducted tukey's test , got following:

> library(agricolae) > hsd.test(fit, "species", group=t, console=t)  study: fit ~ "species"  hsd test sepal.length   mean square error:  0.2650082   species,  means             sepal.length       std  r min max setosa            5.006 0.3524897 50 4.3 5.8 versicolor        5.936 0.5161711 50 4.9 7.0 virginica         6.588 0.6358796 50 4.9 7.9  alpha: 0.05 ; df error: 147  critical value of studentized range: 3.348424   significant difference: 0.2437727   means same letter not different.  groups, treatments , means    virginica       6.588  b    versicolor      5.936  c    setosa          5.006

according table of groups, hsd.test function sorts means descending order , assign letters. thus, "virginica" have largest mean, therefore first in table.

questions: there way change default sorting , assigning of letters? can sort samples in ascending order of means , assign letters. expected output following:

a setosa     5.006 b versicolor 5.936 c virginica  6.588

possible solution: in package multcomp there 2 functions can working together:

1 - glht tukey's test

> <- aov(fit) > library(multcomp) > glht(an, linfct = mcp(species = "tukey"))           general linear hypotheses      multiple comparisons of means: tukey contrasts       linear hypotheses:                                 estimate     versicolor - setosa == 0       0.930     virginica - setosa == 0        1.582     virginica - versicolor == 0    0.652

2 - cld can provide me letters assigned species according levels of factor iris$species

> cld(glht(an, linfct = mcp(species = "tukey")))     setosa versicolor  virginica         "a"        "b"        "c"

unfortunately, glht function not show data can useful , needed creating of barplot (means, std, p-values). of course, can separately special functions, or use both hsd.test , cld. prefer solve problem sorting of means in hsd.test function , use one.

i notice it's bit late answer question. ran exact same problem , share solution future reference. hope helps someday someone.

first option

one use multcompletters() example results tukeyhsd(). however, doesn't allow arbitrary ordering of result , not easy use.

second option

since needed arbitrary order wrote own function, takes vector of letters returned hsd.test , swaps letters in way result nice. meaning letters first in alphabet appearing first.

library(agricolae) reorder<-function(inv){   collapsed <- paste(inv,sep="",collapse = "")   u <- unique(strsplit(collapsed,"")[[1]])   if(length(u)<2){     return(inv)   }   u <- u[order(u)]   m <- matrix(nrow=nrow(inv),ncol=length(u))   m[]<-f   for(i in 1:length(inv)){     s <- strsplit(inv[i],"")[[1]]     index <- match(s,u)     m[i,index] <- t   }   for(i in 1:(length(u)-1)){     firstcolt <- match(t,m[,i])[1] #first row true in current column     firstt <- match(t,rowsums(m[,i:length(u)] > 0))[1] #first row true in rest     if(firstt < firstcolt){       colt <- match(t,m[firstt,i:length(u)])[1]       colt <- colt + - 1 #correct index leftout columns in match       tmp <- m[,colt]       m[,colt] <- m[,i]       m[,i] <- tmp     }   }   res <- vector(mode = "character", length=length(trt))   for(i in 1:length(inv)){     l <- u[m[i,]]     res[i] <- paste(l,sep="",collapse = "")   }   return(res) }  fit <- lm(sepal.length ~ species, data = iris) <- hsd.test(fit, "species", group=t, console=f)$groups <- a[rev(rownames(a)),] #order result way want a$m <- reorder(as.character(a$m))

for example bit of overkill should work more complicated cases.

Search This Blog

UV code