anova - How to change an order of samples in Tukey's test in R? -
problem: learn how can change order of samples tukey's test in r calculates means , assign corresponding letters. simple example below.
i played iris data , have found there differences in sepal.length among different species. here boxplot:

i conducted anova test , have found differences statistically significant.
> fit <- lm(sepal.length ~ species, data = iris) > summary(aov(fit)) df sum sq mean sq f value pr(>f) species 2 63.21 31.606 119.3 <2e-16 *** residuals 147 38.96 0.265 --- signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 then conducted tukey's test , got following:
> library(agricolae) > hsd.test(fit, "species", group=t, console=t) study: fit ~ "species" hsd test sepal.length mean square error: 0.2650082 species, means sepal.length std r min max setosa 5.006 0.3524897 50 4.3 5.8 versicolor 5.936 0.5161711 50 4.9 7.0 virginica 6.588 0.6358796 50 4.9 7.9 alpha: 0.05 ; df error: 147 critical value of studentized range: 3.348424 significant difference: 0.2437727 means same letter not different. groups, treatments , means virginica 6.588 b versicolor 5.936 c setosa 5.006 according table of groups, hsd.test function sorts means descending order , assign letters. thus, "virginica" have largest mean, therefore first in table.
questions: there way change default sorting , assigning of letters? can sort samples in ascending order of means , assign letters. expected output following:
a setosa 5.006 b versicolor 5.936 c virginica 6.588 possible solution: in package multcomp there 2 functions can working together:
1 - glht tukey's test
> <- aov(fit) > library(multcomp) > glht(an, linfct = mcp(species = "tukey")) general linear hypotheses multiple comparisons of means: tukey contrasts linear hypotheses: estimate versicolor - setosa == 0 0.930 virginica - setosa == 0 1.582 virginica - versicolor == 0 0.652 2 - cld can provide me letters assigned species according levels of factor iris$species
> cld(glht(an, linfct = mcp(species = "tukey"))) setosa versicolor virginica "a" "b" "c" unfortunately, glht function not show data can useful , needed creating of barplot (means, std, p-values). of course, can separately special functions, or use both hsd.test , cld. prefer solve problem sorting of means in hsd.test function , use one.
i notice it's bit late answer question. ran exact same problem , share solution future reference. hope helps someday someone.
first option
one use multcompletters() example results tukeyhsd(). however, doesn't allow arbitrary ordering of result , not easy use.
second option
since needed arbitrary order wrote own function, takes vector of letters returned hsd.test , swaps letters in way result nice. meaning letters first in alphabet appearing first.
library(agricolae) reorder<-function(inv){ collapsed <- paste(inv,sep="",collapse = "") u <- unique(strsplit(collapsed,"")[[1]]) if(length(u)<2){ return(inv) } u <- u[order(u)] m <- matrix(nrow=nrow(inv),ncol=length(u)) m[]<-f for(i in 1:length(inv)){ s <- strsplit(inv[i],"")[[1]] index <- match(s,u) m[i,index] <- t } for(i in 1:(length(u)-1)){ firstcolt <- match(t,m[,i])[1] #first row true in current column firstt <- match(t,rowsums(m[,i:length(u)] > 0))[1] #first row true in rest if(firstt < firstcolt){ colt <- match(t,m[firstt,i:length(u)])[1] colt <- colt + - 1 #correct index leftout columns in match tmp <- m[,colt] m[,colt] <- m[,i] m[,i] <- tmp } } res <- vector(mode = "character", length=length(trt)) for(i in 1:length(inv)){ l <- u[m[i,]] res[i] <- paste(l,sep="",collapse = "") } return(res) } fit <- lm(sepal.length ~ species, data = iris) <- hsd.test(fit, "species", group=t, console=f)$groups <- a[rev(rownames(a)),] #order result way want a$m <- reorder(as.character(a$m)) for example bit of overkill should work more complicated cases.
Comments
Post a Comment