r - Logical Subscript too long -
i realized question has been asked before, having looked @ answers, specific questions, , not find answer unique situation.
i entered following r, , worked first example, not second, , cannot understand why.
setting data glm :
setwd("p:/stat319") ucb2<-read.table('berkeley.poissontwo.txt',header=true) attach(ucb2)
ucb2 following :
count admit department gender 313 false female 512 true female 19 false male 89 true male 207 false b female 353 true b female 8 false b male 17 true b male 205 false c female 120 true c female 391 false c male 202 true c male 279 false d female 138 true d female 244 false d male 131 true d male 138 false e female 53 true e female 299 false e male 94 true e male 351 false f female 22 true f female 317 false f male 24 true f male
using factor variable, true , false admit , notadmit :
admit<-c(0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1) fadmit<-factor(admit) radmit<-factor(admit,labels=c("false","true")) glm2<-glm(count~admit+department+gender,family=poisson) glm2
preparing way leave 1 out cross validation
library(car) vif(glm2) # gvif df gvif^(1/(2*df)) # admit 1 1 1 # department 1 5 1 # gender 1 1 1 step(glm2) # start: aic=2272.73 # count ~ admit + department + gender # # df deviance aic # <none> 2097.7 2272.7 # - department 5 2257.2 2422.2 # - gender 1 2260.6 2433.6 # - admit 1 2327.7 2500.8 # # call: glm(formula = count ~ admit + department + gender, family = poisson) # # coefficients: # (intercept) admit departmentb departmentc # 5.82785 -0.45674 -0.46679 -0.01621 # departmentd departmente departmentf gendermale # -0.16384 -0.46850 -0.26752 -0.38287 # degrees of freedom: 23 total (i.e. null); 16 residual # null deviance: 2650 # residual deviance: 2098 aic: 2273 library(ipred) errorest(count~admit+department+gender,data=ucb2,model=glm,est.para=control.errorest(k=24)) # call: # errorest.data.frame(formula = count ~ admit + department + gender, # data = ucb2, model = glm, est.para = control.errorest(k = # 24)) # # 24-fold cross-validation estimator of root mean squared error # # root mean squared error: 180.5741
so first 1 worked data shown. same study, had rearrange data, , perform logistic regression :
ucb1<-read.table('monday.late.txt',header=true) attach(ucb1) # following object masked _by_ .globalenv: # # admit # following objects masked ucb2: # # admit, department, gender y<-cbind(ucb1[,1],ucb1[,2]) glm1<-glm(y~gender+department,family=binomial)
the data follows :
admit notadmit gender department 512 313 female 353 207 female b 120 205 female c 138 279 female d 53 138 female e 22 351 female f 89 19 male 17 8 male b 202 391 male c 131 244 male d 94 299 male e 24 317 male f
setting new data leave 1 out :
vif(glm1) # gvif df gvif^(1/(2*df)) # gender 1.384903 1 1.176819 # department 1.384903 5 1.033099 step(glm1) # start: aic=103.14 # y ~ gender + department # df deviance aic # - gender 1 21.74 102.68 # <none> 20.20 103.14 # - department 5 783.61 856.55 # # step: aic=102.68 # y ~ department # # df deviance aic # <none> 21.74 102.68 # - department 5 877.06 948.00 # # call: glm(formula = y ~ department, family = binomial) # # coefficients: # (intercept) departmentb departmentc departmentd # 0.59346 -0.05059 -1.20915 -1.25833 # departmente departmentf # -1.68296 -3.26911 # # degrees of freedom: 11 total (i.e. null); 6 residual # null deviance: 877.1 # residual deviance: 21.74 aic: 102.7
so far, good, problem arises :
errorest(y~gender+department,data=ucb1,model=glm,est.para=control.errorest(k=12)) error in xj[i, , drop = false] : (subscript) logical subscript long
so why happen ? tried other values k, not sure value k # meant take - assume meant of number of rows
i try same data, arranged different way :
ucb1a<-read.table('berkeley.rearranged.txt',header=true) attach(ucb1a) ucb1a
this rearrangement of data before
admitted not_admit depart genders 1 512 313 female 2 89 19 male 3 353 207 b female 4 17 8 b male 5 120 205 c female 6 202 391 c male 7 138 279 d female 8 131 244 d male 9 53 138 e female 10 94 299 e male 11 22 351 f female 12 24 317 f male
and
y<-cbind(ucb1[,1],ucb1[,2]) glm1a<-glm(y~genders+depart,family=binomial) vif(glm1a) # gvif df gvif^(1/(2*df)) # gender 1.384903 1 1.176819 # department 1.384903 5 1.033099 step(glm1a) # start: aic=103.14 # y ~ gender + department # # df deviance aic # - gender 1 21.74 102.68 # <none> 20.20 103.14 # - department 5 783.61 856.55 # # step: aic=102.68 # y ~ department # # df deviance aic # <none> 21.74 102.68 # - department 5 877.06 948.00 # # call: glm(formula = y ~ department, family = binomial) # # coefficients: # (intercept) departmentb departmentc departmentd # 0.59346 -0.05059 -1.20915 -1.25833 # departmente departmentf # -1.68296 -3.26911 # # degrees of freedom: 11 total (i.e. null); 6 residual # null deviance: 877.1 # residual deviance: 21.74 aic: 102.7
again, far good, once more, occurs :
errorest(y~gender+department,data=ucb1a,model=glm,est.para=control.errorest(k=12)) error in xj[i, , drop = false] : (subscript) logical subscript long
and believe me, tried other numbers again k, , cannot understand why 1 going wrong. if has ideas, specific example of (subscript) logical subscript being long, please reply this.
this issue arise when objects of different size. think problem comes attach() i'm not certain.. try code without it, or try with(). should check why have use attach() first before using nicola pointed out. also, i'm not trying achieve it.
you can see in section of function following : practice
attach has side effect of altering search path , can lead wrong object of particular name being found. people forget detach databases.
in interactive use, preferable use of attach/detach, unless save()-produced file in case attach() (safety) wrapper load().
Comments
Post a Comment