r - How to select specific columns containing certain strings/characters? -
i have dataframe:
df1 <- data.frame(a = c("correct", "wrong", "wrong", "correct"), b = c(1, 2, 3, 4), c = c("wrong", "wrong", "wrong", "wrong"), d = c(2, 2, 3, 4)) b c d correct 1 wrong 2 wrong 2 wrong 2 wrong 3 wrong 3 correct 4 wrong 4
and select columns either strings 'correct' or 'wrong' (i.e., columns b , d in df1), such dataframe:
df2 <- data.frame(a = c("correct", "wrong", "wrong", "correct"), c = c("wrong", "wrong", "wrong", "wrong")) c 1 correct wrong 2 wrong wrong 3 wrong wrong 4 correct wrong
can use dplyr this? if not, function(s) can use this? example i've given straightforward, in can (dplyr):
select(df1, a, c)
however, in actual dataframe, have 700 variables/columns , few hundred columns contain strings 'correct' or 'wrong' , don't know variable/column names.
any suggestions how quickly? lot!
you can use base r
filter
operate on each of df1
's columns , keep ones satisfying logical test in function:
filter(function(u) any(c('wrong','correct') %in% u), df1) # c #1 correct wrong #2 wrong wrong #3 wrong wrong #4 correct wrong
you can use grepl
:
filter(function(u) any(grepl('wrong|correct',u)), df1)
Comments
Post a Comment