r - Selecting specific rows from a large dataset using column values -

April 15, 2014

i have large data set (about 2000 rows , 38 columns) looks (there missing data in columns):

     species crab cmass  gill gmass     treatment    months avglw  avgils 222      cm   65 34.273    p 0.198     newtons cove      0 68.108  93.181 223      cm   57 33.506    p 0.166     newtons cove      0 37.908  39.683 225      cm   65 34.273    p 0.198     newtons cove      0 68.108  93.181 231      cm   62 30.852    p 0.147     newtons cove      0 37.285  89.823 239      cm   65 34.273    p 0.198     newtons cove      0 68.108  93.181 240      cm   57 33.506    p 0.166     newtons cove      0 37.908  39.683 241      cm   62 30.852    p 0.147     newtons cove      0 37.285  89.823 242      cm   63 22.456    p 0.093     newtons cove      0 70.005  67.687 243      cm   59 22.422    p 0.113     newtons cove      0 21.834  39.481

there multiple rows each crab number , able either average rows each crab number or select first unique row each crab number , exclude subsequent rows.

for example: average rows 222, 225, 239 crab '65'; or: select row 222 , exclude 225 , 239 crab has been selected.

i have tried using unique() , sqldf() neither have worked me.

any advice appreciated. thanks!

for average, might want try putting data in data.table , applying function:

mydata <- data.table(mydata) mydata[, lapply(.sd, mean), .sdcols = c("cmass", "gmass"), = "crab"]

assuming want obtain average cmass , gmass.

for other part of question, i'm not sure. try setting key on column interested in call unique:

setkey(mydata, crab) unique(mydata)

it sort crab , unique remove rows duplicate values of crab. want?

Search This Blog

UV code

r - Selecting specific rows from a large dataset using column values -

Comments

Post a Comment

Popular posts from this blog

jquery - How do you format the date used in the popover widget title of FullCalendar? -

Bubble Sort Manually a Linked List in Java -

asp.net mvc - SSO between MVCForum and Umbraco7 -