this question has answer here:
i have larger set of data , need row numbers of rows fulfill conditions. package data.table.
days <- strptime(c("2013-01-01 8:00:00", "2013-02-01 8:00:00"), format="%y-%m-%d %h:%m:%s") datetime <- rep(seq(days[1], days[2], length.out=1e6/5), 5) update <- rep(letters[3:1], length.out=1e6) group <- rep(c("aaa", "bbb", "ccc"), length.out=1e6) weight <- trunc(rnorm(1e6, 110, 3)) weight2 <- rnorm(1e6, 100, 1.5) dt <- data.table(datetime, update, group, weight, weight2) setkey(dt, datetime, update, group, weight, weight2) exp <- dt[1e6/2]
i cannot create data.table subset without column datetime since column used in key. creating new key on subset change order , need certainty original order preserved.
it possible row numbers need using 2 commands.
system.time(dt[, which(dt$update==exp$update & dt$group==exp$group & dt$weight==exp$weight & dt$weight2==exp$weight2)]) system.time(which(dt$update==exp$update & dt$group==exp$group & dt$weight==exp$weight & dt$weight2==exp$weight2))
however need faster way that.
thank suggestions.
it possible row number following way.
which(is.na(dt[list(dt$datetime, dt$update, dt$group, dt$weight, exp$weight2), which=true]) == false)
however 4 times slower vector search examples question.
Comments
Post a Comment