i have database following structure:
id <- c(1,2,3,4,4,5,6,7,7,8) var1 <- c(1,2,1,2,4,1,2,3,5,4) var2 <- c(6,8,7,4,7,8,9,5,7,5) df <- data.frame(cbind(id,var1,var2))
the dataframe looks now:
id var1 var2 1 1 1 6 2 2 2 8 3 3 1 7 4 4 2 4 5 4 4 7 6 5 1 8 7 6 2 9 8 7 3 5 9 7 5 7 10 8 4 5
i want replace var2 value first duplicated id var2 value second duplicated id (see lines 4:5 , 8:9) , delete entire row second duplicate id. final df this:
id var1 var2 1 1 1 6 2 2 2 8 3 3 1 7 4 4 2 7 5 5 1 8 6 6 2 9 7 7 3 7 8 8 4 5
this should work (note op not specific more 2 duplicates, take first var1
, last var2
):
library(data.table) dt = data.table(df) dt[, list(var1 = var1[1], var2 = var2[.n]), = id] # id var1 var2 #1: 1 1 6 #2: 2 2 8 #3: 3 1 7 #4: 4 2 7 #5: 5 1 8 #6: 6 2 9 #7: 7 3 7 #8: 8 4 5
Comments
Post a Comment