i getting weird error while writing data in s3 bucket. don't error regularly. so, not able figure out problem is. fyi, keeping configuration of emr same everytime. also, folder in s3 bucket not
write protected.
insert overwrite directory 's3://logs/apr' select f.cookie,sum(f.pgvw) pageview, count(distinct(f.cookie)) visits ( select a.cookie,a.session,count(distinct(a.date_time)) pgvw ( select extcookie(cs_cookie) cookie,extsession(cs_cookie) session,concat(logdate,' ',time) date_time apr_1 (uri '%.aspx%' or uri '%.html%') , (not(uri '/lts%')) , (extcookie(cs_cookie)!='-' , extcookie(cs_cookie)!=' ') , (extsession(cs_cookie)!='-' , extsession(cs_cookie)!=' ') group extcookie(cs_cookie),extsession(cs_cookie),logdate,time )a group a.cookie,a.session )f f.pgvw>1 group f.cookie;
logs of failed job :
finish_time="1373886754825" hostname="10\.144\.95\.241" error="java\.lang\.runtimeexception: org\.apache\.hadoop\.hive\.ql\.metadata\.hiveexception: hive runtime error while processing row {\"_col0\":\"cwc\=4ld8uploib7rd5x3uinvawd7h\",\"_col1\":7,\"_col2\":1} @ org\.apache\.hadoop\.hive\.ql\.exec\.execmapper\.map(execmapper\.java:166) @ org\.apache\.hadoop\.mapred\.maprunner\.run(maprunner\.java:50) @ org\.apache\.hadoop\.mapred\.maptask\.runoldmapper(maptask\.java:441) @ org\.apache\.hadoop\.mapred\.maptask\.run(maptask\.java:377) @ org\.apache\.hadoop\.mapred\.child$4\.run(child\.java:255) @ java\.security\.accesscontroller\.doprivileged(native method) @ javax\.security\.auth\.subject\.doas(subject\.java:396) @ org\.apache\.hadoop\.security\.usergroupinformation\.doas(usergroupinformation\.java:1132) @ org\.apache\.hadoop\.mapred\.child\.main(child\.java:249) caused by: org\.apache\.hadoop\.hive\.ql\.metadata\.hiveexception: hive runtime error while processing row {\"_col0\":\"cwc\=4ld8uploib7rd5x3uinvawd7h\",\"_col1\":7,\"_col2\":1} @ org\.apache\.hadoop\.hive\.ql\.exec\.mapoperator\.process(mapoperator\.java:550) @ org\.apache\.hadoop\.hive\.ql\.exec\.execmapper\.map(execmapper\.java:148) \.\.\. 8 more caused by: java\.lang\.indexoutofboundsexception: index: 1, size: 1 @ java\.util\.arraylist\.rangecheck(arraylist\.java:547) @ java\.util\.arraylist\.get(arraylist\.java:322) @ org\.apache\.hadoop\.hive\.serde2\.objectinspector\.standardstructobjectinspector\.init(standardstructobjectinspector\.java:118) @ org\.apache\.hadoop\.hive\.serde2\.objectinspector\.standardstructobjectinspector\.<init>(standardstructobjectinspector\.java:106) @ org\.apache\.hadoop\.hive\.serde2\.objectinspector\.objectinspectorfactory\.getstandardstructobjectinspector(objectinspectorfactory\.java:274) @ org\.apache\.hadoop\.hive\.serde2\.objectinspector\.objectinspectorfactory\.getstandardstructobjectinspector(objectinspectorfactory\.java:259) @ org\.apache\.hadoop\.hive\.ql\.exec\.reducesinkoperator\.initevaluatorsandreturnstruct(reducesinkoperator\.java:188) @ org\.apache\.hadoop\.hive\.ql\.exec\.reducesinkoperator\.processop(reducesinkoperator\.java:197) @ org\.apache\.hadoop\.hive\.ql\.exec\.operator\.process(operator\.java:471) @ org\.apache\.hadoop\.hive\.ql\.exec\.operator\.forward(operator\.java:762) @ org\.apache\.hadoop\.hive\.ql\.exec\.tablescanoperator\.processop(tablescanoperator\.java:83) @ org\.apache\.hadoop\.hive\.ql\.exec\.operator\.process(operator\.java:471) @ org\.apache\.hadoop\.hive\.ql\.exec\.operator\.forward(operator\.java:762) @ org\.apache\.hadoop\.hive\.ql\.exec\.mapoperator\.process(mapoperator\.java:531) \.\.\. 9 more
when select or group clauses contain 2 or more fields or aggregates differ in case (e.g., both extsession(cs_cookie) , extsession(cs_cookie)), optimizer try combine fields though execmapper not. causes error you're seeing.
you can confirm or deny cause of problem converting instances of "cs_cookie" "cs_cookie" (or vice-versa) , trying same query again. if no longer error, due optimization problem.
Comments
Post a Comment