java - Mongo Hadoop Connector Issue -
i trying run mapreduce job: pull mongo , write hdfs, cannot seem job run. not find example, issues having if set input path of mongo loos output path of mongo. , getting authentication error when mongodb instance not have authentication.
final configuration conf = getconf(); final job job = new job(conf, "sort"); mongoconfig config = new mongoconfig(conf); mongoconfigutil.setinputformat(getconf(), mongoinputformat.class); fileoutputformat.setoutputpath(job, new path("/trythisdir")); mongoconfigutil.setinputuri(conf,"mongodb://localhost:27017/fake_data.file"); //conf.set("mongo.output.uri", "mongodb://localhost:27017/fake_data.file"); job.setjarbyclass(imageextractor.class); job.setmapperclass(imageextractormapper.class); job.setoutputkeyclass(text.class); job.setoutputvalueclass(text.class); job.setinputformatclass( mongoinputformat.class ); // execute job , return status return job.waitforcompletion(true) ? 0 : 1;
edit: current error having:
exception in thread "main" java.lang.illegalargumentexception: couldn't connect , authenticate collection @ com.mongodb.hadoop.util.mongoconfigutil.getcollection(mongoconfigutil.java:353) @ com.mongodb.hadoop.splitter.mongosplitterfactory.getsplitterbystats(mongosplitterfactory.java:71) @ com.mongodb.hadoop.splitter.mongosplitterfactory.getsplitter(mongosplitterfactory.java:107) @ com.mongodb.hadoop.mongoinputformat.getsplits(mongoinputformat.java:56) @ org.apache.hadoop.mapred.jobclient.writenewsplits(jobclient.java:1079) @ org.apache.hadoop.mapred.jobclient.writesplits(jobclient.java:1096) @ org.apache.hadoop.mapred.jobclient.access$600(jobclient.java:177) @ org.apache.hadoop.mapred.jobclient$2.run(jobclient.java:995) @ org.apache.hadoop.mapred.jobclient$2.run(jobclient.java:948) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:415) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1408) @ org.apache.hadoop.mapred.jobclient.submitjobinternal(jobclient.java:948) @ org.apache.hadoop.mapreduce.job.submit(job.java:566) @ org.apache.hadoop.mapreduce.job.waitforcompletion(job.java:596) @ com.orbis.image.extractor.mongo.imageextractor.run(imageextractor.java:103) @ org.apache.hadoop.util.toolrunner.run(toolrunner.java:70) @ com.orbis.image.extractor.mongo.imageextractor.main(imageextractor.java:78) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:57) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:606) @ org.apache.hadoop.util.runjar.main(runjar.java:208) caused by: java.lang.nullpointerexception @ com.mongodb.mongouri.<init>(mongouri.java:148) @ com.mongodb.mongoclient.<init>(mongoclient.java:268) @ com.mongodb.hadoop.util.mongoconfigutil.getcollection(mongoconfigutil.java:351) ... 22 more
late answer.. may helpul people. encountered same problem while playing apache spark.
i think should set correctly mongo.input.uri , mongo.output.uri used hadoop , input , output formats.
/*correct input , output uri setting on spark(hadoop)*/ conf.set("mongo.input.uri", "mongodb://localhost:27017/dbname.inputcolname"); conf.set("mongo.output.uri", "mongodb://localhost:27017/dbname.outputcolname"); /*set input , output formats*/ job.setinputformatclass( mongoinputformat.class ); job.setoutputformatclass( mongooutputformat.class )
btw, if "mongo.input.uri" or "mongo.output.uri" strings have typos causes same error.
Comments
Post a Comment