Hadoop Mapper Context object -
how run()
method of mapper or reducer class called hadoop framework? framework calling run()
method, requires 1 context object how hadoop passing object? information resides in object?
the run() method called using java run time polymorphism (i.e method overriding). can see line# 569 on link below, extended mapper/reducer instantiated using java reflection apis. maptask class gets name of extended mapper/reducer job configuration object client program have been configured extended mapper/reducer class using job.setmapperclass()
the following code taken hadoop source maptask.java
mappercontext = contextconstructor.newinstance(mapper, job, gettaskid(), input, output, committer, reporter, split); input.initialize(split, mappercontext); mapper.run(mappercontext); input.close();`
the line# 621 example of run time polymorphism. on line, maptask calls run() method of configured mapper 'mapper context' parameter. if run() not extended, calls run() method on org.apache.hadoop.mapreduce.mapper
again calls map() method on configured mapper.
on line# 616 of above link, maptask creates context object details of job configuration, etc. mentioned @harpun , passes onto run() method on line # 621.
the above explanation holds reduce task appropriate reducetask class being main entry class.
Comments
Post a Comment