hadoop - Set reducer capacity for a specific M/R job -
i want change cluster's capacity of reduce slots on per job basis. say, have 8 reduce slots configured tasktracker, job 100 reduce tasks, there (8 * datanode number) reduce tasks running in same time. specific job, want reduce number half, did:
conf.set("mapred.tasktracker.reduce.tasks.maximum", "4"); ... job job = new job(conf, ...)
and in web ui can see job, max reduce tasks @ 4, set. hadoop still launches 8 reducer per datanode job... seems can't alter reduce capacity this.
i asked on hadoop mail list, suggests can make capacity scheduler, how do?
i'm using hadoop 1.0.2.
thanks.
the capacity scheduler allows specify resource limits mapreduce jobs. have define queues, job being scheduled. each queue can have different configuration.
as far issue concerned, when using capacity scheduler 1 can specify ram-per-task limits in order limit how many slots given task takes. according documentation, memory based scheduling supported in linux platform.
for further information topic, see: http://wiki.apache.org/hadoop/limitingtaskslotusage , http://hadoop.apache.org/docs/stable/capacity_scheduler.html.
Comments
Post a Comment