Hadoop: Do not re-schedule a failed reducer

Tag: hadoop Author: szbozhu Date: 2010-08-01

This is how Hadoop currently works: If a reducer fails (throws a NullPointerException for example), Hadoop will reschedule another reducer to do the task of the reducer that failed.

Is it possible to configure Hadoop to not reschedule failed reducers i.e. if any reducer fails, Hadoop merely reports failure and does nothing else.

Of course, the reducers that did not fail will continue to completion.

Other Answer1

you can set the mapred.reduce.max.attempts property using the Configuration class the job.xml

setting it to 0 should solve your problem

Other Answer2

If you set the configuration to not reschedule failed tasks as soon as the first one fails your jobtracker will fail the job and kill currently running tasks. So what you want to do is pretty much impossible.