java - mapreduce.reduce.shuffle.memory.limit.percent, mapreduce.reduce.shuffle.input.buffer.percent and mapreduce.reduce.shuffle.merge.percent -
i want verify understanding these parameters , relationship, if wrong please notify me.
mapreduce.reduce.shuffle.input.buffer.percent
tells total amount of memory allocated entire shuffle phase ofreducer
.mapreduce.reduce.shuffle.memory.limit.percent
tell maximum percentage of in-memory limit single shuffle can consumemapreduce.reduce.shuffle.input.buffer.percent
.mapreduce.reduce.shuffle.merge.percent
usage threshold @ in-memory merge initiated, expressed percentage of total memory(mapreduce.reduce.shuffle.input.buffer.percent
) allocated storing in-memory map outputs.but
hadoop-2.6
has restrictionmapreduce.reduce.shuffle.merge.percent
should greatermapreduce.reduce.shuffle.memory.limit.percent
. means single shuffle has keys of same type otherwise purpose of restriction , relation between 3 ?
i share understanding on these properties, hope help. advise me if wrong.
mapreduce.reduce.shuffle.input.buffer.percent
tells percentage of reducer's heap memory allocated circular buffer store intermediate outputs copied multiple mappers.
mapreduce.reduce.shuffle.memory.limit.percent
tells maximum percentage of above memory buffer single shuffle (output copied single map task) should take. shuffle's size above size not copied memory buffer, instead directly written disk of reducer.
mapreduce.reduce.shuffle.merge.percent
tells threshold percentage in-memory merger thread run merge available shuffle contents on memory buffer single file , spills merged file disk.
it obvious in-memory merger thread should require @ least 2 shuffle files present in memory buffer initiate merge. @ time mapreduce.reduce.shuffle.merge.percent
should higher single shuffle file in memory controlled mapreduce.reduce.shuffle.memory.limit.percent
property, mandates there should @ least more 1 shuffle file present in buffer merge process.
Comments
Post a Comment