java - mapreduce.reduce.shuffle.memory.limit.percent, mapreduce.reduce.shuffle.input.buffer.percent and mapreduce.reduce.shuffle.merge.percent -
i want verify understanding these parameters , relationship, if wrong please notify me.
mapreduce.reduce.shuffle.input.buffer.percenttells total amount of memory allocated entire shuffle phase ofreducer.mapreduce.reduce.shuffle.memory.limit.percenttell maximum percentage of in-memory limit single shuffle can consumemapreduce.reduce.shuffle.input.buffer.percent.mapreduce.reduce.shuffle.merge.percentusage threshold @ in-memory merge initiated, expressed percentage of total memory(mapreduce.reduce.shuffle.input.buffer.percent) allocated storing in-memory map outputs.but
hadoop-2.6has restrictionmapreduce.reduce.shuffle.merge.percentshould greatermapreduce.reduce.shuffle.memory.limit.percent. means single shuffle has keys of same type otherwise purpose of restriction , relation between 3 ?
i share understanding on these properties, hope help. advise me if wrong.
mapreduce.reduce.shuffle.input.buffer.percent tells percentage of reducer's heap memory allocated circular buffer store intermediate outputs copied multiple mappers.
mapreduce.reduce.shuffle.memory.limit.percent tells maximum percentage of above memory buffer single shuffle (output copied single map task) should take. shuffle's size above size not copied memory buffer, instead directly written disk of reducer.
mapreduce.reduce.shuffle.merge.percent tells threshold percentage in-memory merger thread run merge available shuffle contents on memory buffer single file , spills merged file disk.
it obvious in-memory merger thread should require @ least 2 shuffle files present in memory buffer initiate merge. @ time mapreduce.reduce.shuffle.merge.percent should higher single shuffle file in memory controlled mapreduce.reduce.shuffle.memory.limit.percent property, mandates there should @ least more 1 shuffle file present in buffer merge process.
Comments
Post a Comment