Level 7: Runtime and Shuffle

The Tez shuffle layer is the most performance-critical and most bug-prone part of the runtime. Understanding it is required for diagnosing slow queries, data-skew issues, and shuffle fetch failures.

How shuffle works in Tez

Map task → OrderedPartitionedKVOutput → TezIndexRecord (index + data files)
                                         ↓
                              ShuffleHandler (HTTP server in NM)
                                         ↓
Reduce task → OrderedGroupedKVInput ← shuffle fetcher threads
                                         ↓
                                    merge + sort → processor

Key insight: unlike Hadoop MapReduce's ShuffleConsumerPlugin, Tez's shuffle is split into framework code (tez-runtime-library) and user code (Processor). The processor never sees unsorted records — sorting happens in the runtime layer.

What this level covers

TopicLab
Trace shuffle fetch failure from AM logs to root causeLab 7.1
Add or modify an OrderedPartitionedKVOutput processorLab 7.2

Key classes

ClassWhereWhat it does
OrderedPartitionedKVOutputtez-runtime-libraryMap output: partition + sort + spill
OrderedGroupedKVInputtez-runtime-libraryReduce input: fetch + merge + sort
ShuffleFetch / Fetchertez-runtime-libraryHTTP fetch from ShuffleHandler
MergeManagertez-runtime-libraryIn-memory and on-disk merge
ShuffleHandlertez-shuffleNetty HTTP server serving map output
TezIndexRecordtez-runtime-libraryPer-partition offset+length in output file