Level 7: Runtime and Shuffle
The Tez shuffle layer is the most performance-critical and most bug-prone part of the runtime. Understanding it is required for diagnosing slow queries, data-skew issues, and shuffle fetch failures.
How shuffle works in Tez
Map task → OrderedPartitionedKVOutput → TezIndexRecord (index + data files)
↓
ShuffleHandler (HTTP server in NM)
↓
Reduce task → OrderedGroupedKVInput ← shuffle fetcher threads
↓
merge + sort → processor
Key insight: unlike Hadoop MapReduce's ShuffleConsumerPlugin, Tez's shuffle
is split into framework code (tez-runtime-library) and user code
(Processor). The processor never sees unsorted records — sorting happens
in the runtime layer.
What this level covers
| Topic | Lab |
|---|---|
| Trace shuffle fetch failure from AM logs to root cause | Lab 7.1 |
Add or modify an OrderedPartitionedKVOutput processor | Lab 7.2 |
Key classes
| Class | Where | What it does |
|---|---|---|
OrderedPartitionedKVOutput | tez-runtime-library | Map output: partition + sort + spill |
OrderedGroupedKVInput | tez-runtime-library | Reduce input: fetch + merge + sort |
ShuffleFetch / Fetcher | tez-runtime-library | HTTP fetch from ShuffleHandler |
MergeManager | tez-runtime-library | In-memory and on-disk merge |
ShuffleHandler | tez-shuffle | Netty HTTP server serving map output |
TezIndexRecord | tez-runtime-library | Per-partition offset+length in output file |