16-Week Plan: From Curious Reader to Tez Committer Candidate
This is a 16-week, ~10-hour-per-week plan that maps the curriculum (Levels 1–9 plus a 2-week capstone) onto a calendar. Each week states:
- Reading — concrete Tez source files. Open them; do not just skim diagrams.
- Hands-on — what you must build/run on your machine.
- JIRA practice queries — searches that surface real, beginner-appropriate issues.
- Labs — the curriculum labs you must complete.
- Exit checkpoint — concrete deliverables. If you cannot produce them, repeat the week.
The plan assumes you have ~/tez-src checked out, tez-tests/ building with
mvn -DskipTests install, and a working Java 8+/Maven 3.6+ environment.
Weeks 1–2: Level 1 — Orientation and First DAG
Week 1 — The DAG model and the client API
Reading
tez-api/src/main/java/org/apache/tez/dag/api/DAG.java(entire file; ~600 lines)tez-api/src/main/java/org/apache/tez/dag/api/Vertex.javatez-api/src/main/java/org/apache/tez/dag/api/Edge.javatez-api/src/main/java/org/apache/tez/dag/api/EdgeProperty.javatez-api/src/main/proto/DAGApiRecords.proto— focus onDAGPlan,VertexPlan,EdgePlan,EdgeProperty.
Hands-on
- Build Tez from source:
mvn clean install -DskipTests -Phadoop28. - Run
OrderedWordCountagainst a local file usingMiniTezCluster(seetez-tests/src/test/java/org/apache/tez/test/TestTezJobs.java). - Inspect the generated DAGPlan: print it with
dag.createDag(...).toString().
JIRA practice queries
project = TEZ AND status in (Open, "In Progress") AND labels = newbie
project = TEZ AND component = tez-api AND fixVersion is empty AND priority in (Trivial, Minor)
Labs
- Lab 1.1 — Trace a
WordCountend-to-end. - Lab 1.2 — Modify the DAG: add a second mapper vertex.
Exit checkpoint
- You can name every required argument to
DAG.create(),Vertex.create(),Edge.create(), andEdgeProperty.create(). - You can diagram the WordCount DAG without looking.
- You have one JIRA ticket open in a browser tab that you've read end-to-end (description + every comment).
Week 2 — Edges in depth
Reading
tez-api/src/main/java/org/apache/tez/dag/api/EdgeProperty.java— all three enums (DataMovementType,DataSourceType,SchedulingType).tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/EdgeManager*.java— five built-in edge managers.tez-api/src/main/java/org/apache/tez/dag/api/InputDescriptor.java,OutputDescriptor.java,ProcessorDescriptor.java.
Hands-on
- Build the same WordCount with
BROADCASTinstead ofSCATTER_GATHERfor the edge. Observe the failure mode and explain it. - Write a 3-vertex DAG (
A -> B -> C) whereA->BisONE_TO_ONEandB->CisSCATTER_GATHER. Run it; confirm parallelism rules from the source.
JIRA practice queries
project = TEZ AND text ~ "EdgeManager" AND resolution = Unresolved
project = TEZ AND text ~ "broadcast" AND status = Resolved ORDER BY created DESC
Labs
- Lab 1.3 — Edge type matrix experiment.
Exit checkpoint
- Edge type matrix (movement × scheduling × source) drawn from memory.
- You can predict, given edge properties, which
EdgeManagerimpl will be picked. - One short forum/dev-list email you drafted (do not send) summarizing your reading of an EdgeManager file.
Weeks 3–4: Level 2 — Build, run, and read tests
Week 3 — Tez build system and module layout
Reading
pom.xml(root),tez-api/pom.xml,tez-dag/pom.xml.BUILDING.txt.tez-tests/src/test/java/org/apache/tez/test/MiniTezCluster.java— entry-point for nearly every integration test.
Hands-on
- Run
mvn -pl tez-dag test -Dtest=TestVertexImpl#testBasicVertexCompletion. - Run
mvn -pl tez-tests test -Dtest=TestTezJobs#testWordCount. - Profile a build:
mvn -DskipTests install -X 2>&1 | grep "Building\|BUILD".
JIRA practice queries
project = TEZ AND component = build AND status = Open
project = TEZ AND text ~ "MiniTezCluster" AND resolution = Unresolved
Labs
- Lab 2.1 — Build Tez and run all
tez-apitests. - Lab 2.2 — Add a no-op test to
tez-dagand run it via Maven.
Exit checkpoint
- You can explain why
tez-dagdepends ontez-apibut not vice versa. - You know the difference between
tez-runtime-internalsandtez-runtime-library. - You can run a single test via Maven without consulting any docs.
Week 4 — Tests as documentation
Reading
tez-dag/src/test/java/org/apache/tez/dag/app/dag/impl/TestVertexImpl.java(~5000 lines; pick the top 10 test methods).tez-dag/src/test/java/org/apache/tez/dag/app/dag/impl/TestDAGImpl.java.tez-dag/src/test/java/org/apache/tez/dag/app/dag/impl/TestTaskImpl.java.
Hands-on
- Pick one test method in
TestVertexImpl; rewrite it from scratch in your notebook, then diff against the original. - Add an assertion that fails; observe the message; fix it.
JIRA practice queries
project = TEZ AND text ~ "flaky" AND status in (Open, "In Progress")
project = TEZ AND text ~ "TestVertexImpl" AND resolution = Unresolved
Labs
- Lab 2.3 — Read
TestVertexImpl#testKilledTasksHandlingand explain every line.
Exit checkpoint
- You can write a test that constructs a
VertexImpldirectly (withoutMiniTezCluster). - You understand the
DrainDispatcherpattern (seestate-machines.md).
Weeks 5–6: Level 3 — Submission and AM lifecycle
Week 5 — TezClient and submission
Reading
tez-api/src/main/java/org/apache/tez/client/TezClient.java.tez-api/src/main/java/org/apache/tez/client/TezClientUtils.java.tez-api/src/main/java/org/apache/tez/client/TezSessionImpl.java.
Hands-on
- Write a small Java program that uses
TezClientdirectly (no MR shim) to submit a DAG toMiniTezCluster. - Use both session and non-session modes; measure the second-DAG latency difference.
JIRA practice queries
project = TEZ AND component = "tez-api" AND text ~ "TezClient" AND status = Open
Labs
- Lab 3.1 — Build a custom client that submits two DAGs in one session.
Exit checkpoint
- You can list every method that talks to the AM over RPC (grep for
dagAMProtocolinTezClient.java). - You can name the three local resources that
TezClientUtilsuploads.
Week 6 — DAGAppMaster bring-up
Reading
tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java— focus onserviceInit,serviceStart, dispatcher registration.tez-dag/src/main/java/org/apache/tez/dag/app/TaskCommunicatorManager.java.tez-dag/src/main/java/org/apache/tez/dag/app/launcher/ContainerLauncher*.java.
Hands-on
- Run a DAG against
MiniTezClusterwith AM logs atDEBUG. Identify the line inDAGAppMaster.javathat emits the first"Created DAG"log line.
Labs
- Lab 3.2 — Map an AM log line to source code (Lab in Level 3).
Exit checkpoint
- You can list the AsyncDispatcher event-handler registrations in
DAGAppMasterin order. - You can walk the path from
TezClient.submitDAG()toDAGImplbeing instantiated inside the AM.
Weeks 7–9: Level 4 — Vertex internals and state machines
Week 7 — State machine library
Reading
hadoop-yarn-commonStateMachineFactorysource (you'll need to fetch Hadoop source separately).tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java— read only thestateMachineFactoryblock first (~200 lines near the top).
Hands-on
- Write a toy
StateMachineFactoryfor aLight(OFF,ON,BROKEN) in a scratch project.
Labs
- Lab 4.1 — State-machine introduction.
Exit checkpoint
- You can explain
SingleArcTransitionvsMultipleArcTransitionwithout notes.
Week 8 — VertexManager plugins
Reading
tez-api/src/main/java/org/apache/tez/dag/api/VertexManagerPlugin.java,VertexManagerPluginContext.java.tez-dag/src/main/java/org/apache/tez/dag/library/vertexmanager/ShuffleVertexManager.java.
Labs
- Lab 4.2 — VertexManager deep dive (the depth-bar lab).
Exit checkpoint
- A working
CountingVertexManagerwith passing unit test, as specified in Lab 4.2.
Week 9 — Task and TaskAttempt
Reading
tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/TaskImpl.java.tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/TaskAttemptImpl.java.
Labs
- Lab 4.3 — Task lifecycle walk.
- Lab 4.4 — TaskAttempt termination causes.
Exit checkpoint
- You can draw the
TaskAttemptstate machine from memory. - You can list every
TaskAttemptTerminationCauseand what produces it.
Weeks 10–11: Level 5 — Runtime, IPO, and shuffle
Week 10 — Runtime task execution
Reading
tez-runtime-internals/src/main/java/org/apache/tez/runtime/task/TezTaskRunner2.java.tez-runtime-internals/src/main/java/org/apache/tez/runtime/LogicalIOProcessorRuntimeTask.java.
Labs
- Lab 5.1 — Trace a task from container start to processor exit.
Exit checkpoint
- You can list every umbilical call a task makes during its lifetime
(grep
umbilicalintez-runtime-internals).
Week 11 — Shuffle and merge
Reading
tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/shuffle/orderedgrouped/ShuffleManager.java.tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/shuffle/orderedgrouped/Fetcher.java.tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/sort/impl/PipelinedSorter.java.
Labs
- Lab 5.2 — Spilled output inspection on
MiniTezCluster. - Lab 5.3 — Force a fetch failure.
Exit checkpoint
- You can explain
IFileframing in two paragraphs. - You can name the three sorter implementations and when each is used.
Week 12: Level 6 — Scheduling and container reuse
Reading
tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java.tez-dag/src/main/java/org/apache/tez/dag/app/rm/TaskSchedulerManager.java.tez-dag/src/main/java/org/apache/tez/dag/app/rm/container/AMContainerImpl.java.
JIRA practice queries
project = TEZ AND text ~ "container reuse" AND status in (Open, "In Progress")
Labs
- Lab 6.1 — Disable container reuse; measure latency cost.
- Lab 6.2 — Read and explain
tez.am.container.reuse.*configs.
Exit checkpoint
- You can list the four conditions under which a container is not reused.
Week 13: Level 7 — MapReduce compatibility and integrations
Reading
tez-mapreduce/src/main/java/org/apache/tez/mapreduce/input/MRInput.java.tez-mapreduce/src/main/java/org/apache/tez/mapreduce/output/MROutput.java.tez-mapreduce/src/main/java/org/apache/tez/mapreduce/processor/map/MapProcessor.java.
Labs
- Lab 7.1 — Submit a vanilla MR job via Tez (
tez.lib.urismode).
Exit checkpoint
- You can write a one-page essay on "what
MRInputdoes that a plainLogicalInputdoes not."
Week 14: Level 8 — Production diagnostics
Reading
tez-api/src/main/java/org/apache/tez/common/counters/TezCounters.java.tez-dag/src/main/java/org/apache/tez/dag/history/HistoryEventHandler.java.tez-plugins/tez-yarn-timeline-history/.
Labs
- Lab 8.1 — Read a real ATS event dump.
- Lab 8.2 — Trace a failure through the AM log + ATS + counters.
Exit checkpoint
- You can answer: "Why did vertex X fail?" given only an AM log and ATS dump.
Weeks 15–16: Capstone
Follow capstone/index.md start-to-finish:
- Issue selection (week 15, day 1–2).
- Reproduction → root cause (week 15, day 3–7).
- Implementation + tests (week 16, day 1–4).
- Patch submission + write-up (week 16, day 5–7).
Exit checkpoint
- A real patch attached to a real JIRA, with passing tests and a clear summary.
- A 1500–3000 word public write-up of the experience.
How to use this plan when you fall behind
- If you finish a week's reading but cannot pass the exit checkpoint, repeat the week. Do not advance.
- If a JIRA query returns no results, change the query. The dev community moves; labels and components shift.
- Skip a Level only if you can pass all exit checkpoints from previous Levels in one sitting.