16-Week Plan: From Curious Reader to Tez Committer Candidate

This is a 16-week, ~10-hour-per-week plan that maps the curriculum (Levels 1–9 plus a 2-week capstone) onto a calendar. Each week states:

  • Reading — concrete Tez source files. Open them; do not just skim diagrams.
  • Hands-on — what you must build/run on your machine.
  • JIRA practice queries — searches that surface real, beginner-appropriate issues.
  • Labs — the curriculum labs you must complete.
  • Exit checkpoint — concrete deliverables. If you cannot produce them, repeat the week.

The plan assumes you have ~/tez-src checked out, tez-tests/ building with mvn -DskipTests install, and a working Java 8+/Maven 3.6+ environment.


Weeks 1–2: Level 1 — Orientation and First DAG

Week 1 — The DAG model and the client API

Reading

  • tez-api/src/main/java/org/apache/tez/dag/api/DAG.java (entire file; ~600 lines)
  • tez-api/src/main/java/org/apache/tez/dag/api/Vertex.java
  • tez-api/src/main/java/org/apache/tez/dag/api/Edge.java
  • tez-api/src/main/java/org/apache/tez/dag/api/EdgeProperty.java
  • tez-api/src/main/proto/DAGApiRecords.proto — focus on DAGPlan, VertexPlan, EdgePlan, EdgeProperty.

Hands-on

  • Build Tez from source: mvn clean install -DskipTests -Phadoop28.
  • Run OrderedWordCount against a local file using MiniTezCluster (see tez-tests/src/test/java/org/apache/tez/test/TestTezJobs.java).
  • Inspect the generated DAGPlan: print it with dag.createDag(...).toString().

JIRA practice queries

project = TEZ AND status in (Open, "In Progress") AND labels = newbie
project = TEZ AND component = tez-api AND fixVersion is empty AND priority in (Trivial, Minor)

Labs

  • Lab 1.1 — Trace a WordCount end-to-end.
  • Lab 1.2 — Modify the DAG: add a second mapper vertex.

Exit checkpoint

  • You can name every required argument to DAG.create(), Vertex.create(), Edge.create(), and EdgeProperty.create().
  • You can diagram the WordCount DAG without looking.
  • You have one JIRA ticket open in a browser tab that you've read end-to-end (description + every comment).

Week 2 — Edges in depth

Reading

  • tez-api/src/main/java/org/apache/tez/dag/api/EdgeProperty.java — all three enums (DataMovementType, DataSourceType, SchedulingType).
  • tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/EdgeManager*.java — five built-in edge managers.
  • tez-api/src/main/java/org/apache/tez/dag/api/InputDescriptor.java, OutputDescriptor.java, ProcessorDescriptor.java.

Hands-on

  • Build the same WordCount with BROADCAST instead of SCATTER_GATHER for the edge. Observe the failure mode and explain it.
  • Write a 3-vertex DAG (A -> B -> C) where A->B is ONE_TO_ONE and B->C is SCATTER_GATHER. Run it; confirm parallelism rules from the source.

JIRA practice queries

project = TEZ AND text ~ "EdgeManager" AND resolution = Unresolved
project = TEZ AND text ~ "broadcast" AND status = Resolved ORDER BY created DESC

Labs

  • Lab 1.3 — Edge type matrix experiment.

Exit checkpoint

  • Edge type matrix (movement × scheduling × source) drawn from memory.
  • You can predict, given edge properties, which EdgeManager impl will be picked.
  • One short forum/dev-list email you drafted (do not send) summarizing your reading of an EdgeManager file.

Weeks 3–4: Level 2 — Build, run, and read tests

Week 3 — Tez build system and module layout

Reading

  • pom.xml (root), tez-api/pom.xml, tez-dag/pom.xml.
  • BUILDING.txt.
  • tez-tests/src/test/java/org/apache/tez/test/MiniTezCluster.java — entry-point for nearly every integration test.

Hands-on

  • Run mvn -pl tez-dag test -Dtest=TestVertexImpl#testBasicVertexCompletion.
  • Run mvn -pl tez-tests test -Dtest=TestTezJobs#testWordCount.
  • Profile a build: mvn -DskipTests install -X 2>&1 | grep "Building\|BUILD".

JIRA practice queries

project = TEZ AND component = build AND status = Open
project = TEZ AND text ~ "MiniTezCluster" AND resolution = Unresolved

Labs

  • Lab 2.1 — Build Tez and run all tez-api tests.
  • Lab 2.2 — Add a no-op test to tez-dag and run it via Maven.

Exit checkpoint

  • You can explain why tez-dag depends on tez-api but not vice versa.
  • You know the difference between tez-runtime-internals and tez-runtime-library.
  • You can run a single test via Maven without consulting any docs.

Week 4 — Tests as documentation

Reading

  • tez-dag/src/test/java/org/apache/tez/dag/app/dag/impl/TestVertexImpl.java (~5000 lines; pick the top 10 test methods).
  • tez-dag/src/test/java/org/apache/tez/dag/app/dag/impl/TestDAGImpl.java.
  • tez-dag/src/test/java/org/apache/tez/dag/app/dag/impl/TestTaskImpl.java.

Hands-on

  • Pick one test method in TestVertexImpl; rewrite it from scratch in your notebook, then diff against the original.
  • Add an assertion that fails; observe the message; fix it.

JIRA practice queries

project = TEZ AND text ~ "flaky" AND status in (Open, "In Progress")
project = TEZ AND text ~ "TestVertexImpl" AND resolution = Unresolved

Labs

  • Lab 2.3 — Read TestVertexImpl#testKilledTasksHandling and explain every line.

Exit checkpoint

  • You can write a test that constructs a VertexImpl directly (without MiniTezCluster).
  • You understand the DrainDispatcher pattern (see state-machines.md).

Weeks 5–6: Level 3 — Submission and AM lifecycle

Week 5 — TezClient and submission

Reading

  • tez-api/src/main/java/org/apache/tez/client/TezClient.java.
  • tez-api/src/main/java/org/apache/tez/client/TezClientUtils.java.
  • tez-api/src/main/java/org/apache/tez/client/TezSessionImpl.java.

Hands-on

  • Write a small Java program that uses TezClient directly (no MR shim) to submit a DAG to MiniTezCluster.
  • Use both session and non-session modes; measure the second-DAG latency difference.

JIRA practice queries

project = TEZ AND component = "tez-api" AND text ~ "TezClient" AND status = Open

Labs

  • Lab 3.1 — Build a custom client that submits two DAGs in one session.

Exit checkpoint

  • You can list every method that talks to the AM over RPC (grep for dagAMProtocol in TezClient.java).
  • You can name the three local resources that TezClientUtils uploads.

Week 6 — DAGAppMaster bring-up

Reading

  • tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java — focus on serviceInit, serviceStart, dispatcher registration.
  • tez-dag/src/main/java/org/apache/tez/dag/app/TaskCommunicatorManager.java.
  • tez-dag/src/main/java/org/apache/tez/dag/app/launcher/ContainerLauncher*.java.

Hands-on

  • Run a DAG against MiniTezCluster with AM logs at DEBUG. Identify the line in DAGAppMaster.java that emits the first "Created DAG" log line.

Labs

  • Lab 3.2 — Map an AM log line to source code (Lab in Level 3).

Exit checkpoint

  • You can list the AsyncDispatcher event-handler registrations in DAGAppMaster in order.
  • You can walk the path from TezClient.submitDAG() to DAGImpl being instantiated inside the AM.

Weeks 7–9: Level 4 — Vertex internals and state machines

Week 7 — State machine library

Reading

  • hadoop-yarn-common StateMachineFactory source (you'll need to fetch Hadoop source separately).
  • tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java — read only the stateMachineFactory block first (~200 lines near the top).

Hands-on

  • Write a toy StateMachineFactory for a Light (OFF, ON, BROKEN) in a scratch project.

Labs

  • Lab 4.1 — State-machine introduction.

Exit checkpoint

  • You can explain SingleArcTransition vs MultipleArcTransition without notes.

Week 8 — VertexManager plugins

Reading

  • tez-api/src/main/java/org/apache/tez/dag/api/VertexManagerPlugin.java, VertexManagerPluginContext.java.
  • tez-dag/src/main/java/org/apache/tez/dag/library/vertexmanager/ShuffleVertexManager.java.

Labs

  • Lab 4.2 — VertexManager deep dive (the depth-bar lab).

Exit checkpoint

  • A working CountingVertexManager with passing unit test, as specified in Lab 4.2.

Week 9 — Task and TaskAttempt

Reading

  • tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/TaskImpl.java.
  • tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/TaskAttemptImpl.java.

Labs

  • Lab 4.3 — Task lifecycle walk.
  • Lab 4.4 — TaskAttempt termination causes.

Exit checkpoint

  • You can draw the TaskAttempt state machine from memory.
  • You can list every TaskAttemptTerminationCause and what produces it.

Weeks 10–11: Level 5 — Runtime, IPO, and shuffle

Week 10 — Runtime task execution

Reading

  • tez-runtime-internals/src/main/java/org/apache/tez/runtime/task/TezTaskRunner2.java.
  • tez-runtime-internals/src/main/java/org/apache/tez/runtime/LogicalIOProcessorRuntimeTask.java.

Labs

  • Lab 5.1 — Trace a task from container start to processor exit.

Exit checkpoint

  • You can list every umbilical call a task makes during its lifetime (grep umbilical in tez-runtime-internals).

Week 11 — Shuffle and merge

Reading

  • tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/shuffle/orderedgrouped/ShuffleManager.java.
  • tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/shuffle/orderedgrouped/Fetcher.java.
  • tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/sort/impl/PipelinedSorter.java.

Labs

  • Lab 5.2 — Spilled output inspection on MiniTezCluster.
  • Lab 5.3 — Force a fetch failure.

Exit checkpoint

  • You can explain IFile framing in two paragraphs.
  • You can name the three sorter implementations and when each is used.

Week 12: Level 6 — Scheduling and container reuse

Reading

  • tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java.
  • tez-dag/src/main/java/org/apache/tez/dag/app/rm/TaskSchedulerManager.java.
  • tez-dag/src/main/java/org/apache/tez/dag/app/rm/container/AMContainerImpl.java.

JIRA practice queries

project = TEZ AND text ~ "container reuse" AND status in (Open, "In Progress")

Labs

  • Lab 6.1 — Disable container reuse; measure latency cost.
  • Lab 6.2 — Read and explain tez.am.container.reuse.* configs.

Exit checkpoint

  • You can list the four conditions under which a container is not reused.

Week 13: Level 7 — MapReduce compatibility and integrations

Reading

  • tez-mapreduce/src/main/java/org/apache/tez/mapreduce/input/MRInput.java.
  • tez-mapreduce/src/main/java/org/apache/tez/mapreduce/output/MROutput.java.
  • tez-mapreduce/src/main/java/org/apache/tez/mapreduce/processor/map/MapProcessor.java.

Labs

  • Lab 7.1 — Submit a vanilla MR job via Tez (tez.lib.uris mode).

Exit checkpoint

  • You can write a one-page essay on "what MRInput does that a plain LogicalInput does not."

Week 14: Level 8 — Production diagnostics

Reading

  • tez-api/src/main/java/org/apache/tez/common/counters/TezCounters.java.
  • tez-dag/src/main/java/org/apache/tez/dag/history/HistoryEventHandler.java.
  • tez-plugins/tez-yarn-timeline-history/.

Labs

  • Lab 8.1 — Read a real ATS event dump.
  • Lab 8.2 — Trace a failure through the AM log + ATS + counters.

Exit checkpoint

  • You can answer: "Why did vertex X fail?" given only an AM log and ATS dump.

Weeks 15–16: Capstone

Follow capstone/index.md start-to-finish:

  1. Issue selection (week 15, day 1–2).
  2. Reproduction → root cause (week 15, day 3–7).
  3. Implementation + tests (week 16, day 1–4).
  4. Patch submission + write-up (week 16, day 5–7).

Exit checkpoint

  • A real patch attached to a real JIRA, with passing tests and a clear summary.
  • A 1500–3000 word public write-up of the experience.

How to use this plan when you fall behind

  • If you finish a week's reading but cannot pass the exit checkpoint, repeat the week. Do not advance.
  • If a JIRA query returns no results, change the query. The dev community moves; labels and components shift.
  • Skip a Level only if you can pass all exit checkpoints from previous Levels in one sitting.