Lab 3.1: Trace a DAG Submission End-to-End
Background
A DAG goes from a Java object constructed with the API to running tasks in containers through a sequence of method calls, IPC calls, and event posts that spans six class boundaries and three JVMs. This lab asks you to trace that path precisely — class name, method name, and the data that crosses each boundary.
Being able to reconstruct this trace from code (not from documentation) is the skill. That
means reading DAGAppMaster.java, DAGImpl.java, VertexImpl.java, and
TezChild.java and following the chain yourself.
The Six Class Boundaries
[1] TezClient.submitDAG(dag)
│
│ DAGClientAMProtocol (IPC) — carries: SubmitDAGRequest{DAGPlan}
▼
[2] DAGClientHandler.submitDAG(request) [in DAGAppMaster]
│
│ posts: DAGAppMasterEvent(NEW_DAG_SUBMITTED)
▼
[3] DAGAppMaster.handle(event)
│
│ calls createDag(dagPlan) → new DAGImpl(...)
│ posts: DAGEvent(DAG_INIT)
▼
[4] DAGImpl.handle(DAGEvent{DAG_INIT})
│
│ InitTransition: initializes all VertexImpl objects
│ posts: VertexEvent(V_INIT) for each vertex
▼
[5] VertexImpl.handle(VertexEvent{V_INIT})
│
│ InitTransition: sets up tasks, calls VertexManager
│ posts: VertexEvent(V_START) when ready
│ posts: TaskEvent(T_SCHEDULE) for each task
▼
[6] TaskImpl → TaskAttemptImpl → ContainerLauncher → NM
│
│ NM starts container JVM: TezChild.main()
▼
[Container JVM] TezChild receives task assignment via TezTaskUmbilicalProtocol
│
▼
LogicalIOProcessorRuntimeTask.run() — Processor.run() called
Step-by-Step Tasks
Step 1: Find the Entry Point in TezClient
Open tez-api/src/main/java/org/apache/tez/dag/api/TezClient.java.
Find the submitDAG(DAG dag) method. Answer:
- What is the name of the IPC protocol interface used to communicate with the AM?
- What does
TezClientdo if it does not yet have an AM to talk to (session not started)? - What method on the DAG object serializes it to a
DAGPlanprotobuf? - What request object wraps the
DAGPlanbefore it is sent over IPC?
# Find the IPC protocol interface
grep -n "Protocol" tez-api/src/main/java/org/apache/tez/dag/api/TezClient.java | head -10
# Find DAGPlan construction
grep -n "DAGPlan\|createDag\|getPlan" tez-api/src/main/java/org/apache/tez/dag/api/TezClient.java
Step 2: Find the AM-side IPC Handler
The AM exposes the DAGClientAMProtocol interface. The implementation is in DAGAppMaster.
# Find the implementation of submitDAG on the AM side
grep -rn "submitDAG" tez-dag/src/main/java/org/apache/tez/dag/app/ | grep -v test
Open the handler class. Answer:
- What is the exact class name that implements
DAGClientAMProtocol? - What event type does it post to the
AsyncDispatcherafter receiving theDAGPlan? - Does the
submitDAGcall on the AM side block until the DAG completes, or does it return immediately?
Step 3: Trace DAGAppMaster Initialization
Open tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java.
Find the serviceStart() method. Read the component initialization order:
- List the components initialized in
serviceStart()in order - Find where
AsyncDispatcheris created and started - Find where the
DAGEventDispatcher(the component that routesDAGEvents toDAGImpl) is registered
# Find component initialization
grep -n "addService\|serviceStart\|startService" \
tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java | head -20
Step 4: Read the DAGImpl Init Transition
Open tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/DAGImpl.java.
Find the StateMachineFactory definition. Locate the transition for DAGEventType.DAG_INIT.
The transition handler class is InitTransition. Find it in the same file.
Answer:
- What does
InitTransition.transition()do with each vertex in the DAG? - After initializing vertices, what event does
DAGImplpost? - Under what condition does the init transition immediately move to
RUNNINGvs waiting?
# Find the init transition
grep -n "InitTransition\|DAG_INIT" \
tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/DAGImpl.java | head -20
Step 5: Read the VertexImpl Init Transition
Open tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java.
Find the transition from INITIALIZING on event V_INIT. The handler is InitTransition
(a different class from the one in DAGImpl).
Answer:
- What is the
VertexManagerand when is it invoked during initialization? - How does
VertexImplknow how many tasks to create (the parallelism)? - What event does
VertexImplsend toDAGImplwhen initialization completes?
# Find vertex init transition
grep -n "V_INIT\|InitTransition" \
tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java | head -20
Step 6: Trace the Container Launch
After tasks are scheduled, TaskAttemptImpl requests a container from the TaskScheduler.
When a container is assigned, ContainerLauncher builds the launch context.
# Find the container launch command construction
grep -rn "containerLaunchContext\|getContainerLaunchContext\|vargs" \
tez-dag/src/main/java/org/apache/tez/dag/app/launcher/ | grep -v test | head -10
Answer:
- What is the main class of the container JVM? (The class with
main()that YARN launches) - What information is passed to
TezChildvia system properties vs environment variables? - How does
TezChildknow which task to run when it starts?
Step 7: Read TezChild.main()
Open tez-dag/src/main/java/org/apache/tez/dag/app/TezChild.java.
Find the main() method and the run() loop.
Answer:
- What IPC interface does
TezChilduse to communicate with the AM? - What does
TezChilddo when it receives aTaskSpecfrom the AM? - What class is instantiated to actually run the processor?
# Find TezChild
find tez-dag/src/main/java -name "TezChild.java"
wc -l $(find tez-dag/src/main/java -name "TezChild.java")
Complete the Trace Table
Fill in this table by reading the code (not from this page or any other documentation):
| Step | Class | Method | Data / Event |
|---|---|---|---|
| 1 | TezClient | submitDAG() | Sends SubmitDAGRequest{DAGPlan} via IPC |
| 2 | ? | submitDAG() | Posts event ??? |
| 3 | DAGAppMaster | handle() | Creates DAGImpl, posts DAGEvent{DAG_INIT} |
| 4 | DAGImpl | InitTransition.transition() | Posts VertexEvent{V_INIT} for each vertex |
| 5 | VertexImpl | InitTransition.transition() | Posts TaskEvent{T_SCHEDULE} for each task |
| 6 | TaskAttemptImpl | ? | Requests container from RM via TaskScheduler |
| 7 | ContainerLauncher | ? | Launches container JVM with TezChild as main class |
| 8 | TezChild | run() | Receives task spec, starts processor |
| 9 | LogicalIOProcessorRuntimeTask | run() | Calls Processor.run() |
Fill in the ? cells from the actual code. Each cell should contain the real method name.
Expected Output
A completed trace table with all cells filled from code, not from documentation. Each answer should be verifiable by pointing to a specific line in a specific file.
Example format for your notes:
Step 2: DAGClientHandler.submitDAG()
in: tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java
line: ~1234
posts: DAGAppMasterEvent(NEW_DAG_SUBMITTED)
Stretch Goals
-
Find the
AsyncDispatcherqueue size configuration. What happens if the queue fills up?grep -rn "AsyncDispatcher\|dispatcher.queue" \ tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java | head -10 -
Find where the AM is told to exit when the DAG completes:
grep -n "stop\|shutdown\|exit" \ tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/DAGImpl.java | grep -i "succeeded\|complete" -
Trace what happens to a
TA_DONEevent fromTezChildback toDAGImpl:TezChildcalls a method on the umbilical- The AM receives it and posts a
TaskAttemptEvent TaskAttemptImpltransitions toSUCCEEDED- The chain continues up to
DAGImplIdentify every class and event in this reverse chain.