Lab 5.2 — Add a Missing TestVertexImpl Transition Test

Lab type: Fix-It (test improvement)
Estimated time: 90 min
Tez module: tez-dag
Key class: org.apache.tez.dag.app.dag.impl.TestVertexImpl


Overview

TestVertexImpl covers the VertexImpl state machine but no test suite is ever complete. In this lab you will:

  1. Read the state machine definition
  2. Identify an untested transition
  3. Write a JUnit test that exercises that transition
  4. Verify it fails without the expected assertions and passes with them

This is the canonical entry point for new Tez contributors — many accepted patches are "add test coverage for transition X".


Step 1 — Locate the State Machine Definition

find ~/tez-src -name "VertexImpl.java" | head -3
grep -n "StateMachineFactory\|addTransition" \
  ~/tez-src/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java \
  | head -50

The state machine is built with StateMachineFactory<VertexImpl, VertexState, VertexEventType, VertexEvent>. Each addTransition() call defines:

  • current state
  • event type
  • next state
  • transition action

Step 2 — Read TestVertexImpl.java

wc -l ~/tez-src/tez-dag/src/test/java/org/apache/tez/dag/app/dag/impl/TestVertexImpl.java

It is large (~5,000 lines). You do not need to read it all. Instead:

grep -n "public void test" \
  ~/tez-src/tez-dag/src/test/java/org/apache/tez/dag/app/dag/impl/TestVertexImpl.java \
  | head -60

List all test method names.


Step 3 — Find an Untested Transition

Compare the transitions in VertexImpl.java to the tests in TestVertexImpl.java.

Strategy:

  1. List all addTransition calls with grep -n "addTransition" VertexImpl.java
  2. For each transition, search TestVertexImpl.java for a test that covers the (fromState, eventType) pair
  3. Find one that is missing

Hint: look at transitions from INITED state. Some transitions from INITED triggered by rare events (e.g. VERTEX_FAILED before a task is scheduled) are often not explicitly tested.


Step 4 — Write the Test

Add a new test to TestVertexImpl.java. Follow the exact style of the surrounding tests:

@Test(timeout = 5000)
public void testVertexFailed_FromInitedState() {
    // TODO: initialize a vertex to INITED state using the existing test helpers
    //       then send a VERTEX_FAILED event
    //       assert the vertex transitions to ERROR or FAILED state
    //       assert any cleanup callbacks were invoked
}

Pattern to follow:

  • Look for an existing test that puts the vertex in the state you need (e.g. testVertexWithInitializer reaches RUNNING; look for a simpler path)
  • Use dispatcher.getEventHandler().handle(new VertexEventXxx(...)) to fire events
  • Use vertex.getState() to assert the resulting state

Step 5 — Run the New Test

cd ~/tez-src
mvn test -pl tez-dag \
  -Dtest=TestVertexImpl#testVertexFailed_FromInitedState -q 2>&1 | tail -20

Step 6 — Run the Full Test Class

mvn test -pl tez-dag -Dtest=TestVertexImpl -q 2>&1 | tail -10

All existing tests must still pass.


Step 7 — Write the Patch and JIRA Description

cd ~/tez-src
git diff > /tmp/TEZ-VERTEXTEST.001.patch
cat /tmp/TEZ-VERTEXTEST.001.patch

Draft JIRA:

Summary: TestVertexImpl is missing coverage for VERTEX_FAILED from INITED state

Description:
  The VertexImpl state machine defines a transition (INITED, VERTEX_FAILED)
  but TestVertexImpl has no test that fires this event path.  This patch adds
  TestVertexImpl#testVertexFailed_FromInitedState to cover the gap.

Priority: Minor
Component: tez-dag

Deeper Understanding

#Question
1What is the difference between VertexState.FAILED and VertexState.ERROR? When does the AM choose each?
2TestVertexImpl uses a mock AppContext. What methods on AppContext does VertexImpl call most frequently? (grep for appContext.)
3What is DrainDispatcher and why is it used in tests instead of AsyncDispatcher?
4Some tests set a Clock mock. Why would a state machine test need to control time?