Level 5: Testing Infrastructure
Apache Tez has one of the most complete test suites in the Hadoop ecosystem:
thousands of unit tests, a MiniTezCluster integration harness, and a
TestOrderedWordCount end-to-end reference. At this level you will move from
reading tests to writing them — adding missing coverage to TestVertexImpl,
submitting a real DAG against MiniTezCluster, and finding and fixing a flaky
test.
Why testing matters for contributors
Every Tez patch must include either (a) a new test that fails without the patch and passes with it, or (b) a clear justification in the JIRA for why a test is not needed. Committers will block patches that regress existing tests or that add unverified logic.
What this level covers
| Topic | Where |
|---|---|
MiniTezCluster setup/teardown lifecycle | Lab 5.1 |
TestOrderedWordCount as the canonical integration test template | Lab 5.1 |
Adding a missing TestVertexImpl transition test | Lab 5.2 |
| Writing a full mini-cluster integration test for your own DAG | Lab 5.3 |
| Identifying, reproducing, and fixing a flaky test | Lab 5.4 |
Prerequisites
- Level 4 complete (you understand
VertexImplstate machine andVertexManagerPlugin) - Tez source checked out and
mvn install -DskipTestssucceeded
Test categories and Maven commands
| Category | What it tests | Command |
|---|---|---|
| Unit | Single class in isolation with mocks | mvn test -pl tez-dag -Dtest=TestVertexImpl |
| Mini-cluster integration | Full AM + YARN + HDFS in-process | mvn test -pl tez-tests -Dtest=TestOrderedWordCount |
| System | Real cluster (CI only) | Not run locally |
Key test classes
| Class | Module | What it covers |
|---|---|---|
TestVertexImpl | tez-dag | VertexImpl state machine, transitions, vertex recovery |
TestDAGImpl | tez-dag | DAGImpl state machine, DAG-level events |
TestTaskImpl | tez-dag | TaskImpl scheduling, speculation, counters |
TestTaskAttemptImpl | tez-dag | TaskAttemptImpl state transitions |
TestOrderedWordCount | tez-tests | End-to-end DAG submission against MiniTezCluster |
TestMiniTezClusterWithTez | tez-tests | Multi-DAG runs, recovery, kill scenarios |
Expected outcome
By the end of this level you will have:
- Run a DAG against
MiniTezClusterinside a JUnit test - Added a missing state-machine transition test to
TestVertexImpl - Identified and fixed a flaky test (or documented why it flakes)