Lab 2.4 — Review It: Spot the Flaws in TEZ-FAKE001.001.patch
Lab type: Review-It — read a synthetic patch, find every flaw, explain the impact, propose fixes
Estimated time: 60–90 min
Tez component: tez-dag → org.apache.tez.dag.app.dag.impl.TaskImpl
Context
You are a Tez committer reviewing a patch uploaded to JIRA. The contributor claims
the patch fixes a race condition where TaskImpl.getCounters() returns null when
called before any task attempt has completed.
Your job is to review the patch before it merges. There are exactly 5 intentional flaws hidden in the diff below. Find them all.
The Synthetic Patch
diff --git a/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/TaskImpl.java b/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/TaskImpl.java
index a1b2c3d..e4f5a6b 100644
--- a/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/TaskImpl.java
+++ b/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/TaskImpl.java
@@ -214,6 +214,8 @@ public class TaskImpl implements Task, EventHandler<TaskEvent> {
+ import org.apache.tez.common.counters.TezCounters;
+
public synchronized TezCounters getCounters() {
TezCounters counters = null;
if (successfulAttempt != null) {
@@ -221,7 +223,7 @@ public class TaskImpl implements Task, EventHandler<TaskEvent> {
counters = successfulAttempt.getCounters();
} else {
counters = attemptList.stream()
- .filter(a -> a.getState() == TaskAttemptState.SUCCEEDED)
+ .filter(a -> a.getState() == TaskAttemptState.RUNNING)
.findFirst()
.map(TaskAttemptImpl::getCounters)
.orElse(null);
@@ -231,6 +233,14 @@ public class TaskImpl implements Task, EventHandler<TaskEvent> {
return counters;
}
+ /**
+ * Returns the counter for this task, or a new empty TezCounters object
+ * if no counters are available yet.
+ *
+ * @return counters, never null
+ */
+ public synchronized TezCounters getCountersOrEmpty() {
+ TezCounters c = getCounters();
+ return c == null ? new TezCounters() : c;
+ }
+
diff --git a/tez-dag/src/test/java/org/apache/tez/dag/app/dag/impl/TestTaskImpl.java b/tez-dag/src/test/java/org/apache/tez/dag/app/dag/impl/TestTaskImpl.java
index b7c8d9e..f0a1b2c 100644
--- a/tez-dag/src/test/java/org/apache/tez/dag/app/dag/impl/TestTaskImpl.java
+++ b/tez-dag/src/test/java/org/apache/tez/dag/app/dag/impl/TestTaskImpl.java
@@ -891,6 +891,18 @@ public class TestTaskImpl {
+ @Test
+ public void testGetCountersBeforeAnyAttempt() {
+ // No attempts started; counters should not be null
+ initTask();
+ TezCounters result = task.getCounters();
+ assertNotNull("getCounters() must not return null", result);
+ }
+
+ @Test
+ public void testGetCountersOrEmptyReturnsSameObjectEachTime() {
+ initTask();
+ TezCounters first = task.getCountersOrEmpty();
+ TezCounters second = task.getCountersOrEmpty();
+ assertSame("Must return same instance", first, second);
+ }
+
Your Task
For each flaw you find, fill in the table:
| # | File | Line / hunk | Flaw description | Why it matters | Suggested fix |
|---|---|---|---|---|---|
| 1 | |||||
| 2 | |||||
| 3 | |||||
| 4 | |||||
| 5 |
Guided Questions
Work through these questions one by one. Each one points at a different flaw.
Question 1 — Import placement
Look at where the import statement was added:
+ import org.apache.tez.common.counters.TezCounters;
+
public synchronized TezCounters getCounters() {
- Is this a valid location for a Java
importdeclaration? - What would happen at compile time if this diff were applied as-is?
- Where should imports go in a Java file?
- Lookup: does
TaskImpl.javaalready importTezCountersat the top?
(grep "import.*TezCounters" tez-dag/src/main/java/.../TaskImpl.java)
What is the flaw?
Question 2 — The filter predicate
The patch changes the fallback stream filter from:
.filter(a -> a.getState() == TaskAttemptState.SUCCEEDED)
to:
.filter(a -> a.getState() == TaskAttemptState.RUNNING)
- Re-read the JIRA description: the reporter says
getCounters()returns null when called before any attempt has completed. - Does filtering for
RUNNINGattempts fix that? - What does it mean to read counters from a
RUNNINGattempt vs aSUCCEEDEDone? - Are the counters of a still-running attempt considered final/reliable?
What is the flaw? What should the filter be?
Question 3 — The new test testGetCountersBeforeAnyAttempt
Read the test body carefully:
TezCounters result = task.getCounters();
assertNotNull("getCounters() must not return null", result);
- The test asserts that
getCounters()is not null when no attempt has started. - But the patch does not change
getCounters()to return an empty object — it adds a separate methodgetCountersOrEmpty()for that. - When
successfulAttemptis null andattemptListis empty, what doesgetCounters()actually return? - Will this test pass or fail against the patched code?
What is the flaw?
Question 4 — The new test testGetCountersOrEmptyReturnsSameObjectEachTime
assertSame("Must return same instance", first, second);
getCountersOrEmpty()is implemented as:
return c == null ? new TezCounters() : c;- Each call creates a new
TezCounters()whencis null. - Does the
assertSameassertion match the implementation? - Is
assertSametesting a documented contract, or is it over-specifying an implementation detail? - What assertion would actually verify the intended contract ("not null")?
What is the flaw?
Question 5 — The JIRA description says the fix is needed, but…
Re-read the patch one final time. The root cause (as stated in the JIRA) is that
getCounters() can return null. The correct caller-safe fix for most Tez callers
would be to make getCounters() itself never return null (return empty TezCounters as
the contract).
Instead the patch adds getCountersOrEmpty() as a new method — but leaves the old
getCounters() method returning null.
- Every existing caller of
getCounters()still gets null. - The Tez codebase uses
getCounters()in aggregation loops that iterate counters:counters.incrAllCounters(taskCounters)— passing null there throws NPE. - How many callers of
getCounters()exist intez-dag?
grep -rn "\.getCounters()" tez-dag/src/main/ | grep -v "//.*getCounters" | wc -l
- Does the patch actually fix the original bug?
What is the flaw?
Answer Key (Read After You've Filled the Table)
Reveal answers
| # | Flaw | Impact | Fix |
|---|---|---|---|
| 1 | import statement placed inside the class body (after the opening {) | Compile error — Java imports must precede the class declaration | Remove the import; TezCounters is already imported at the top of TaskImpl.java |
| 2 | Filter changed to RUNNING instead of keeping SUCCEEDED; a running attempt's counters are partial and unstable | Returns wrong/partial data; counters values change as the attempt progresses | Revert to SUCCEEDED filter; the real fix is to handle the "no succeeded attempt yet" case separately (return null or empty) |
| 3 | testGetCountersBeforeAnyAttempt asserts assertNotNull on getCounters() which still returns null when no attempt has completed | Test will fail on the patched code — the patch doesn't make getCounters() non-null | Test should call getCountersOrEmpty() or the assertion should accept null and document the contract |
| 4 | assertSame requires the same object reference but getCountersOrEmpty() creates a new TezCounters() each time null is returned | Test fails on every call where no successful attempt exists | Use assertNotNull to verify the non-null contract; don't assert reference identity |
| 5 | The patch adds getCountersOrEmpty() but doesn't fix the root cause — getCounters() still returns null; all existing callers are still broken | Downstream NPEs in counter aggregation loops are not fixed | Change getCounters() itself to return new TezCounters() instead of null, or add a null-guard in every caller; document the chosen contract |
Reflection
-
A patch that adds a new method instead of fixing the old one is sometimes called an "additive workaround." When is that acceptable? When is it wrong?
-
The Apache Tez review process requires that every patch include a test that would have failed before the fix and passes after. Does this patch satisfy that requirement? Why or why not?
-
If you were the committer, what feedback would you leave on JIRA? Write two or three sentences in the style of a real review comment (constructive, specific, pointing at the line).
-
Look up a real Tez JIRA review thread (search
issues.apache.org/jiraforproject = TEZ AND labels = patch-available AND resolution = Fixed). Find one comment where a committer asked for a test change. What did they say?