JIRA & Code Review — Inside a Tez Review

This chapter is the committer view of code review. Read it as a contributor, and your patches will become reviewable. Read it as a new committer, and you'll have a workflow.

Tez is RTC

Apache projects choose between two commit philosophies:

ModelMeaningUsed by
RTC (Review Then Commit)Patch must be reviewed and +1'd before commitTez, Hive, Hadoop (for most code)
CTR (Commit Then Review)Committer may commit and discuss afterSome smaller projects, certain Hadoop subsystems

Tez is RTC. The implication: every commit went through at least one review round. Patches sitting at "Patch Available" with no review block on attention, not on velocity — the committer pool is finite.

The RTC exception: trivial fixes (typos, log message edits, javadoc improvements) may be committed by a committer without an explicit +1, but the commit message references the JIRA and the patch is still attached for the record.

How a Committer Reads a Patch

When a committer opens your patch (in JIRA, GitHub PR, or git apply locally), the sequence is roughly:

1. Read the JIRA description.              30s
2. git apply --check on a fresh clone.     30s
3. Look at git diff --stat.                30s
4. Read the test changes.                  2-5 min
5. Read the implementation changes.        5-15 min
6. Run mvn checkstyle:check.               30s
7. Run mvn test in changed modules.        2-15 min
8. Optionally: run an integration test.    5-30 min
9. Comment.                                Variable

The first three steps determine whether the patch gets the full read or a bounce. If the JIRA is unclear or the diff doesn't apply or includes unrelated changes, the patch goes back without step 5.

The Skim Phase

A committer skimming git diff --stat is looking for:

  • File count and module spread. A patch touching one module is easy; one touching five is suspicious.
  • Tests in the diff. No tests in a behavior-changing patch is a red flag.
  • Generated files in the diff. target/, *.iml, .idea/ — never committed.
  • Whitespace-only churn. git diff -w should not be vastly smaller than git diff.

If any of these are off, expect a comment before the implementation is read.

The Test Phase

Committers read tests before implementation because the test reveals intent. A good test named testRecoverNoInputs tells the reviewer:

  • The bug is in recovery.
  • The trigger is "no inputs."
  • The fix should not break recovery in any other case.

If the test is missing, weak (no assertions, or assertions that would pass without the fix), or named generically (testMethod1), the reviewer assumes the implementation is also weak.

The Implementation Phase

By the time the reviewer reads the code, they have a mental model from the JIRA, the test name, and the diff stat. The implementation read is checking:

  • Does the code match the intent of the JIRA and test?
  • Is the change minimal — does it touch what it must, and only what it must?
  • Are exceptions handled appropriately for the file's conventions?
  • Is logging at the right level (DEBUG for hot paths, INFO for state transitions, WARN for recoverable, ERROR for unrecoverable)?
  • Are there obvious thread-safety issues (state visible across threads, shared mutable collections)?
  • Are there back-compat concerns? (See Compatibility)

Comment Phrasing

Committer comments follow soft conventions that contributors should recognise — they encode meaning beyond the literal text.

CommentMeans
"Nit: ..."Stylistic preference; you may take it or push back without controversy.
"Suggestion: ..."Reviewer thinks there's a better way but isn't blocking.
"Concern: ..."Reviewer wants this addressed before commit.
"I don't think this is right."Block; must be resolved.
"Have you considered X?"Genuine question; respond with your reasoning.
"Let's discuss on dev@."Issue is bigger than the patch; design discussion needed.
"+1 LGTM"Approval (informal).
"+1 pending checkstyle"Conditional approval.
"-1, see ..."Veto; must be resolved before commit.

Reciprocal etiquette on responses, see Responding to Feedback: acknowledge every comment explicitly, fix what's fixable, push back with evidence on what's not.

Patch Available → Reviewed Lifecycle

The JIRA state transitions for a typical patch:

Open
 |  (contributor starts)
 v
In Progress
 |  (contributor attaches .001)
 v
Patch Available  ← reviewer reads here
 |  (review comments)
 v
In Progress  ← contributor revises
 |  (attaches .002)
 v
Patch Available
 |  (LGTM)
 v
Resolved (committer commits to trunk)
 |  (release ships)
 v
Closed

The patch attachments accumulate: .001, .002, .003. They are never deleted. Future readers can reconstruct the review by walking through them.

GitHub-PR-based reviews follow the same lifecycle, but the iteration happens in the PR's commit history rather than separate .NNN.patch files. The JIRA still moves through the states above.

Backport Patches

A patch may need to land on multiple branches (e.g. master and branch-0.10). The contributor attaches both:

TEZ-4321.001.patch                 (for master)
TEZ-4321.branch-0.10.001.patch     (for the maintenance branch)

The committer reviews and commits each. The JIRA comment notes the commits:

Committed to master: <sha>
Committed to branch-0.10: <sha>

The Committer's Pre-Commit Checklist

A committer about to commit runs:

cd ~/tez-src
git fetch origin
git checkout master
git merge --ff-only origin/master
git apply --check /tmp/TEZ-4321.003.patch
git apply /tmp/TEZ-4321.003.patch

mvn install -DskipTests
mvn checkstyle:check
mvn test -pl tez-dag,tez-api      # changed modules
mvn test -pl tez-tests -Dtest=TestOrderedWordCount

git add -A
git commit -s -m "TEZ-4321: Fix NPE in VertexImpl.recover when no inputs. (Jane Doe via gunther)"
git push origin master

Notes on the commit step:

  • -s adds a Signed-off-by: trailer. Tez doesn't currently require DCO, but it's Apache-idiomatic.
  • The (Jane Doe via gunther) suffix is added by the committer, not the contributor.
  • The push goes to apache/tez (committer karma required).

After push:

1. Update JIRA: status → Resolved, set Fix Version (e.g. "0.10.5").
2. Comment on JIRA with the commit SHA.
3. Thank the contributor.

Holding the "No" Muscle

A subtle and underappreciated committer skill is declining patches that shouldn't go in. A patch can be technically correct and still not belong in trunk — too narrow a use case, too much added complexity, the wrong layer.

Wording for a respectful decline:

Thanks for the patch. After reading, I'm not comfortable taking this in trunk because REASON. I appreciate the work, and I'd encourage ALTERNATIVE-PATH. Closing the JIRA as Won't Fix; if there's broader consensus on dev@ for a different approach, happy to reopen.

The "no" muscle is not natural. Committers learn it because the alternative — accepting every patch — accumulates technical debt that the committer pool will pay forever. See Committer Mindset.

When to Refactor Unsolicited Code in a Patch

A contributor's patch sometimes lands in a corner of the code the committer would like to clean up. The temptation is to do the cleanup at commit time. Don't.

The rules:

  • Never modify the contributor's diff at commit. The patch attached to JIRA must match what was reviewed.
  • File a follow-up JIRA for the cleanup. Reference the contributor in CC.
  • If the patch creates a refactoring opportunity, take it later. Not in this commit.

The exception: trivial cleanups the contributor agreed to in review may be applied at commit. The JIRA comment notes them. Example:

Committed with a small change: extracted the new logic into a private
helper method as discussed in review. Attaching the committed patch
as .004 for the record.

What Goes On the GitHub PR vs. JIRA

Tez accepts patches as JIRA attachments and as GitHub PRs (linked from the JIRA). The mapping:

Lives onWhat
JIRADescription, design discussion, root cause, attachments (.NNN.patch), final commit reference
GitHub PR (if used)Line-by-line comments, CI run results, iterative push history

A PR without a linked JIRA is incomplete; the JIRA is the system of record. A JIRA without a PR is fine — many Tez patches are still JIRA-attachment-only.

If you open a PR, link it on the JIRA in the first comment and set the JIRA to "Patch Available."

Worked Example — A Full Review Cycle

JIRA: TEZ-4321. "Fix NPE in VertexImpl.recover when no inputs."

Day 0   You: file JIRA with description, repro, root cause.
        Set yourself as assignee, status In Progress.
        Attach .001 patch; status → Patch Available.

Day 3   Committer @gopalv: applies patch locally, runs tests, reviews.
        Comments on JIRA:
          - L88: prefer Collections.emptyList().
          - L92: add a test for the no-inputs case.
          - L94: should we handle no-outputs symmetrically? Concern: see
            VertexImpl.recover at L142, looks like the same shape.

Day 4   You: reply on JIRA.
          - L88: agreed.
          - L92: agreed; adding testRecoverNoInputs.
          - L94: I see the parallel but think it's a separate JIRA.
            Filing TEZ-4329 to track.
        Attach .002.

Day 7   @gopalv: re-reviews. "+1 LGTM."

Day 8   @gopalv: commits.
        "TEZ-4321: Fix NPE in VertexImpl.recover when no inputs. (Jane Doe via gopalv)"
        Sets JIRA Resolved, Fix Version 0.10.5.

Day 8   You: comment "Thanks @gopalv. Working on TEZ-4329 next."

That is a healthy review — 2 patch rounds, 1 follow-up JIRA filed, no friction.

Validation Artifacts

After this chapter:

  1. A ~/tez-notes/reviewer-vocab.md cheatsheet from the comment-phrasing table.
  2. The four checklist steps committers run pre-commit, saved for when you are one.
  3. The discipline to never modify a contributor's diff at commit (with an exception only for explicit reviewer-author agreement).
  4. The reflex to comment "Thanks @COMMITTER" after a merge of your patch.

The next chapter — Committer Mindset — takes the perspective further: the judgement model committers use across many patches and many years.