Lab 2.2: Prepare a Patch Using Apache Practices
Background
A "patch" in Apache open-source culture means a unified diff file attached to a JIRA issue. This lab walks you through the complete workflow: finding a safe change to make, preparing the patch, verifying it, and writing the JIRA description.
This lab uses a real but trivial change as the vehicle — a Javadoc improvement in
tez-api. Trivial changes are intentional: the goal is to master the workflow, not to write
impressive code.
The Apache Git Patch Workflow
Apache Tez development uses a linear history on master (now trunk in some Apache projects,
master in Tez). The standard contributor workflow:
origin/master (read-only for non-committers)
|
↓ checkout
local/master
|
↓ branch
local/TEZ-NNNN
|
↓ make changes
↓ mvn test (pass)
↓ mvn checkstyle:check (pass)
↓ git diff origin/master > TEZ-NNNN.001.patch
|
→ Attach to JIRA
You never push your branch to Apache. You generate a diff and attach it.
Step-by-Step Tasks
Step 1: Set Up Your Working Branch
cd /path/to/tez
# Always start from a clean, up-to-date master
git fetch origin
git checkout master
git merge origin/master
# Create a branch named after the JIRA issue you are working on
# Use TEZ-0000 as a placeholder for this lab
git checkout -b TEZ-0000-javadoc-tezvertex
Verify you are on the new branch:
git branch
# * TEZ-0000-javadoc-tezvertex
# master
Step 2: Find a Target for Your Change
Open tez-api/src/main/java/org/apache/tez/dag/api/Vertex.java.
Look for public methods that:
- Have no Javadoc, or
- Have a
@paramtag with a non-descriptive name like// TODO, or - Have a
@returntag missing from a non-void method
A useful starting point:
# Find methods with empty or missing Javadoc in tez-api
javadoc -private -sourcepath tez-api/src/main/java \
org.apache.tez.dag.api 2>&1 | grep "no comment"
Or manually: open Vertex.java in IntelliJ, look at the addDataSink() method. If it lacks
a @param description for dataSink, that is your target.
Step 3: Make the Change
Add or improve the Javadoc for the method you identified. Follow this format exactly:
/**
* Adds a {@link DataSink} to this vertex. The sink will receive the output
* of this vertex after all tasks complete.
*
* @param outputName
* the name used to identify this sink in the DAG; must be unique
* within this vertex
* @param dataSink
* the {@link DataSink} descriptor defining the sink type and
* configuration
* @return this {@link Vertex} instance (for method chaining)
* @throws IllegalStateException if the vertex has already been added to a {@link DAG}
*/
public Vertex addDataSink(String outputName, DataSinkDescriptor dataSink) {
Rules for Apache Javadoc style:
- First sentence is a brief imperative description (no subject: "Adds a…" not "This method adds a…")
- Multi-line
@paramdescriptions indent the continuation by 10 spaces (2 more than@param) - Use
{@link ClassName}for all class references - Use
{@code value}for code literals and parameter names in prose
Step 4: Verify Compilation
mvn compile -pl tez-api -q
Expected: BUILD SUCCESS with no errors.
Step 5: Run Checkstyle
mvn checkstyle:check -pl tez-api
Expected: BUILD SUCCESS. If there are violations, fix them before continuing.
Common Javadoc-specific violations:
JavadocStyle— Javadoc comment does not end with a periodJavadocMethod—@paramor@returntag is missingJavadocVariable— public field missing Javadoc
Step 6: Run the Relevant Tests
mvn test -pl tez-api -q
Expected: BUILD SUCCESS. Even a pure Javadoc change requires a test run — checkstyle runs
as part of the test phase in some configurations.
Step 7: Generate the Patch
# Verify what you changed
git diff
# The diff should show only the lines you intentionally changed
# No whitespace changes, no unrelated files
# Generate the patch file
git diff origin/master > /tmp/TEZ-0000.001.patch
# Inspect it
cat /tmp/TEZ-0000.001.patch
The patch file should:
- Start with
diff --git a/tez-api/... - Show exactly the lines you added/removed (prefixed with
+/-) - Contain no changes to files you did not intend to modify
If the patch is longer than expected, run git status to find unexpected changes and
use git checkout -- <file> to revert them.
Step 8: Write the JIRA Description
For the JIRA issue you would create for this patch, write:
Summary line format:
TEZ-0000. Improve Javadoc for Vertex.addDataSink()
Description format:
Problem:
The addDataSink() method in Vertex.java has no @param documentation for the
'dataSink' parameter. This makes it harder for new users to understand the
expected input without reading the implementation.
Fix:
Add complete @param, @return, and @throws Javadoc for addDataSink().
Testing:
mvn test -pl tez-api (all existing tests pass)
mvn checkstyle:check -pl tez-api (no violations)
Step 9: Review the Patch as a Committer Would
Before attaching a patch, ask yourself:
- Does the patch contain only the changes described in the JIRA description?
- Does it pass
mvn test -pl <module>locally? - Does it pass
mvn checkstyle:check -pl <module>? - Is the commit message format correct? (
TEZ-NNNN. Short description.) - Is there a clear explanation in the JIRA description of what was wrong and what was fixed?
If any answer is "no", fix it before uploading.
Common Mistakes
| Mistake | How to detect | Fix |
|---|---|---|
| Patch includes unrelated formatting changes | git diff shows hundreds of lines | git checkout -- <unintended-file> |
| Patch modifies generated code | Proto-generated files in the diff | Revert generated files; only change source |
Patch applies only to a non-master branch | git diff origin/master shows no changes | Rebase your branch onto current master |
| Checkstyle violation in unchanged line | mvn checkstyle:check fails in a line you did not write | You must fix it anyway — it is in your patch |
| Test fails on unrelated module | Running all tests surfaces a pre-existing failure | Confirm by running on a clean checkout; note the existing failure in JIRA |
JIRA Status Workflow
After attaching your patch:
- Set the JIRA status to "Patch Available"
- Add a comment: "Patch attached. Tested with
mvn test -pl tez-apiandmvn checkstyle:check -pl tez-api, both pass." - Wait for a committer to review — do not ping on the mailing list immediately
- If no response in 2 weeks, it is acceptable to send one polite reminder to
dev@tez.apache.org:Subject: [REMINDER] TEZ-NNNN patch available for review Hi dev@, Friendly reminder that TEZ-NNNN has a patch attached. Any feedback welcome. https://issues.apache.org/jira/browse/TEZ-NNNN Thanks
Expected Output
At the end of this lab you have:
- A local branch
TEZ-0000-javadoc-tezvertexwith a Javadoc change - A passing test run:
mvn test -pl tez-api - A passing checkstyle run:
mvn checkstyle:check -pl tez-api - A patch file at
/tmp/TEZ-0000.001.patchwith only the intended diff - A written JIRA description (even if not submitted) in the format above
Stretch Goals
-
Find a real
MinororTrivialopen issue in Apache Tez JIRA that has been open for more than 6 months with no patch. Leave a JIRA comment expressing interest. -
Attempt the same patch workflow with a real issue:
- Use
git checkout -b TEZ-<real-number>-<short-description>for the branch name - Use the real JIRA number in the patch filename:
TEZ-NNNN.001.patch
- Use
-
Read three recently committed Tez patches by browsing JIRA issues with status
"Resolved". For each, read the complete comment thread to understand the feedback cycle and how many patch revisions were required. -
Generate a
git logview that shows only your branch's commits:git log origin/master..HEAD --onelineThis is what a committer sees when reviewing your work.