I wrote a guide to using github in a workflow as part of a course last semester. Here’s a quick dump of it:
A Github Overview
This document covers my understanding of best practices using Github and git as version control and issue tracking tools. We may not need to implement all of this stuff, but the majority of it will be helpful to use consistently.
First Steps: Setting up a development environment for a new project
To follow the workflow I describe here, a couple of prerequisites have to be satisfied.
First, clone the central repository
This assumes basic git knowledge, so I won't cover the details. If you don't want to put in your Github user/pass every time you push or pull, you should set up an ssh key locally and with Github.
Then, fork the central repository
Because our work will eventually be merged into a central code repository, that represents the latest official version of whatever the project is, we need a way to store our own work on github – without affecting the central repository. The easiest way to do this is to fork a repo:
- Navigate to the main repository on Github and click "Fork", then your account name.
Finally, set up your local git remotes for this project.
Git remotes are references to non-local copies of a git repository. For a useful workflow, at least two are required:
origin – Points to your fork of a project's repository
upstream – Points to the main repository for a project
- We must change rename the remote named
git remote rename origin upstream. By default, git will set the
origin remote to point to the repository you cloned from. In this case, assuming you've followed these instructions, that will be the main repository rather than your fork.
- Add your fork of the repo as the
origin remote with
git remote add origin GIT_URL
Working with issues
An issue or bug describes a specific problem that needs to be solved on a project. Some examples are, fix a crash, update documentation or implement feature A. On occasion, issues will be so big that they could be better described as a series, or collection, of smaller issues. An issue of this type is called a meta issue, and are best avoided unless completely necessary.
Issues serve a number of non-trivial purposes:
- They scope a piece of work, allowing someone to take responsiblity for it.
- They provide a place for discussion of the work, and a record of those conversations
- If well scoped, they provide a high-level view of what needs to be accomplished to hit release goals.
Github provides a way to mark issues with labels, providing an extra layer of metadata. These are useful in cases that are common on multi-developer projects:
- Prioritization of issues, marking them as critical, bug, crash or feature (among others)
- Identifiaction of blockers, by marking connected issues as blocked or blocking
- Calls to action, such as needs review or needs revision
Creating labels is fairly easy:
A blocker, with respect to issues, is an issue whose completion is required before another issue can be completed. With good planning blockers can mostly be avoided, but this isn't always true.
If an issue is blocking another issue, label it as blocker and in the issue description, mark which issue it blocks:
Likewise, if an issue is blocked, label it as blocked and mark which issue blocks it:
Creating an issue
The line between over and under documenting work with issues is thin. Ideally, every piece of work should have an issue, but this relies on skillful identifcation of pieces of work. "Implement a feature" is a good candidate, while "add a forgotten semi-colon" probably isn't.
The key point to remember is that collaboration relies on communication, and issues provide a centralized location for discussion and review of work that is important to a project.
For this reason, as soon as you can identify an important piece of work that logically stands on its own, you should file an issue for it. Issues can always be closed if they are duplicates, or badly scoped.
After identifying a good candidate, follow these guidelines when creating an issue:
- Name the issue with a useful summary of the work to be done. If you can't summarize it, it's probably a bad candidate.
- Describe the issue properly. If it's a crash or bizzarre behaviour, include steps to reproduce (STR)!
Milestones, project planning and triage
Just like issues represent a logical unit of work, milestones represent logical moments where development hits a larger target. They can be useful for prioritizing issues, and can even have due date attached to them. They aren't always necessary, but can be very helpful when skillfully determined.
In a project you are a key member of, they should be discussed. The act of triaging is prioritizing issues and making sure that the most important ones are addressed first. Milestones can be useful in this pursuit.
While creating an issue, you can add it to a milestone easily:
A workflow all the work other than writing code that goes into fixing a bug or solving an issue. The actual writing of code fits into the workflow, but it is useful to seperate the ideas at first.
The steps in a workflow will logically flow from the contribution guidelines of a particular project, but a good framework can be established and applied in most cases:
- Claim an issue, usually by assigning yourself to it (if you have permissions) or by commenting on the issue saying you want to solve it.
- Create a local branch based on
master, whose name indicates which issue you've selected, and what the issue covers. E.g.
git checkout -b issue3-contributorGuidelines
- Develop your patch, and commit as needed
- When ready for a review, push the branch to your fork.
- Open a pull request against the main repository.
- Flag a reviewer so they can get to work reviewing your code.
- Follow the review process
- When the review is finished, condense your commits into their most logical form (see below) and force push your changes with
git push origin -f BRANCH_NAME. NOTE: This will overwrite all the commits on your remote branch, so be sure you won't lose work
- Merge your code in if you have permissions, either on github itself or though the commandline.
- Delete your local and remote branches for the issue. You've done it!
Like issues, commits must be well scoped. At most you should have one commit per logical unit of work. If issues are well scoped, this means one commit per issue. The purpose of this is to make it easy to undo logically separated pieces of work without affecting other code, so you might end up with more than one commit. Aim for one as you start, and it will keep your work focused.
As a final note, a good format for your commit messages is: "Fixed #XXX – Issue summary", where XXX is the issue number. When done this way, the issue you reference will be automatically closed when the commit is merged into the repository.
Opening a pull request
A pull request is a summary of the changes that will occur when a patch is merged into a branch (like master) on another repository. Opening them is easy with Github.
After pushing a branch:
As always, make sure to communicate the pull request's purpose well, along with any important details the reviewer should know. This is a good place to flag a reviewer down.
The review process – having your code reviewed
During review, you and a number of reviewers will go over your patch and discuss it. When you need to make changes to the code based on a review, commit it seperately from the main commits of your work for the issue. This helps preserve the comments on the pull request.
When your code has reached a point where it is ready for merging, you can combine your commits into their final form with the interactive rebase command. Interactive rebasing is a key git skill, but has serious destructive potential. Make sure to read the link in this paragraph in full before attempting it.
The review process – reviewing someone's patch
A reviewer has two important jobs, sometimes split amongst two or more reviewers:
- Test the code
- Walk through the code thoroughly, commenting on changes that should be made.
Be polite, and explain your comments if necessary. If you aren't sure about something, invite discussion. The code's quality is the point.
A major difficulty for reviewers is finding time to review when writing patches of their own. This can be mitigated somewhat by discussing it with contributors ahead of time, so you can both be working on the code at once without interrupting development of your own patches.
Comments can me made directly on code in a pull request:
Proper communication on Github
Issue tracking's main appeal is providing a place to solve problems through discussion, and have that conversation available to reference from that point on. Pull requests and issues usually require some conversation. Key guidelines are mostly common-sense (respect each other, etc.) but some specific ones are:
- Check your github notifications at regular intervals, so people get the feedback they need.
- Learn the github markup language (a variant of markdown) to help communicate with code examples, links and emphasis.
- Control expectations by being explicit about what you can and cannot handle.