One of the greatest things about GitHub is the ease with which you can fork a project and make your own changes.
One of the worst things about GitHub is the ease with which you can fork a project and then lose track of the changes upstream is making.
This is extra-true when you want to make occasional PRs for a project with regular commits; it's easy for your fork to get way behind in a way that makes it really annoying to write patches that'll merge cleanly later. This guide collects my personal experience in handling this situation with minimal pain.
One: Set an "upstream" remote
If you run
git remote -v in your repo, you'll see the remote repositories that your local copy knows about. These are things that are easy to push to. By default, it'll only contain the GitHub URL for your fork, labeled "origin". You're gonna be interacting with the original repo you forked from a lot, so you want to give it an easy name, too:
git remote add upstream https://github.com/ORIGINAL_OWNER/ORIGINAL_REPOSITORY.git
Now you can easily refer to the original repo as
Two: Regularly re-sync with upstream
Whenever you're about to start some new work in your fork, re-sync it with the upstream first, so you have all the new commits and are less likely to conflict badly:
git fetch upstream git checkout master git merge upstream/master git push
This should cleanly merge and just do a "fast-forward" on your repo, rather than actually making a merge commit, as long as you faithfully do what I suggest in the next step.
Three: NEVER COMMIT TO YOUR MASTER BRANCH
If you want a clean, easy workflow, you want to NEVER, EVER commit to your local master branch. The master branch's sole purpose is to track upstream exactly, so you can always cleanly work against the current version and write easy-to-merge PRs.
Instead, always make your changes in branches. When you push the branch to your remote, you can do a PR; when the PR is accepted, you can pull the newly-updated upstream and then delete your branch.
master always remains a source of upstream truth, unpolluted by your personal code unless blessed by the upstream maintainers.
(This is good advice in general; always make changes in branches and only commit to master when you're done, but I'm usually lazy and just always commit to master for personal projects. But it's super-important to get this right for forks, or else you're in for a lot of pain.)
Four: Rebase your branches regularly
Say you're in the middle of writing a new proposed feature (in a branch, of course) and you notice that upstream has some new commits that touch code you're going to be modifying soon. You'd like to get that code into your branch now, before you start modifying things, so you don't have merge conflicts later. But how?
First, resync your master as stated in step 2. Then, rebase your topic branch on top of the new upstream code:
git checkout MY_COOL_BRANCH git rebase master
This'll undo your commits, pull in the new stuff from upstream, then replay your commits on top of it, so when you eventually make a PR, the upstream maintainers will have a perfect clean merge, with your commits sitting on top of their latest code.
Five: Force-push commits that just fix PR nitpicks
When you eventually do submit your PR, the review will probably catch some small mistakes you need to fix. You can just make those fixes and push them to the PR branch, but then when it's eventually merged there will be a lot of silly little "fix typo" commits scattered around.
Instead, just fix the commit itself, using the
git commit --amend option, then force-push that to your remote with
git push --force. Force-pushing is normally a very bad idea, because git is usually rightfully trying to stop you from making a mistake when it rejects a push, but in this case changing the history of a short-lived PR branch is unimportant and makes things look cleaner overall.