One of the greatest things about GitHub is the ease with which you can fork a project and make your own changes.
One of the worst things about GitHub is the ease with which you can fork a project and then lose track of the changes upstream is making.
This is extra-true when you want to make occasional PRs for a project with regular commits; it's easy for your fork to get way behind in a way that makes it really annoying to write patches that'll merge cleanly later. This guide collects my personal experience in handling this situation with minimal pain.
One: Set an "upstream" remote
If you run
git remote -v in your repo, you'll see the remote repositories that your local copy knows about. These are things that are easy to push to. By default, it'll only contain the GitHub URL for your fork, labeled "origin". You're gonna be interacting with the original repo you forked from a lot, so you want to give it an easy name, too:
git remote add upstream https://github.com/ORIGINAL_OWNER/ORIGINAL_REPOSITORY.git
Now you can easily refer to the original repo as
Two: Regularly re-sync with upstream
Whenever you're about to start some new work in your fork, re-sync it with the upstream first, so you have all the new commits and are less likely to conflict badly:
git fetch upstream git checkout master git merge upstream/master git push
This should cleanly merge and just do a "fast-forward" on your repo, rather than actually making a merge commit, as long as you faithfully do what I suggest in the next step.
Three: NEVER COMMIT TO YOUR MASTER BRANCH
If you want a clean, easy workflow, you want to NEVER, EVER commit to your local master branch. The master branch's sole purpose is to track upstream exactly, so you can always cleanly work against the current version and write easy-to-merge PRs.
Instead, always make your changes in branches. When you push the branch to your remote, you can do a PR; when the PR is accepted, you can pull the newly-updated upstream and then delete your branch.
master always remains a source of upstream truth, unpolluted by your personal code unless blessed by the upstream maintainers.
(This is good advice in general; always make changes in branches and only commit to master when you're done, but I'm usually lazy and just always commit to master for personal projects. But it's super-important to get this right for forks, or else you're in for a lot of pain.)
Four: Rebase your branches regularly
Say you're in the middle of writing a new proposed feature (in a branch, of course) and you notice that upstream has some new commits that touch code you're going to be modifying soon. You'd like to get that code into your branch now, before you start modifying things, so you don't have merge conflicts later. But how?
First, resync your master as stated in step 2. Then, rebase your topic branch on top of the new upstream code:
git checkout MY_COOL_BRANCH git rebase master
This'll undo your commits, pull in the new stuff from upstream, then replay your commits on top of it, so when you eventually make a PR, the upstream maintainers will have a perfect clean merge, with your commits sitting on top of their latest code.
Five: (Usually) Don't Force-Push Commits, Just Push
When you eventually do submit your PR, the review will probably catch some small mistakes you need to fix. Go ahead and just fix them, and push the new commits with the changes normally. They'll show up in the PR history, and older comments and conversations will stick around and give context to reviewers later.
If you do a significant change, where the diff isn't actually meaningful any longer, then it might make sense to do a force push:
git checkout your PR branch,
git log to find the first commit of your PR (or the one that you want to reset your progress onto, at least),
git reset COMMIT-HASH-HERE to move your "HEAD" pointer back to that spot, then
git commit --amend to rewrite the commit to have all your new changes. Finally, force-push it with
git push --force, so GitHub won't reject your change as rewriting history.
(Force-pushing is normally a very bad idea, because git is usually rightfully trying to stop you from making a mistake when it rejects a push, but in this case changing the history of a short-lived PR branch is unimportant and makes things look cleaner overall.)
Note: This is NOT NECESSARY NORMALLY. Even if the repository prefers a "clean, linear history", without a lot of fixup commits showing up when they merge in the PR, they have that option now! The "Squash And Merge" option on a PR does this fixup for you - it squishes all of the PR's changes into a single commit, then commits it to the tip of the master branch.
This gives the best of both worlds - while you're working on a PR, you keep track of all your commits and can review past work or comments, but when you're done, all that messy detail gets eliminated and the project gains one nice commit.