wiki:GitDemoAndTutorial
Last modified 6 weeks ago Last modified on 03/06/14 13:47:02

Table of Contents

  1. Code Repositories
    1. The central repository on git.lsstcorp.org
    2. Anonymous read-only access
    3. Getting developer/write access
    4. Installing git
    5. Cloning (checking out) LSST packages
    6. Repository browser
    7. Permissions
      1. LSST/ repositories
      2. private/ repositories
      3. contrib/ and personal/ repositories
      4. Full gitolite config file
  2. LSST git workflow and branch management policy
  3. LSST git tag policy
  4. git Tutorials to Read and/or Watch
  5. Understanding git (by mjuric)
  6. All you need to know to get started (by RHL)
  7. git Crash Course
    1. Starting up
    2. Cloning an existing project
    3. Adding a new file
    4. Editing a file and committing the changes
    5. Viewing the commit tree
    6. Amending a commit
    7. Pushing upstream
    8. Branches
      1. Creating
      2. Adding a file
      3. Switching branches
      4. Listing which branches are available
      5. Making your branches available to others
      6. Checking out and tracking an existing branch from an upstream repository
      7. Listing commits on a branch
      8. Merging
    9. Tagging
    10. Oops, I should have done that on a ticket branch
    11. Some useful command-line prompt hacking
  8. F.A.Q.
    1. Why switch away from SVN (for LSST)
    2. Why git and not hg?
    3. How do I migrate from Mercurial? Can Mercurial and git interoperate
    4. What about security? What prevents someone from deleting the master …
    5. What is the difference between git commit and git commit -a ?
    6. How do I restore a file to unmodified state ?
    7. What are best practices for developing on a branch?
    8. What are 'cache', 'index', and 'staging area'?
    9. I'm having problems with 'git push' and 'non-fast-forward mege' errors
    10. How can I avoid stepping on people's toes when making changes?
  9. gitolite remote commands
    1. Which repositories am I allowed to access?
    2. Creating a repository
    3. Deleting a repository
    4. Forking a repository
    5. Renaming or moving a repository
    6. Creating a repository in LSST/ hierarchy (admins only)
    7. Deleting an LSST/ repository (admins only)
    8. SSH-ing into the git account (admins only)
  10. Managing LSST gitolite users and permissions (admins only)
    1. Adding or removing users
    2. Removing big files that were accidentally added to a repository
    3. Backup
    4. gitolite setup notes for NCSA admins
    5. Notes

Code Repositories

The central repository on git.lsstcorp.org

LSST code repositories currently live at:

git@git.lsstcorp.org:LSST/...

To see a list of available repositories, do:

alias gitolite="ssh git@git.lsstcorp.org"
gitolite expand .

See this section for useful utilities to navigate and manage the LSST-hosted code repositories.

Anonymous read-only access

To use the http protocol, do something like:

git clone http://dev.lsstcorp.org/git/LSST/DMS/afw

The above will clone the repository for afw into the afw subdirectory of your current working directory.

Anonymous access using this method, requires git client 1.6.6 or later.

Or to use the git protocol, use something like:

git clone git://dev.lsstcorp.org/LSST/DMS/afw

The above will clone the repository for afw into the afw subdirectory of your current working directory.

Use the cgit repository browser to see the list of available repositories. Once Trac is upgraded, we'll return to its browser to browse git repositories.

Getting developer/write access

You will need a gitolite access account to access LSST code repositories. Everyone who was a member of svn unix group on svn.lsstcorp.org should already have an account.

If you do not have an account, send an e-mail to lsst-admin@… with your desired username and SSH public key. There are no passwords; all authentication is via SSH public keys.

Installing git

The minimum recommended version of git is 1.6.4.3. With older versions some (not often used) functionality such as auto-creation of gitolite repositories may not work.

If you're using Linux, your repository probably already has it packaged. If you're using macports on OS X, they have it. Otherwise, look here for getting the binary for your OS.

GUI: there are many GUIs for git (e.g., gitk comes standard with git). A particularly nice one for OS X is SourceTree.

Cloning (checking out) LSST packages

Do something like:

git clone git@git.lsstcorp.org:LSST/DMS/afw.git LSST/DMS/afw

The above will clone the repository for afw into LSST/DMS/afw subdirectory of your current working directory.

Use the cgit repository browser (below) to see the list of available repositories, or the gitolite expand command (see here. Once Trac is upgraded, we'll return to its browser to browse git repositories.

Repository browser

cgit browser has been set up at http://dev.lsstcorp.org/cgit/ , for browsing LSST repositories.

Note how ticket numbers link back to Trac tickets in commit messages (example).

It is also a convenient way to see which repositories are available.

Permissions

LSST-hosted repositories are managed using gitolite. There are two main groups of users, @admins and @devs, with varying degrees of permissions over the repositories. There are also two kinds of repositories; the official repositories (in LSST/), and the user repositories (in contrib/ and personal/).

LSST/ repositories

@devs are allowed to add commits to the master branch, to create branches named tickets/..., and create tags beginning with a digit. They are not allowed to rewind master or ticket branches (rewrite history), or delete tags.

The @devs, however, are allowed to both create and delete both branches and tags in u/USER/ namespace, where USER is their username. For example, mjuric has full control over creating, pushing into and deleting a branch named 'u/mjuric/mybranch' in any repository in the LSST/ hierarchy.

Admins can do anything, including history rewriting and branch/tag deletion.

private/ repositories

These repositories mirror the structure of LSST/ repositories, but are inaccessible to un-authenticated users (i.e., though cgit or using git-archive). They're intended for data and code containing proprietary information that cannot be shared with the public. Note that, in the spirit of openness, all LSST code should default to being public unless required otherwise.

At the moment, only private/LSST/Camera is in use.

contrib/ and personal/ repositories

Contributed and personal repositories are located in contrib/ and personal/USER/ namespaces, where USER is the user's username. Any developer can create new repositories in those directories. They can also delete repositories they've created. The difference between repos in contrib/ and personal/ is that all devs have write access to those in contrib/, while write access is (by default) limited only to the repository's creator for repos in personal/USER.

Rules of thumb: if you're contributing a piece of code on which you expect others to work as well, create it in contrib/. If you're creating (or forking) a repository for your personal hacking, create it in personal/USER (where USER should be replaced by your username). Note that you can give additional users permissions to read and write to your user/ repositories using gitolite's setperms command.

ssh git@git.lsstcorp.org setperms personal/<USER>/<REPO>.git
WRITERS @all
# <CTL-D> to end input 

Full gitolite config file

For those who'd like to know more, here is our current gitolite config (as of Nov 17th, 2011):

@admins = <snipped>

repo    gitolite-admin
        RW+     =   @admins

repo    testing
        RW+     =   @all

@devs = root
include "devs.conf"
include "readonly.conf"

repo    LSST/..*
        C                               = root          # A dummy user who can create repos. Use 'sudo' command together with 'mv' to move repos into the official LSST namespa
        R                               = @readonly     # Allow reading only to 'readonly' group (buildbot, etc...)
        RW+C                            = @admins       # Allow full control to admins
        RW                              = @devs         # Allow push to any existing branch
        RWC     tickets/[0-9]+$         = @devs         # Allow creating and pushing to tickets
        RWC     refs/tags/[0-9]         = @devs         # Allow creation of tags of the form X...., where X is a number
        RW+C    u/USER/                 = @devs         # Allow full control over personal branches
        RW+C    refs/tags/u/USER/       = @devs         # Allow full control over personal tags

                                                        # Contributed repositories, where all devs can write by default
repo    contrib/..*                                     # Allow user repositories in contrib
        C                               = @devs         # Any develper can create a contributed repo
        R                               = @readonly     # Allow reading only to 'readonly' group (buildbot, etc...)
        RW+                             = CREATOR       # Creator can do whatever they wish
        RW+                             = @admins       # Admins can do whatever they wish
        RW                              = @devs         # Others can read and write and create new branches/tags

                                                        # Personal repositories, where only the creator can write by default
repo    personal/CREATOR/..*                            # Allow personal repositories
        C                               = @devs         # Any developer can create a contributed repo
        R                               = @readonly     # Allow reading only to 'readonly' group (buildbot, etc...)
        RW+                             = CREATOR       # Creator can do whatever they wish
        RW+                             = @admins       # Admins can do whatever they wish
        RW                              = WRITERS       # Creator-designated writers can write and create new branches/tags
        R                               = READERS @devs # Creator-designated readers and all developers can read

Read gitolite configuration docs to understand the details of the above.

LSST git workflow and branch management policy

Note: this policy has been amended by DM/Policy/BranchingPolicies; where the two are in conflict, DM/Policy/BranchingPolicies should prevails.

This workflow is based on github-flow.

It is (intentionally) quite similar to the current flow in svn, and shouldn't require a steep learning curve.

  1. Anything in the 'master' branch is deployable (== should alway runs). Developing directly on the master branch is forbidden.
  2. Feature (and bugfix) development always happens in branches (equivalent to current trac tickets). It is advisable to commit early and often to your branch. However, you should not merge the master into your feature branch unless you absolutely need some new feature that has been developed in the master in the meantime. This prevents complex-looking commit histories.
  3. When your feature is ready, ask for it to be reviewed. Some minor features may not need a review.
  4. A feature that passes code review is permitted to be merged into master. Merge it and GOTO 1.

For maintaining "stable" releases create a separate branch (e.g., "stable-R4.0"), and apply exactly the same flow to that branch (except that only bugfixes would be allowed).

As an example, here are the keystrokes for developing a new feature:

# 1. check which branch you are on, 
# mostly like you want to be on the master branch
git branch

# 2. create new branch for ticket X, both locally and remotely
git checkout -b tickets/9999
git push -u origin tickets/9999

# 3. do work, commit often, push periodically
... do work ...
git commit -a
git push

# 4. when ready for review, merge it with master
# to pick up any changes made there and fix
# potential merge conflicts, and run the unit tests.
# Then ask for a review.
git pull
git merge master
... fix any merge conflicts ...
... run unit tests, make sure things still work ...
git push

# 5. when your feature passes review, merge it into master
git checkout master
git merge --no-ff tickets/9999
git push

LSST git tag policy

When creating tags meant for the public (e.g., release tags), always use annotated tags (that is, 'git tag -a'). These store the information on who created the tag, when, can be cryptographically signed, and can have a message attached to them. Example:

git tag -a -m "Version 4.7.0.0"     4.7.0.0

(or, with signing):

git tag -a -m "Version 4.7.0.0" -s 4.7.0.0 

git Tutorials to Read and/or Watch

Understanding git (by mjuric)

An explanation of how to think about git and how it internally does things. Also discusses merging and fast-forwards. Note that most of this is covered in the tutorials above, but if you're still confused, try reading it.

All you need to know to get started (by RHL)

DVCSes (git, hg, bzr, darcs, ...)

        working files
             |
             | add (not hg; explicit step in git (or commit -a))
            \|/
             v
       Files that I want to commit
             ^
            /|\
             |
             | commit/update
             |
            \|/
             v
         Local repo  <-- pull/push -->  Remote repo

CVS/SVN/Perforce/etc.:

        working files _
                     |\
                       \
                        \
                         \
                          \
                           \________________
                                            |
                                            | commit/update
                                           \|/
                                            v
                                         Remote repo

git Crash Course

Starting up

git config --global user.name "Firstname Lastname"		# Configure your name; this will appear in commits
git config --global user.email "your_email@youremail.com"	# Configure you e-mail; this will appear in commits
git config --global color.ui true					# Use colors if terminal is capable
git config --global push.default tracking					# Make 'git push' push only the current branch, and not all of them (see the FAQ)

Find more defaults to play with here. You may also be interested in bash completion script (note: this comes packaged with git in some distributions).

Cloning an existing project

git clone git@git.lsstcorp.org:LSST/DMS/afw.git LSST/DMS/afw		# Clone out DMS.afw project to DMS/afw directory
cd LSST/DMS/afw

Adding a new file

echo '# New build system!' > CMakeList.txt
git status						# See the status of files in the working directory
git status -s						# The same in format familiar to SVN users
git add CMakeList.txt					# Add a new file to be tracked by git
git status
git commit						# Commit the changes (the file addition)
git log

Editing a file and committing the changes

echo '#more stuff' >> CMakeList.txt			# Change the file
git status						# See that the file is now "dirty"
git diff						# See the changes
git commit -a						# Commit all changes (don't forget the -a!)

Viewing the commit tree

git log			# see the new state
git log --stat		# also see what has changed
gitk			# graphical tool
gitx			# another graphical tool (OS X)

Amending a commit

git commit --amend	# Use it to change the most recent commit message (and more)

Pushing upstream

git status              # Note that it says the branch is ahead of origin/master by two commits
git push                # This makes the changes available to everyone (they become a part of official LSST code history)

Branches

Creating

git branch						# View what branch we're on
git branch tickets/9999	# Create a new branch named 'tickets/9999'
git branch						# Note that the current branch has not changed
git checkout tickets/9999	# Check out the new branch (like 'svn switch')
git branch

or, you can do it in one line:

git checkout -b tickets/9999 HEAD  # Check out HEAD into a newly created branch 'tickets/....' and switch to it

Adding a file

echo "// still empty" > src/image/ExtendedSources.cc
git add src/image/ExtendedSources.cc
git commit
git log

Switching branches

ls -lrt src/image/			# Note the file is there
git checkout master			# Switch to branch 'master'
ls -lrt src/image/			# Note the file is gone

Listing which branches are available

git branch			# List local branches
git branch -r			# List remote branches
git branch -a			# List all branches (both local and remote)

Making your branches available to others

git push -u origin tickets/9999	# Push branch tickets/..... to remote repository 'origin', and set it up so we can pull from the remote branch in the future (-u)

Use "git pull --rebase" instead of just "git pull" when working on a branch with someone else; this will avoid unnecessary merge commits without rewriting any history that has already been pushed.

Checking out and tracking an existing branch from an upstream repository

git fetch 				# make sure we're in-sync with remote repositories
git checkout -t origin/tickets/8888	# Checks out the branch 'tickets/8888' from remote repository 'origin' into a local tracking branch of the same name
					# Note: newer versions of git allow just 'git checkout tickets/8888'

or

git fetch
git checkout -t -b multifit origin/tickets/8888	# Checks out 'tickets/8888' from remote 'origin' into a local tracking branch named 'multifit'

Listing commits on a branch

git log origin/master..origin/tickets/8888 # Lists commits reachable from 'tickets/8888'
                                           # that are not reachable from master
                                           # (i.e. excludes any commits merged to the
                                           # ticket from master)
git diff origin/master...origin/tickets/8888 # Displays differences caused by the above
                                             # commits.  ***NOTE*** that there are *three*
                                             # dots in this syntax, which is unique to
                                             # "git diff".

Merging

git checkout master						# ensure we're on master
git pull							# ensure we're up-to-date
git merge --no-ff tickets/9999	# Merge the tickets/9999 branch
ls -lrt src/image/						# Note the new file is here
git log								# Show the merge commit
git log --graph							# This is better
git push							# Upload changes to main LSST repo

Tagging

git tag -a 5.0.0.0		# Create an annotated tag (a tag with a message)

or

git tag -s 5.0.0.0		# Create a gpg-signed tag

You can use -m MSG with -a to save starting an editor. N.b you must use -a or -s otherwise git describe will ignore your tag.

Then

git log --graph --decorate	# See the tag you just made
git push --tags			# Push all your tags upstream

Oops, I should have done that on a ticket branch

Adpted from http://schacon.github.com/git/git-reset.html

I thought it was going to be a tiny bug-fix that I could commit straight to master but it grew into something that should be done on a ticket: (This is for when you have already git committed the changes, but not git pushed them.)

$ git checkout master
Already on 'master'
Your branch is ahead of 'origin/master' by 5 commits.

(Remember that number 5:)

If you're making a new ticket for this fix,

$ git branch tickets/9999
$ git reset --hard HEAD~5
$ git checkout tickets/9999

and keep working as usual.

If you want to apply your commits to an existing ticket branch,

$ git branch temp
$ git reset --hard HEAD~5
$ git checkout tickets/2019
$ git merge temp
$ git branch -d temp

(but this will also merge all other commits to master into your ticket branch).

Some useful command-line prompt hacking

git is distributed with a shell script, contrib/completion/git-prompt.sh, that defines a function, __git_ps1 that's useful for displaying the branch and status of a git repository in your command-line prompt. This file seems to be generally installed by distributors, and so you probably already have __git_ps1 defined in your environment. If not, grab that shell script and source it. If you can't get it, or don't want it, then a poor-man's version is supplied, below.

The behaviour of __git_ps1 is configurable with the following environment variables: GIT_PS1_SHOWDIRTYSTATE (define to non-empty value; then * indicates unstaged changes and + indicates staged changes), GIT_PS1_SHOWSTASHSTATE (define to non-empty value; then $ indicates non-empty stash), GIT_PS1_SHOWUNTRACKEDFILES (define to non-empty value; then $ indicates the presence of untracked files) and GIT_PS1_SHOWUPSTREAM (define as auto; then < indicates you're behind the upstream and can merge, > indicates you're ahead of the upstream and can push, <> indicates you've diverged, and = indicates there's no difference).

The result is something like:

user@machine:~/LSST/afw (tickets/1234>) $ 

(i.e., I'm on branch tickets/1234 with changes committed that I can push) but all you have to do is add $(__git_ps1) at the desired location in your current PS1 definition.

So, here's what I use:

# The following two functions provide a basic alternative for git.git/contrib/completion/git-prompt.sh
# in case it's not available
function prompt_git_dirty {
    local gitstat=`git status 2> /dev/null`
    local charstat=""
    [[ -z $(echo $gitstat | grep "nothing to commit") ]] && charstat="\%"
    [[ -n $(echo $gitstat | grep "Your branch and '.*' have diverged") ]] && echo "${charstat}\<\>" && return
    [[ -n $(echo $gitstat | grep 'Your branch is ahead of') ]] && echo "${charstat}\>" && return
    [[ -n $(echo $gitstat | grep 'Your branch is behind') ]] && echo "${charstat}\<" && return
    echo $charstat
}
function prompt_git_branch {
  git branch --no-color 2> /dev/null | sed -e '/^[^*]/d' -e "s/* \(.*\)/[\1$(prompt_git_dirty)]/"
}

# Setup for git.git/contrib/completion/git-prompt.sh
export GIT_PS1_SHOWDIRTYSTATE=1
export GIT_PS1_SHOWSTASHSTATE=1
export GIT_PS1_SHOWUNTRACKEDFILES=1
export GIT_PS1_SHOWUPSTREAM="auto"
type __git_ps1 1>/dev/null 2>&1 || alias __git_ps1=prompt_git_branch

PS1='\[\e[1;32m\]\u@\h\[\e[0;39m\]:\[\e[1;34m\]\w\[\e[1;31m\]$(__git_ps1)\[\e[0;1m\] \$ \[\e[0;39m\]'

F.A.Q.

Why switch away from SVN (for LSST)

The current proposed workflow is intentionally similar to svn, to ease and speed up the transition. One may wonder why then change at all? Here are some reasons:

Work offline
With git (and other DVCS-es), you can work (make commits) even when offline, and sync them up when you connect. That way your 3-days-of-work-while-stuck-on-an-airport-with-no-internet will not appear as one giant and convoluted commit.
Speed and efficiency
git if fast and storage efficient. It was built for large projects. It handles large binary files well. Common operations like committing, updating, tagging, branching, merging, viewing the commit log, take fractions of a second with git, even on large projects (think Linux kernel). SVN may take seconds to tens of seconds to do the same. This completely changes how one thinks of branching and merging; these now become everyday tools. git is also good at conserving disk space: the entire LSST SVN repository, including all history, is just 11GB in git (dominated by the size of test data).
Ease of branching and merging
git can create and merge branches in fractions of a second, and merges are usually pain-free unless there's an obvious conflict (i.e., the same line has been changed in both branches). If there is a conflict, git helps you resolve it. git was built with branches in mind; there are no kludges like 'svn cp' to emulate true branches with what are essentially copies in a separate directory.
Ease of tagging
Tagging in git is instant; unlike SVN, there are no copies involved. Tags can be cryptographically signed.
Safety and security
Every object in git's repository (it's "database") is protected by a SHA1 checksum. Repository consistency is verifiable; any disk corruption can be detected. The repository cannot be tampered with; any tampering will result in a detectable change of SHA1 hashes. In combination with cryptographically signed tags, a given source release derived from a git repository can be verified and guaranteed not to have been tampered with.
Distributed development
Finally, git allows for distributed development. When you check out a project with git, you not only get the latest revision, but you get 'all' the revisions ever made of that project (the entire commit history, the copy of the entire repository). Your local copy becomes an identical clone. Therefore, every developer has a full clone of the project, and all git clones (repositories) are equal. The "central" repository, if any, is only "central" only by convention; changes made on its clones can be shared not only with the "central" repository, but directly between the clones themselves (i.e., you can push some experimental code from your copy, to your colleague's copy, without checking into the central repository the code that isn't yet ready). Also, all of these clones serve as backup; if the central repository goes down, it can be easily restored from any of the clones.

As mentioned above, the proposed workflow intentionally mirrors what we used with SVN, to ease the transition. Once the team becomes more comfortable with git, and starts using some of its more advanced features, additional benefits will become obvious.

Why git and not hg?

I've talked about this at length in my git proposal, but, to summarize here, I prefer git because of a few tools that make a developer's everyday life easier:

'git stash'
The ability to quickly stash-away changes in the working directory to (for example) quickly fix a bug that has popped up elsewhere
'git cherry-pick'
The ability to cherry pick commits from one branch to another (e.g., to port only select fixes from development to release branch)
git staging area (git add and git add -p)
git can be told to commit only a subset of all modified files, and even subsets of changes in any given file. This lets you organize your changes in logically self-contained commits, instead of lumping together large unrelated modifications.
history rewriting (git rebase, git commit --amend, ...)
git's "history rewriting" tools are unparalleled in DVCS world. While seeming initially odd (and agains what you've always been taught to do with SVN), once you start using them they become an irreplaceable tool in your tool belt. They allow you to correct mistakes, commit more often while developing without fear that you'll "pollute the history", maintain patch queues, easily develop and maintain forked code built on top of other codebases (think HSC as built on top of LSST, or lsstpkg as a derivative of EUPS, etc.)
ubiquity and tool support
git is supported by virtually every hosted platform out there. Even Google Code and bitbucket (!!) have started supporting it this summer. It is well supported by development tools (XCode, Eclipse, KDevelop, emacs). Long-term it has a secure future, and the adoption rate among the general public seems higher than for mercurial. This will be an advantage when the community starts developing their own extensions on top of our codebase.
well designed branching
git's design and handling of branching is, IMNSHO, fundamentally better. Mercurial's original design of branching required multiple working copies (literally, multiple directories on the disk; closer to git clones than branches). They've since introduced named branches, anonymous branches, and bookmarks, to get to approximate feature equivalence with git. Git has only one, conceptually clear, type of branch. This makes it simpler to understand. Another big difference is that Mercurial stores the name of each branch within every commit made on that branch; therefore, if you name your temporary branch 'my-crazy-experiment', that name will stick to the code forever.

Mercurial has recently begun adding support for some of the above features, but most of them are currently implemented as optional/experimental extensions. Git comes with all those features built-in.

You may also want to peruse the Why is git better than X website.

How do I migrate from Mercurial? Can Mercurial and git interoperate

Migration from Mercurial should be straightforward using the fast-export script (see a description here).

If you're not yet ready to migrate, you can use hg-git, to access git repositories using a Mercurial client. This allows you to push/pull changes from/to your legacy Mercurial repository.

What about security? What prevents someone from deleting the master branch?

By default, nothing. There are configuration settings to disallow deleting (or rewinding) of branches. But for a fine-grained solution, we should set up and use gitolite.

What is the difference between git commit and git commit -a ?

See this great explanation here, that I didn't find until I wrote the text below (sigh)...

Committing changes to a git repository is a two-step process:

  • Step 1: specify which of the (possibly many) modified files would you like to commit as a part of this change set
  • Step 2: execute the commit.

The two above steps equate to the following commands:

  • Step 1: git add modifiedFile1.cc modifiedFile2.cc modifiedFile3.cc ...
  • Step 2: git commit

Most version control systems (including SVN and hg) omit Step 1. and always assume you want to commit all files that have been modified. Git is not as presumptuous, because there are sometimes good reasons why you'd want to split the modifications into two different commits (e.g., if you've modified 10 files while developing a new feature, while the one-line modification in the 11th file was an unrelated bug that you stumbled upon and fixed in the process). Now, what if you do want to commit changes to all modified files (or if you're used to SVN behavior and see no point in extra typing)? Then use:

  • git commit -a

The '-a' switch tells git to run an implicit git add for all modified files in the working directory, before performing the commit.

How do I restore a file to unmodified state ?

git checkout HEAD myfile.txt

The way to read this command is: 'Dear git, please check out from branch HEAD the file myfile.txt'. In git, HEAD always refers to the current branch. You can probably already tell that if I wrote git checkout otherbranch myfile.txt git would check out the file from otherbranch. It's even more general than that: instead of a branch name, you can give it any tree-ish out of which to extract the file.

What are best practices for developing on a branch?

Often the features you're developing take a long time to mature. Therefore your feature branch (also sometimes called a "topic branch") may lag behind master quite a lot by the time you're done. What should you do? Should you "sync up" often by merging 'master' into your feature branch, or should you wait and fix any conflicts until the very end?

Junio Hamano has an excellent post on this that is a MUST to read. To summarize:

  • Merge your feature branch into the master only when it's complete (up to bugfixes).
  • Merge the master into your feature branch only when there's a new feature in master that the code in your branch needs to use

This strategy minimizes the number of merges in the history of the project, which helps with tools like 'git bisect' (automated finding of commits that caused bugs/regressions). And if you're nervous about doing all the conflict resolution at the very end, look into git rerere.

What are 'cache', 'index', and 'staging area'?

To first (and second, and probably third) order, they're the same: the staging area where you place the files (using 'git add') that are to be a part of the next commit. That there are three terms for one and the same thing is a historical artefact.

I'm having problems with 'git push' and 'non-fast-forward mege' errors

See if this answers your question.

How can I avoid stepping on people's toes when making changes?

See this page on interacting with gits.

LSST git repository hosting

We use gitolite to manage LSST repositories. If you're a new user, read here to get started.

gitolite remote commands

A number of useful commands can be triggered by both admins and devs, by ssh-ing to git@git.lsstcorp.org. These are gitolite's admin defined commands. The following bash alias is also useful (and used in the examples below):

alias gitolite="ssh git@git.lsstcorp.org"

Which repositories am I allowed to access?

Do:

gitolite info

Note that the patterns shown in the output are regular expressions, not shell globs (i.e., '.' character is a stand-in for 'any character', and not a dot as it is with shell globs).

To see all repositories to which this expands to:

gitolite expand

Creating a repository

Repositories are not explicitly created; they're auto-created when needed. When you try to clone, push, or perform any operation on a nonexistent repository, one will be auto-created for you (assuming you have the permissions to create it there). For example:

mkdir newthing
cd newthing
git init
. . .  write code, make a new commit . . .
git remote add origin git@git.lsstcorp.org:contrib/newthing.git
git push --all -u
git push --tags

Warning: creating a repository in the LSST hiearchy requires a special procedure

Deleting a repository

To delete a repository (assuming you are its owner), use the 'trash' command:

gitolite trash contrib/newthing.git

This command doesn't actually delete the repository; it only moves it to trash. Use 'list-trash' to see what's in the trash, and 'restore', to restore a deleted repository from the trash.

To permanently delete a repository, an admin needs to ssh to git@… account and rm -rf it from ~/repositories/deleted. This is a safety feature (i.e., it should be very hard to permanently delete anything).

Forking a repository

Forking is essentially making a server-side repository copy (a clone), allowing you to clone an existing repository (e.g., for experimentation). Example:

gitolite fork LSST/DMS/afw personal/mjuric/afw
git clone git@git.lsstcorp.org:personal/mjuric/afw

This now gives me a full clone of afw, with full rights to all of its tags/branches/etc.

Note: the server-side clone is done _very_ efficiently wrt. space (hardlinking is used where possible).

If you want to fork the repository to a remote site (e.g., for control of the stability of the stack while retaining the ability to exchange commits), see this.

Renaming or moving a repository

Use:

gitolite mv <fromname> <toname>

Unlike UNIX 'mv', toname cannot be a directory; it must be a fully-qualified repository name. Example:

gitolite mv contrib/mything.git contrib/bettername.git

Note that .git extension is optional.

Creating a repository in LSST/ hierarchy (admins only)

One cannot directly create (or delete, but you can trash) repositories in LSST/, but if you're a member of @admins, you can move an existing user repository there using:

gitolite sudo root mv <fromrepo> <torepo>

Why: Since repositories are auto-created if they don't exist (see above), allowing even a small subset of admins to create them in the LSST/ hierarchy may lead to lots of spurious repos due to typos. Because of that, only the user 'root' is allowed to create (or delete) repositories in LSST/, and user 'root' can only be accessed using the 'sudo' command. This adds an additional layer of safety.

Deleting an LSST/ repository (admins only)

gitolite sudo root trash LSST/therepotodelete.git

SSH-ing into the git account (admins only)

Use a password to get shell access (as keys will redirect to gitolite). To force SSH to skip public key auth, do:

ssh -o PubkeyAuthentication=no git@git.lsstcorp.org

Managing LSST gitolite users and permissions (admins only)

User and permission management is done via configuration files in the standard gitolite-admin repository. You have to be a member of @admins gitolite group to access this repository. Membership consists of roughly one person per DM site. For the exact list, contact lsst-admin@...

For user management activities, first clone gitolite-admin to your local machine:

git clone git@git.lsstcorp.org:gitolite-admin

Adding or removing users

Add the user's SSH public key to a file named keydir/username@0.pub in the gitolite-admin repository. For example, to add user mjuric, add mjuric's public key into

keydir/mjuric@0.pub

If the user has more than one key, add as many as necessary by changing the numeric suffix following the @ sign (e.g., keydir/mjuric@1.pub, etc.).

Next, add the user to conf/devs.conf, to the @devs group. For example:

@devs = mjuric

Commit the changes, and push them upstream to make the changes effective:

git commit -a
git push

To remove users, simply remove the public keys and their entries from conf/devs.conf file, commit and push.

Removing big files that were accidentally added to a repository

NOTE: This is history rewriting and should be done only after consulting mjuric'''

-------- Original Message --------
Subject: permanently removing a file from git - working recipe
Date: Fri, 29 Mar 2013 13:54:41 -0700
From: Jacek Becla <becla@slac.stanford.edu>

Some background for Robert and Mike: our collaborators pushed 0.5 GB worth of 
test data into our qserv repo, which I had to clean up.

Here is the recipe:
1. git clone, plus don't forget to have another clone
    for safety, comparing etc

2. git filter-branch -f --index-filter \
    'git rm --force --cached --ignore-unmatch <pathToFile>' \
     -- --all

in our case the <pathToFile> looked like
"tests/case02/data/Object.txt"
"tests/case02/data/Source.txt"
etc
Repeat for every file you are removing.

3) rm -Rf .git/refs/original && \
    git reflog expire --expire=now --all && \
    git gc --aggressive && \
    git prune

4) git push --force

5) checkout the branch where the offending files were
   committed and git push --force that branch

That seemed to work in our case.

Jacek

Backup

To back up all of the repositories simply back up anything and everything in ~git. Note that ~git/repositories is a symlink to /lsst_ibrix/gitolite/repositories, that should be backed up as well.

gitolite setup notes for NCSA admins

  • We use gitolite to control and manage access to LSST repositories. Please familiarize yourself with gitolite before fiddling with the configuration.
  • gitolite has been setup on account 'git' at ds33.ncsa.uiuc.edu (aka. git.lsstcorp.org)
  • because ds33 is a slow machine, $GL_BIG_CONFIG=1 and $GL_NO_DAEMON_NO_GITWEB=1 have been set in ~/.gitolite.rc. This speeds up authentication/push/pulls quite a bit
  • set $REPO_UMASK = 0027 (in ~/.gitolite.rc), to allow Trac to read git repositories (once it's added to the git group)
  • because of space constraints, the repos are actually stored at /lsst_ibrix/gitolite/repositories, and symlinked to ~git/repositories. Take note of this when setting up backup.
  • $GL_WILDREPOS = 1 has been set, enabling wildcard repos
  • $REPO_UMASK = 0022 has been set, making repos world-readable (so that apache/cgit can see them)
  • $BIG_INFO_CAP = 2000 has been set, to allow the users to obtain the full list of repositories using the info 'ADC'
  • $GL_ADC_PATH = "/home/git/adc-bin" has been set, to enable ADCs. See that directory for a list of ADCs
  • I (mjuric) have written a 'mv' ADC, to allow the repositories to be easily moved. The syntax follows UNIX mv command, mv FROM TO

Other

mjuric's random notes (don't read this)

Notes

  • Compiled new git in ~mjuric/lfs on svn.lsstcorp.org
  • Had to install manpages using the trick from http://www.jukie.net/bart/blog/git-man-install
  • Compiled new SVN on svn.lsst.org in ~mjuric/lfs, to get around Ticket 1501
    • required recompiling apr, apr-util, berkeley-db and sqlite, swig
    • separate install of SVN perl bindings (needed by git-svn)
    • wasn't fun
  • Import failed on large repos because of insufficient RAM (2G).
    • Had to set 'git config pack.windowMemory 250m' on repos to mitigate