wiki:RHLSetupComments
Last modified 11 years ago Last modified on 06/04/2008 03:56:20 PM

Here's a marked up copy of wiki:KtlSetupUseCases

I'm sure that I missed some. My major comment is that it still is not clear to me just how much of this is needed, and how much is already present in ways that people don't understand (note: if this is the case, we clearly need to simplify the system or docs).

Production Execution

1. Setup for Application Execution

I really don't see the point of "activate"; however I added a configure option to support it:

configure --with-setup-alias=activate:deactivate

For typical application execution, the installed packages should be tagged versions retrieved from the repository. In my ideal world, such versions are retrieved as binary archives for the user's platform where possible (for efficiency) and built from source otherwise. If they are built from source, the known-good snapshot of the dependent packages' versions must be used while building, not anything dependent on the installing user's configuration. The command used for accessing the repository should be "eups fetch".

There's already an "eups distrib" command that seems to do all of this; the eups fetch seems like syntactic sugar to me and I'd resist it. If it were implemented, it'd be via a plugin to eups (which would be OK).

"eups distrib" is acceptable.

It's not the name, it's the choice of default options. I'm willing to consider aliases for eups distrib:

eups fetch == eups distrib --install [other options?] eups publish == eups distrib --create [other options?]

but I'd do them as a user-provided extension module to neups (details to be settled, but probably via an environment variable that names a file to import + hooks). fetch seems a fine mnemonic, but I'm not so sure of publish --- details, details.

All such installed packages should be identical for a given platform, no matter which user performs the installation. Accordingly, there seems to be no reason to have per-user installation directories for this case. Note that this does not mean that all users should have the same set of "current" (or default/preferred) versions; those may vary on a per-user or even per-shell basis.

current is used to mean what you're calling "stable". I'd rather keep it as current, and add another state (preferred?) if needed.

Today's "current" is really "preferred". I believe it is a combination of a per-machine state (or even per-user with LSST_DEVEL?) with a global default. Because it can be overwritten locally, it is not the same as "stable", which is a global state in the repository (root).

OK, so you're looking for centrally controlled package states. That seems reasonable.

Building a package from source must check that dependent packages are activated, and, ideally, that sufficient files are available from those packages to complete the build. This can be accomplished by having each package specify its "interface", which consists of zero or more header files, zero or more Python packages, and zero or more shared libraries.

More details please. A package currently provides its dependencies; I think that you're asking for another file with other info --- which sounds doable.

The package provides its package dependencies, but it does not specify which header files, Python package files, or shared library files must exist. Instead, these are today specified in the SConstruct of any and every package using another package, which is the wrong place. A separate file with this "interface" information that can be interpreted by the using/depending package's SConstruct is acceptable.

Activating a package of course needs to automatically activate compatible versions of its dependent packages, starting with the current/preferred version but selecting another if required. Activating can automatically retrieve packages if they have not been installed to further simplify the end-user's task.

No, automatic retrieval is magic and I don't like it. Setting up a package sets it up; fetching it is something entirely different. Note that the former doesn't require write access to the shared space.

I can live without automatic retrieval; Russell also seems leery of this. The main goal here was to have the command below be an extremely simple one.

Proposed DC3 command: activate dc3pipe --stable

See above; I'd rather keep this as --current and the default.

So the revised proposal is now to use "eups distrib -i -C dc3pipe; setup dc3pipe", with the first command defaulting to the global "current" version (not just the latest version), and the "-C" option installing it as preferred.

Or eups distrib --install --type=preferred dc3pipe && setup dc2pipe

2. Setup for executing a particular version of an application

If the installed package is under rapid development, in the current system it may have been installed from a checked-in svn working copy rather than retrieved from the package repository, hence the "1.2+svn4455" designation. There are two problems with this: automatically generating the version number is difficult, and ensuring a reproducible installation is also difficult, because the "setup -r ." command typically used for defining the build configuration uses whichever packages are labeled "current". In DC2, this was not an issue because all developers working in this style used the same physical installation of the package, but this is not workable during normal distributed development. Besides solving the above two problems, a mechanism is needed for developers on other machines to get access to the same version.

My preferred solution is to allow eups fetch for svn versions. I would disallow "scons install" and replace it with "eups install", making eups the sole command used for maintenance of the installation tree. "eups install" of a checked-in svn working copy would install a version automatically numbered using a "base tag" version number. This command would not be allowed if files existed that had not been checked in.

I don't like this. eups is a version manager, not an installer (the exception is eups distrib, which uses the information in the eups DB to figure out what you need --- maybe that was a mistake). scons is a build system and can understands what's up-to-date. We should use it. If K-T really wants to stop people typing scons install, we should provide some other wrapper but I personally think that this is unnecessary.

The original goal of "eups install" was to have this include "eups publish" below. With these separated, it is perhaps acceptable to go back to "scons install current" to install a checked-in svn working copy locally with a properly-generated version number and make it preferred. Note that this is different from today's "scons install" in several ways; it would make sure that all files are checked in and would also retain knowledge of the svn URL of the working copy. I don't see why one would ever install without performing a declare, so those should be merged, and defaulting to "preferred" (not "stable") also makes sense. If run in a working copy taken unchanged from a tag directory, the result must be indistinguishable from "eups distrib -i -C PACKAGE BASETAG".

I think that you're mixing two things here. install means put a copy of certain files onto the local file system. Your "publish" concept means tell the central source of packages that this version

The "base tag" is OK, but needs to be fleshed out

The "base tag" is just a version number that is used as the basis for svn working copy versions. It could be stored in a file or an svn property. It would be set just before making a tag copy.

Importantly, an additional command, "eups publish" would also upload a lightweight package description to the package repository. This description would consist of a snapshot of the dependent package versions activated for the build and the svn revision of the trunk in use. This should be sufficient information for "eups fetch" to re-create the developer's build of the package, producing the same situation that the original developer produced with the "eups install" command. The package description is also sufficient information for an automated build system to create a binary version of the package for download by "eups fetch" for efficiency. Finally, with a tag version instead of an svn revision, this description could even be used for making tagged versions available to the package repository.

This basically exists. The

eups distrib --create

command (if properly invoked...) writes enough information to reinstall a version. It looks as if K-T wants a wrapper. Note that for eups distrib --install (which K-T calls eups fetch) there has to be enough information somewhere for the packages to be found, configured, built and installed. This is currently being done via build files. As e.g. cfitio isn't in out tree you need a bit more infrastructure (which is why the full command's like

eups distrib --create --build ~/LSST/BuildFiles: afw svn4828 -r ~lsst/products/packages/Linux

where the non-LSST build files are in ~/LSST/BuildFiles).

It looks like the requirement is then for a wrapper to hide the "--build" option plus the ability for any developer with svn access to be able to write a new version to the package repository using "eups distrib -c PACKAGE VERSION+svnNNNN" (but not overwrite an existing version, and only special developers can write tagged versions without the "+svnNNNN" unless we really ensure they are identical).

We (or at least I) was confused by two separate issues here:

Writing enough information to install a given version (e.g. its dependencies) Declaring that version stable (or beta, or whatever)

Is the former what you're calling "publish" and the latter "mark"? If so, they are the global equivalent of "install" and "declare --type=foo" and I'd rather see a unified API

If you just want a list of dependencies, try

setup -n -v afw

(or a simple command to call the python API, possibly as an option to eups)

The list of dependent package versions is intended to be for internal use, not for end users.

The "base tag" version number allows us to have revisions from release maintenance branches or even directly from ticket branches. Such svn versions will always correspond with their preceding tagged versions, allowing humans or software to easily determine the suitability of such versions as dependencies. The "base tag" for a given version could be specified in the table file or another location, such as an svn property. Right now, a similar version number is in the .pacman.m4 file, but I believe this is the version of the next release, not the version of the last release, making this much more difficult to maintain. Having this base tag version would allow versions of the form "{base tag}+svn{revision number}" to be auto-generated, simplifying comparisons and improving human understandability. The base tag version could be updated automatically by a tool whenever a tag copy is made.

Once again, there is no need for per-user installation directories in this case, as all installations are identical and installing a version cannot negatively impact another user's configuration.

Proposed DC3 command: activate dc3pipe 3.4

This now becomes eups distrib -i -C dc3pipe 3.4; setup dc3pipe.

3. Setup for executing an application using a particular version of a dependent package

The proposal here is to add a --keep option to setup/activate in order to allow prior specifically-chosen versions to be kept, instead of having them replaced. Alternatively, the existing method of overriding versions in dependency order can still be used. In either case, activating a specific version of package X must check that that version satisfies the requirements of all currently-activated packages that depend on package X.

Proposed DC3 command: activate daf_persistence 3.12; activate dc3pipe --keep

The --keep option is supported in svn eups (well, neups/nsetup actually as I haven't been brave enough to move the names over --- but I shall before release)

Installation

4. Installing a tagged release into the stack

setup -r . does not ensure a reproducible build environment, as described above in use case 2. My preference here is again to mandate the use of eups fetch for installing tagged (or untagged) versions.

Proposed DC3 command: eups fetch daf_persistence (for stable version)

This is now eups distrib -i -C daf_persistence.

5. Installing a beta version of a package

See use case 2, including the caveat about branch versions. The "beta version" would typically be marked as preferred for the user, so that manual version specification would not be necessary for every shell/login, but this marking should not be shared by other users on the same machine. It may be desirable for a developer to mark a particular "beta version" as the preferred one for all other developers, but not for application users.

Proposed DC3 command: eups fetch daf_persistence --beta

I don't think that we want to add lots of magic --beta/--stable/--alpha/--rhl flags. If we need the functionality, I'd prefer --type=beta

Note that the "beta" state is global, in the package repository, not per-machine. I don't see there being a large number of states as a result. But "--type=beta" is acceptable if it can be abbreviated "-t=beta" or "-t beta". The new proposed command is then "eups distrib -i -C daf_persistence -t=beta".

The argument parser accepts:

--type beta --type=beta -t beta -t=beta --type=beta:gamma --type=beta -t gamma

(where the first 4 and last 2 are equivalent)

Proposed DC3 commands for developers: cd daf_persistence; eups install; eups publish daf_persistence --current; eups mark daf_persistence --current --beta

See comments above. What does "--current" mean in this brave new world?

"--current" means to select the currently preferred version instead of providing an explicit version number; I don't think any eups commands today have this convenience feature.

In fact, it may make sense to remove the distinction between "preferred"/"current" and "setup"/"activated"; aside from environment ugliness, is there any reason not to have a version of every package setup at all times?

There is no way to do the "publish" and "mark" steps that affect the global repository today.

6. Reinstalling a version of a package

I don't think it should ever be necessary to replace an installed version. This is dangerous in many respects: it may break running programs, it may lose history, it may break provenance.

Proposed DC3 command: None

Can't we just remove and readd it? Not that that's a good thing to be doing routinely.

With the way that versions are supposed to be created, a re-added version ought to be identical to the original, so there's no reason to perform the operation.

7. Removing a package

Removing obsolete packages should be supported. "eups install --remove" seems like a reasonable way to do this.

Proposed DC3 command: eups install --remove daf_persistence 3.12+svn6235

Actually, the command's

eups remove daf_persistence 3.12

OK

It's trickier than you might think -- should the removal be recursive? Should it be allowed to remove products that are in use by other products? I think that the eups remove command handles all of this correctly, and even has an interactive mode for cowards such as me.

See also eups uses daf_persistence 3.12

Recursive as in removing the removed package's dependencies? That doesn't make sense to me. Recursive the other way as in removing packages dependent on the removed package? That would make a little more sense, but is unnecessary. Removing products in use (setup/activated) should of course be banned. The point is that this operation should be low cost because any version so removed can be restored easily.

If you remove a version it can in general "orphan" other products, in the sense that they are never setup. eups remove --recursive removes a product and all of its dependencies, checking that they aren't setup by some other product. This is different from not removing products that are already setup by the user.

This said, eups remove is a little scary to allow as an option to the general user. We could put an "ACL" file in the ups_db file I suppose. Not a bad idea...

"eups uses" must be a "neups" feature.

It's also in late eups versions --- I added it to support eups remove

Development

8. Adding and testing new features into a package

The main thing here is the need to activate the current working directory as the source for files for the package being built and remember that this directory should be the source for this user ("install devel" in Russell's terminology). This requires activating a package directory outside the installation tree that may not ever have been built. Note that the directory needs to be activated, not the files in it, because those files may not exist and the set of files may even change during development. (This makes activation of an devel-installed package more complicated for a link-forest implementation, for example.) The devel-installed package version will typically also be marked as the preferred version for the developer user, but it should not be marked as preferred, or even visible at all, to any other user. The devel-installed package must have at a minimum a table file specifying its dependencies (and perhaps its "base tag" version); it may also have an interface defined. Activating the devel-installed package, like activating any package, should automatically activate the appropriate dependent packages.

I don't understand what you're getting at here. You'll always do build before a declaration; currently

scons scons install scons declare

(and you can give it a name, I think with

scons install version=rhl)

setup product rhl

I don't see that anything fundamentally changed. Please explain why this is hard.

The point of devel-installing is to avoid the install/declare/setup steps and manual version number generation when dealing with working copies that have not been checked in and are undergoing rapid development. Russell uses "eups declare -r . -c" in a per-user LSST_DEVEL environment, which is usefully persistent over sessions but requires the presence of LSST_DEVEL and requires the user to manually generate the version number.

It's not LSST_DEVEL, it's a private directory first in $EUPS_PATH.

Since we are calling this a form of installation, we should use eups to perform this action for consistency. "eups install --devel" might be acceptable.

As before, installation is not an eups function. We can have wrappers if we must, but they'll be very thin layers around scons and eups.

If "scons install" automatically does a declare, then it might be adequate to have "eups install" be performed by "scons install" and "eups install --devel" performed by "eups declare".

Now that I realise that you're referring to eups declare -r . -c, I see that this isn't an installation at all. Why do we need to change this?

A devel-installed package need not have all of its files checked in to be used.

It may be necessary to have more than one devel-installed version of a given package for a given user if the developer is working on multiple tickets, perhaps with different base tags, at the same time. Only one would be preferred, but each could be activated in its own shell. Thus the devel-install command must take a user-specified name to distinguish between these versions.

Proposed DC3 command: eups install --devel

9. Adding and testing new features into several packages simultaneously

Multiple devel-installed packages may need to be activated and marked as preferred at the same time.

Proposed DC3 commands: eups install --devel; cd ../../daf/base; eups install --devel

Concepts

The above use cases lead to the following concepts:

Installation status:

  • Tagged package versions installed on the machine.
  • Svn revision package versions installed on the machine.
  • Tagged or svn revision package versions published to the project-wide package repository.
  • Devel-installed, per-user package versions.
  • No version can be activated without being installed or devel-installed.

Preferred status:

  • Versions preferred for application users: "stable".
  • Versions preferred for developers: "beta".
  • Versions preferred by user: "current".
  • Versions activated by user in this shell: "setup"/"activated".

Package contents:

  • Dependencies: table file with ranges.
  • "Base tag" revision: table file.
  • Interface: interface file.
  • Repository description: dependency snapshot plus tag version or svn revision and svn URL.

Proposed Command Syntax

Inquiry Commands

eups list [PACKAGE [{VERSION, --current, --active}]]

Lists packages installed on the machine with versions. If a package is specified, lists only versions of that package. If a version or option is specified, lists only that version of the package (if any). A glob (like for shell pathnames) may be used instead of a package name to list all packages matching the glob.

No need for a glob:

eups list | grep ...

is the Unix way. A glob would be easy enough, but it seems like gilding the lily to me.

Do you do "ls | grep"?

But ls doesn't know anything about globbing, that's the shell. So you'd have to quote the glob characters for eups list to see them.

eups list {--current, --active}

Lists all current or activated package versions on the machine.

eups list {-r, --repository} [PACKAGE [{VERSION, --stable, --beta, --all}]]

Lists all available packages from the project-wide package repository with available versions. Lists tagged versions only unless --devel or --all are specified.

I think that your -r option is either the current -z or -Z; no need to change. See note above about --beta

No, "-z" selects the product paths which contain this directory and "-Z" uses the path from a product. I don't know what either of these is exactly, but is not the same as listing whatever is in EUPS_PKGROOT, which is what this command is intended to do.

Fine. This isn't done via eups list at present (and I don't think it should be), but via eups distrib --list. This doesn't work with the current pacman backend due to the use of a cgi script to generate lists of manifest files at ncsa, but that's a detail.

Installation Commands

eups fetch [PACKAGE [{VERSION, --beta}]] [--nocurrent]

Retrieves a package from the project-wide package repository, building it from source if necessary, and installing it into the machine's standard installation tree. The version defaults to the one marked stable, if present, then the one marked beta, then the latest tagged version, and finally the latest version if none of the previous ones exist. Automatically marks the fetched version as current unless --nocurrent is specified.

I don't think that --nocurrent is a good idea --- it's magically doing what usually takes a command. This is currently achieved by eups distrib. Also, you need some way to identify the root (maybe an environmental variable? If so, it should be set by an eups metapackage rather than a file you have to source for at least three reasons. One: it can be unsetup; Two: it's consistent with all other setups; Three: you don't need a version for every shell)

Almost every time you install a package, you will want to use it. Therefore it should be marked preferred as the default, rather than requiring a separate step to do so.

Because fetching a package and setting it up are two different things, and we shouldn't confuse them. After all, everytime that the user needs that package after the initial install she'll need to say setup. I don't like magic.

I am not sure what you are referring to by identifying the root. If you mean the package repository, it's not clear to me why this is a highly dynamic thing.

Well, unless we use a magic environment variable (or rcfile) eups distrib doesn't know where to go for packages. The magic's currently $EUPS_PKGROOT (I think).

Building from source uses the exact versions of the dependent packages from the installation snapshot.

lsst-tag VERSION [DIRECTORY]

Tags a working copy of a package as a release version. Checks to make sure that all files are checked in. Modifies the "base tag" (whether in a file or a property) and checks that in. Makes the tag copy in the svn repository. Defaults to the current directory.

This command would be used by the package owner as part of a release process; it performs the necessary svn functions for release. Before tagging, the release process would build and test the package. After tagging, the process would install and publish the package and possibly mark it as stable (see below).

eups install [DIRECTORY] [--nocurrent]

Installs a package working copy as a tagged or svn version into the machine's standard installation tree. Defaults to the current directory.

First, ensures that all files are checked in and from the same svn branch. Rebuilds the package if the activated version of any dependent package has changed since the last build. Takes a snapshot of the activated versions of all dependent packages. Also records the svn URL of the working copy.

If the directory is an unchanged tag directory, installs with version BASETAG. If in another type of working copy, installs with version BASETAG+svnREVISION. Automatically marks the installed version as current unless --nocurrent is specified.

eups install {-d, --devel} [NAME] [--nocurrent]

Devel-installs the current directory for this user. The package name is auto-detected. This devel-install replaces any other directory devel-installed for this package with the same name. The version is BASETAG+svnd, or BASETAG+svnd+NAME if a name is specified. This command does not build the package. No snapshot of dependent package versions is made. The svn URL of the working copy is not recorded. Automatically marks the devel-installed version as current unless --nocurrent is specified.

eups install --remove PACKAGE VERSION

Removes a version installed on the machine (rarely used). For devel-installed versions, removes the knowledge of the installation but does not touch any files in the package directory.

eups publish PACKAGE {VERSION, --current}

Makes a tagged or svn version installed on the machine available to the project-wide package repository. Devel-installed versions cannot be published. Requires a specific version or --current to use the version marked current. Copies the package description, including the dependent package snapshot, to the package repository.

Note: There could be a potential conflict if two developers install versions of a package on their own machines, but using different versions of dependent packages, and then both try to publish their versions to the project-wide package repository. Presumably either one could be used, but this is still a case where different developers can have different ideas of what a single package version means.

This is eups distrib --create. Note that there's a concept of declaring the package current to eups distrib as opposed to locally -- this may be what you mean by published.

See above. Declaring the package current in the repository is similar to marking "stable" below except that it can be overridden locally.

eups publish --remove PACKAGE VERSION

Removes a published version from the project-wide package repository (rarely used).

eups mark {stable, beta} PACKAGE {VERSION, --current}

Marks a published version in the project-wide package repository as "beta" or, if the version is a tagged version, allows it to be marked "stable". Any version previously marked with the same mark has its mark removed. Requires a specific version or --current to use the version marked current.

eups mark {stable, beta} --remove PACKAGE VERSION

Removes a mark (rarely used).

eups current PACKAGE VERSION

Marks a locally installed version as "current" for this user. Given the automatic marking as current for fetch, install, and install --devel, this command is not expected to be used frequently. It is distinct from eups mark since it operates on a per-user basis, rather than on the project-wide package repository.

How does this differ from eups declare --current package version?

The difference is that "eups current" is always per-user. Today, "eups declare --current" is only per-user if LSST_DEVEL is used, I believe.

It's current in a particular DB; if that DB's private, so's the setup. I do not think that having eups current mean one thing, and eups declare --current another is a wise choice.

Activation Commands

activate PACKAGE [VERSION] [{--stable, --beta}] [{--keep, --exact}]

Activates a package, checking all dependencies. If no version is given, uses the current version if possible; otherwise uses the latest version that works. If a specific version is given or --stable or --beta is specified, tries to use that version if possible; fails if not. If a specific version is given and --stable or --beta is specified, tries to use the specific version while using --stable or --beta for dependencies (see next paragraph).

This command also activates dependent packages if needed. If --exact is specified, the exact versions used when the package was installed are used. Tries to use the already-activated version of dependent packages if --keep is specified; fails if this is not possible. Otherwise, if --stable or --beta is specified, tries to use those versions of dependent packages first. Next tries the current version. Finally, uses the latest version that works if none of the previous versions are possible and fails if no versions work.

I haven't implemented --exact; --keep is in. The generalisations of --current (e.g. --beta == --type=beta) are easily doable.

Calls eups fetch if any package is not installed to retrieve the package from the project-wide package repository.

No; this is not acceptable (see above)

Fine.

deactivate PACKAGE

Deactivates activated version of package and any packages depending on given package.

isn't this just unsetup package with K-T's preferred name? It really would have been easier, at least for me, to use "setup"/"unsetup" throughout this document to isolate changes from syntactic sugar.

Yes, deactivate == unsetup.