Last modified 11 years ago Last modified on 08/18/2008 02:54:40 PM

Archive Operations System (AOS)


Every LSST Site (apart from the summit) includes an Archive Operations System. The purpose of this system is to host the and manage the on-going operational processes of the site. This includes:

  • managing the long-term storage subsystem (if present),
  • managing the data storage cache, caching onto spinning disk new datasets needed by the site, and removing older copies of datasets no longer needed,
  • providing data access services in the form of database queries and dataset requests from other systems,
  • replicating data from other sites locally, and
  • running routine data management control processes, including the triggering of pipeline processing (if supported at that site).

This System hosts the most important assets of a site. It directly manages the data that is served to outside users as well as to internal processing within the Distributed Processing System. The long-term storage subsystem holds permanent copies of released data products that would be accessed not only for data reprocessing, but also for disaster recovery. Consequently, this System will feature a stronger protections against unauthorized access than other systems.


In addition to the site-wide security personnel (see section 2.1), the AOS is expected to have associated with it between 1 and 3 people, depending on the site, who will be responsible for operating and monitoring the day-to-day processes of the site. This personnel will usually handle the typical failures of the system and may be the ones to discover evidence of a security problem. This personnel must be aware of the procedures for reporting and reacting to possible security issues.

Physical Operating Environment

See section 2.2.

System Descriptions

Long-term Storage Subsystem
Not all sites will have this subsystem. This subsystem provides permanent storage for core data products, including raw images, calibration data, release catalogs, and co-added sky images. It will most likely consist of a massive, automated tape library fronted by a modestly sized filesystem (200-400 TB).
Data Cache subsystem
This is subsystem provides temporary storage for the datasets of greatest interest at any given time. It is assumed that any data stored here can be automatically reproduced from another source, either by replicating a copy from another site or recalculating the dataset from original data in long-term storage.
Database servers
These provides access to the data catalogs, both the previously released catalogs (which are static) as well as the current unreleased catalog (if available at the site). The AOS provides a read-only interface to the databases for use by the Community Services System (CSS) and the Distributed Processing Systems. If the site features a Distributed Processing System (DPS) for internal product production, the AOS will provide a read-write interface exclusively for updating the current unreleased catalog.
Dataset Access servers
These provide access to file-based data products. Data access services will first look in the Data Cache for requested datasets; if they are not found there, the service may be requested them from external sites or trigger their recreation (via the DPS) and recaching into the Data Cache. The AOS provides a read-only interface to the CSS and the DPS and a read-write interface to the DPS.
Data Replicator service
This service pulls, receives, or delivers datasets from/to other sites. This will be done via secure, authenticated connections to ports at other LSST sites.
Event Broker
The Event Broker routes Event messages in and out of the AOS. When exchanging events with other systems either on site or off, it is done via secure, authenticated connections to other Event Brokers on definied ports. This subsystem will also maintain its own historical database of events.
Miscellaneous Operational Processes
The AOS will host all of the processes that carry out automated operations of data management. This includes the routine execution of pipelines on the DPS, packaging and distributing science Alerts, etc. Pipelines will be launched on a DPS by accessing its Pipeline Manager interface.
Administrative Portal
It is expected that operational processes will launched, controlled, and monitored from outside the system (possibly from off-site) via an Administration Portal.

Data Products

The AOS provides the primary storage for the following products described in section 2.4:

  • Raw Science Data
  • Temporary Processed Data Products
  • Released Data Products
  • Unreleased Science Data
  • User Science Data

Any of these data products may be received via the Data Replicator service. Some of these data will be marked to be archived in long-term storage; in this case, a response message must be returned to indicate the successful copy into long-term storage.

The AOS Event Broker maintains its own database contain the history of events originating from the AOS which would be viewable via the Administration Portal. The AOS is expected to manage other internal data such as logs which may also be viewable via the Administration Portal. The confidentiality concern for this data would be low. The concern for integrity would be moderate as logs would be a target for corruption by an intruder wishing to hide tracks. The desired availability would be moderate as it is expected that operations would not cease of access to this data were interrupted.

Management, Operational, and Technical Controls Descriptions

Access Control