wiki:SecurityMountain
Last modified 11 years ago Last modified on 07/23/2008 10:19:52 AM

Summit and Base Facility

The observatory at the summit and the base facility in La Serena house LSST's real-time science operation.

Overview

Physical Safety The telescope is resistent to attack, even in the event of a security breach.

  • Camera: Hardware is responsible for its own safety (prevent electrical damage, for example), regardless of settings
  • Telescope and Observatory: Operator is always present and responsible for safe movement and operation, even during automated cadence

Optical Link to Summit The summit and base share a single protected network, linked by a 10Gb optical fiber connection.

NOAO Facility The base facility is in NOAO's operations center, physically collocated with the Chilean DAC?, although the Base Facility has its own private, firewalled network and is an entirely separate security domain.

Dedicated Link to Archive Center The base facility streams raw observation data straight to the Archive Center?, through a specially-provisioned link from Chile to Florida to Illinois.

Computer Security at the Summit and Base Facility

  1. Physical Security

1.1 - Power

1.1.1 - Feed from Chilean grid, UPS, generator(s)

1.1.2 - All LSST computer equipment will be kept running by UPS until generator startup. Chilean power grid has serious power issues such as power blackouts, sags, surges which have been known to last for prolonged periods.

1.2 - HVAC

1.2.1 - Sizing, type & routing

1.2.2 - Water alarms under raised floor

1.2.3 - Emergency Power shut-off buttons

1.3 - Fire supression

1.3.1 - Need to selection of type (inergen or similar, if water, dry or wet pipe)

1.3.2 - Sensors located under floor and in room

1.4 - Physical Access Controls

1.4.1 - La Serena uses card based RFID system into computer room with logging

1.5 - Physical Detective Controls

1.5.1 - Video cameras around/in computer centers

1.6 - Physical Preventive Controls

1.6.1 - May use interior glass walls into computer centers.

1.7 - Physical Seismic Protection

1.7.1 - Racks should be seismically isolated as possible (hung from the ceiling).

  1. Network Security

2.1 - Connections between security domains and subdomains

2.1.1 - Other observatories within Chile LSST - CTIO - SOAR and others

2.1.2 - Camera groups (e.g. - SLAC)

2.1.3 - Telescope control group

2.1.4 - DACs (e.g. - NCSA, SDSC, Chilean DAC )

2.2 - Internal network structure

2.2.1 - Border Routers

2.2.2 - Border Firewall at Base

Classical stateful inspection and ACL packet filtering firewall rules.

2.2.3 - Internal Layer 3 switches

2.2.3.1 - Functional VLAN design (trunking of VLANs between La Serena & Pachon)

VLANs are a critical component for IT safety at LSST as these provide logical isolation of IT components and enforcement of system specific security policy with separate access control rules per VLAN as needed.

In the ideal, the network appliances which create the VLANs would support things like dynamic VLAN assignment via SNMP traps, in order to allow (if desired) implementation of "captive portal" technology to authenticate and hence make accountable computers which are casually connected to the LSST La Serena and Summit network. Alternatives include preventive administrative and technical controls which require users to formally request access and include a MAC address for example.

The first WWW pages we saw at the All Hands conference at NCSA showing the "acceptable use" policy requesting a password is an example...

2.2.3.2 - "packet filtering" and other switch-based firewall capabilities @ VLAN

Presumably, there will separate ACLs which will be needed development for each VLAN, ranging from VLANs with *NO* outside access, to VLANS permitting access from only some IP ranges, to VLANs open to the public (the DMZ's on the DACs for example).

2.2.4 - Intrusion Detection/Prevention? System

2.2.4.1 - Subnet Positioning

Need to identify which subnets or VLANs will be of "most interest" in terms of where to place IDS/IPS sensors. These would be "mission critical" subnets where it behooves to detect and stop suspicious network activity.

2.2.4.2 - Event scenario (NAGIOS alarms, corrective actions (ACL modifications, RESETS?)

Need to identify what sort of scenarios might crop up (e.g. script kiddies trying to slew the telescope, Bulgarian SPAM king trying to set up shop in the DAC, etc).

2.2.5 - "Host Based Intrusion Detection"

2.2.5.1 - Tripwire, AIDE or other hash-based system consistency packages/HIDS

Would include running periodic checks to look for anomalies, and offloading results off probably to the SGUIL or similar security console.

2.2.6 - System-wide log server

2.2.6.1 - syslog-ng or similar

Gathers up the syslog files from all servers and network devices of interest (e.g. servers, firewalls, routers and smart switches)

2.2.6.2 - "splunk", "swatch" and other log record dataminers.

Utilize the syslog files sent in to the network syslog-ng server.

2.2.7 - system-wide security correlation server for event notification

2.2.7.1 - "SGUIL", "Acid/Base?" or similar combined security event display

Has real time annunciators and alarms, similar to NAGIOS, allows for alarm email, automatic text messaging to CISS personal cell phones. Allows data mining & drill-down to security events of interest.

2.2.7.2 - Integrated with NAGIOS (or similar) performance monitoring ("event notification"/"call tree")

2.2.7.3 - Network and performance monitoring (also integrated NAGIOS)

Typically gathers data via SNMP to display performance with the ability to set alarm and incident thresholds. Also able to check if critical applications are running or critical resources available using ad-hoc scripting.

  1. Data Products
  • What products are consumed, produced, and stored
  • Their sources and destinations
  • Ensuring integrity
  1. Authentication/Authorization? requirements

4.1 - PKI (?)

4.1.1 - Advantages/disadvantages

4.1.2 - Where is the Certificate Authority

4.1.3 - What is the vetting process

4.2 - Single sign-on using PKI or other means(?)

A kerberos or SESAME type system which provides mutual authentication between client and server could go a long way towards providing non-repudiation and integrity of traffic up to the summit. Fermilab and others use kerberos and could serve as a model in what is possible.

4.3 - Determine roles - subjects (who) and objects (what they are allowed to access)

4.4 - "Remote Access" to objects

4.4.1 - Access based on role

4.4.2 - Fixed VPN tunnels between LSST and partners

A permanent VPN tunnel could connect up LSST partners in a manner similar to the way SOAR partners (e.g. UNC's remote observation facilities) hook into the SOAR infrastructure.

Since most certainly "integrity" at LSST trumps confidentiality, it would seem that the ESP mode of VPN which encrypts VPN traffic is probably of less importance the the AH or "authenticated header" mode of VPN, in particular since AH is able to provide non-repudiation for traffic that passes through it and would provide additional assurance against man in the middle session hijacking, for example.

4.4.3 - Transport VPN to infrastructure from other sites

Transient programs which need to connect into LSST facilities could be given general VPN accounts and help in setting up VPN clients.

4.4.4 - Two factor authentication using smart token (NIST 800-63 level 4)

Token devices are getting pretty cheap and it should be quite feasible to require these for access into LSST infrastructure. Given Moore's law and the development of cracked password databases (e.g. "rainbow tables" for non-salted passwords for example) depending on just passwords for LSST security is not a very safe bet.

4.4.5 - Authentication provided by RADIUS or DIAMETER or similar.

These servers could provide a central repository of strong authentication data records.

4.5 - "hairpinned VPN" imposes Firewall security policy on remote users.

This "fully integrates" VPN clients into the server network and is a VPN "best practices" to help mitigate the threat of pwned client machines.

  1. Recovery plans

5.1 - "Business Recovery Plan"

5.1.1 - Identify the "family jewels"

5.1.2 - Determine the risk

5.1.3 - Develop recovery sequence

5.2 - Disaster Recovery Plans

5.2.1 - Immediate recovery from the event

5.2.2 - Segue into the recovery

5.3 - Contingency Recovery Plans

5.3.1 - Individual procedures for recovery

5.3.2 - Procedures for incident handling (typically security incidents)

5.4 - Backup

5.4.1 - Determine objects to be backed up

5.4.2 - On-site backup procedures

5.4.3 - Off-site backup procedures

  1. Maintainability provisions (maybe outside of scope for now)
  1. Variability among instances

Differences between Summit and Base Facility

  1. Physical Security

1.1 - Power

1.2 - HVAC

1.3 - Fire supression

1.4 - Physical Access Controls

1.4.1 - La Serena uses card based RFID system into computer room with logging

1.5 - Physical Detective Controls

1.6 - Physical Preventive Controls

1.6.1 - La Serena may use interior glass walls into computer centers.

  1. Network Security

2.1 - Connections to other security domains

2.2 - Internal network structure

  1. Data Products
  • What products are consumed, produced, and stored
  • Their sources and destinations
  • Ensuring integrity
  1. Authentication/Authorization? requirements
  1. Base & Summit Recovery plans
  1. Maintainability provisions (maybe outside of scope for now)
  1. Variability among instances (for example, Chilean DAC vs. NCSA DAC)