Last modified 9 years ago Last modified on 10/08/2009 09:00:31 AM

This page is part of the Security topic.

Notes from NSF's Cyb09, Sept 14-15 2009

About the Summit

NSF Cybersecurity Summit 2009 for Large Research Facilities

This was the 5th annual summit, which was started as a response to a major security incident in 2003 that affected several large U.S. research facilities.

From the website (emphasis added):

The research and education community faces a pressing need for effective and efficient strategies for securing its IT infrastructure. This is particularly important in large, federally sponsored research facilities that operate very substantial, interconnected networks of diverse resources including computers, information stores, and special instrumentation. ... This forum will bring together stakeholders to establish and maintain collaborative efforts to advance cybersecurity among the university and government research communities.

Lee LeClair's Comments

(Lee is a security consultant with Ephipian)

I'm glad you guys were able to attend. From your notes, it sounds like it was definitely worthwhile. From what we've seen at major data centers, the key points are the same. Clear incident response processes are critical as are TESTED backup and recovery exercises. Automated build processes are a little tougher up front but pay off big during recovery. And of course backups are easy, it's the RECOVERY where problems arise.

Other lessons we've seen in general:

  • For anything not super compute-intensive, seriously consider Virtualization. You can store whole builds with VMs which make recovery much faster and simpler
  • Prioritize systems and data so that in the event of problems, its clear what has to come back first, then second, etc.
  • Establish Recovery Time Objectives (how long can something be down), get management buy-in, and then build and practice to achieve it
  • Establish Recovery Point Objectives (how much data loss is acceptable measured in time), get management buy-in, then setup schedules and exercises to achieve it
  • Document (so staff doesn't have to think too hard in stressed circumstances) and practice (practice makes perfect and instills confidence)

Bill Baker's Notes

I attended for LSST and the U.S. National Virtual Observatory. It was my first time at the summit -- it really helped me gain perspective on LSST's cybersecurity preparations.


With regard to LSST's cybersecurity plan:

  • We have the right ideas, but we can trim down our policy
    • NSF wants an executive summary, really (8-10 pages total, readable)
      • But keep nitty-gritty for internal use
    • Concentrate more on incident response and user education
  • NSF mainly wants to know that we:
    • know our security risks with regards to our science mission
    • are capable of protecting ourselves
    • will keep in touch with NSF
  • Misc things that stood out
    • Concentrate on recovery, in particular rebuild times
      • Automate machine installation as much as possible
      • Imagine our nightmare scenario and prepare for it
    • Documents sent to NSF are subject to audit -- be sure you mean it!
    • We value data and instruments intrinsically, but not computation clusters
      • Consider making a clean separation between data and computation

More detailed notes

Overall (LSST)

  • Your mission is science
    • Security should serve it
    • Know risks & manage them (and what an incident would truly cost)
    • Protect your physical control systems
  • Be ready to respond to incidents
    • Balance prevention & planning response; respond proportionately
    • Have contacts & share knowledge
      • Internal cohesion & human resources -- practice!
      • Peer organizations (to share current news, threats, info)
      • Law enforcement & external security organizations
    • Robustness in depth: Keep running despite attacks & compromises
      • If a portal is compromised, keep control systems protected & running
  • Knowledge protects
    • Know your needs and how to safeguard them
    • Think wholistically -- watch for all threats and all solutions, not just the ones you are comfortable with
    • Educate your users
  • It's hard to clean up a network after an intrusion
    • Best to make it easy to reinstall
    • Separate data storage from software
  • Respond to vulnerabilities quickly
    • Custom kernel patching
  • Reality
    • How many rules does a large firewall have? 15,000?
      • What are the chances those rules are all correct? 0.
    • Assume that your users' accounts are bad (especially scientists)

Overall (NVO)

  • Practitioners
    • kantara initiative
  • Federated ID technologies are better-understood than policies
    • tech relatively well established & available, standardized
    • policies are a matter of research
  • Long-term
    • Trend is toward federated identity
      • Users are learning
      • Portals are starting to expect it
    • Has advantages for Relying Parties
      • Externalizes user & group management
      • Enforces good practices
  • Migrating from local auth to federated:
    • users: provide migration path or allow legacy
    • portals
      • catch transitions (new development & apps; rewrites)
      • highlight inherent incentives
      • make it easy
      • pick broad & shallow audiences first, to pull in institutions
  • InCommon? is an example of a federated identity coalition
    • Software published
    • University of Texas copied it -- see slides
      • copied documents, with a few changes
    • Complete set of policy documents
    • Fee for each participant
    • Already doing all the things NVO wants to do
      • But much more formally

My Thoughts

  • What kinds of incidents can LSST have?
    • Compromised infrastructure
    • Compromised science accounts
    • Leaked personal research data or metadata
    • Integrity loss of data
      • during transmission
      • in storage
      • tampering vs. decay
  • LSST Policy changes
    • user & admin training: emphasize & elaborate
      • response plan: elaborate & mandate practice/simulation/testing
    • make recovery easy
      • segregate data & software (& checksums)
        • our priority is data, not computer systems
        • compromised software installation shouldn't threaten data
        • reduce incentive to alter data systems
        • also keep integrity checks remote from data
      • rebuild time requirement
        • automate machine installation where possible
    • systematize learning
      • track system changes (Mike's wiki page)
  • Questions
    • How do you know what *real* risks are? You can only guess.

Von Welch

  • People over-emphasize prevention and need to put more effort into response
    • you can get hit despite prevention
    • can go for years on luck, then get hit twice in a row

James Marsteller, Teragrid (PSC)

  • Educate your users
    • Incident in 2008
      • NFS compromise
      • PIs sharing SSH keys against the rules
      • Slow start, sleep, then own with vulnerability & spread
    • Responded quickly
      • team already in place
      • secure communication channels
    • Educated users are more cooperative
      • Know the policies
      • Know the consequences
      • Personal machines: especially vulnerable
  • Turn off unused accounts
    • Large numbers hard to do during a crisis
  • Patching
    • downtime: balance availability vs. security
    • custom kernels delay more (Lustre ...)

Acting Deputy Director Cora Marrett

  • Ann Radcliffe, 1746: "A well-informed mind is the best security against the contagion of folly and of vice."

Doug Pearson, REN-ISAC, Indiana U

  • Relationships & communication
  • What do you do during vulnerable window?
  • SES, Security Event System (similar to Argonne's Fed Mod?)
    • cross-site cooperation system
    • detect events, aggregate, correlate, determine bad actors, watch & block


  • Weaving in OpenID
    • support by Shib -- better library :)
    • viewed as inherently LOA 1 (wtf? protocol vs. provider)
  • Support LOA 2 with Silver
  • Accustom users to federated login flow
  • Maturation
    • Privacy
      • need to implement opaque identifiers
      • end-user control of attribute disclosure
    • Lots of potential in attributes
      • tons of potential use cases
      • rich and complex space that needs exploring
      • GSA workshop: "The Tao of Attributes" (goal = blueprints)

Stefan Lueders, CERN

  • Huge ecosystem
    • control systems -- putting biggest effort into protecting
    • visitors -- bringing their infected machines
    • computing (users running their own software)
    • [something else too -- missed it]
  • Detect infected machines and block them
  • Provide courses on secure programming (C, C++, Java)
  • Multi-national organizatin
    • deal with multiple countries' laws (What if they conflict?)
  • We should put more trust in users
    • in a large, free environment, I am not directly in charge of anything
    • academic freedom, privacy, etc.
    • train them (from childhood, even)
    • like teaching people to cross the street safely
  • Keep running while under attack, even compromised
    • Keep control systems up even if other systems are owned

Walter Dykas, DOE

  • Steven Chu says "Science is priority"
    • need to measure impact of rule changes on the *mission*
    • enforcers want "perfect security" but it doesn't exist
    • know risks: always have vulnerabilities, threats
      • may even have undetected compromises

Michael Corn, U of I

  • Outsourcing & looser coupling
    • storage, cloud CPUs
    • sliced-up services like
  • Breakdown of network boundary as a security perimeter
    • We bought an IDS
      • started turning off rules
      • we were left with traffic to Libya, Iran, etc.
      • but we have hundreds of students from those countries
  • Clever vulnerabilities
    • Triggering all air conditioners in NYC, collapse electric grid

Shawn Henry, FBI, Asst. Director, Cyber Division

  • Interactive, regular exchange of info & intel
  • Mission: mitigate threats from cyber attacks & crimes
  • What's the threat? Is it real? Many people don't know what the threat is, or what an incident looks like.
  • Every single infrastructure is being attacked and having losses, every day
    • Energy, Transportation, Banking & Finance, IT, CDCs
  • Who is the adversary?
    • Individuals & virtual hacker groups
      • small groups of specialists
      • organization similar to classical criminal groups
    • Terrorists, Jihadis (?wtf?)
    • Foreign Intelligence
    • Insiders
  • CNE = Computer Network Exploitation
    • confidentiality, integrity, availability
  • Avenues
    • Insiders
    • Vendors & Infiltrators
    • Expanded & Remote access (trojan wireless provider or USB device)
  • Challenges
    • hard to clean a network thoroughly after an intrusion
      • best to make it easy to reinstall
  • Preparing for an intrusion
    • Understand what it means to respond
    • Team work together -- IT, engineering, line, admin
    • Have a plan, know it, practice it
    • Keep it to yourself -- at least the response plan
    • Defense in depth
    • Law enforcement relationship
  • Sharing information
    • Coordination & Cooperation
      • public & private sector

Prof. Gene Spafford

  • National Security has changed as threats have changed
    • distant past: Physical borders -- check
    • 20th c.: Asymmetric threats -- check (mostly)
    • now: Crime
      • counterfeiting
      • human trafficking
      • drugs (e.g. Mexico being distabilized)
      • weapons trafficking
      • piracy (Caribbean)
      • terrorism (often over-emphasized, but basically criminal)
  • What was our response to 9/11?
    • attack Afghanistan, invade Iraq (inappropriate)
    • cost estimated at $2 Trillion
      • $675M per victim
      • 334 civilian casualties per victim
      • Radicalization / creation of more terrorists
      • exactly what al-Qaeda wanted -- see Osama bin Laden's strategy statement
    • compare to death from heart disease, car accidents, lack of seatbelts
  • What are our priorities?
    • DOD budget: $700 billion
    • Justice+Science+Foreigh Affairs+Education: $100 billion
  • Conclusions
    • We don't react until there's a disaster
    • Then we overreact and remember the name of the disaster
    • Ongoing low-level problems get less attention
    • If you have responsibility for security but don't have concommittant authority and funding, you become the person who gets blamed when something goes wrong
  • The state of cyber-security
    • watch sites say "green"
    • but situation is grim
      • botnets, malware, incidents -- all on continuous rise
      • most perps are never arrested
  • What can we do differently?
    • revisit engineering decisions that were made during previous technology generations
    • physical resources are now cheaper than programming time, security breaches, etc.


  • motivation to join InCommon?
    • grants administration
    • transparency
    • public access to information
  • Ardoth Hassler
  • Clair Goldsmith - smart old Texas guy
    • the kind of app you want to start with has one or two users from each ID Provider, to get it to start working
    • Inter-federation (InCommon? / UT Fed / etc.)
      • Technology exists, works now
      • Policy does not exist
  • Ken Klingenstein
    • Federation of authentication = externalization of identity
    • Federation of authorization = externalization of attributes
  • "I'm twelve years old, but I have a one-time password token."

NSF Security Planning (mainly Abe Singer of LIGO)

*- Also see Ardoth's slides

  • Main Point
    • It's about risk mitigation
    • Programs and plans will improve over time
    • A journey, not a destination
  • Models out there, best practices
    • NEES good example of consortium
    • Internet2, universities
  • Should be:
    • Sufficient to meet needs
    • Appropriate to identified risks
  • Don't reinvent wheels
  • What NSF wants
    • Short overview (NSF doesn't want all details)
    • Know what you're doing
    • Under control
    • Will communicate with NSF
  • Reporting
    • Note: documents received by the NSF are subject to audit
    • Phone NSF instead of email
  • keep it short
  • the nature of the attack
  • what you're doing to keep things under control
  • immediate vs. annual report?
    • major vs. minor incidents
    • LIGO: major incidents interfere with operations
    • Teragrid: multi-site vs. single-site
    • Does it make the news?
  • Concepts (Abe)
    • Don't rely on user compliance -- not all will comply
    • User accounts *will* get compromised
    • Endpoints *will* get compromised
    • Usability: a requirement for security practice
    • Work from your infrastructure out
    • The insider threat is the outsider threat
    • Think "containment"
    • Technologis & mechanisms: understand what they do
      • have more than just a hammer
    • Network traffic isn't the problem; it's what you do with it
    • Best block is to not be there (Mr. Miyagi)
    • Security requires control
    • Sweat the small stuff & details
      • failures are often in little mistakes, despite broad correctness
      • you can be FISMA A+, but have terrible security
      • easy to produce a mountain of paperwork, but not implement
  • Assessment (not saying "audit", which is a verification of correctness)
    • Abe doesn't like "best practices" because in IT security it's too often herd behavior, too much fear and ignorance.
    • We don't have actuarial tables -- my life would be easier if we did.
  1. What are you protecting? What are your fundamental assets?
    • Understand risks and threats
    • Nightmare scenarios?
      • LIGO: postdoc injects false gravity wave data
      • LIGO: or data gets corrupted and is worthless
      • Something happens that makes your funding go away
    • Requirements and Process
    • Trust relationships: identify them
      • trust = believe something another system says
      • trust = one system is allowed to compromise another
    1. identify risk
    2. outline process to mitigate
    3. does the process really address the risk?
    4. is it worth doing?
    5. identify residual risk -- is it acceptable?
  2. What services do you have to provide?
  3. Who's getting served?
  • To Decide
    • Risk acceptance
      • identify residual risk & accept it at the highest levels
    • Security officer
      • someone with practical experience
      • enforce the plan
      • communicate with NSF
    • Management
      • either direct involvement or report to upper mgmt
  • Beware
    • threats vs risks (not the same)
    • attackers -- who are they?
    • don't under-estimate your attackers
      • as skilled as you, or more skilled
      • Stakkato read system documentation, learned as he went
  • To Do -- balance these
    • Protection
    • Monitoring
    • Response
    • Policy -- should serve what you want to do
  • Incident Response
    • start with your nightmare scenario
    • what would you like to be able to do?
  • intelligence (information & communication)
  • assessment
  • recovery
  • how did they get in?
  • what's the damage?
  • how do I keep them from coming back?
  • Monitor
    • network traffic
    • user & system activity
    • system logs
    • applicatin logs
    • historical data
    • user & system data
  • Protection - really about system administration
    • configuration mgmt
    • account mgmt
    • access control
    • network, system, application separation
    • patch mgmt
  • Policy
    • institutional policies -- watch for conflicts
  • behavior
  • enabling vs. prohibiting
    • highlight each
  • enforceable
  • understandable & accessible
  • short -- so that people will read them
  • why, who, what, where, when
  • scope
  • summary
  • penalties