Ticket #959 (closed defect: fixed)

Opened 10 years ago

Last modified 10 years ago

orca PipelineManager needs to handle defaultDomain=None

Reported by: Tim Axelrod Owned by: srp
Priority: normal Milestone:
Component: ctrl_orca Keywords:
Cc: rhl, fergal Blocked By:
Blocking: Project: LSST
Version Number:
How to repeat:

not applicable

Description

I'm running a pipeline with all nodes on my local box, and find that mpiexec fails if nodelist.scr contains a fully qualified domain name. mpdboot ends up assigning the node to the IP for localhost, and that means the nodename needs to be "newfield" instead of "newfield.as.arizona.edu". But PipelineManager.py crashes if the platform policy file does not set defaultDomain.

Here's how I fixed it in my local copy:

--- python/lsst/ctrl/orca/pipelines/PipelineManager.py (revision 10750) +++ python/lsst/ctrl/orca/pipelines/PipelineManager.py (working copy) @@ -51,7 +51,8 @@

self.prodPolicyOverrides = prodPolicyOverrides

self.defaultDomain = policy.get("platform.deploy.defaultDomain")

  • self.logger.log(Log.DEBUG, "defaultDomain = "+self.defaultDomain)

+ if self.defaultDomain is not None: + self.logger.log(Log.DEBUG, "defaultDomain = "+self.defaultDomain)

self.rootDir = policy.get("defRootDir")

self.createDirectories()

@@ -95,10 +96,14 @@

self.logger.log(Log.DEBUG, "Suspiciously short node name: " + node)

self.logger.log(Log.DEBUG, "-> nodeentry =" + nodeentry) self.logger.log(Log.DEBUG, "-> node =" + node)

  • node += "."+self.defaultDomain

+ if self.defaultDomain is not None: + node += "."+self.defaultDomain

nodeentry = "%s:%s" % (node, nodeentry[colon+1:])

else:

  • nodeentry = "%s%s:1" % (node, self.defaultDomain)

+ if self.defaultDomain is not None: + nodeentry = "%s%s:1" % (node, self.defaultDomain) + else: + nodeentry = "%s:1" % node

self.logger.log(Log.DEBUG, "returning nodeentry = " + nodeentry) return nodeentry

Change History

comment:1 Changed 10 years ago by DefaultCC Plugin

  • Cc rhl, fergal added

comment:2 Changed 10 years ago by srp

  • Status changed from new to assigned
  • Owner changed from daues to srp
  • Component changed from meas_astrom to ctrl_orca

Thanks for the fix. I'll get that added into the tree.

Just curious... can you use a terminal on your local machine and then ssh into it using a fully qualified name?

Reassigning this to me, and putting it under ctrl_orca

comment:3 Changed 10 years ago by daues

I was curious about the comment "mpiexec fails if nodelist.scr contains a fully qualified domain name". This is the generic way that we run pipelines, i.e., all of the pex_harness examples use fqdn; example nodelist.scr:

lsst8.ncsa.uiuc.edu:2

So it would be interesting to identify the configuration/context where this is failing.

comment:4 Changed 10 years ago by srp

  • Status changed from assigned to closed
  • Resolution set to fixed
  • reviewstatus changed from notReady to selfReviewed

Fixed and tested in trunk.

Note: See TracTickets for help on using tickets.