Ticket #2967 (closed action item: fixed)

Opened 5 years ago

Last modified 5 years ago

Use python2 instead of python when invoking the interpreter in code

Reported by: rowen Owned by: robyn
Priority: normal Milestone:
Component: TCT Keywords:
Cc: robyn, mjuric, ktl, rhl, smm, rowen, RayPlante, gpdf, mfreemon Blocked By:
Blocking: Project: LSST
Version Number:
How to repeat:

not applicable

Description

PEP 394 has recommendations on the python 2 to 3 transition that affect us, even though we are staying with python 2. Based on this PEP and discussion on the python-general mailing list, it appears that some time in 2015, python will invoke the python 3 interpreter.

I propose that we future-proof our code by changing from python to python2 in hash-bang lines (#!/usr/bin/env python2) and any other situation where we are invoking a python interpreter. This should be for all new code now, but we should also plan to back-port existing code by the end of 2014.

Change History

comment:1 Changed 5 years ago by robyn

  • Cc mfreemon added

On 22 Jan 2014 at 6:20 PM

Russell pointed out in email that this ticket lacked most of the content from: DM/Policy/Python3Prep. For the sake of minimizing multi page references and maintaining the Ticket for this TCT issue, that page is reproduced below:


  • Use from __future__ import division so that / is floating-point division and // is truncated integer division, regardless of the type of numbers being divided. This gives more predictable behavior than the old operators, avoiding a common source of obscure bugs. It also makes intent of the code more obvious.

  • Use from __future__ import absolute_import and import local modules using relative imports (e.g. from . import foo or from .foo import bar). This results in clearer code and avoids shadowing global modules with local modules. It also makes 2to3 conversion more reliable.
  • Avoid dict.keys() and dict.iterkeys(). For iterating over keys, iterate over the dictionary itself, e.g.: for x in mydict:. To test for inclusion use in, e.g. if key in myDict:. This is preferred over keys() and iterkeys() and avoids the issues mentioned in the previous item.

  • Replace file with open. This is preferred and file is gone in Python 3.

  • Use as when catching an exception, e.g. except Exception as e or except (LookupError, TypeError) as e. The new syntax is clearer, especially when catching multiple exception classes, and the old syntax does not work in Python 3.
  • Use from __future__ import print_function. Minor, but provides forward compatibility. This will affect very little code since we rarely use print.

  • Use next(myIter) instead of myIter.next(). This is preferred, and the special method next has been renamed to __next__ in Python 3.

For more information see http://python3porting.com/toc.html, among several useful references.

RHL's Comment

If we go this route, then I think we should follow the advice of e.g. ?http://python3porting.com/noconv.html and run 2to3 and fix the results to run with python 2.7 and python 3; the swig interface is claimed to build with either 2 or 3.

I guess that the one I hate most is the treatment of iterdict. Using iterdict in python2 is almost always a micro-optimisation, and one that goes away in python3! The argument is:

People should change idiomatic python2 and python3 to non-idiomatic python2 (with iterdict) so that 2to3 can convert it back.

Fortunately you can disable this when you run 2to3:

2to3 -x dict foo.py

and I think we should, followed by a hand conversion as needed (or removal of iterdict before running 2to3)

MJ's Comment

To RHL: To offer a different viewpoint on iter*() methods (and xrange()): one's view of these strongly depends on when they began learning python. To those who've come to the party late (e.g. >= Python 2.2, circa ~2001), this *is* idiomatic python2 (and .values(), .keys(), etc... are "methods to be used when I explicitly want a list back"). Same for xrange(). They're warts, for sure, but idiomatic warts.

N.b., I see these as being about robustness, less than optimization: though the initial assumption that some dict is small and .keys() are safe to use may be true, that can change in the future w/o nobody noticing (causing difficult-to-track-down performance regressions).

Last edited 5 years ago by robyn (previous) (diff)

comment:2 follow-up: ↓ 4 Changed 5 years ago by robyn

This proposal was discussed at the TCT Meeting of 30 January 2014 attended by Russell Owen, K-T Lim, Robert Lupton, Mario Juric, and Jim Bosch. Mike Freemon sent his regrets since he had no opinion to offer on any of the proposals being reviewed. Gregory Dubois Felsmann sent his regrets but sent his proxy to K-T.

This proposal provoked much discussion.

  • Robert expressed his strong preference for an initial implementation to use the simplest appropriate constructs until the implementation has been profiled and data shows that more efficient constructs should be used. This was in reference to the rule preferring "iterating over dictionary values and items use itervalues() and iteritems() instead of values() and items()".
    • Mario's rebuttal, in the comments in the proposal body above, indicated he felt that argument was inaccurate.
    • Russell strongly disagreed with Robert's position since he prefers using 'iteritems()' and does not want to be handed a full list. Russell was also willing to just drop this item from the proposal to move the proposal along.
  • Robert suggested that we should wait until we are ready to move to Python 3 and then use the converter: py2to3, to convert from Python 2 to Python 3 syntax. His goal is to minimize developer time on the process and let the automated translator handle the majority of the work.
    • Mario was concerned that certain python 2 features changing in Python 3 could be translated into 'bad' code. He cited the new division idiom as an example where each change should be reviewed to ensure it does the desired action.
  • Taking a different approach, it was suggested that all new development conform to python 3 idioms.
    • Robert doesn't want to use: from _future_import_
  • Mario countered by suggesting we separate the discussion into two questions: 1) what is a reasonable set of recommendations; and 2) how should we implement them.
  • K-T proposed 3 possibilities: 1) These proposed changes should be used in all new code; or 2) Existing code should be modified if the code is being modified for another purpose; or 3) do all the updates now.

Vote 1: All voting participants agreed to support K-T's option 1: all new code going forward would use the python 3 idioms presented in the Proposal.

Vote 2: All voting participants agreed that all existing code should be updated to use: 'from future import division' in order to expose potential hidden problems. This effort will be registered in a Ticket and scheduled as appropriate by DM system scientist into the Release efforts.

  • The Ticket has been created: #3129 "Update existing python code to use from __future__ import division"

In an off-topic comment, K-T said that he would prefer the package 'examples/' to be considered part of the package's documentation; and further, the examples/ should work if invoked as specified.

  • Robert&Russell said that the examples/ contents were currently built but not executed since some of them expect input or the data required isn't available, etc.
    • This is a topic for a different forum.

comment:3 Changed 5 years ago by rhl

Robert suggested that we should wait until we are ready to move to Python 3 and then use the converter: py2to3, to convert from Python 2 to Python 3 syntax. His goal is to minimize developer time on the process and let the automated translator handle the majority of the work.

Actually that's not my proposal. I propose that when we have a breathing space we run 2to3 but contunue to use python2.7. The from __future__ division is a way to check for potential problems.

comment:4 in reply to: ↑ 2 Changed 5 years ago by mjuric

Replying to robyn:

  • Mario was concerned that certain python 2 features changing in Python 3 could be translated into 'bad' code. He cited the new division idiom as an example where each change should be reviewed to ensure it does the desired action.

Just to clarify: my comment about the division idiom was about how this is an opportunity to make the code more robust and find real bugs even while we're still using Python 2.7

PS: I like where we ended up -- the adopted resolution doesn't ask for lots of work that will ultimately be better handled by 2to3 as RHL pointed out, but it encourages style that will make it easier for 2to3 to get it right when we decide to use it (though the whole str/unicode -> bytes/str thing makes me cringe).

comment:5 Changed 5 years ago by robyn

  • Status changed from new to closed
  • Resolution set to fixed

The Policy statement is available at: DM/Policy/Python3Prep and is cited in CodeStandards. It was summarized for the user in email.

The scheduling of the implementation/conversion of the use of 'from future import division' has been recorded in Ticket #3129.

comment:6 Changed 5 years ago by rowen

Note that future imports can be chained, so I recommended this future import (since we're not bothering with print_function):

from __future__ import absolute_import, division

Note: See TracTickets for help on using tickets.