X
    Categories: How-to

Developer Guide to Key Differences between Python 2 and 3

The Python 3 programming language was released in December 2008 and served as the next version intended to improve upon and replace Python 2. It introduced many syntactic revisions along with a much larger standard library to improve Python’s usability and programming experience. Due to these changes, Python 3 is not directly backward compatible with Python 2. This is unlike most other mainstream programming languages such as Java which maintain a strong emphasis on backward compatibility. The break in backwards compatibility has allowed python to grow and accommodate modern programming practices. However, it has also left behind millions of lines of legacy Python 2 codebases and modules which might still need to be worked with and maintained. To work with these codebases and possibly migrate them to Python 3, we need to understand the differences between Python 2 and 3. 

Back to the __future__

The Python language developers backported many of the newly introduced Python 3 features and packages back into Python 2.6 and Python 2.7. However, Python 3 features which would affect the core Python 2 language syntax were not enabled by default. They must instead be explicitly enabled by importing them from the __future__ pseudo-module. __future__ is a pseudo-module in the sense that it is not a real Python 2 module, but instead, it is more like a directive to the python bytecode compiler asking it to parse the code using the syntax and semantics of a future version of python (in this case Python 3). This is the reason why all __future__ imports must be placed at the very top of the python file before any of the other regular import statements. __future__ is meant to ease the transition to the newer python syntax. In addition, it allows us to maintain a single consistent codebase which will run on both versions of Python.    

Key Differences

The key differences between Python2 and Python 3 are as outlined below:

Print Statement

One of the most basic differences between Python 2 and 3 is the print statement. In Python 2, print is a special statement used to print values on the console. This means that parentheses are not needed while invoking the print statement. In contrast, print is a function in Python 3 and requires parentheses when it is invoked.

The newer print function can be used in Python 2 by importing print_function from the __future__ module.   

$ python2 
>>> print 1, 2, 3
123

>>> print(1, 2, 3)  # Notice how (1, 2, 3) is treated as a tuple
(1, 2, 3)

>>> type(print)  # This raises a syntax error since print is not a function
  File "<stdin>", line 1
type(print)
        ^
SyntaxError: invalid syntax

$ python2 
>>> print 1, 2, 3
123

>>> print(1, 2, 3)  # Notice how (1, 2, 3) is treated as a tuple
(1, 2, 3)

>>> type(print)  # This raises a syntax error since print is not a function
  File "<stdin>", line 1
type(print)
        ^
SyntaxError: invalid syntax


Division

By default Python2 performs integer division whenever we use the “ / ” operator. However Python 3 performs floating point division on the operands of the “ / ” operator. We can still perform integer division in Python3 using the “ // “ operator (two forward slashes). This behaviour can be obtained in Python 2 by importing division from the __future__ psueso-module. 

$ python3
>>> 1 / 2
0.5
>>> 1 // 2
0

$ python2

>>> 1 / 2
0

 

Package __init__ files

Python considers individual python source files (.py files) as modules and these modules may be grouped together in a hierarchical folder structure to create a package (where the folder name defines the package name). Python allows packages to be nested one inside the other by keeping sub-folders inside the parent package folder. The folder structure of an example package is as shown below:

package_a
|-- __init__.py
|-- module1.py
|-- module2.py

 


The initialization code for a package can be placed inside a
__init__.py file. A distinction between Python 2 and Python 3 is that the __init__.py must be explicitly defined inside a folder in order to have Python 2 treat the contents of a folder as a package. This means that Python 2 will not recognize a folder as a package unless it has a __init__.py file inside it. This __init__.py file can be empty, but it must exist inside the folder explicitly. In contrast, Python 3 does not force any such requirement on the presence of a __init__.py file. That is, Python 3 will treat any folder as a package regardless of whether or not it contains an __init__.py file.

Absolute Imports

Import statements such as import mod within a package are ambiguous. In Python 2  they may refer to another module contained within that package or to a top-level module. This can cause standard library modules to be shadowed by package internal modules having the same name. To avoid this ambiguity, Python 3 does not allow implicit relative imports and instead uses absolute imports which import the first mod package found in the sys.path directories. This behavior can be obtained in Python 2 by importing absolute_import from the __future__ pseudo-module.

In order to continue using relative imports (importing other modules from within a package), Python 3 introduces the dot notation to explicitly specify that we want to perform a relative import. A single dot refers to the current directory, and two dots specify the parent directory. The dot notation can be used to import relative modules as shown below:

from . import mod                      # mod.py exists in the same directory
from .subpackage import another_mod    # subpackage is a sub-directory

 

Unicode Literals

Strings by default are ASCII bytes in Python 2, they can be converted to unicode using the ‘u’ prefix or the unicode() function. Python 3 uses unicode strings by default, they can be converted to bytes using the ‘b’ prefix or using the bytes() function.

$ python3
>>> str = 'hello world'  # unicode string
>>> str_b = b'hello bytes'  # bytes string
>>> type(str_b)
<class 'bytes'>
>>> bytes(str, 'utf-8')  # convert unicode to bytes
b'hello world'
$ python2
>>> str = 'hello world'  # bytes string
>>> unicode(str)
u'hello world'


FuncTools

Python 2 provides useful functional programming functions such as reduce by default in the global namespace. However, Python 3 moves functional programming utilities such as reduce into the functools package.

$ python3
>>> import functools
>>> functools.reduce(lambda x,y: x + y, range(5))

$ python2
>>> reduce(lambda x,y: x + y, range(5))

 


Generators

In Python 3 the range, map, zip, and filter functions return a generator that can be iterated to obtain the individual values. This is in contrast with Python 2 where the range, map, zip, and filter functions return a list containing all the elements.

Note: Python 2 provides an xrange method which returns a generator for the required range specified in the parameters.

$ python3
>>> range(10)
range(0, 10)


$ python2
>>> range(5)
[0, 1, 2, 3, 4]
>>> xrange(5)
xrange(5)

 

Exceptions

The syntax to raise Exceptions in Python 3 is slightly altered when compared to Python 2. The exception message must now be placed within parentheses. In addition, exception handling syntax has also been altered since we must now use the as keyword to capture the exception into a variable.

$ python2
>>> try:
...    raise Exception, 'Something went wrong!'
... except Exception, ex:
...    print ex

$ python3
>>> try:
...    raise Exception('Something went wrong!')
... except Exception as ex:
...    print(ex)

 


List Comprehension Variable Namespace

Python 2 leaks the iterating variable used in a list comprehension into the scope namespace. This would re-assign values of variables having the same name as the iterating variable with the value of the last element iterated in the list comprehension. Python 3 does not allow the iterating variable to be leaked into the namespace of the enclosing scope.

$ python2
>>> s = 1000
>>> arr = [s*s for s in range(5)]  # global s has been reassigned 4
>>> print(s)
4

$ python3
>>> s = 1000
>>> arr = [s*s for s in range(5)]  # s is not leaked into global
>>> print(s)
1000

 


Rounding

In Python 3 the rounding strategy used by the round function has been modified for halfway cases (decimal values of 0.5). The value will be rounded to the nearest even number in halfway cases. This is unlike Python 2 which used the Banker’s rounding scheme where all halfway numbers were rounded up.

$ python3
>>> round(10.2), round(10.5), round(10.9)
(10, 10, 11)

$ python2
>>> round(10.2), round(10.5), round(10.9)
(10.0, 11.0, 11.0)

 

Version Independence with six

The six package allows us to code in both versions of python without any modifications. Six provides interfaces that wrap around the differences between Python 2 and Python 3. Six allows us to check the version of the Python interpreter using two boolean flags six.PY2 and six.PY3 which are set depending on the version of python which is currently executing the script.

Six wraps around basic differences using constants such as six.text_types which provide a wrapper around the unicode text type in python (str in Python 3 and unicode in Python 2). 

Six also wraps around standard library differences using the six.moves sub-module. six.moves provides wrappers around all the standard library functions which have been reorganized in Python 3 by providing a consistent interface that works across both versions of python. For example, we can make use of the reduce function as shown  in the below code snippet which works on both Python 2 and Python 3:

>>> import six
>>> six.moves.reduce(lambda x,y: x+y, [1, 2, 3])

6

 

Ankit Sachan:
Related Post