Advanced Python Training
In Chicago with David Beazley

  • Practical Python Programming
  • Advanced Python Mastery
  • Concurrency and Distributed Systems
  • Click here for more details!

    Mastering Python 3 I/O

    Copyright (C) 2010
    David M. Beazley
    http://www.dabeaz.com

    Presented at PyCon'10, February 17, 2010, Atlanta, Georgia.

    Introduction

    As most Python programmers know, Python 3 breaks backwards compatibility with Python 2 both in syntax and new semantics of built-in operations. One of the most radical changes concerns the ground-up redesign of the I/O system. This tutorial aims to take a tour of the new I/O stack. Topics include text processing, binary data handling, system interfaces, io library module, memory views, and porting advice.

    Support Files

    The following file contains some supporting data files that are used in some of the code samples. There are also some code fragments to experiment with things. This download also includes all of the code samples that follow below.

    Code Samples

    Here are various code samples that you can use to try things things out during the course. They're presented in the same order as presentation slides.

    Preliminaries:

    • timethis.py. A utility function for making performance measurements. Used in many of the code samples that follow.

    Part 1 : Introducting Python 3

    • printlinks.py. A Python 2 program that simply prints all of the links on a specified HTML page fetched with urlopen(). Try converting this program to Python 3 using 2to3.

    Part 2 : Working with Text

    • textop.py. Performance timings of various text operations. Try it with different versions of Python.
    • textformat.py. Examples of new-style formatting applied to a list of tuples in order to make a formatted table.
    • textformat2.py. Examples of new-style formatting applied to a list of dictionaries in order to make a formatted table.
    • textformat3.py. Examples of new-style formatting applied to a list of instances in order to make a formatted table.
    Part 3 : Binary Data Handling

    • msgfrag.py. A comparison ofjoining byte fragments together using concatenation, join, and bytearray extension.
    • structwrite.py. Two techniques of writing binary data structures are compared.
    Part 4 : System Interfaces

    No files

    Part 5 : The io module

    These files have a few simple performance tests for comparing different file modes, encodings, etc. You should try these under both Python 2 and 3.

    • iterlines.py. Iterate over lines of a text file using native open().
    • itercodecs.py. Iterate over lines of a text file using codecs.open()
    • iterbin.py. Iterate over lines of a text file using binary file mode.
    • iterenc.py. Iterate over lines of a text file using different text encodings. (Python 3 only).
    • readall.py. Read the entire contents of a file all at once.
    • find404.py. Find all 404 errors in a web server log using text and binary file modes.
    Part 6 : Standard Library Issues

    No files.

    Part 7 : Memory Views and I/O

    • pipearray.py and getarray.py. An example of directly sending a binary array through a pipe created with the subprocess module.
    • structpack.py. Packing a bytearray in-place versus incremental extension.
    • receive.py and send.py. An example of sending a large buffer over a socket using memoryviews.
    Feedback

    I'm always looking for ways to improve presentation materials and examples. Send your ideas to dave@dabeaz.com.