CptS 481 - Python Software Construction

Unit 8: I/O

In [1]:
from IPython.display import HTML
HTML(open("notes.css", "r").read())
Out[1]:

The print() Function

  • keywords:
    • sep
    • end
    • file
    • flush
In [2]:
print(1,2,3)
print(4, 5, 6)
1 2 3
4 5 6
In [3]:
print(1, "spam", (1,2), set(), { 'z':42 }, sep=", ", end="; ")
print("this is on the same line")
1, spam, (1, 2), set(), {'z': 42}; this is on the same line
In [4]:
import time
print("waiting ... ", end="", flush=False)
time.sleep(10)
print("done")
waiting ... done

Streams

  • basic I/O abstraction

  • predefined streams (a la POSIX)

    You can import these from sys:

    • stdin (or cin in C++)
    • stdout (or cout in C++)
    • stderr (or cerr in C++)
  • useful builtin stream function

    • next(stream)
In [5]:
%%sh
wc -l words.txt
102401 words.txt
In [6]:
f = open("words.txt")
for i in range(4):
    print(next(f), end="")
while True:
    word = next(f)
A
A's
AMD
AMD's
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-6-bb764aa4f108> in <module>
      3     print(next(f), end="")
      4 while True:
----> 5     word = next(f)

StopIteration: 
In [ ]:
word
In [7]:
ctLines = 0
for line in open("words.txt"):
    ctLines += 1
ctLines
Out[7]:
102401
  • useful stream methods:
    • read()
    • readline()
    • readlines()
    • write()
    • writeline()
    • writelines()
    • seek()
    • tell()
In [8]:
wholeFile = open("Makefile").read()
In [9]:
wholeFile
Out[9]:
'EXPORTS = \\\n\tnotes.css \\\n\tnotes.html \\\n\tnotes.ipynb \\\n\twords.txt\n\n# We include the intro\nCHAPTER_LATEX_DIRECTORIES = \\\n\t$(BOOK_ROOT_DIR)/pt1-ch09-basicIO\n\ninclude ../include/Makefile.unit\n'
In [10]:
open("copiedFile", "w").write(wholeFile)
Out[10]:
188
In [11]:
f = open("Makefile")
lines = f.readlines()
lines[4]
Out[11]:
'\twords.txt\n'
In [12]:
lines[1]
Out[12]:
'\tnotes.css \\\n'

The io Module

  • built-in, almost never need to import
In [13]:
import io
io.DEFAULT_BUFFER_SIZE
Out[13]:
8192

The open() Function

  • most important I/O function
  • arguments:
    • path to file
    • most important keyword:
      • mode (similar to stdio)
        • "r": (default): readable text file
        • "r+": readable/writeable text file
        • "w": writeable text file created from scratch
        • "a": writeable text file opened for appending
        • append a "b" to any of these read binary (unencoded) data
    • other keyword arguments include:
      • buffering
      • encoding
        • encoding used if text file (default: UTF-8)
  • used to return a file object, but that class no longer exists
    • now it's a factory function
  • there's a close() method, but you almost never need it
In [14]:
s = "abcdef\n"
f = open("foo.txt", "w")
print(f)
f.write(s)
f.close()
<_io.TextIOWrapper name='foo.txt' mode='w' encoding='UTF-8'>
In [15]:
s = b"abcdef\n"
f = open("foo.txt", "wb")
print(f)
f.write(s)
f.close()
<_io.BufferedWriter name='foo.txt'>
In [16]:
s = "abcdef\n"
f = open("foo.txt", "w")
print(s, file=f, end="")
f.close()

We can copy a file with a single statement:

In [17]:
open("bar.txt", "wb").write(open("foo.txt", "rb").read())
Out[17]:
7

Here's the effect of the "b" flag:

In [18]:
rResult = open("foo.txt", "r").read()
print("      For file opened with 'r', read() returns:", type(rResult))
      For file opened with 'r', read() returns: <class 'str'>
In [19]:
rbResult = open("foo.txt", "rb").read()
print("The same file opened with 'rb', read() returns:", type(rbResult))
The same file opened with 'rb', read() returns: <class 'bytes'>

Example: hexwords

The file "words.txt" contains a list of words, one per line. The goal is to find all pairs of words in the file that, when concatenated, form a legal 32-bit (8 hex digit) hexadecimal number, like "added fee". This means the words can only use letters "a" through "f".

In [20]:
# Build the list "hexWords[]" containing only words made from hex digits a-f 
hexWords = []
for line in open('words.txt'):
    # remove trailing `\n` from 'line'
    line = line.rstrip()
    if set(line) <= set('abcdef'): # We're intentionally ignoring upper case A-F
        hexWords.append(line)
# no-prize challenge:
# Can you write the above as a single statement using a comprehension?

# Compare each pair of hexWords. If their combined length is 8, print them.
# (Can you think of a faster way to do this?)

count = 0
for hexWord0 in hexWords:
    for hexWord1 in hexWords:
        if len(hexWord0) + len(hexWord1) == 8:
            print(hexWord0, hexWord1, end='    ')
            count += 1
            if count % 5 == 0:
                print()
a acceded    a defaced    a effaced    abed abed    abed aced    
abed babe    abed bade    abed bead    abed beef    abed cede    
abed dead    abed deaf    abed deed    abed face    abed fade    
abed feed    accede ad    accede be    accede fa    acceded a    
acceded b    acceded c    acceded d    acceded e    acceded f    
ace added    ace baaed    ace ceded    ace decaf    ace ebbed    
ace faced    ace faded    aced abed    aced aced    aced babe    
aced bade    aced bead    aced beef    aced cede    aced dead    
aced deaf    aced deed    aced face    aced fade    aced feed    
ad accede    ad beaded    ad bedded    ad beefed    ad cabbed    
ad dabbed    ad decade    ad deeded    ad deface    ad efface    
ad facade    add added    add baaed    add ceded    add decaf    
add ebbed    add faced    add faded    added ace    added add    
added baa    added bad    added bed    added bee    added cab    
added cad    added dab    added dad    added deb    added ebb    
added fad    added fed    added fee    b acceded    b defaced    
b effaced    baa added    baa baaed    baa ceded    baa decaf    
baa ebbed    baa faced    baa faded    baaed ace    baaed add    
baaed baa    baaed bad    baaed bed    baaed bee    baaed cab    
baaed cad    baaed dab    baaed dad    baaed deb    baaed ebb    
baaed fad    baaed fed    baaed fee    babe abed    babe aced    
babe babe    babe bade    babe bead    babe beef    babe cede    
babe dead    babe deaf    babe deed    babe face    babe fade    
babe feed    bad added    bad baaed    bad ceded    bad decaf    
bad ebbed    bad faced    bad faded    bade abed    bade aced    
bade babe    bade bade    bade bead    bade beef    bade cede    
bade dead    bade deaf    bade deed    bade face    bade fade    
bade feed    be accede    be beaded    be bedded    be beefed    
be cabbed    be dabbed    be decade    be deeded    be deface    
be efface    be facade    bead abed    bead aced    bead babe    
bead bade    bead bead    bead beef    bead cede    bead dead    
bead deaf    bead deed    bead face    bead fade    bead feed    
beaded ad    beaded be    beaded fa    bed added    bed baaed    
bed ceded    bed decaf    bed ebbed    bed faced    bed faded    
bedded ad    bedded be    bedded fa    bee added    bee baaed    
bee ceded    bee decaf    bee ebbed    bee faced    bee faded    
beef abed    beef aced    beef babe    beef bade    beef bead    
beef beef    beef cede    beef dead    beef deaf    beef deed    
beef face    beef fade    beef feed    beefed ad    beefed be    
beefed fa    c acceded    c defaced    c effaced    cab added    
cab baaed    cab ceded    cab decaf    cab ebbed    cab faced    
cab faded    cabbed ad    cabbed be    cabbed fa    cad added    
cad baaed    cad ceded    cad decaf    cad ebbed    cad faced    
cad faded    cede abed    cede aced    cede babe    cede bade    
cede bead    cede beef    cede cede    cede dead    cede deaf    
cede deed    cede face    cede fade    cede feed    ceded ace    
ceded add    ceded baa    ceded bad    ceded bed    ceded bee    
ceded cab    ceded cad    ceded dab    ceded dad    ceded deb    
ceded ebb    ceded fad    ceded fed    ceded fee    d acceded    
d defaced    d effaced    dab added    dab baaed    dab ceded    
dab decaf    dab ebbed    dab faced    dab faded    dabbed ad    
dabbed be    dabbed fa    dad added    dad baaed    dad ceded    
dad decaf    dad ebbed    dad faced    dad faded    dead abed    
dead aced    dead babe    dead bade    dead bead    dead beef    
dead cede    dead dead    dead deaf    dead deed    dead face    
dead fade    dead feed    deaf abed    deaf aced    deaf babe    
deaf bade    deaf bead    deaf beef    deaf cede    deaf dead    
deaf deaf    deaf deed    deaf face    deaf fade    deaf feed    
deb added    deb baaed    deb ceded    deb decaf    deb ebbed    
deb faced    deb faded    decade ad    decade be    decade fa    
decaf ace    decaf add    decaf baa    decaf bad    decaf bed    
decaf bee    decaf cab    decaf cad    decaf dab    decaf dad    
decaf deb    decaf ebb    decaf fad    decaf fed    decaf fee    
deed abed    deed aced    deed babe    deed bade    deed bead    
deed beef    deed cede    deed dead    deed deaf    deed deed    
deed face    deed fade    deed feed    deeded ad    deeded be    
deeded fa    deface ad    deface be    deface fa    defaced a    
defaced b    defaced c    defaced d    defaced e    defaced f    
e acceded    e defaced    e effaced    ebb added    ebb baaed    
ebb ceded    ebb decaf    ebb ebbed    ebb faced    ebb faded    
ebbed ace    ebbed add    ebbed baa    ebbed bad    ebbed bed    
ebbed bee    ebbed cab    ebbed cad    ebbed dab    ebbed dad    
ebbed deb    ebbed ebb    ebbed fad    ebbed fed    ebbed fee    
efface ad    efface be    efface fa    effaced a    effaced b    
effaced c    effaced d    effaced e    effaced f    f acceded    
f defaced    f effaced    fa accede    fa beaded    fa bedded    
fa beefed    fa cabbed    fa dabbed    fa decade    fa deeded    
fa deface    fa efface    fa facade    facade ad    facade be    
facade fa    face abed    face aced    face babe    face bade    
face bead    face beef    face cede    face dead    face deaf    
face deed    face face    face fade    face feed    faced ace    
faced add    faced baa    faced bad    faced bed    faced bee    
faced cab    faced cad    faced dab    faced dad    faced deb    
faced ebb    faced fad    faced fed    faced fee    fad added    
fad baaed    fad ceded    fad decaf    fad ebbed    fad faced    
fad faded    fade abed    fade aced    fade babe    fade bade    
fade bead    fade beef    fade cede    fade dead    fade deaf    
fade deed    fade face    fade fade    fade feed    faded ace    
faded add    faded baa    faded bad    faded bed    faded bee    
faded cab    faded cad    faded dab    faded dad    faded deb    
faded ebb    faded fad    faded fed    faded fee    fed added    
fed baaed    fed ceded    fed decaf    fed ebbed    fed faced    
fed faded    fee added    fee baaed    fee ceded    fee decaf    
fee ebbed    fee faced    fee faded    feed abed    feed aced    
feed babe    feed bade    feed bead    feed beef    feed cede    
feed dead    feed deaf    feed deed    feed face    feed fade    
feed feed    

The input() Function

  • where does input come from these days?
  • (optional) argument is a prompt
  • In general, prompting is not Pythonic.
In [22]:
text = input("What is the air speed velocity of an unladen swallow?")
print(f"user entered: \"{text}\"")
What is the air speed velocity of an unladen swallow?20 miles per hour
user entered: "20 miles per hour"
In [23]:
text
Out[23]:
'20 miles per hour'

I/O with the json Module

JSON is "JavaScript Object Notation", borrowed from the JavaScript language. It's a popular alternative to HTML and very well suited to Python.

In [24]:
import json
from pprint import pprint

struct = ( { 
    "red":   0,
    "green": 1,
    "blue":  2,
    1:       "a",
    -2:      "stuff",
    3.1:     (1, 2, 3)
    },
    [ ],
    ( -1, 0, "a string" )
)
# json.dumps() converts an object to a JSON string.
result = json.dumps(struct)
result
Out[24]:
'[{"red": 0, "green": 1, "blue": 2, "1": "a", "-2": "stuff", "3.1": [1, 2, 3]}, [], [-1, 0, "a string"]]'
In [25]:
pprint(json.loads(result))
[{'-2': 'stuff', '1': 'a', '3.1': [1, 2, 3], 'blue': 2, 'green': 1, 'red': 0},
 [],
 [-1, 0, 'a string']]

Note that all of the dictionary keys have been converted to strings. Although Python allows non-string keys, JSON does not. In fact, these are "objects", not dictionaries, in JSON. This means that you can't always count on loads(dumps(x)) == x. Oh well.

json can do prettyprinting. Unlike HTML, this does not change the semantics.

In [26]:
print(json.dumps(struct,
                  indent=4 # indent 4 spaces per level
))
[
    {
        "red": 0,
        "green": 1,
        "blue": 2,
        "1": "a",
        "-2": "stuff",
        "3.1": [
            1,
            2,
            3
        ]
    },
    [],
    [
        -1,
        0,
        "a string"
    ]
]

Another few things about JSON:

  • Tuples are converted to lists, which JSON calls "arrays".

  • Sets are not allowed (so convert them to lists before dumping).

json.dump()

  • json.dump() writes an object to a file in JSON format.
In [27]:
outputFile = open("saved.json", "w")
json.dump(struct, outputFile)
outputFile.close()
In [ ]:
# %load saved.json
[{"red": 0, "green": 1, "blue": 2, "1": "a", "-2": "stuff", "3.1": [1, 2, 3]}, [], [-1, 0, "a string"]]
In [29]:
inputFile = open("saved.json")
object_ = json.load(inputFile)
object_
Out[29]:
[{'red': 0, 'green': 1, 'blue': 2, '1': 'a', '-2': 'stuff', '3.1': [1, 2, 3]},
 [],
 [-1, 0, 'a string']]
In [ ]: