Look at this program:

main = do
  contents <- readFile "foo.txt"
  writeFile "foo.txt" ('a':contents)

What does it do?

The tale of a contrived example

A little over a year ago, I was working on internationalizing the Trello web client, and I wrote a little Haskell program which went through and parsed several hundred Mustache templates, extracted all the English-looking strings from them, and spat out Teacup templates that had those strings replaced with lookups in a table of translations.

The program was interactive, prompting you to enter a reasonable “key name” for each English string, which it saved across runs in a little text file. If you’d already identified a string before, it wouldn’t bother to ask you again, even across multiple files.

But sometimes – just sometimes! – when I ran this program, it would spit out the following error message:

openFile: resource busy (file is locked)

This was mysterious, especially given that it was not happening consistently.

The program in question looked something like this:

construct a Bimap of keys-to-strings by reading a file
do some stuff, consulting the bimap and adding new entries to it over time
serialize the bimap and overwrite the file it came from with the new mappings

After a bit of squinting at this error, trying out similar but simpler things, I happened upon a three-line program, the same one you’ve already seen:

main = do
  contents <- readFile "foo.txt"
  writeFile "foo.txt" ('a':contents)

And what does that program do?

It crashes with the same error – consistently.

Now we’re talkin'

I suspected that this had something to do with lazy I/O, that bogeyman of which I had heard whispers in the past. I figured that Haskell’s readFile had decided not to actually, you know, read the file contents until someone asked for them. As such, readFile would have to keep the file handle open until the contents were requested, which wouldn’t happen until after writeFile attempted to open the same file. Which would fail, naturally.

So, to fix this, all we need to do is force evaluation before writeFile, and we’ll be golden. Right? Right.

Up to this point, I am on the right track. That will not last long.

Crazy little thing called `seq`

I vaguely recalled something called seq, which could be used to force evaluation of thunks. As I understood at the time, it was generally used to improve memory behavior of programs that would allocate a bunch of intermediate thunks. But why not use it to control evaluation order as well?

main = do
  contents <- readFile "foo.txt"
  seq contents (writeFile "foo.txt" ('a':contents))

Hmm. Still doesn’t work. Why not?

If you have any significant Haskell experience, the answer is probably obvious. Stay with me! It’s going to get a lot worse before it gets any better.

We have a mystery on our hands

The first thing I tried, naturally, was sprinkling some printfs over the code, just to make sure it was crashing where I thought it was:

main = do
  putStrLn "about to open for reading"
  contents <- readFile "foo.txt"
  putStrLn "that statement is over but we don't know what was read"

  seq contents (return ())

  putStrLn "about to open for writing"
  writeFile "foo.txt" ('a':contents)
  putStrLn "done writing"

And running this, I got the following output:

about to open for reading
that statement is over but we don't know what was read
about to open for writing
test: foo.txt: openFile: resource busy (file is locked)

Which confirmed my suspicion: somehow, the file was still open for reading when we tried to open it for writing.

“Hmm,” past me thought, “Either seq isn’t actually forcing evaluation (I vaguely remember something about it not being intuitive…) or something deeper and weirder is happening here.” Maybe the seq call is being optimized away, since its second argument isn’t actually used? That’s a thing that can happen, right?

Who knows?

Something deeper and weirder

Since I was on a mac, I fired up dtruss to try to decide whether or not the seq call was actually doing anything:

$ echo "b" > foo.txt
$ ghc Prepend.hs -o prepend
$ sudo dtruss -f ./prepend

One root password later, and I got some interesting output. Actually a ton of output, condensed to the relevant parts here:¹

62260/0x3d67c0:  write(0x1, "about to open for reading\n\0", 0x1A)       = 26 0
62260/0x3d67c0:  open("foo.txt\0", 0x20004, 0x1B6)       = 3 0
62260/0x3d67c0:  fstat64(0x3, 0x10F508070, 0x1B6)        = 0 0
62260/0x3d67c0:  write(0x1, "that statement is over but we don't know what was read\n\0", 0x37)      = 55 0
62260/0x3d67c0:  read(0x3, "b\n(\0", 0x1FA0)         = 2 0
62260/0x3d67c0:  write(0x1, "about to open for writing\n(\0", 0x1A)      = 26 0
62260/0x3d67c0:  open("foo.txt\0", 0x20205, 0x1B6)       = 4 0
62260/0x3d67c0:  fstat64(0x4, 0x10F508170, 0x1B6)        = 0 0
62260/0x3d67c0:  close(0x4)      = 0 0
62260/0x3d67c0:  write_nocancel(0x2, "prepend: \0", 0x9)         = 9 0
62260/0x3d67c0:  write_nocancel(0x2, "foo.txt: openFile: resource busy (file is locked)\0", 0x31)        = 49 0
62260/0x3d67c0:  write_nocancel(0x2, "\n\0", 0x1)        = 1 0

Raw dtruss output doesn’t make for great skimming, so I’ll prettify it a bit:

write(stdout, "about to open for reading\n(\0", 26 bytes) =
  26 bytes written

open("foo.txt\0", for reading, 0666) =
  opened as FD 3

fstat64(FD 3, struct address, 0666) =
  information put into the provided struct

write(stdout, "that statement is over but we don't know what was read\n\0", 55 bytes) =
  55 bytes written

read(FD 3, string address, no more than 8096 bytes please) =
  2 bytes read: "b\n"

write(stdout, "about to open for writing\n\0", 26) =
  26 bytes written

open("foo.txt\0", for writing, 0666) =
  opened as FD 4

fstat64(FD 4, struct address, 0666) =
  information put into the provided struct

close(FD 4) =
  closed successfully

Presumably whatever it saw in the second fstat64 call was not to its liking, so it decided to close the file descriptor and begin printing the error messages (which I omitted from the prettified output).

But look at that: it actually did read the file! When I called seq, it read the whole thing – we can see b\n right there. Whatever that strange thing was that I didn’t quite remember about seq clearly wasn’t all that important. This code is fine.

Give past me some time; I’ll get there eventually.

Sanity check

At this point I was quite confused, so I tried something radical:

import System.IO

main :: IO ()
main = do
  readHandle <- openFile "foo.txt" ReadMode
  contents <- hGetContents readHandle
  seq contents (return ())
  hClose readHandle

  writeHandle <- openFile "foo.txt" WriteMode
  hPutStr writeHandle ('a':contents)
  hClose writeHandle

And that appeared to work perfectly.² So it is possible to do this in Haskell, and I have again verified that I totally get seq. Sanity check complete.

Encouraged by these results, I tried another, slightly less intense sanity check, expecting this one to work too (for some reason):

import System.IO

main :: IO ()
main = do
  putStrLn "about to do reading"
  contents <- withFile "foo.txt" ReadMode hGetContents
  putStrLn "about to seq"
  seq contents (return ())
  putStrLn "about to do writing"
  withFile "foo.txt" WriteMode (flip hPutStr ('a':contents))
  putStrLn "done writing"

Which actually does not work at all. The (prettified) dtruss output reveals why this is:

write(stdout, "about to do reading\n\200\004(\0", 20 bytes) =
    20 bytes written

open("foo.txt\0", for reading, 0666) =
    opened as FD 3

fstat64(0x3, struct address, 0666) =
    information put into the provided struct

close(FD 3) =
    closed successfully

write(stdout, "about to do writing\n@\004\0", 20 bytes) =
    20 bytes written

open("foo.txt\0", 0x20205, 0666) =
    opened as FD 3

fstat64(FD 3, struct address, 0666) =
    information put into the provided struct

ftruncate(FD 3, 0x0, 0666) =
    file truncated successfully

write(FD 3, "a\004\0", 1 byte) =
    1 byte written

close(FD 3) =
    closed successfully

write(stdout, "done writing\n\004\b\0", 13 bytes) =
    13 bytes written

Of course the withFile command closed the handle we wanted to read from before we forced it to read, so this doesn’t work.

This just confirmed what I already knew: hGetContents doesn’t do any reading. Only the seq call causes the actual read to happen, and that happens after the handle has already been closed by withFile.

Now I don’t think it’s completely unreasonable, at this point, to expect some kind of error. Am I not trying to read from a file handle that’s already been closed? Isn’t that bad?

I would have liked to see a big red “Hey! You already closed that file handle!” message to pop up on my screen and for my monitor to go dark and start flashing a skull and crossbones and for a calm woman’s voice to chant “ACCESS DENIED” over the intercom, but I would have settled for a non-zero exit code.

What did I get instead? Nothing.

Actually worse than nothing, because the end result of running this program is that foo.txt gets truncated and replaced with the single character a. Silently, without complaint. Insult to injury!

But, unfortunately for my sense of indignation, this is very much the documented behavior:

Once a semi-closed handle becomes closed, the contents of the associated list becomes fixed. The contents of this final list is only partially specified: it will contain at least all the items of the stream that were evaluated prior to the handle becoming closed.

So what’s happening here is this:

withFile gets a file handle and hands it to hGetContents.
hGetContents says “Alright, I’ll create this empty list of characters, and if anyone asks what’s in it, I’ll load some up. But as soon as that handle is closed, I’m freezing the list.”
withFile immediately closes the file handle.
hGetContents says “Oh, well, the list is set in stone now. It shall be forever empty.”

I have no one to blame but myself, I suppose, bringing my own preconceived notions of what hGetContents “should” do to the table. But the principle of least surprise might have something to say about this.

Once I understood the behavior, the fix was clear: I just need to force evaluation of the list-of-characters inside the function passed to withFile:

import System.IO

main :: IO ()
main = do
  contents <- withFile "foo.txt" ReadMode strictRead
  withFile "foo.txt" WriteMode (flip hPutStr ('a':contents))
  where
    strictRead handle = do
      str <- hGetContents handle
      seq str (return str)

And now everything’s fine. Or so I thought.

We’ll come back to the subtle (or glaring) bug in this code soon. But first…

Why didn’t the other thing work

Even though I thought I had it working, I still wanted to understand where my simpler approach went wrong:

main = do
  contents <- readFile "foo.txt"
  seq contents (return ())
  writeFile "foo.txt" ('a':contents)

Because the documentation for hGetContents is rather clear on one point:

A semi-closed handle becomes closed […] once the entire contents of the handle has been read.

Now, it doesn’t exactly say that it becomes closed immediately. But I was assuming that’s what it meant. And – as you can see in the dtruss output up above – it certainly read the entire contents of the handle. All two bytes of it!

The historical record is a little fuzzy on what happened next. I believe I was talking to the friend and colleague who had gotten me interested in Haskell in the first place, and he handed me this very similar code for consideration:

main = do
  contents <- readFile "foo.txt"
  putStrLn contents
  writeFile "foo.txt" ('a':contents)

A minor variation on my attempt above, replacing the seq expression with putStrLn.

And, surprisingly to me, this worked.

And that cracked the case wide open. Because dtruss revealed a key difference between this implementation and the one that used seq:

open("foo.txt\0", for reading, 0666) =
    opened as FD 3

read(FD 3, string address, up to 8096 bytes) =
    2 bytes read: "b\n"

write(stdout, "b\n\024\b\0", 2 bytes) =
    2 bytes written

read(FD 3, string address, up to 8096 bytes) =    <-- I say!
    0 bytes read

close(FD 3) =
    closed successfully

write(stdout, "\n\004\0", 1 byte) =
    1 byte written

open("foo.txt\0", for writing, 0666) =
    opened as FD 3

ftruncate(FD 3, 0x0, 0666) =
    truncated successfully

write(FD 3, "ab\n\0", 3 bytes) =
    3 bytes written

close(FD 3) =
    closed successfully

Aha! A fascinating mistake.

Even though, in the seq case, we were reading the entire contents of the file, hGetContents has no way of knowing that. It asked for 8096 bytes, and it only got 2 back, but that doesn’t necessarily mean that there aren’t any more out there. From man 2 read:

The system guarantees to read the number of bytes requested if the descriptor references a normal file that has that many bytes left before the end-of-file, but in no other case.

hGetContents has no way of knowing that we’re talking to a normal file here, so it needs to do the second read in order to know that it is finished reading from the handle:

If successful, the number of bytes actually read is returned. Upon reading end-of-file, zero is returned. Otherwise, a -1 is returned and the global variable errno is set to indicate the error.

And that’s exactly what we see in the dtruss output above. read has no way of saying “I read two bytes and there aren’t any more.” The second read call is required to know that we’ve reached the end of the file.

And why didn’t seq cause two reads?

Because it didn’t need to.

`seq`ing the truth

See, as you may already know, I was using seq completely wrong.

In order to explain why, I would like to present the only definition of seq you’ll ever need:

seq ⊥ x = ⊥
seq _ x = x

Or in English: if the first argument to seq is bottom, then seq returns bottom. Otherwise, seq returns its second argument.

Which to me is far more intuitive than talking about “weak head normal form” or “evaluate until you find a data constructor or lambda abstraction or primitive value” or however I had seen seq presented at the time.³

And seq is, of course, lazy. It’s not going to do any more evaluation than it needs to in order to determine whether its first argument is undefined or not. Returning to our example, even if contents turns out to be 'b' : '\n' : undefined, that’s still distinct from undefined, so it won’t bother to check if there’s any more to the file. No second read, no handle closing, no joy.

A working solution

At last, I understood what was happening well enough to write a working solution:

main = do
  contents <- readFile "foo.txt"
  seq (length contents) (return ())
  writeFile "foo.txt" ('a':contents)

The only way seq can determine if length returns bottom is to evaluate it, and the only way length can determine how many characters are in the file is to read the whole thing.

I don’t feel great about that assertion, though: even though it’s true, it requires a little bit of indirect reasoning that shouldn’t really be in our application code. If I return to this code later on, will I still remember why that seq is there?

So we could add a comment, or we could just require the strict package and write:

import Prelude hiding (readFile)
import System.IO.Strict (readFile)

main = do
  contents <- readFile "foo.txt"
  writeFile "foo.txt" ('a':contents)

Which uses the same seq + length trick to fully evaluate the file contents, but keeps that detail out of our application code.

Or we could stop using the String type to read from files at all, you monster, and use a library like text or bytestring, which ship strict equivalents of the file-interacting functions in the Prelude. This is the correct choice in real life, but was not the first thing I reached for in a little script that just reads and shows a data structure.

And, while we’re at it, we could also not overwrite files like this, for a thousand reasons, and instead write to a temporary file and mv it over the old one once it has been successfully written. This is also the correct choice in real life, but if we’d done that then we never would have embarked on this fun journey!

Tying up loose ends

But we’re not totally in the clear yet.

Remember my first sanity check, where I thought I got it working by manually hCloseing the file handle?

That only appeared to work because foo.txt had fewer than 8096 characters in it. If it had been longer, I would have seen the same truncation behavior as in the withFile example, just truncated to 8096 bytes instead of 0. Subtle! At least, subtle enough to fool me.

Now look back at my second attempt:

main = do
  contents <- readFile "foo.txt"
  seq contents (writeFile "foo.txt" ('a':contents))

Even if I replaced seq contents (writeFile ...) with seq (length contents) (writeFile ...), this would still be wrong, because seq does not guarantee that it will evaluate its first argument before its second argument. seq isn’t really about evaluation or order, despite the unfortunate name. It just provides a way to distinguish bottom from not-bottom.⁴

But would this work in practice? Sure! Sometimes.

Mendel Feygelson pointed out in the comments that there is a flaw in my reasoning here. Because it doesn’t matter if the writeFile expression is evaluated before the length contents expression; what matters is that it’s not executed before length contents is evaluated. Evaluation ≠ execution in Haskell… except when lazy I/O is involved, and execution happens happen as a side-effect of evaluation, which is the point of this whole post and thus this is confusing to reason about.

The takeaway is that, in this case, seq (length contents) (writeFile ...) is totally fine, regardless of the order in which those expressions are evaluated, because the actual write operation won’t occur until the writeFile expression gets threaded into main’s IO operation, which is guaranteed to be after length contents is evaluated.

Bring it back; bring it home

Returning to the Trello internationalization problem: I suspect I was seeing the error whenever I hit a template file that had no English strings in it, so it never had to consult or add to the bimap it was maintaining, which meant that it never had to evaluate the contents and never had to close the file handle it was reading from.

I don’t think I noticed this at the time (I don’t remember; it was over a year ago) because I was running it from a shell for loop on a few hundred files at once, and didn’t bother to check which specific files it was failing on. Or maybe I did, and that’s why I thought it had to do with lazy evaluation… I don’t know. I shouldn’t have waited so long to write this up.

How do we feel about all of this

I’m glad I hit this bug. The experience really made me think deeply about evaluation and seq and I/O and all sorts of things in Haskell. I had fun the entire time I was debugging it, which is one of the reasons why I wanted to share it with the rest of the world.

But.

From a pedagogical standpoint, I can recognize that this isn’t great.

I’m not the first person to encounter this problem, and I won’t be the last.

My experience was, I hope, much worse than average: I had just enough misinformation to be dangerous, just enough false hypothesis confirmation to keep me looking in the wrong direction…

But still, how many people give up on Haskell because of things like this? Not because they somehow have a completely wrong model of how seq behaves, but just because the three-line example at the beginning of this post fails in the first place. Even the jump from that example to “use the strict package” requires figuring out how to use cabal, and by then the Ruby equivalent is already finished running.

It’s not exactly novel to say that maybe lazy I/O isn’t all that great. And there are certainly arguments for keeping the Prelude’s file interaction functions lazy by default. I don’t really want to get into that.

All I want is to present the unedited mistakes one beginner.

Try it yourself! There’s a long prologue of non-app code that I assume is the Haskell runtime initialization. Also each write is paired with a select, which I elided, there are some fun ioctl calls that fail because they’re not talking to a teletype, and more. ↩︎
More on that later. ↩︎
Writing this post, I learned that this is actually how it’s presented in the documentation, but I didn’t encounter this definition until I read Haskell, A History. I should really read more documentation. ↩︎
There’s some “can’t tell your head normal form from your bottom” joke hiding in here, but I decided it wasn’t worth looking for. ↩︎