Look at this program:
main = do
contents <- readFile "foo.txt"
writeFile "foo.txt" ('a':contents)
What does it do?
The tale of a contrived example
A little over a year ago, I was working on internationalizing the Trello web client, and I wrote a little Haskell program which went through and parsed several hundred Mustache templates, extracted all the English-looking strings from them, and spat out Teacup templates that had those strings replaced with lookups in a table of translations.
The program was interactive, prompting you to enter a reasonable “key name” for each English string, which it saved across runs in a little text file. If you’d already identified a string before, it wouldn’t bother to ask you again, even across multiple files.
But sometimes – just sometimes! – when I ran this program, it would spit out the following error message:
openFile: resource busy (file is locked)
This was mysterious, especially given that it was not happening consistently.
The program in question looked something like this:
- construct a
Bimap
of keys-to-strings by reading a file - do some stuff, consulting the bimap and adding new entries to it over time
- serialize the bimap and overwrite the file it came from with the new mappings
After a bit of squinting at this error, trying out similar but simpler things, I happened upon a three-line program, the same one you’ve already seen:
main = do
contents <- readFile "foo.txt"
writeFile "foo.txt" ('a':contents)
And what does that program do?
It crashes with the same error – consistently.
Now we’re talkin'
I suspected that this had something to do with lazy I/O, that bogeyman of which I had heard whispers in the past. I figured that Haskell’s readFile
had decided not to actually, you know, read the file contents until someone asked for them. As such, readFile
would have to keep the file handle open until the contents were requested, which wouldn’t happen until after writeFile
attempted to open the same file. Which would fail, naturally.
So, to fix this, all we need to do is force evaluation before writeFile
, and we’ll be golden. Right? Right.
Up to this point, I am on the right track. That will not last long.
Crazy little thing called seq
I vaguely recalled something called seq
, which could be used to force evaluation of thunks. As I understood at the time, it was generally used to improve memory behavior of programs that would allocate a bunch of intermediate thunks. But why not use it to control evaluation order as well?
main = do
contents <- readFile "foo.txt"
seq contents (writeFile "foo.txt" ('a':contents))
Hmm. Still doesn’t work. Why not?
If you have any significant Haskell experience, the answer is probably obvious. Stay with me! It’s going to get a lot worse before it gets any better.
We have a mystery on our hands
The first thing I tried, naturally, was sprinkling some printf
s over the code, just to make sure it was crashing where I thought it was:
main = do
putStrLn "about to open for reading"
contents <- readFile "foo.txt"
putStrLn "that statement is over but we don't know what was read"
seq contents (return ())
putStrLn "about to open for writing"
writeFile "foo.txt" ('a':contents)
putStrLn "done writing"
And running this, I got the following output:
about to open for reading
that statement is over but we don't know what was read
about to open for writing
test: foo.txt: openFile: resource busy (file is locked)
Which confirmed my suspicion: somehow, the file was still open for reading when we tried to open it for writing.
“Hmm,” past me thought, “Either seq
isn’t actually forcing evaluation (I vaguely remember something about it not being intuitive…) or something deeper and weirder is happening here.” Maybe the seq
call is being optimized away, since its second argument isn’t actually used? That’s a thing that can happen, right?
Who knows?
Something deeper and weirder
Since I was on a mac, I fired up dtruss
to try to decide whether or not the seq
call was actually doing anything:
$ echo "b" > foo.txt
$ ghc Prepend.hs -o prepend
$ sudo dtruss -f ./prepend
One root password later, and I got some interesting output. Actually a ton of output, condensed to the relevant parts here:1
62260/0x3d67c0: write(0x1, "about to open for reading\n\0", 0x1A) = 26 0
62260/0x3d67c0: open("foo.txt\0", 0x20004, 0x1B6) = 3 0
62260/0x3d67c0: fstat64(0x3, 0x10F508070, 0x1B6) = 0 0
62260/0x3d67c0: write(0x1, "that statement is over but we don't know what was read\n\0", 0x37) = 55 0
62260/0x3d67c0: read(0x3, "b\n(\0", 0x1FA0) = 2 0
62260/0x3d67c0: write(0x1, "about to open for writing\n(\0", 0x1A) = 26 0
62260/0x3d67c0: open("foo.txt\0", 0x20205, 0x1B6) = 4 0
62260/0x3d67c0: fstat64(0x4, 0x10F508170, 0x1B6) = 0 0
62260/0x3d67c0: close(0x4) = 0 0
62260/0x3d67c0: write_nocancel(0x2, "prepend: \0", 0x9) = 9 0
62260/0x3d67c0: write_nocancel(0x2, "foo.txt: openFile: resource busy (file is locked)\0", 0x31) = 49 0
62260/0x3d67c0: write_nocancel(0x2, "\n\0", 0x1) = 1 0
Raw dtruss
output doesn’t make for great skimming, so I’ll prettify it a bit:
write(stdout, "about to open for reading\n(\0", 26 bytes) =
26 bytes written
open("foo.txt\0", for reading, 0666) =
opened as FD 3
fstat64(FD 3, struct address, 0666) =
information put into the provided struct
write(stdout, "that statement is over but we don't know what was read\n\0", 55 bytes) =
55 bytes written
read(FD 3, string address, no more than 8096 bytes please) =
2 bytes read: "b\n"
write(stdout, "about to open for writing\n\0", 26) =
26 bytes written
open("foo.txt\0", for writing, 0666) =
opened as FD 4
fstat64(FD 4, struct address, 0666) =
information put into the provided struct
close(FD 4) =
closed successfully
Presumably whatever it saw in the second fstat64
call was not to its liking, so it decided to close the file descriptor and begin printing the error messages (which I omitted from the prettified output).
But look at that: it actually did read the file! When I called seq
, it read the whole thing – we can see b\n
right there. Whatever that strange thing was that I didn’t quite remember about seq
clearly wasn’t all that important. This code is fine.
Give past me some time; I’ll get there eventually.
Sanity check
At this point I was quite confused, so I tried something radical:
import System.IO
main :: IO ()
main = do
readHandle <- openFile "foo.txt" ReadMode
contents <- hGetContents readHandle
seq contents (return ())
hClose readHandle
writeHandle <- openFile "foo.txt" WriteMode
hPutStr writeHandle ('a':contents)
hClose writeHandle
And that appeared to work perfectly.2 So it is possible to do this in Haskell, and I have again verified that I totally get seq
. Sanity check complete.
Encouraged by these results, I tried another, slightly less intense sanity check, expecting this one to work too (for some reason):
import System.IO
main :: IO ()
main = do
putStrLn "about to do reading"
contents <- withFile "foo.txt" ReadMode hGetContents
putStrLn "about to seq"
seq contents (return ())
putStrLn "about to do writing"
withFile "foo.txt" WriteMode (flip hPutStr ('a':contents))
putStrLn "done writing"
Which actually does not work at all. The (prettified) dtruss
output reveals why this is:
write(stdout, "about to do reading\n\200\004(\0", 20 bytes) =
20 bytes written
open("foo.txt\0", for reading, 0666) =
opened as FD 3
fstat64(0x3, struct address, 0666) =
information put into the provided struct
close(FD 3) =
closed successfully
write(stdout, "about to do writing\n@\004\0", 20 bytes) =
20 bytes written
open("foo.txt\0", 0x20205, 0666) =
opened as FD 3
fstat64(FD 3, struct address, 0666) =
information put into the provided struct
ftruncate(FD 3, 0x0, 0666) =
file truncated successfully
write(FD 3, "a\004\0", 1 byte) =
1 byte written
close(FD 3) =
closed successfully
write(stdout, "done writing\n\004\b\0", 13 bytes) =
13 bytes written
Of course the withFile
command closed the handle we wanted to read from before we forced it to read, so this doesn’t work.
This just confirmed what I already knew: hGetContents
doesn’t do any reading. Only the seq
call causes the actual read to happen, and that happens after the handle has already been closed by withFile
.
Now I don’t think it’s completely unreasonable, at this point, to expect some kind of error. Am I not trying to read from a file handle that’s already been closed? Isn’t that bad?
I would have liked to see a big red “Hey! You already closed that file handle!” message to pop up on my screen and for my monitor to go dark and start flashing a skull and crossbones and for a calm woman’s voice to chant “ACCESS DENIED” over the intercom, but I would have settled for a non-zero exit code.
What did I get instead? Nothing.
Actually worse than nothing, because the end result of running this program is that foo.txt
gets truncated and replaced with the single character a
. Silently, without complaint. Insult to injury!
But, unfortunately for my sense of indignation, this is very much the documented behavior:
Once a semi-closed handle becomes closed, the contents of the associated list becomes fixed. The contents of this final list is only partially specified: it will contain at least all the items of the stream that were evaluated prior to the handle becoming closed.
So what’s happening here is this:
withFile
gets a file handle and hands it tohGetContents
.hGetContents
says “Alright, I’ll create this empty list of characters, and if anyone asks what’s in it, I’ll load some up. But as soon as that handle is closed, I’m freezing the list.”withFile
immediately closes the file handle.hGetContents
says “Oh, well, the list is set in stone now. It shall be forever empty.”
I have no one to blame but myself, I suppose, bringing my own preconceived notions of what hGetContents
“should” do to the table. But the principle of least surprise might have something to say about this.
Once I understood the behavior, the fix was clear: I just need to force evaluation of the list-of-characters inside the function passed to withFile
:
import System.IO
main :: IO ()
main = do
contents <- withFile "foo.txt" ReadMode strictRead
withFile "foo.txt" WriteMode (flip hPutStr ('a':contents))
where
strictRead handle = do
str <- hGetContents handle
seq str (return str)
And now everything’s fine. Or so I thought.
We’ll come back to the subtle (or glaring) bug in this code soon. But first…
Why didn’t the other thing work
Even though I thought I had it working, I still wanted to understand where my simpler approach went wrong:
main = do
contents <- readFile "foo.txt"
seq contents (return ())
writeFile "foo.txt" ('a':contents)
Because the documentation for hGetContents
is rather clear on one point:
A semi-closed handle becomes closed […] once the entire contents of the handle has been read.
Now, it doesn’t exactly say that it becomes closed immediately. But I was assuming that’s what it meant. And – as you can see in the dtruss
output up above – it certainly read the entire contents of the handle. All two bytes of it!
The historical record is a little fuzzy on what happened next. I believe I was talking to the friend and colleague who had gotten me interested in Haskell in the first place, and he handed me this very similar code for consideration:
main = do
contents <- readFile "foo.txt"
putStrLn contents
writeFile "foo.txt" ('a':contents)
A minor variation on my attempt above, replacing the seq
expression with putStrLn
.
And, surprisingly to me, this worked.
And that cracked the case wide open. Because dtruss
revealed a key difference between this implementation and the one that used seq
:
open("foo.txt\0", for reading, 0666) =
opened as FD 3
read(FD 3, string address, up to 8096 bytes) =
2 bytes read: "b\n"
write(stdout, "b\n\024\b\0", 2 bytes) =
2 bytes written
read(FD 3, string address, up to 8096 bytes) = <-- I say!
0 bytes read
close(FD 3) =
closed successfully
write(stdout, "\n\004\0", 1 byte) =
1 byte written
open("foo.txt\0", for writing, 0666) =
opened as FD 3
ftruncate(FD 3, 0x0, 0666) =
truncated successfully
write(FD 3, "ab\n\0", 3 bytes) =
3 bytes written
close(FD 3) =
closed successfully
Aha! A fascinating mistake.
Even though, in the seq
case, we were reading the entire contents of the file, hGetContents
has no way of knowing that. It asked for 8096 bytes, and it only got 2 back, but that doesn’t necessarily mean that there aren’t any more out there. From man 2 read
:
The system guarantees to read the number of bytes requested if the descriptor references a normal file that has that many bytes left before the end-of-file, but in no other case.
hGetContents
has no way of knowing that we’re talking to a normal file here, so it needs to do the second read
in order to know that it is finished reading from the handle:
If successful, the number of bytes actually read is returned. Upon reading end-of-file, zero is returned. Otherwise, a
-1
is returned and the global variableerrno
is set to indicate the error.
And that’s exactly what we see in the dtruss
output above. read
has no way of saying “I read two bytes and there aren’t any more.” The second read
call is required to know that we’ve reached the end of the file.
And why didn’t seq
cause two read
s?
Because it didn’t need to.
seq
ing the truth
See, as you may already know, I was using seq
completely wrong.
In order to explain why, I would like to present the only definition of seq
you’ll ever need:
seq ⊥ x = ⊥
seq _ x = x
Or in English: if the first argument to seq
is bottom, then seq
returns bottom. Otherwise, seq
returns its second argument.
Which to me is far more intuitive than talking about “weak head normal form” or “evaluate until you find a data constructor or lambda abstraction or primitive value” or however I had seen seq
presented at the time.3
And seq
is, of course, lazy. It’s not going to do any more evaluation than it needs to in order to determine whether its first argument is undefined
or not. Returning to our example, even if contents
turns out to be 'b' : '\n' : undefined
, that’s still distinct from undefined
, so it won’t bother to check if there’s any more to the file. No second read
, no handle closing, no joy.
A working solution
At last, I understood what was happening well enough to write a working solution:
main = do
contents <- readFile "foo.txt"
seq (length contents) (return ())
writeFile "foo.txt" ('a':contents)
The only way seq
can determine if length
returns bottom is to evaluate it, and the only way length
can determine how many characters are in the file is to read the whole thing.
I don’t feel great about that assertion, though: even though it’s true, it requires a little bit of indirect reasoning that shouldn’t really be in our application code. If I return to this code later on, will I still remember why that seq
is there?
So we could add a comment, or we could just require the strict
package and write:
import Prelude hiding (readFile)
import System.IO.Strict (readFile)
main = do
contents <- readFile "foo.txt"
writeFile "foo.txt" ('a':contents)
Which uses the same seq
+ length
trick to fully evaluate the file contents, but keeps that detail out of our application code.
Or we could stop using the String
type to read from files at all, you monster, and use a library like text
or bytestring
, which ship strict equivalents of the file-interacting functions in the Prelude
. This is the correct choice in real life, but was not the first thing I reached for in a little script that just read
s and show
s a data structure.
And, while we’re at it, we could also not overwrite files like this, for a thousand reasons, and instead write to a temporary file and mv
it over the old one once it has been successfully written. This is also the correct choice in real life, but if we’d done that then we never would have embarked on this fun journey!
Tying up loose ends
But we’re not totally in the clear yet.
Remember my first sanity check, where I thought I got it working by manually hClose
ing the file handle?
That only appeared to work because foo.txt
had fewer than 8096 characters in it. If it had been longer, I would have seen the same truncation behavior as in the withFile
example, just truncated to 8096 bytes instead of 0. Subtle! At least, subtle enough to fool me.
Now look back at my second attempt:
main = do
contents <- readFile "foo.txt"
seq contents (writeFile "foo.txt" ('a':contents))
Even if I replaced seq contents (writeFile ...)
with seq (length contents) (writeFile ...)
, this would still be wrong, because seq
does not guarantee that it will evaluate its first argument before its second argument. seq
isn’t really about evaluation or order, despite the unfortunate name. It just provides a way to distinguish bottom from not-bottom.4
But would this work in practice? Sure! Sometimes.
Bring it back; bring it home
Returning to the Trello internationalization problem: I suspect I was seeing the error whenever I hit a template file that had no English strings in it, so it never had to consult or add to the bimap it was maintaining, which meant that it never had to evaluate the contents and never had to close the file handle it was reading from.
I don’t think I noticed this at the time (I don’t remember; it was over a year ago) because I was running it from a shell for
loop on a few hundred files at once, and didn’t bother to check which specific files it was failing on. Or maybe I did, and that’s why I thought it had to do with lazy evaluation… I don’t know. I shouldn’t have waited so long to write this up.
How do we feel about all of this
I’m glad I hit this bug. The experience really made me think deeply about evaluation and seq
and I/O and all sorts of things in Haskell. I had fun the entire time I was debugging it, which is one of the reasons why I wanted to share it with the rest of the world.
But.
From a pedagogical standpoint, I can recognize that this isn’t great.
I’m not the first person to encounter this problem, and I won’t be the last.
My experience was, I hope, much worse than average: I had just enough misinformation to be dangerous, just enough false hypothesis confirmation to keep me looking in the wrong direction…
But still, how many people give up on Haskell because of things like this? Not because they somehow have a completely wrong model of how seq
behaves, but just because the three-line example at the beginning of this post fails in the first place. Even the jump from that example to “use the strict
package” requires figuring out how to use cabal
, and by then the Ruby equivalent is already finished running.
It’s not exactly novel to say that maybe lazy I/O isn’t all that great. And there are certainly arguments for keeping the Prelude
’s file interaction functions lazy by default. I don’t really want to get into that.
All I want is to present the unedited mistakes one beginner.
-
Try it yourself! There’s a long prologue of non-app code that I assume is the Haskell runtime initialization. Also each
write
is paired with aselect
, which I elided, there are some funioctl
calls that fail because they’re not talking to a teletype, and more. ↩︎ -
More on that later. ↩︎
-
Writing this post, I learned that this is actually how it’s presented in the documentation, but I didn’t encounter this definition until I read Haskell, A History. I should really read more documentation. ↩︎
-
There’s some “can’t tell your head normal form from your bottom” joke hiding in here, but I decided it wasn’t worth looking for. ↩︎