We got to Part IV! We have learned everything we need to know about Package Management and now get to learn about Writing Nix Expressions.

I have to point out that this “part” basically starts out with a link to a whole other manual. My goodness. I don’t know if I can type up every thought that goes through my head as I read two manuals. I’ve gotta sleep at some point.

I’m not sure if I will actually need to look at that, though. I guess we’ll find out as we go.

Chapter 14. A Simple Nix Expression

The manual says there are three things I have to do to make a Nix package:

Write a file describing all of the sources and dependencies of the package.
Write a shell script that can actually build the package.
Add it to all-packages.nix.

Well I don’t wanna do that third thing. I wanna make, like, a local package. Maybe I’ll need to make my own channel? I don’t know. I’m also sort of surprised that (1) and (2) are separate steps? The Nix expression doesn’t describe how to build it? I am surprised, based on the definition of “derivation” as “build action” that I vaguely remember.

Anyway, the manual gives an example of our old friend hello:

{ stdenv, fetchurl, perl }:

stdenv.mkDerivation {
  name = "hello-2.1.1";
  builder = ./builder.sh;
  src = fetchurl {
    url = ftp://ftp.nluug.nl/pub/gnu/hello/hello-2.1.1.tar.gz;
    sha256 = "1md7jsfd8pa45z73bz1kszpp01yw6x5ljkjk2hx7wl800any6465";
  };
  inherit perl;
}

And the manual provides a play by play of every line!

I love that. Great way to present this: every line here is a mystery, so I like that it walks me through each one independently.

Let’s hope I can understand the descriptions.

{ stdenv, fetchurl, perl }:

This apparently means that this is a function definition? This is very unfamiliar syntax. I assume that that means instead of (say) the JavaScript syntax param => body, the Nix syntax would be param: body. And that the one parameter is a record that’s being unpacked? I assume. So we take a record or an associative array or something with three elements: stdenv, fetchurl, and perl. That’s sort of a weird set of arguments for a general package declaration – though I guess that will be sorted out by step (3) in the list up there. I guess that all packages are functions with different signatures, and you have to know how to “invoke” these functions to get your derivation.

Okay. I’ll accept this for now.

Next up, we have:

stdenv.mkDerivation {

The manual doesn’t say this explicitly, but I’m gonna say this: this is the syntax for function invocation. This is just a function call. stdenv.mkDerivation is the function, and the thing delimited by the curly braces is the argument. This is totally normal syntax in Haskell or OCaml or whatever, but I suspect that is totally not obvious if you’re coming from a “normal” language. In JavaScript, this would be stdenv.mkDerivation({ ... }). It’s just that in (this particular family of) “functional programming languages,” people don’t use the parens. There is a very good reason that it is this way, and that reason is called currying. I don’t know if Nix has curried functions or not, or if functions can only have one argument for other reasons, or what. I’m new here too. Just saying that as background, because it seems like the manual assumes you understand that this is a function call.

Anyway, back to what the manual does say:

Building something from other stuff is called a derivation in Nix (as opposed to sources, which are built by humans instead of computers).

Okay. This is the closest we have come to a definition of “derivation” so far. I will meditate on this. I feel like this language is very imprecise, though. Surely the act of building something is not called a “derivation.” That would be crazy. Maybe building something is called “deriving” it? The derivation is like… the description of the thing to build, right?

I do not know.

We perform a derivation by calling stdenv.mkDerivation.

“Perform a derivation”? What does that mean??

mkDerivation is a function provided by stdenv that builds a package from a set of attributes.

“Builds a package”?? Are we building it, or describing it? I thought we needed to write a shell script with the actual build instructions? I’m very confused here.

A set is just a list of key/value pairs where each key is a string and each value is an arbitrary Nix expression. They take the general form { name1 = expr1; ... nameN = exprN; }.

Okay. Not sure why this is called a “set.” This definitely goes against my experience with the term “set” in other contexts. I would call this a “map” or an “associative array” or a “dictionary” or something.

Moving on to the “attributes:”

name = "hello-2.1.1";

It’s interesting that name, which is described as the “human-readable” string, has a version number. I would think the human-readable package name would be hello and the computer-readable package identifier would be hello-2.1.1. But apparently these are reversed? Weird.

builder = ./builder.sh;

It’s weird to me that ./builder.sh isn’t quoted – I am so used to paths just being strings. I guess in Nix they are a first-class type in Nix with their own syntax? That’s neat, if true.

src = fetchurl {
  url = ftp://ftp.nluug.nl/pub/gnu/hello/hello-2.1.1.tar.gz;
  sha256 = "1md7jsfd8pa45z73bz1kszpp01yw6x5ljkjk2hx7wl800any6465";
};

fetchurl is one of the arguments to our top-level function. I don’t know why mkDerivation is in stdenv but fetchurl isn’t. Both seem… pretty standard. But okay.

I am curious whether fetchurl just downloads a file or whether it knows how to tar xz it as well. I suppose we shall see once we get to builder.sh.

The manual notes that the src attribute isn’t required – I could call this whatever I want, and have as many different “sources” as I want. Interesting. I suppose we’ll see how to refer to this in builder.sh as well.

inherit perl;

inherit?? What is this, an object-oriented package manager? I demand to see your supervisor.

No; apparently this is sugar for perl = perl;. In the languages I’m familiar with, { perl } would be the standard way to say { perl = perl }. Haskell calls this “punning,” which is a more whimsical term.

So, okay. Once again, I assume we’ll see how to refer to perl – whatever the type of that thing is. The path to a perl binary? The “derivation” for perl? No idea. Let’s find out.

source $stdenv/setup

PATH=$perl/bin:$PATH

tar xvfz $src
cd hello-*
./configure --prefix=$out
make
make install

Huh! Okay. Pretty cavalier attitude towards variable quoting here, eh? Running around with the safety off? I’ll assume that all of these are paths in the Nix store and thus somehow unable to contain spaces (?). So I’ll allow it, tentatively.

More interesting to note: there is no shebang here. Which implies that builder must be a shell script – but I do not know which shell. I get that you probably want a shell script due to the apparent environment variable argument-passing here, but is it possible to write something else? I do not know. Oh, yes I do: a footnote explains that this can be “written in any language, but typically it’s a bash shell script.” Okay. Just a default.

Anyway, let’s go through it line by line:

source $stdenv/setup

See? Isn’t source nicer to read? POSIX be damned.

I don’t know what $stdenv is. I assume this that this is always there / automatically provided by whatever calls the builder, since it doesn’t appear in the “set” (do we really have to call it that?) that we passed to mkDerivation, while some of the other “attributes” appear as variable names here. Obviously it’s a path to some directory, but I don’t know what directory, so I don’t know how to inspect the setup script to see what it needs to do.

I turn to the annotation in the manual, but it doesn’t explain this either. It really adds no information – it is pretty much a recap of the above things that I assumed.

The manual does say that the environment is completely cleared except for the variables set by whatever-calls-this-script – no PATH, for example. Makes sense. I assume you could use an absolute path in here, on purpose, if you wanted. But you don’t want to. Nothing about the working directory that this is run under. Mysterious.

I’d like to read $stdenv/setup but I’m not sure how. Annoying. I turn to the man pages, hoping to find something…

But what? I’m not really sure what page that would even be on. nix-build? No. nix-instantiate? I’m grasping at straws here. No idea.

That script is probably part of the nix package, though – so I should be able to find it in nixpkgs, right?

$ find ~/src/nixpkgs -name 'setup'

Nope. Ah, right, because it would be an external source…

$ git clone git@github.com:NixOS/nix.git ~/src/nix
$ find ~/src/nix -name 'setup'

Boooo.

Okay, well, it probably exists somewhere on my file system, right?

$ ls /nix/store | grep stdenv
7nw4bgvwl4w03s6g15279hq52gbc30iy-stdenv-darwin.drv
9879lyvqbj1qifbsm6i7hk4p75jz7afa-bootstrap-stage0-stdenv-darwin.drv
9ack3qyjc26za4flq6awgnp775dldnij-stdenv-darwin.drv
chb0aw1yl97q1p642j58fb0l3jbifrah-bootstrap-stage0-stdenv-darwin.drv
d7apm4hi9bg2fi16x8mzk86chgvb9k8s-bootstrap-stage3-stdenv-darwin.drv
hfqapkz752p7szpavxld8j5vpwi6wc5j-bootstrap-stage2-stdenv-darwin.drv
kcbzm20vy8myjw521wjb2grbchx6h50s-bootstrap-stage3-stdenv-darwin.drv
liacvbiqhxk4505g88b7s27zmqrmpfa4-bootstrap-stage4-stdenv-darwin.drv
llgrpiin4wr33lkpxia3qn5iddf2a5yh-bootstrap-stage4-stdenv-darwin.drv
mdz2bv285gl5q3z5qkivil70lpg03qv5-bootstrap-stage1-stdenv-darwin.drv
qwvpg693ap04clwg0r1wdiprsah4yqfq-bootstrap-stage1-stdenv-darwin.drv
sp860v02a8j8s7vqhjrc94hgsklqdvdw-stdenv-darwin/
v85b3qis3fyi553rb0w30iballqcs8hb-bootstrap-stage2-stdenv-darwin.drv
xi1sgg2rabsz0hkysgw6ld60d9kybv9n-bootstrap-stage2-stdenv-darwin.drv

Okay. sp860v02a8j8s7vqhjrc94hgsklqdvdw-stdenv-darwin/ sounds promising.

$ tree -F /nix/store/sp860v02a8j8s7vqhjrc94hgsklqdvdw-stdenv-darwin
/nix/store/sp860v02a8j8s7vqhjrc94hgsklqdvdw-stdenv-darwin
├── nix-support/
└── setup

Okay. Just the script, and a weird haunted empty directory.

Okay! Wow. The script is 1338 lines long. That’s a lotta shell – I will not reproduce it here. But also: come on, Nix maintainers. Trim one line. You’re so close.

What does it do? I don’t know. Sets some environment variables – sets SHELL to be the Nix version of bash, and lots of NIX_ things that I don’t know what they do. set -euo pipefail, good good.

The script has quite a few comments, and is broken up into parts: section headers include “hook handling,” “logging,” “error handling” – which sets up an EXIT trap to report failure or build duration. Nice. Then there are the “helper functions” – apparently functions to help the rest of this massive script, not to help in custom builders – then “initialisation” [Europe]. Lots of “hooks” get run throughout this script – I have no idea what a hook is at this stage of my life. I see some shellcheck pragmas, which make me feel warm and fuzzy.

Shoutout to this function:

# Mutually-recursively find all build inputs. See the dependency section of the
# stdenv chapter of the Nixpkgs manual for the specification this algorithm
# implements.
findInputs() {
    ...
}

Have you ever written a shell script with mutually recursive functions? Are you going to die before you’ve ever truly lived?

I zone out around line 600. Ah, but then we get to “textual substitution functions.” Exciting. And lastly a section called “What follows is the generic builder,” which is a terrible section name. And wow: lotta stuff in the generic builder. Like, over 500 lines of stuff. I do not attempt to understand it.

So, okay. I was mostly curious to know “what Nix-specific functions do I have when writing my builder script,” and I don’t really have an answer to that. There were a lot of functions declared in there, but I don’t know that any of them are meant for me. They seem like… I don’t know what they seem like.

So I’m still a little bit in the dark about why I’d want to source setup. And sure, the exit trap thing is nice, but is it necessary? Is any of this necessary, or is it supposed to be done for my pleasure? Why isn’t it just added automatically by whatever invokes the script? I come away with no answers.

Let’s return to the builder script.

source $stdenv/setup

PATH=$perl/bin:$PATH

tar xvfz $src
cd hello-*
./configure --prefix=$out
make
make install

The whole rest of it is, you know, pretty straightforward. In addition to $stdenv, we have $perl, $src, and $out. $perl and $src were attributes in our input to mkDerivation. But presumably those weren’t strings in our Nix expression – I have no idea what the type of the fetchurl function call is, or really how to think about it right now, but I assume it didn’t download a file and return the file name, the way it would work in, you know, a “normal” language. Nix is a lazily evaluated functional programming language. Expressions can’t have side effects. Right? Right??

So I assume the function call’s result is some sort of value with instructions that Nix knows how to realize when it actually comes time to build, and that the string path that it downloads to is then passed to the builder script in an environment variable. I assume the same thing about perl – that the argument to our hello derivation-making function is not a string, but some sort of abstract representation of the package – perhaps this is what “derivation” means; I still don’t really have any intuition for that term.

The annotations for these lines don’t really seem to back these theories up. They say “The perl environment variable points to the location of the Perl package (since it was passed in as an attribute to the derivation).” But I don’t know if “it” in the parenthetical refers to “the location” or “the Perl package.” Ambiguous.

Similarly: “The src attribute was bound to the result of fetching the Hello source tarball from the network, so the src environment variable points to the location in the Nix store to which the tarball was downloaded.” When was the src attribute bound to the result of fetching the tarball? I am used to separating the determination of what to do from the place that actually does it – there must be some pithy term for that – as seen most obviously in Haskell’s IO type. You can have an IO Int, and you can pass it around, put it in a list, duplicate it, whatever you want: it’s just a recipe. Nothing happens until you ask the runtime to perform the effect.

But maybe the answer is that it actually is performing the side effect as soon as the src field is evaluated – which doesn’t happen until build time. Lazy evaluation means we don’t need an intermediate “type” to pass around. We can delay the network effect until we actually need to turn the result of fetchurl into a string, to pass it to the builder script.

This seems plausible. But I sort of expect, at some point, to be able to load expressions like this up in something like a repl, to inspect its attributes, and get a feel for what it really is. I would be a little surprised if evaluating a single field caused a file to be downloaded, unpacked, put in my /nix/store. That doesn’t mean that’s not how it works! It’s just not what I would expect, coming from general purpose programming languages. I am used to laziness and purity going hand in hand, because compiler optimizations. But maybe Nix has a deterministic evaluation model that makes writing side-effectful code safe? I don’t know.

Anyway. I think this would all be a lot easier to understand if the example had type annotations. Not like you need to change the Nix language or something to add them, but just, like, in a comment, you know? Types make everything so much easier to understand. So much less magical.

Right now I don’t even know what types exist in the Nix language. Strings, presumably. Apparently paths, if the builder = ./builder.sh; line is to be believed. Or are you allowed to have unquoted strings? I have no idea. It’s the first Nix I’ve seen in like five years.

Anyway. I realize I could google this, and find an answer in a few seconds, and then I’d know. But I’m not going to do that yet – I’m going to wait until the manual explains either how it works or teaches me enough so that I can look it up myself in Nix itself.

One interesting thing here, which I didn’t notice but which is called out in the manual is that $out is apparently the final path in the Nix store that this package will go. It’s not some temporary output directory that gets copied into the store. It’s the actual ultimate destination. The manual explains that the “hash” of the package – the prefix of the basename of /nix/store/pakmb65sf3g2hkbm1fdgk2fh6hiij720-hello-2.10 – is produced as a hash of the inputs to the derivation, not the hash of the build artifacts, as I would have assumed.

This makes sense once I think about it: in a perfect world, both of these would produce stable hashes, and it wouldn’t matter if you looked at the inputs or the outputs. But the inputs are known immediately, whereas knowing the outputs requires you to actually build the thing, so picking the inputs allows you to request that exact package from a binary cache – you know in advance what the fully hashed name will be, without needing to specify it as a sort of checksum thing, so it makes the perfect ID.

Of course, in the real world, I also imagine you would have a lot of insignificant variations in the output that would make the hashes very unstable across multiple builds and multiple machines. Trying to package pre-existing software in Nix that does not care deeply about having perfectly reproducible builds is probably going to result in a lot of packages with not-perfectly-reproducible builds.

Tangent: it’s annoying that hashes appear at the beginning of the paths in the Nix store, instead of at the end. You can’t just type cd /nix/store/hello <TAB> and autocomplete the full path. I mean, I can do that, because my shell autocomplete looks at the whole filename if it can’t find a prefix match. I don’t know if this is some option I enabled at some point or if it’s just a Neat Thing that zsh does out of the box, but this is definitely not the default behavior in bash, and I assume that like 80% of developers just use default bash. So I imagine it would be annoying for them.

I guess there are tradeoffs either way, but this makes the store feel less like a directory that I can just dive into and explore and more like a database that I have to interact with with special tools. Maybe that’s good? I dunno. I don’t really know what those tools are yet.

Anyway. Tangent over.

It’s interesting that we are writing directly to the final $out path. I assume this is because, you know, a lot of software is more than just some binary: it might be some binary that contains an absolute path to some library that it also built by the same makefile, so you couldn’t just build it in a temporary location and then mv it into its final resting place unless you somehow rewrote the paths in the binary – and that would be crazy. I mean, as someone who is pretty used to writing simple statically-linked executables, it seems sort of icky to have software that “must” live in one place. But it is all around us, and Nix must deal with it.

But it makes me think about what happens when a build fails: do I end up with bogus temporary build artifacts in my store? Presumably the $out path is deleted, but is this what the “valid paths” thing from the glossary was talking about? What if I’m doing a rebuild of a package I already have installed? Probably the $out directory is removed before the build starts, and re-created from scratch each time. Seems like the safest way, so I’ll assume that’s what happens. But it makes sense now how there might be paths in the Nix store that are either temporarily (while building) or permanently (in the case that cleanup failed) “invalid.” And you can’t track validity in the /nix/store/whatever directory itself – how would you ensure no collisions with build artifacts? – so it makes sense that the “database” remembers that.

Okay. Section over.

We learned a bit, but we did not learn how to actually use any of this. I would like to follow along at home and actually try building this derivation – but I have no idea how. I saw some Nix and some shell, in isolation, with no idea where they go or how to use them.

Sad. Hopefully we’ll get to that soon – but I think this as good a time as any for a break.

What does the function mkDerivation actually do? What does it “return”? What side effects does it have?
Why does Nix call maps “sets”?
Are paths a first-class type in Nix?
Can I put a space in a Nix package name?
How do “special” paths like stdenv-darwin wind up in the store / stay in the store? Are those part of the nix package?
What is a hook?
Do I have to source $stdenv/setup? If so, why?
When are the side effects of fetchurl actually executed?
Why do hashes appear at the beginning of store paths instead of the end?

How to Learn Nix, Part 10:My first derivation

Chapter 14. A Simple Nix Expression

How to Learn Nix, Part 10:
My first derivation