Okay. I remember stdenv. I remember it adding a lot of nice things to my PATH in a builder script, although I couldn’t tell you what exactly it included. I remember that it absolutely insists that you have an output named out or dev, and if you don’t (because, say, the example in the manual didn’t) you’ll get a horrible error message from deep in the bowels of some shell script.

Let’s see if we can get more specific.

Chapter 6. The Standard Environment

Okay. So stdenv.mkDerivation instead of the built-in derivation, yes. I remember this doing something that I would consider sort of broken with the args value.

The minimum we need to pass is src and name, but the manual teaches us that pname and version is preferred, which is something I can heartily agree with.

Many packages have dependencies that are not provided in the standard environment. It’s usually sufficient to specify those dependencies in the buildInputs attribute:

stdenv.mkDerivation {
  name = "libfoo-1.2.3";
  ...
  buildInputs = [libbar perl ncurses];
}

This attribute ensures that the bin subdirectories of these packages appear in the PATH environment variable during the build, that their include subdirectories are searched by the C compiler, and so on.

Okay – the first time we saw this, way back in the Nix manual, used perl as an attribute and then referred to it as $perl in the builder script, and actually manually consed it onto the PATH. This seems like a strictly nicer way to do that.

Next we learn a little bit about how the “default builder” works. It seems that it’s a giant shell script, but you’re able to insert code into the shell script at a few key points – so you can customize the way that code is built (by default make) without customizing how it is installed (by default make install). Alright. This section is only an introduction, and tells us that these things can be done, but leaves the details up to future sections of the manual. I think the example is pretty illustrative:

stdenv.mkDerivation {
  name = "fnord-4.5";
  ...
  buildPhase = ''
    gcc foo.c -o foo
  '';
  installPhase = ''
    mkdir -p $out/bin
    cp foo $out/bin
  '';
 }

I like this structure a lot: give me a high-level, breadth-first overview, but make sure to tell me that we’ll come back to it in more detail so that I don’t go off on my own trying to figure out all the bits and bobs myself.

Then we see a better way that we can customize the defaults – which it calls the “generic builder:”

source $stdenv/setup

buildPhase() {
  echo "... this is my custom build phase ..."
  gcc foo.c -o foo
}

installPhase() {
  mkdir -p $out/bin
  cp foo $out/bin
}

genericBuild

That seems much nicer in the case that you’re doing something nontrivial – at the very least I could run shellcheck on that file, which would be much more tricky to do with embedded string literals in Nix. But it’s probably good to have the string option too, in case you’re just doing something trivial.

Anyway, the manual tells us one very important thing that source "${stdenv}/setup" does for us: it processes buildInputs and puts them on our PATH. So that’s nice.

Aha! The next section tells us everything that is available in the standard environment. It’s roughly what I would expect: we’ve got coreutils, findutils, diffutils. We have bash, but no sh – thank goodness. So we have a baseline usable system. We have sed and awk and grep, because we aren’t animals. Then tar, gzip, bzip2, and xz – we probably need to unpack a source archive. Then we have gcc and make, so that we can do something with the sources. Minor outlier is patch, which I just assumed was a part of diffutils, but I guess it isn’t.

We also have patchelf, but only on Linux. Is that really… is that really common enough to be included in the standard environment? As stated many times, I do not use Linux on a daily basis. Is that… something that is a regular part of Linux users' workflows?

Okay, now let’s check if all of this is true.

nixpkgs has a file called common-path.nix:

{pkgs}: [
  pkgs.coreutils
  pkgs.findutils
  pkgs.diffutils
  pkgs.gnused
  pkgs.gnugrep
  pkgs.gawk
  pkgs.gnutar
  pkgs.gzip
  pkgs.bzip2.bin
  pkgs.gnumake
  pkgs.bash
  pkgs.patch
  pkgs.xz.bin
]

Yep. They ain’t lyin'. Cool. I imagine this doesn’t change too much.

As described in the Nix manual, almost any *.drv store path in a derivation’s attribute set will induce a dependency on that derivation. mkDerivation, however, takes a few attributes intended to, between them, include all the dependencies of a package.

I most certainly do not remember the Nix manual describing this. I don’t really remember it saying anything about .drv files. But it has been a while. It’s possible that I’ve forgotten. I remember it describing that, like, you need to explicitly specify build inputs. But it never described the details or the mechanism behind that, that I can recall.

But gosh, it looks like all that’s about to change.

Quite a bit to unpack here:

Dependencies can be broken down along three axes: their host and target platforms relative to the new derivation’s, and whether they are propagated. The platform distinctions are motivated by cross compilation … [but] even if one is not cross compiling, the platforms imply whether or not the dependency is needed at run-time or build-time, a concept that makes perfect sense outside of cross compilation.

Okay. There’s a footnote, also:

The build platform is ignored because it is a mere implementation detail of the package satisfying the dependency: As a general programming principle, dependencies are always specified as interfaces, not concrete implementation.

Uhhhh okay. I don’t know what that means. I mean, I know what interfaces/implementation separation means in general. But I don’t know why this implies the build platform is an implementation detail. I don’t know what “build platform” means, I guess.

I thought the platform thing made sense, on my first read, but after reading the next few sentences it clearly doesn’t. And I have no idea what “propagated” means. The paragraph continues:

By default, the run-time/build-time distinction is just a hint for mental clarity, but with strictDeps set it is mostly enforced even in the native case.

That’s surprising to me? I would think, you know, the runtime dependencies are the ones that get installed when I say “install this thing please.” Nix will download the thing I asked for from cache, and also download all of its runtime dependencies. Right? It doesn’t seem like a hint for mental clarity. It seems like a pretty fundamentally different thing, in the face of a binary cache.

I’m not sure what it means to “enforce” this. It doesn’t describe strictDeps immediately, but I’ll keep going.

The extension of PATH with dependencies, alluded to above, proceeds according to the relative platforms alone. The process is carried out only for dependencies whose host platform matches the new derivation’s build platform i.e. dependencies which run on the platform where the new derivation will be built. For each dependency dep of those dependencies, dep/bin, if present, is added to the PATH environment variable.

Okay. That’s a lot of words, but I think it’s saying something extremely simple: the stdenv builder thingy makes build-time dependencies available to the build script, not runtime dependencies.

The terminology is very confusing, though: here we see “host platform” and “build platform,” though the previous paragraph used the terms “host platform” and “target platform.” I assume “host platform” meant “build platform” and “target platform” meant “runtime platform,” but then this sentence doesn’t make sense, so I guess that’s wrong. Since obviously it should only be providing build dependencies that can actually run on the build platform.

The paragraph links to a later chapter that says it will define these terms. I think I need to read that and come back; I am really not following this description.

In Chapter 9, I find these definitions:

The “build platform” is the platform on which a package is built. Once someone has a built package, or pre-built binary package, the build platform should not matter and can be ignored.

Okay. I’m happy with that.

The “host platform” is the platform on which a package will be run. This is the simplest platform to understand, but also the one with the worst name.

Haha. Okay.

The “target platform” attribute is, unlike the other two attributes, not actually fundamental to the process of building software. Instead, it is only relevant for compatibility with building certain specific compilers and build tools. It can be safely ignored for all other packages.

Then there are three more paragraphs describing it in detail. But basically, you know how gcc can’t actually compile for a different architecture unless you recompile gcc itself? That’s always seemed crazy to me, that compilers aren’t just functions that take code and produce whatever target you want. I mean, some compilers are, obviously. It seems like a weird architectural mistake, though, right? Like how is that a compile-time anyway we can stop taking about this. I have never written a compiler. I can’t really sass it in good conscience.

So it seems like Nix uses the term “target platform” to mean, like, “the platform that a given compiler can produce.” So I expect that there are versions of gcc for like:

  • host = darwin, target = darwin
  • host = darwin, target = linux
  • host = darwin, target = arm or something i dunno

So yeah, okay. Returning to that confusing paragraph…

The process is carried out only for dependencies whose host platform matches the new derivation’s build platform i.e. dependencies which run on the platform where the new derivation will be built. For each dependency dep of those dependencies, dep/bin, if present, is added to the PATH environment variable.

Okay. So it isn’t really making a runtime/build time distinction at all. It just says that dependencies that can run on the build platform will be in PATH. Simple. Let’s keep going.

The dependency is propagated when it forces some of its other-transitive (non-immediate) downstream dependencies to also take it on as an immediate dependency.

Is “other-transitive” a typo? Is that a term I should be familiar with? The parenthetical implies it means, like, “strictly transitive” (by analogy with “strict subset”)? Google doesn’t seem to know this term, but I think that’s what the manual is getting at.

But why does this… matter? What’s the difference between an immediate and a transitive dependency?

The next sentence is:

Nix itself already takes a package’s transitive dependencies into account, but this propagation ensures nixpkgs-specific infrastructure like setup hooks (mentioned above) also are run as if the propagated dependency.

I didn’t cut that off. That’s the whole sentence. And the end of the paragraph. It seems to just end in the middle of a sentence. Oh boy.

Okay, I don’t understand this. The term “setup hooks” did appear earlier in this section, but it was really only accompanied by a “we’ll talk about this soon” link to a later section. It seems like this is a way for a package to modify the build environments of things that depend on it (at build time)? I can maybe imagine this being useful but I am having trouble coming up with a plausible example right now.

Next paragraph:

It is important to note that dependencies are not necessarily propagated as the same sort of dependency that they were before, but rather as the corresponding sort so that the platform rules still line up.

What? By “sort of dependency,” does it mean as runtime or build time dependencies? How… I’m just confused at this point. This is not making sense to me.

The exact rules for dependency propagation can be given by assigning to each dependency two integers based one how its host and target platforms are offset from the depending derivation’s platforms. Those offsets are given below in the descriptions of each dependency list attribute. Algorithmically, we traverse propagated inputs, accumulating every propagated dependency’s propagated dependencies and adjusting them to account for the “shift in perspective” described by the current dependency’s platform offsets. This results in sort a transitive closure of the dependency relation, with the offsets being approximately summed when two dependency links are combined. We also prune transitive dependencies whose combined offsets go out-of-bounds, which can be viewed as a filter over that transitive closure removing dependencies that are blatantly absurd.

Uhhh uhhh help what is happening here. Even without the typo in the first sentence, this is mystifying. What are these offsets? I think of platforms as, like, distinct things. There’s no… hierarchy. There’s no ordering of different platforms.

We can define the process precisely with Natural Deduction using the inference rules. This probably seems a bit obtuse, but so is the bash code that actually implements it! They’re confusing in very different ways so… hopefully if something doesn’t make sense in one presentation, it will in the other!

Okay that paragraph took this from arcane and mystifying into a little bit fun and whimsical. I appreciate the human touch there. It even links to a Wikipedia article. I like that the author is acknowledging that this is hard, and that in fact it’s even harder than it sounds, because cross-compilation is complicated. A footnote points me to the code that implements this:

# Mutually-recursively find all build inputs. See the dependency section of the
# stdenv chapter of the Nixpkgs manual for the specification this algorithm
# implements.
findInputs() {

I remember that! I remember making a joke about this function all the way back in part 10. We’ve come full circle. At least for this one particular small nested circle that is part of a larger Kandinski-esque kaleidoscope of shapes.

Anyway, I cannot imagine that I, in this moment, need to understand this, as I am neither cross-compiling anything nor even trying to author a Nix package. But. I’ll read the pseudocode that describes what’s happening.

Okay. I read the pseudocode. It’s, umm, yeah.

I really feel like I don’t understand the motivation here for “adjusting platforms” or how we’re treating platforms, like, as numbers. Why I can “add” platforms together; what that even means. I think my brain really needs a concrete example in order to make any sense of this: this section is describing the solution to a problem I don’t really understand what the problem is in the first place. How often are we doing platform arithmetic? In what situations do we have dependencies that propagate through a weird chain of multiple different platforms? I don’t know.

I suppose this comes up a lot if you’re, like, building a CI server intended to produce binaries for multiple platforms. But I’m thinking that this probably isn’t going to be a large part of my Nix experience. So I don’t feel too bad that I don’t understand it.

Mostly, though, the presentation is– I don’t think this section of the manual does a good job of explaining this to someone who does not already understand how it works. We just dove right into “yeah there are relative offsets and you have to follow these inference rules and they’re just integers.” At no point did we, like, get a concrete example to motivate it. The manual hasn’t even technically defined the relevant terms here, unless you jumped ahead to Chapter 9 as I did.

Anyway. Moving on.

Next up we get different variables that we can use to describe dependencies for the standard environment.

We’ve already seen buildInputs; it was straightforward. Here’s the formal treatment:

A list of dependencies whose host platform and target platform match the new derivation’s. This means a 0 host offset and a 1 target offset from the new derivation’s host platform. This would be called depsHostTarget but for historical continuity. If the dependency doesn’t care about the target platform (i.e. isn’t a compiler or similar tool), put it here, rather than in depsBuildBuild.

Okaaaay. I am very glad that this is not called depsHostTarget, regardless of historical continuity.

These are often programs and libraries used by the new derivation at run-time, but that isn’t always the case. For example, the machine code in a statically-linked library is only used at run-time, but the derivation containing the library is only needed at build-time. Even in the dynamic case, the library may also be needed at build-time to appease the linker.

The library case is an interesting example; I sort of think of “build-time dependencies” as things like gcc, and I would think of pcre as a runtime dependency, but of course, yeah, in the case of static linking pcre is sort of both. I guess that’s maybe why the waters are muddied a little here between runtime and build-time dependencies. Hmm. Okay. That makes this seem… less gratuitously confusing, I guess.

Let’s see… there are a bunch of other variables. Most of them have crazy names in the pattern of depsHostTarget that I can’t begin to parse.

But nativeBuildInputs sounds interesting.

If the dependency doesn’t care about the target platform (i.e. isn’t a compiler or similar tool), put it here, rather than in depsBuildBuild or depsBuildTarget. This could be called depsBuildHost but nativeBuildInputs is used for historical continuity.

And then there are a bunch of these weirdly constructed variables: depsBuildBuild, depsBuildTarget, depsHostHost, depsTargetTarget. And then also propagated versions of each of those. Every one has at least one paragraph of description.

Here’s an excerpt from depsBuildTarget:

This is a somewhat confusing concept to wrap one’s head around, and for good reason. As the only dependency type where the platform offsets are not adjacent integers, it requires thinking of a bootstrapping stage two away from the current one. It and its use-case go hand in hand and are both considered poor form: try to not need this sort of dependency, and try to avoid building standard libraries and runtimes in the same derivation as the compiler produces code using them. Instead strive to build those like a normal library, using the newly-built compiler just as a normal library would. In short, do not use this attribute unless you are packaging a compiler and are sure it is needed.

Okay, I don’t– I’m definitely not going to understand all of these. I will come back to this as soon as I start doing cross-compilation, I guess. In the meantime… I guess I’ll stick to nativeBuildInputs? Most of the time?

Next up we learn some attributes that will affect the stdenv builder. It calls them attributes, but I assume these are actually environment variables that affect the stdenv builder, but we rely on the fact that attributes turn into environment variables.

NIX_DEBUG – does roughly what you’d expect. Set it to an integer verbosity level from 1 to 7, it seems. enableParallelBuilding, which may default to true or false depending on the tool (?). passthru:

This is an attribute set which can be filled with arbitrary values. For example:

passthru = {
  foo = "bar";
  baz = {
    value1 = 4;
    value2 = 5;
  };
}

Values inside it are not passed to the builder, so you can change them without triggering a rebuild. However, they can be accessed outside of a derivation directly, as if they were set inside a derivation itself, e.g. hello.baz.value1.

Okay, neat. One particular value is considered special by Nixpkgs convention: passthru.updateScript, which is used to automatically update Nixpkgs by maintainers/scripts/update.nix. Okay.

Thank goodness we don’t have to type those extra three characters.

Next up we learn about customizing the different “phases” of the default build – configure, build, install, etc. It turns out you can actually set your own phases, by specifying, well, the phases attribute/environment variable. It defaults to:

$prePhases unpackPhase patchPhase $preConfigurePhases configurePhase 
$preBuildPhases buildPhase checkPhase $preInstallPhases installPhase 
fixupPhase installCheckPhase $preDistPhases distPhase $postPhases

Okay, so some of those are lists, I guess, and some are single phases. So you can, like, append some preBuildPhases. Okay, sure. I have seen at least one higher-level package-making combinator that appeared to make it easy to define Python packages, so I can see this being useful. These all seem pretty straightforward.

It describes the default behavior for each of the singular phases (I assume the default behavior for the pre/post phases is nothing).

unpackPhase does what you’d expect with the src argument, depending on its extension (or with the srcs argument, if you’re so inclined). There are various attributes you can also pass to customize this, as well as preUnpack and postUnpack which are not phases, apparently, but “hooks.”

patchPhase reads patches from an attribute called – you guessed it – patches. I assume this is supposed to be a list of files, not literal patches as strings, but it doesn’t actually give an example. Maybe both work?

I look for concrete examples. A lot of them use fetchpatch, and take a path to an individual commit on GitHub? Alright; that’s useful to know I guess, for contributing to Nixpkgs. But if just I have a problem, I’ll probably add a packageOverrides to just reference a local patch. There are also some packages that just check in a patch file directly into Nixpkgs. I guess it’s personal preference. I’m also seeing a function called substituteAll? I find the definition, but it’s unhelpfully documented as:

# see the substituteAll in the nixpkgs documentation for usage and constaints

Okay. ⌘F tells me I’ll be getting to that soon.

I don’t see any examples of inlining a literal patch into the package config. So that’s good.

Next up is configure – I can set dontConfigure = true; to skip this, in case I am not writing a C project. Which I almost certainly am not. Lotsa flags for this one.

Then buildPhase, checkPhase, installPhase – all what you’d expect. Then fixupPhase, which is not obvious, but explained as:

  • It moves the man/, doc/ and info/ subdirectories of $out to share/.

I guess so that packages are more likely to follow a consistent directory structure?

  • It strips libraries and executables of debug information.

Okay – but there are flags to ask it not to, please. I assume this is the default to get more deterministic artifacts. It also teaches me that I can separateDebugInfo = true; and then set debug-file-directory ~/.nix-profile/lib/debug in my ~/.gdbinit in order to be able to debug things I install with nix-env. Neat.

  • On Linux, it applies the patchelf command to ELF executables and libraries to remove unused directories from the RPATH in order to prevent unnecessary runtime dependencies.

Aha. So that’s why we have patchelf.

  • It rewrites the interpreter paths of shell scripts to paths found in PATH. E.g., /usr/bin/perl will be rewritten to /nix/store/some-perl/bin/perl found in PATH.

Neat. Okay. Does it do that only for shebangs, or for any absolute path? Presumably only shebangs: the relative attribute to not do this is called dontPatchShebangs. Also doing the other thing would be… hard. I feel like parsing arbitrary bash is a bit out of scope for Nix.

Okay, that was an interesting phase. Next is installCheck, which is another boring one.

Then there’s the distPhase:

The distribution phase is intended to produce a source distribution of the package. The default distPhase first calls make dist, then it copies the resulting source tarballs to $out/tarballs/. This phase is only executed if the attribute doDist is set.

Okay! That’s it for phases. Next up, a description of some friendly shell functions we have available for our use in builder scripts:

makeWrapper allows me to create, well, a wrapper executable that runs the wrapped executable with different environment variables, different PATH, different argv[0], different command-line args… nothing about removing command-line args, though, so I don’t know if I could use this to make a zsh that is compatible with nix-shell. Maybe, though?

substitute is… it’s basically just sed, but with nice syntax for expanding variables, and seems to only do literal replacements.

substitute ./foo.in ./foo.out \
    --replace /usr/bin/bar $bar/bin/bar \
    --replace "a string containing spaces" "some other text" \
    --subst-var someVar

substitute is implemented using the replace command. Unlike with the sed command, you don’t have to worry about escaping special characters.

I’ve never heard of replace before. It’s not included in macOS. It’s not a part of coreutils, either. How does it call replace, then?

It doesn’t. That’s a lie – it seems to entirely use shell replacement patterns. There is a nixpkgs.replace, but it doesn’t appear to have man pages, so I can’t really tell you anything.

This makes me think this is stale, and that the next line is also stale:

It supports performing substitutions on binary files (such as executables), though there you’ll probably want to make sure that the replacement string is as long as the replaced string.

I very much doubt that, if it’s using shell substitution commands. But I’m not gonna try it.

Next up we have substituteAll – which we saw in the patches thing.

It seems that the Nix function substituteAll produces a derivation whose builder script calls the shell function substituteAll. And then also does eval "$preInstall" and eval "$postInstall"? I don’t know what that’s about. Why is that a part of this?

Oh, right, because this is the preInstall/postInstall for this little anonymous derivation, not for the derivation using substituteAll to patch its sources. Right. Right. Okay. But it doesn’t define a preInstall/postInstall attribute? So why… is this because of that weird “propagated” thing that I don’t really understand? Why is this necessary?

I don’t know. Let’s move on.

substituteAllInPlace; easy. stripHash:

# prints coreutils-8.24
stripHash "/nix/store/9s9r019176g7cvn2nvcw41gsp862y6b4-coreutils-8.24"

Useful.

And lastly wrapProgram:

Convenience function for makeWrapper that automatically creates a sane wrapper file. It takes all the same arguments as makeWrapper, except for --argv0.

Okay! That’s what our builders look like.

6.7. Package setup hooks

Now we arrive at a very long section. There are many paragraphs of English prose uninterrupted by code snippets or examples.

But they’re very well-written paragraphs, and they clearly explain something that was confusing to me. And they do so humbly, describing the downsides of this approach and how it’s kind of gross and hacky but still extremely useful.

So basically: when you depend on something, you aren’t just depending on it; you aren’t just putting it in your /nix/store. You’re also giving that package the chance to run arbitrary shell code during your build phase – basically, you’re giving every dependency a chance to configure itself before you turn around and use it.

So I’m on board with that. Then it goes back to describing “platforms” as integers and how you need to use $hostOffset or $targetOffset in order to make sure you’re adding hooks to the right downstream packages or… I don’t know. Then it loses me. My takeaway from this is that I should add these hooks by saying:

addEnvHooks "$hostOffset" myBashFunction

And try not to worry about what “$hostOffset” means.

Then it describes specific hooks that “are run for every package built using stdenv.mkDerivation,” but describes things it’s already talked about. move-docs.sh, which we remember from fixupPhase:

This setup hook moves any installed documentation to the /share subdirectory directory. This includes the man, doc and info directories. This is needed for legacy programs that do not know how to use the share subdirectory.

Wha? When it says “the /share subdirectory directory,” I assume it means “the share subdirectory?” Like /nix/store/xxx-out/share? Is that a typo or some strange convention with which I am unfamiliar?

Then strip.sh, and patch-shebangs.sh… I guess we’re just seeing how these behaviors are implemented? But I really don’t see how this relates to hooks – how this relates to dependencies configuring themselves. It seems like it’s just describing the stdenv build details.

It’s weird that it’s describing these as .sh files, when the manual presented these as “bash functions.” Maybe they define the bash functions? Or maybe there’s some dynamism here so you can give it a function or a script? I don’t know.

This describes the “CC Wrapper” a bit – I sort of encountered that, unwillingly, when I was trying to build Nix from source. It seems to exist to find dependencies. I dunno. I’m sort of zoning out; this is a really long section that I really don’t expect to matter for a long time. I skim…

breakpointHook:

This hook will make a build pause instead of stopping when a failure happens. It prevents nix from cleaning up the build environment immediately and allows the user to attach to a build environment using the cntr command. Upon build error it will print instructions on how to use cntr, which can be used to enter the environment for debugging. Installing cntr and running the command will provide shell access to the build sandbox of failed build.

Never heard of cntr. It seems to be “A container debugging tool based on FUSE.” Neat. I can use it pretty easily:

nativeBuildInputs = [ breakpointHook ];

But, unfortunately, it’s Linux-specific. Are builds happening in containers? I haven’t really given much thought to the build environment isolation here.

Anyway, there are millions of built-in hooks. Millions. I skim to the end, without finding any others that excite me.

I don’t really get– these descriptions follow a section that says “hooks are weirdly like inherited transparently.” But then it describes a bunch that… aren’t? They’re just built into every package?

Oh, I missed this separator in my skimming:

Here are some more packages that provide a setup hook. Since the list of hooks is extensible, this is not an exhaustive list. The mechanism is only to be used as a last resort, so it might cover most uses.

Okay. So the examples here are illustrative:

Python: Adds the lib/${python.libPrefix}/site-packages subdirectory of each build input to the PYTHONPATH environment variable.

Okay. I’m not really a Python person, but this seems like a straightforward one. Let’s see if we can follow the definition.

$ cat ~/src/nixpkgs/pkgs/development/interpreters/python/setup-hook.sh
addPythonPath() {
    addToSearchPathWithCustomDelimiter : PYTHONPATH $1/@sitePackages@
}

toPythonPath() {
    local paths="$1"
    local result=
    for i in $paths; do
        p="$i/@sitePackages@"
        result="${result}${result:+:}$p"
    done
    echo $result
}

if [ -z "${dontAddPythonPath:-}" ]; then
    addEnvHooks "$hostOffset" addPythonPath
fi

# Determinism: The interpreter is patched to write null timestamps when compiling python files.
# This way python doesn't try to update them when we freeze timestamps in nix store.
export DETERMINISTIC_BUILD=1;
# Determinism: We fix the hashes of str, bytes and datetime objects.
export PYTHONHASHSEED=0;
# Determinism. Whenever Python is included, it should not check user site-packages.
# This option is only relevant when the sandbox is disabled.
export PYTHONNOUSERSITE=1;

Also note the @sitePackages@ line, which will presumably be replaced by a substituteAll call somewhere.

But, okay… where is this used? I see that it calls addEnvHooks. So presumably this script is going to be eval’d by any script that depends on python (or any of the other packages defined here). But how does that happen?

Is there a setupHook = line, somewhere? I can’t find it, but there is a lot of Nix code here. This might be easier to observe in motion:

nix-repl> pkgs.python // { type = "divination"; }
{
  C_INCLUDE_PATH = "/nix/store/v5vd9k28hsmj826nr0sx48ajyh7wfqf1-Libsystem-1238.60.2/include:/nix/store/czpiyz53jz1lznag71s5q25ajjqls3ad-bzip2-1.0.6.0.1-dev/include:/nix/store/l8gj8w7ma7ykv4wjdcrm27cxlrz8jfpj-openssl-1.1.1j-dev/include:/nix/store/78w7q3bfqk7v1m6rmyqjvz9mcxmqnrqf-zlib-1.2.11-dev/include:/nix/store/2wqdl2a5wx0jaqf1jrizpkncw91g55p3-db-5.3.28-dev/include:/nix/store/6i66pc23pcp8vvi5aqx2grwmg6505jds-gdbm-1.19/include:/nix/store/vgma7r066pldsla4fi1jwzl3rz88b8f1-ncurses-6.2-dev/include:/nix/store/wsgwk0dmqgw5dabaq05j9kspwc53v2j2-sqlite-3.34.1-dev/include:/nix/store/l9i9j3kjhnrzy2b0wpzw0hjkkm3lklv9-readline-6.3p08-dev/include:/nix/store/h0x6wir3pz5kr27saj5s0axfmffwadv2-configd-453.19/include";
  DETERMINISTIC_BUILD = 1;
  LDFLAGS = "";
  LIBRARY_PATH = "/nix/store/v5vd9k28hsmj826nr0sx48ajyh7wfqf1-Libsystem-1238.60.2/lib:/nix/store/yagfwm4fdnb7izby3qwbdbi3klrhc6cf-bzip2-1.0.6.0.1/lib:/nix/store/xxmdw2z5cm1rlcl8n9n91q7356542byw-openssl-1.1.1j/lib:/nix/store/rnxxqbgijxxnjfyqpcgmcajmi6hq19b2-zlib-1.2.11/lib:/nix/store/95lcb5clawyk93mad8snn99p5xrryk6w-db-5.3.28/lib:/nix/store/6i66pc23pcp8vvi5aqx2grwmg6505jds-gdbm-1.19/lib:/nix/store/gkmkh3nj9qck3kyqcrqd7lcqsv1ipn2c-ncurses-6.2/lib:/nix/store/98581sab5cs4q4x9a0cs6jqyyypbsx2f-sqlite-3.34.1/lib:/nix/store/yk2ry7hp5hzn9vxld87m4gx4b3j9x0bj-readline-6.3p08/lib:/nix/store/h0x6wir3pz5kr27saj5s0axfmffwadv2-configd-453.19/lib";
  NIX_CFLAGS_COMPILE = "-msse2";
  __darwinAllowLocalNetworking = false;
  __ignoreNulls = true;
  __impureHostDeps = [ ... ];
  __propagatedImpureHostDeps = [ ... ];
  __propagatedSandboxProfile = [ ... ];
  __sandboxProfile = "";
  all = [ ... ];
  args = [ ... ];
  buildEnv = «derivation /nix/store/gnk5p753k9iy9phr2jna02828k6cn5xh-python-2.7.18-env.drv»;
  buildInputs = [ ... ];
  builder = "/nix/store/l25gl3siwmq6gws4lqlyd1040xignvqw-bash-4.4-p23/bin/bash";
  configureFlags = [ ... ];
  depsBuildBuild = [ ... ];
  depsBuildBuildPropagated = [ ... ];
  depsBuildTarget = [ ... ];
  depsBuildTargetPropagated = [ ... ];
  depsHostHost = [ ... ];
  depsHostHostPropagated = [ ... ];
  depsTargetTarget = [ ... ];
  depsTargetTargetPropagated = [ ... ];
  doCheck = false;
  doInstallCheck = false;
  drvAttrs = { ... };
  drvPath = "/nix/store/y8nr9xn2k2k7710mj09ywsd2yb3fw66g-python-2.7.18.drv";
  enableParallelBuilding = true;
  enableParallelChecking = true;
  executable = "python2.7";
  hasDistutilsCxxPatch = true;
  implementation = "cpython";
  inputDerivation = «derivation /nix/store/gapzmqlazdh9aiqb4x1bbdrm5hcvs1gz-python-2.7.18.drv»;
  interpreter = "/nix/store/nbjiq2cvbjx8wjdn859ynma92s194mcx-python-2.7.18/bin/python2.7";
  isPy2 = true;
  isPy27 = true;
  isPy3 = false;
  isPy310 = false;
  isPy35 = false;
  isPy36 = false;
  isPy37 = false;
  isPy38 = false;
  isPy39 = false;
  isPy3k = false;
  isPyPy = false;
  libPrefix = "python2.7";
  meta = { ... };
  name = "python-2.7.18";
  nativeBuildInputs = [ ... ];
  out = «derivation /nix/store/y8nr9xn2k2k7710mj09ywsd2yb3fw66g-python-2.7.18.drv»;
  outPath = "/nix/store/nbjiq2cvbjx8wjdn859ynma92s194mcx-python-2.7.18";
  outputName = "out";
  outputUnspecified = true;
  outputs = [ ... ];
  override = { ... };
  overrideAttrs = «lambda @ /nix/store/mi0xpwzl81c7dgpr09qd67knbc24xab5-nixpkgs-21.05pre274251.f5f6dc053b1/nixpkgs/lib/customisation.nix:85:73»;
  overrideDerivation = «lambda @ /nix/store/mi0xpwzl81c7dgpr09qd67knbc24xab5-nixpkgs-21.05pre274251.f5f6dc053b1/nixpkgs/lib/customisation.nix:84:32»;
  passthru = { ... };
  patches = [ ... ];
  pkgs = { ... };
  pname = "python";
  postFixup = "# Include a sitecustomize.py file. Note it causes an error when it's in postInstall with 2.7.\ncp /nix/store/kclys2xfrg0zjmpa37gyp33nyg1c7j0q-sitecustomize.py $out/lib/python2.7/site-packages/sitecustomize.py\n";
  postInstall = "# needed for some packages, especially packages that backport\n# functionality to 2.x from 3.x\nfor item in $out/lib/python2.7/test/*; do\n  if [[ \"$item\" != */test_support.py*\n     && \"$item\" != */test/support\n     && \"$item\" != */test/regrtest.py* ]]; then\n    rm -rf \"$item\"\n  else\n    echo $item\n  fi\ndone\ntouch $out/lib/python2.7/test/__init__.py\nln -s $out/lib/python2.7/pdb.py $out/bin/pdb\nln -s $out/lib/python2.7/pdb.py $out/bin/pdb2.7\nln -s $out/share/man/man1/{python2.7.1.gz,python.1.gz}\n\nrm \"$out\"/lib/python*/plat-*/regen # refers to glibc.dev\n\n# Determinism: Windows installers were not deterministic.\n# We're also not interested in building Windows installers.\nfind \"$out\" -name 'wininst*.exe' | xargs -r rm -f\n# Determinism: rebuild all bytecode\n# We exclude lib2to3 because that's Python 2 code which fails\n# We rebuild three times, once for each optimization level\nfind $out -name \"*.py\" | $out/bin/python -m compileall -q -f -x \"lib2to3\" -i -\nfind $out -name \"*.py\" | $out/bin/python -O -m compileall -q -f -x \"lib2to3\" -i -\nfind $out -name \"*.py\" | $out/bin/python -OO -m compileall -q -f -x \"lib2to3\" -i -\n";
  postPatch = "";
  preConfigure = "# Purity.\nfor i in /usr /sw /opt /pkg; do\n  substituteInPlace ./setup.py --replace $i /no-such-path\ndone\nfor i in Lib/plat-*/regen; do\n  substituteInPlace $i --replace /usr/include/ /nix/store/v5vd9k28hsmj826nr0sx48ajyh7wfqf1-Libsystem-1238.60.2/include/\ndone\nsubstituteInPlace configure --replace '`/usr/bin/arch`' '\"i386\"'\nsubstituteInPlace Lib/multiprocessing/__init__.py \\\n  --replace 'os.popen(comm)' 'os.popen(\"/nix/store/cpvjym13fdglv6zmdr5xav20g5rbafbx-coreutils-8.32/bin/nproc\")'\n";
  propagatedBuildInputs = [ ... ];
  propagatedNativeBuildInputs = [ ... ];
  pythonAtLeast = «lambda @ /nix/store/mi0xpwzl81c7dgpr09qd67knbc24xab5-nixpkgs-21.05pre274251.f5f6dc053b1/nixpkgs/lib/strings.nix:503:24»;
  pythonForBuild = «derivation /nix/store/y8nr9xn2k2k7710mj09ywsd2yb3fw66g-python-2.7.18.drv»;
  pythonOlder = «lambda @ /nix/store/mi0xpwzl81c7dgpr09qd67knbc24xab5-nixpkgs-21.05pre274251.f5f6dc053b1/nixpkgs/lib/strings.nix:491:22»;
  pythonVersion = "2.7";
  setupHook = «derivation /nix/store/n2l76vrvpa2fv89kflg5s1fk6x713x1g-python-setup-hook.sh.drv»;
  sitePackages = "lib/python2.7/site-packages";
  sourceVersion = { ... };
  src = «derivation /nix/store/jbv74pqxq95hcwif4mf19rapmq69a1bi-Python-2.7.18.tar.xz.drv»;
  stdenv = «derivation /nix/store/v1fmdbwdgqds6b4icqzyin0anag03dz3-stdenv-darwin.drv»;
  strictDeps = false;
  system = "x86_64-darwin";
  tests = { ... };
  type = "divination";
  ucsEncoding = 4;
  userHook = null;
  version = "2.7.18";
  withPackages = «lambda @ /nix/store/mi0xpwzl81c7dgpr09qd67knbc24xab5-nixpkgs-21.05pre274251.f5f6dc053b1/nixpkgs/pkgs/development/interpreters/python/with-packages.nix:3:1»;
}

Okay – yes:

setupHook = «derivation /nix/store/n2l76vrvpa2fv89kflg5s1fk6x713x1g-python-setup-hook.sh.drv»;

So somewhere in this directory, I should be able to find a like substituteAll setup-hook.sh python-setup-hook.sh line, right?

Not quite, but close:

$ cat setup-hook.nix
{ runCommand }:

sitePackages:

let
  hook = ./setup-hook.sh;
in runCommand "python-setup-hook.sh" {
  inherit sitePackages;
} ''
  cp ${hook} hook.sh
  substituteAllInPlace hook.sh
  mv hook.sh $out
''

An import ./setup-hook.nix, then?

Nope. Hmm. It’s kind of a tangly mess.

cpython/default.nix (as well as pypy/default.nix and cpython/2.7/default.nix) all contain the setupHook = python-setup-hook sitePackages; lines. So those are the actual leaf derivations, I assume. But they each take python-setup-hook as an argument, and I’m not sure… where that comes, who calls them with it, or why it’s kebab-case.

All I can see is a callPackage, but that would imply python-setup-hook exists at the top-level, which–

nix-repl> pkgs.python-setup-hook
{ __functionArgs = { ... }; __functor = «lambda @ /nix/store/mi0xpwzl81c7dgpr09qd67knbc24xab5-nixpkgs-21.05pre274251.f5f6dc053b1/nixpkgs/lib/trivial.nix:324:19»; override = { ... }; }

Oh. Which it does. Okay. Weird:

$ grep -B 1 'python-setup-hook' ~/src/nixpkgs/pkgs/top-level/all-packages.nix
  # Should eventually be moved inside Python interpreters.
  python-setup-hook = callPackage ../development/interpreters/python/setup-hook.nix { };

Huh. Okay.

Well, anyway, we saw an example. How did I know I was looking for setupHook? Did I just guess that that would be the name of the attribute? I think I did. It is documented, all the way back in the fixupPhase documentation:

setupHook – A package can export a setup hook by setting this variable. The setup hook, if defined, is copied to $out/nix-support/setup-hook. Environment variables are then substituted in it using substituteAll.

Huh, okay. We didn’t see that, though – we saw a derivation that explicitly called substituteAll. Oh, but right, duh, environment variables. So the script says $hostOffset, but the result says…

$ cat /nix/store/3dc6fidfpk2z2n92vqnagswnzrw512ax-python-2.7.17/nix-support/setup-hook
addPythonPath() {
    addToSearchPathWithCustomDelimiter : PYTHONPATH $1/lib/python2.7/site-packages
}

toPythonPath() {
    local paths="$1"
    local result=
    for i in $paths; do
        p="$i/lib/python2.7/site-packages"
        result="${result}${result:+:}$p"
    done
    echo $result
}

addEnvHooks "$hostOffset" addPythonPath

# Determinism: The interpreter is patched to write null timestamps when compiling python files.
# This way python doesn't try to update them when we freeze timestamps in nix store.
export DETERMINISTIC_BUILD=1;
# Determinism: We fix the hashes of str, bytes and datetime objects.
export PYTHONHASHSEED=0;
# Determinism. Whenever Python is included, it should not check user site-packages.
# This option is only relevant when the sandbox is disabled.
export PYTHONNOUSERSITE=1;

Umm, still says $hostOffset. Huh. Only the @sitePackages@ calls are different. And presumably they are different because of this:

let
  hook = ./setup-hook.sh;
in runCommand "python-setup-hook.sh" {
  inherit sitePackages;
} ''
  cp ${hook} hook.sh
  substituteAllInPlace hook.sh
  mv hook.sh $out
''

And not because of any, you know, anything magical happening. Hmm.

6.8 Purity in Nixpkgs

[measures taken to prevent dependencies on packages outside the store, and what you can do to prevent them]

GCC doesn’t search in locations such as /usr/include. In fact, attempts to add such directories through the -I flag are filtered out. Likewise, the linker (from GNU binutils) doesn’t search in standard locations such as /usr/lib. Programs built on Linux are linked against a GNU C Library that likewise doesn’t search in the default system locations.

That’s the whole section! Thank goodness for the tl;dr at the top.

6.9. Hardening in Nixpkgs

This section seems to begin with the assumption that I know what hardening means. I do not. There are lots of things that that could mean to me. I think of, you know, cybersecurity. Then I think of Metapod. But… I assume this is some sort of reproducibility thing?

No! It actually is about security hardening, if you’re building C or C++ packages. It seems to default to enabling a bunch of nice warnings in gcc? And some actual protections. Neat! It has little examples of the types of compiler errors you should expect to see if these hardening flags break your build and need to be disabled. That’s nice! Thanks, Nix.

what did we learn

Okay. That’s the end of the chapter.

This was a long one, and sort of a… mixture of extremely useful practical information about how to actually write your own derivation in real life, and then a bunch of completely inscrutable cross-compilation and propagated dependency stuff.

Why was that confusing? Cross-compilation isn’t, like, that crazy, is it? We’re just compiling things for other things? Yeah, the gcc “target” thing is weird, I guess, but… it seems like if I think about how I would model this, maybe I will come up with something resembling Nix’s solution, and then it will make sense to me.

So something I recently did with cross-compilation was build QMK for a weird new keyboard I bought. I’ll try to use that as a concrete example, and see how far I get.

So the “package” – the ultimate thing I built – was a “binary” for… well, I don’t know what Nix would call the platform. I’ll call it avr, I guess.

PackageHost platform
keymap.hexavr

But it has dependencies. For one thing, in order to make that, I needed a compiler that targeted avr:

PackageHost platformTarget platform
keymap.hexavr
gccx86_64-darwinavr

Wow, in doing that, I realized that if you ever ask nix repl to print out the top-level nixpkgs expression, it just… does. And there’s no way to quit. It doesn’t process SIGTERM – you can’t ctrl-c out of it. I had to kill -9 it to get it to stop spewing errors about unfree packages (???). Weird. Unpleasant.

So okay, obviously there are more than just those things. gcc itself has a lots of dependencies. But they sort of are all trivial now, right? There’s only one “level” of cross-compilation happening. Let’s assume gcc depends on make

PackageHost platformTarget platform
keymap.hexavr
gccx86_64-darwinavr
makex86_64-darwinx86_64-darwin

I guess… hmm. This exercise isn’t useful. I don’t know how to print out an actual visualization of the real life qmk package. Or even if the real-life scenario is interesting or representative: after all, I’m just making an executable – it itself has no Nix-specified dependencies (as far as I know). And there is no Nix expression to define the final output – only a Nix expression to define the environment that I need in order to build the final output. There might not actually be any actual platform arithmetic here.

Darn.

Yeah, I guess I’m having trouble understanding why this isn’t just a simple recursive thing. Line up the target platform with the host platform. I feel like I really need an example of… why you’d ever need multiple levels of indirection here.

So say I’m trying to build a Linux executable. And I’m on my Mac. And the Linux executable has build-time and run-time dependencies.

Some of the build-time dependencies are, like, make. But some of the build-time dependencies are libraries that need to be statically linked. And then some of the runtime dependencies are libraries that are going to be dynamically linked. Let’s say:

  • Build-time dependencies: gcc, make, pcre
  • Runtime dependencies: libsqlite3

So those build-time dependencies are different types of dependencies. make is easy; it’s doing its own thing. host = darwin, target = darwin. gcc is host = darwin, target = linux. But pcre is host = linux (with no target). And libsqlite3 is also host = linux (no target).

So how do I say that to Nix? How do I express the difference between those “types” of dependencies? Presumably these are different combinations of depsBuildTarget and depsHostHost or whatever?

I don’t know. I don’t know how. I would think that I would, like, say explicitly what I need and when I need it. But it seems that Nix is going to great effort to make sure that I don’t need to be explicit, and it will just figure it out?

And apparently it involves arithmetic. Yeah, I don’t get it. I don’t know. I really need to work through a concrete example before I can hope to understand what’s happening here.

Which I will not do right now. If anyone out there is reading this and does understand all this, I would love to hear how you got there. Otherwise, I will have to wait for Chapter 9, which seems to dive more deeply into all of this (oh boy).


  • Does Nix install all build-time dependencies of a package?
  • What… on earth… is all of this platform arithmetic? What do these numbers mean? Why– what is happening here?
  • Why does the fixupPhase move doc directories around?
  • Why does the substituteAll builder need to call preInstall/postInstall?
  • Is it true that “Environment variables are then substituted in it using substituteAll”? Is Python doing that explicitly unnecessarily, or are the docs wrong?
  • Why does nix repl keep printing errors about unfree packages when I just ask it to evaluate the top-level pkgs expression?