Merge commit 'cfc75eec4603c06503ae750f88cf397e00796ea8'

Conflicts: nixpkgs/pkgs/build-support/rust/build-rust-package/default.nix
author: Alyssa Ross <hi@alyssa.is> 2023-10-20 22:09:03 +0000
committer: Alyssa Ross <hi@alyssa.is> 2023-10-20 22:09:03 +0000
commit: 50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e (patch)
tree: f2556b911180125ccbb7ed0e78a54e92da89adce /nixpkgs/lib/fileset
parent: 4c16d4548a98563c9d9ad76f4e5b2202864ccd54 (diff)
parent: cfc75eec4603c06503ae750f88cf397e00796ea8 (diff)
download: nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar
nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.gz
nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.bz2
nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.lz
nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.xz
nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.zst
nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.zip
4 files changed, 910 insertions, 114 deletions
diff --git a/nixpkgs/lib/fileset/README.md b/nixpkgs/lib/fileset/README.md
index 6e57f1f8f2b4..ebe13f08fdef 100644
--- a/nixpkgs/lib/fileset/README.md
+++ b/nixpkgs/lib/fileset/README.md
@@ -1,5 +1,10 @@
 # File set library
 
+This is the internal contributor documentation.
+The user documentation is [in the Nixpkgs manual](https://nixos.org/manual/nixpkgs/unstable/#sec-fileset).
+
+## Goals
+
 The main goal of the file set library is to be able to select local files that should be added to the Nix store.
 It should have the following properties:
 - Easy:
@@ -41,12 +46,20 @@ An attribute set with these values:
 - `_type` (constant string `"fileset"`):
   Tag to indicate this value is a file set.
 
-- `_internalVersion` (constant `2`, the current version):
+- `_internalVersion` (constant `3`, the current version):
   Version of the representation.
 
+- `_internalIsEmptyWithoutBase` (bool):
+  Whether this file set is the empty file set without a base path.
+  If `true`, `_internalBase*` and `_internalTree` are not set.
+  This is the only way to represent an empty file set without needing a base path.
+
+  Such a value can be used as the identity element for `union` and the return value of `unions []` and co.
+
 - `_internalBase` (path):
   Any files outside of this path cannot influence the set of files.
-  This is always a directory.
+  This is always a directory and should be as long as possible.
+  This is used by `lib.fileset.toSource` to check that all files are under the `root` argument
 
 - `_internalBaseRoot` (path):
   The filesystem root of `_internalBase`, same as `(lib.path.splitRoot _internalBase).root`.
@@ -111,9 +124,57 @@ Arguments:
 - (+) This can be removed later, if we discover it's too restrictive
 - (-) It leads to errors when a sensible result could sometimes be returned, such as in the above example.
 
+### Empty file set without a base
+
+There is a special representation for an empty file set without a base path.
+This is used for return values that should be empty but when there's no base path that would makes sense.
+
+Arguments:
+- Alternative: This could also be represented using `_internalBase = /.` and `_internalTree = null`.
+  - (+) Removes the need for a special representation.
+  - (-) Due to [influence tracking](#influence-tracking),
+    `union empty ./.` would have `/.` as the base path,
+    which would then prevent `toSource { root = ./.; fileset = union empty ./.; }` from working,
+    which is not as one would expect.
+  - (-) With the assumption that there can be multiple filesystem roots (as established with the [path library](../path/README.md)),
+    this would have to cause an error with `union empty pathWithAnotherFilesystemRoot`,
+    which is not as one would expect.
+- Alternative: Do not have such a value and error when it would be needed as a return value
+  - (+) Removes the need for a special representation.
+  - (-) Leaves us with no identity element for `union` and no reasonable return value for `unions []`.
+    From a set theory perspective, which has a well-known notion of empty sets, this is unintuitive.
+
+### No intersection for lists
+
+While there is `intersection a b`, there is no function `intersections [ a b c ]`.
+
+Arguments:
+- (+) There is no known use case for such a function, it can be added later if a use case arises
+- (+) There is no suitable return value for `intersections [ ]`, see also "Nullary intersections" [here](https://en.wikipedia.org/w/index.php?title=List_of_set_identities_and_relations&oldid=1177174035#Definitions)
+  - (-) Could throw an error for that case
+  - (-) Create a special value to represent "all the files" and return that
+    - (+) Such a value could then not be used with `fileFilter` unless the internal representation is changed considerably
+  - (-) Could return the empty file set
+    - (+) This would be wrong in set theory
+- (-) Inconsistent with `union` and `unions`
+
+### Intersection base path
+
+The base path of the result of an `intersection` is the longest base path of the arguments.
+E.g. the base path of `intersection ./foo ./foo/bar` is `./foo/bar`.
+Meanwhile `intersection ./foo ./bar` returns the empty file set without a base path.
+
+Arguments:
+- Alternative: Use the common prefix of all base paths as the resulting base path
+  - (-) This is unnecessarily strict, because the purpose of the base path is to track the directory under which files _could_ be in the file set. It should be as long as possible.
+    All files contained in `intersection ./foo ./foo/bar` will be under `./foo/bar` (never just under `./foo`), and `intersection ./foo ./bar` will never contain any files (never under `./.`).
+    This would lead to `toSource` having to unexpectedly throw errors for cases such as `toSource { root = ./foo; fileset = intersect ./foo base; }`, where `base` may be `./bar` or `./.`.
+  - (-) There is no benefit to the user, since base path is not directly exposed in the interface
+
 ### Empty directories
 
-File sets can only represent a _set_ of local files, directories on their own are not representable.
+File sets can only represent a _set_ of local files.
+Directories on their own are not representable.
 
 Arguments:
 - (+) There does not seem to be a sensible set of combinators when directories can be represented on their own.
@@ -129,7 +190,7 @@ Arguments:
 
   - `./.` represents all files in `./.` _and_ the directory itself, but not its subdirectories, meaning that at least `./.` will be preserved even if it's empty.
 
-    In that case, `intersect ./. ./foo` should only include files and no directories themselves, since `./.` includes only `./.` as a directory, and same for `./foo`, so there's no overlap in directories.
+    In that case, `intersection ./. ./foo` should only include files and no directories themselves, since `./.` includes only `./.` as a directory, and same for `./foo`, so there's no overlap in directories.
     But intuitively this operation should result in the same as `./foo` – everything else is just confusing.
 - (+) This matches how Git only supports files, so developers should already be used to it.
 - (-) Empty directories (even if they contain nested directories) are neither representable nor preserved when coercing from paths.
@@ -144,7 +205,7 @@ File sets do not support Nix store paths in strings such as `"/nix/store/...-sou
 
 Arguments:
 - (+) Such paths are usually produced by derivations, which means `toSource` would either:
-  - Require IFD if `builtins.path` is used as the underlying primitive
+  - Require [Import From Derivation](https://nixos.org/manual/nix/unstable/language/import-from-derivation) (IFD) if `builtins.path` is used as the underlying primitive
   - Require importing the entire `root` into the store such that derivations can be used to do the filtering
 - (+) The convenient path coercion like `union ./foo ./bar` wouldn't work for absolute paths, requiring more verbose alternate interfaces:
   - `let root = "/nix/store/...-source"; in union "${root}/foo" "${root}/bar"`
@@ -180,6 +241,5 @@ Here's a list of places in the library that need to be updated in the future:
 - > The file set library is currently somewhat limited but is being expanded to include more functions over time.
 
   in [the manual](../../doc/functions/fileset.section.md)
-- Once a tracing function exists, `__noEval` in [internal.nix](./internal.nix) should mention it
 - If/Once a function to convert `lib.sources` values into file sets exists, the `_coerce` and `toSource` functions should be updated to mention that function in the error when such a value is passed
 - If/Once a function exists that can optionally include a path depending on whether it exists, the error message for the path not existing in `_coerce` should mention the new function
diff --git a/nixpkgs/lib/fileset/default.nix b/nixpkgs/lib/fileset/default.nix
index 88c8dcd1a70b..7bd701670386 100644
--- a/nixpkgs/lib/fileset/default.nix
+++ b/nixpkgs/lib/fileset/default.nix
@@ -6,16 +6,20 @@ let
     _coerceMany
     _toSourceFilter
     _unionMany
+    _printFileset
+    _intersection
     ;
 
   inherit (builtins)
     isList
     isPath
     pathExists
+    seq
     typeOf
     ;
 
   inherit (lib.lists)
+    elemAt
     imap0
     ;
 
@@ -156,7 +160,7 @@ If a directory does not recursively contain any file, it is omitted from the sto
           lib.fileset.toSource: `root` is of type ${typeOf root}, but it should be a path instead.''
     # Currently all Nix paths have the same filesystem root, but this could change in the future.
     # See also ../path/README.md
-    else if rootFilesystemRoot != filesetFilesystemRoot then
+    else if ! fileset._internalIsEmptyWithoutBase && rootFilesystemRoot != filesetFilesystemRoot then
       throw ''
         lib.fileset.toSource: Filesystem roots are not the same for `fileset` and `root` ("${toString root}"):
             `root`: root "${toString rootFilesystemRoot}"
@@ -170,7 +174,7 @@ If a directory does not recursively contain any file, it is omitted from the sto
         lib.fileset.toSource: `root` (${toString root}) is a file, but it should be a directory instead. Potential solutions:
             - If you want to import the file into the store _without_ a containing directory, use string interpolation or `builtins.path` instead of this function.
             - If you want to import the file into the store _with_ a containing directory, set `root` to the containing directory, such as ${toString (dirOf root)}, and set `fileset` to the file path.''
-    else if ! hasPrefix root fileset._internalBase then
+    else if ! fileset._internalIsEmptyWithoutBase && ! hasPrefix root fileset._internalBase then
       throw ''
         lib.fileset.toSource: `fileset` could contain files in ${toString fileset._internalBase}, which is not under the `root` (${toString root}). Potential solutions:
             - Set `root` to ${toString fileset._internalBase} or any directory higher up. This changes the layout of the resulting store path.
@@ -258,15 +262,11 @@ If a directory does not recursively contain any file, it is omitted from the sto
   */
   unions =
     # A list of file sets.
-    # Must contain at least 1 element.
     # The elements can also be paths,
     # which get [implicitly coerced to file sets](#sec-fileset-path-coercion).
     filesets:
     if ! isList filesets then
       throw "lib.fileset.unions: Expected argument to be a list, but got a ${typeOf filesets}."
-    else if filesets == [ ] then
-      # TODO: This could be supported, but requires an extra internal representation for the empty file set, which would be special for not having a base path.
-      throw "lib.fileset.unions: Expected argument to be a list with at least one element, but it contains no elements."
     else
       pipe filesets [
         # Annotate the elements with context, used by _coerceMany for better errors
@@ -278,4 +278,132 @@ If a directory does not recursively contain any file, it is omitted from the sto
         _unionMany
       ];
 
+  /*
+    The file set containing all files that are in both of two given file sets.
+    See also [Intersection (set theory)](https://en.wikipedia.org/wiki/Intersection_(set_theory)).
+
+    The given file sets are evaluated as lazily as possible,
+    with the first argument being evaluated first if needed.
+
+    Type:
+      intersection :: FileSet -> FileSet -> FileSet
+
+    Example:
+      # Limit the selected files to the ones in ./., so only ./src and ./Makefile
+      intersection ./. (unions [ ../LICENSE ./src ./Makefile ])
+  */
+  intersection =
+    # The first file set.
+    # This argument can also be a path,
+    # which gets [implicitly coerced to a file set](#sec-fileset-path-coercion).
+    fileset1:
+    # The second file set.
+    # This argument can also be a path,
+    # which gets [implicitly coerced to a file set](#sec-fileset-path-coercion).
+    fileset2:
+    let
+      filesets = _coerceMany "lib.fileset.intersection" [
+        {
+          context = "first argument";
+          value = fileset1;
+        }
+        {
+          context = "second argument";
+          value = fileset2;
+        }
+      ];
+    in
+    _intersection
+      (elemAt filesets 0)
+      (elemAt filesets 1);
+
+  /*
+    Incrementally evaluate and trace a file set in a pretty way.
+    This function is only intended for debugging purposes.
+    The exact tracing format is unspecified and may change.
+
+    This function takes a final argument to return.
+    In comparison, [`traceVal`](#function-library-lib.fileset.traceVal) returns
+    the given file set argument.
+
+    This variant is useful for tracing file sets in the Nix repl.
+
+    Type:
+      trace :: FileSet -> Any -> Any
+
+    Example:
+      trace (unions [ ./Makefile ./src ./tests/run.sh ]) null
+      =>
+      trace: /home/user/src/myProject
+      trace: - Makefile (regular)
+      trace: - src (all files in directory)
+      trace: - tests
+      trace:   - run.sh (regular)
+      null
+  */
+  trace =
+    /*
+    The file set to trace.
+
+    This argument can also be a path,
+    which gets [implicitly coerced to a file set](#sec-fileset-path-coercion).
+    */
+    fileset:
+    let
+      # "fileset" would be a better name, but that would clash with the argument name,
+      # and we cannot change that because of https://github.com/nix-community/nixdoc/issues/76
+      actualFileset = _coerce "lib.fileset.trace: argument" fileset;
+    in
+    seq
+      (_printFileset actualFileset)
+      (x: x);
+
+  /*
+    Incrementally evaluate and trace a file set in a pretty way.
+    This function is only intended for debugging purposes.
+    The exact tracing format is unspecified and may change.
+
+    This function returns the given file set.
+    In comparison, [`trace`](#function-library-lib.fileset.trace) takes another argument to return.
+
+    This variant is useful for tracing file sets passed as arguments to other functions.
+
+    Type:
+      traceVal :: FileSet -> FileSet
+
+    Example:
+      toSource {
+        root = ./.;
+        fileset = traceVal (unions [
+          ./Makefile
+          ./src
+          ./tests/run.sh
+        ]);
+      }
+      =>
+      trace: /home/user/src/myProject
+      trace: - Makefile (regular)
+      trace: - src (all files in directory)
+      trace: - tests
+      trace:   - run.sh (regular)
+      "/nix/store/...-source"
+  */
+  traceVal =
+    /*
+    The file set to trace and return.
+
+    This argument can also be a path,
+    which gets [implicitly coerced to a file set](#sec-fileset-path-coercion).
+    */
+    fileset:
+    let
+      # "fileset" would be a better name, but that would clash with the argument name,
+      # and we cannot change that because of https://github.com/nix-community/nixdoc/issues/76
+      actualFileset = _coerce "lib.fileset.traceVal: argument" fileset;
+    in
+    seq
+      (_printFileset actualFileset)
+      # We could also return the original fileset argument here,
+      # but that would then duplicate work for consumers of the fileset, because then they have to coerce it again
+      actualFileset;
 }
diff --git a/nixpkgs/lib/fileset/internal.nix b/nixpkgs/lib/fileset/internal.nix
index 2c329edb390d..9892172955c3 100644
--- a/nixpkgs/lib/fileset/internal.nix
+++ b/nixpkgs/lib/fileset/internal.nix
@@ -7,11 +7,14 @@ let
     isString
     pathExists
     readDir
-    typeOf
+    seq
     split
+    trace
+    typeOf
     ;
 
   inherit (lib.attrsets)
+    attrNames
     attrValues
     mapAttrs
     setAttrByPath
@@ -28,6 +31,7 @@ let
     drop
     elemAt
     filter
+    findFirst
     findFirstIndex
     foldl'
     head
@@ -64,7 +68,7 @@ rec {
   # - Increment this version
   # - Add an additional migration function below
   # - Update the description of the internal representation in ./README.md
-  _currentVersion = 2;
+  _currentVersion = 3;
 
   # Migrations between versions. The 0th element converts from v0 to v1, and so on
   migrations = [
@@ -89,8 +93,38 @@ rec {
         _internalVersion = 2;
       }
     )
+
+    # Convert v2 into v3: filesetTree's now have a representation for an empty file set without a base path
+    (
+      filesetV2:
+      filesetV2 // {
+        # All v1 file sets are not the new empty file set
+        _internalIsEmptyWithoutBase = false;
+        _internalVersion = 3;
+      }
+    )
   ];
 
+  _noEvalMessage = ''
+    lib.fileset: Directly evaluating a file set is not supported.
+      To turn it into a usable source, use `lib.fileset.toSource`.
+      To pretty-print the contents, use `lib.fileset.trace` or `lib.fileset.traceVal`.'';
+
+  # The empty file set without a base path
+  _emptyWithoutBase = {
+    _type = "fileset";
+
+    _internalVersion = _currentVersion;
+
+    # The one and only!
+    _internalIsEmptyWithoutBase = true;
+
+    # Due to alphabetical ordering, this is evaluated last,
+    # which makes the nix repl output nicer than if it would be ordered first.
+    # It also allows evaluating it strictly up to this error, which could be useful
+    _noEval = throw _noEvalMessage;
+  };
+
   # Create a fileset, see ./README.md#fileset
   # Type: path -> filesetTree -> fileset
   _create = base: tree:
@@ -103,14 +137,17 @@ rec {
       _type = "fileset";
 
       _internalVersion = _currentVersion;
+
+      _internalIsEmptyWithoutBase = false;
       _internalBase = base;
       _internalBaseRoot = parts.root;
       _internalBaseComponents = components parts.subpath;
       _internalTree = tree;
 
-      # Double __ to make it be evaluated and ordered first
-      __noEval = throw ''
-        lib.fileset: Directly evaluating a file set is not supported. Use `lib.fileset.toSource` to turn it into a usable source instead.'';
+      # Due to alphabetical ordering, this is evaluated last,
+      # which makes the nix repl output nicer than if it would be ordered first.
+      # It also allows evaluating it strictly up to this error, which could be useful
+      _noEval = throw _noEvalMessage;
     };
 
   # Coerce a value to a fileset, erroring when the value cannot be coerced.
@@ -135,11 +172,11 @@ rec {
     else if ! isPath value then
       if isStringLike value then
         throw ''
-          ${context} ("${toString value}") is a string-like value, but it should be a path instead.
+          ${context} ("${toString value}") is a string-like value, but it should be a file set or a path instead.
               Paths represented as strings are not supported by `lib.fileset`, use `lib.sources` or derivations instead.''
       else
         throw ''
-          ${context} is of type ${typeOf value}, but it should be a path instead.''
+          ${context} is of type ${typeOf value}, but it should be a file set or a path instead.''
     else if ! pathExists value then
       throw ''
         ${context} (${toString value}) does not exist.''
@@ -155,14 +192,20 @@ rec {
         _coerce "${functionContext}: ${context}" value
       ) list;
 
-      firstBaseRoot = (head filesets)._internalBaseRoot;
+      # Find the first value with a base, there may be none!
+      firstWithBase = findFirst (fileset: ! fileset._internalIsEmptyWithoutBase) null filesets;
+      # This value is only accessed if first != null
+      firstBaseRoot = firstWithBase._internalBaseRoot;
 
       # Finds the first element with a filesystem root different than the first element, if any
       differentIndex = findFirstIndex (fileset:
-        firstBaseRoot != fileset._internalBaseRoot
+        # The empty value without a base doesn't have a base path
+        ! fileset._internalIsEmptyWithoutBase
+        && firstBaseRoot != fileset._internalBaseRoot
       ) null filesets;
     in
-    if differentIndex != null then
+    # Only evaluates `differentIndex` if there are any elements with a base
+    if firstWithBase != null && differentIndex != null then
       throw ''
         ${functionContext}: Filesystem roots are not the same:
             ${(head list).context}: root "${toString firstBaseRoot}"
@@ -203,22 +246,22 @@ rec {
       // value;
 
   /*
-    Simplify a filesetTree recursively:
-    - Replace all directories that have no files with `null`
+    A normalisation of a filesetTree suitable filtering with `builtins.path`:
+    - Replace all directories that have no files with `null`.
       This removes directories that would be empty
-    - Replace all directories with all files with `"directory"`
+    - Replace all directories with all files with `"directory"`.
       This speeds up the source filter function
 
     Note that this function is strict, it evaluates the entire tree
 
     Type: Path -> filesetTree -> filesetTree
   */
-  _simplifyTree = path: tree:
+  _normaliseTreeFilter = path: tree:
     if tree == "directory" || isAttrs tree then
       let
         entries = _directoryEntries path tree;
-        simpleSubtrees = mapAttrs (name: _simplifyTree (path + "/${name}")) entries;
-        subtreeValues = attrValues simpleSubtrees;
+        normalisedSubtrees = mapAttrs (name: _normaliseTreeFilter (path + "/${name}")) entries;
+        subtreeValues = attrValues normalisedSubtrees;
       in
       # This triggers either when all files in a directory are filtered out
       # Or when the directory doesn't contain any files at all
@@ -228,10 +271,112 @@ rec {
       else if all isString subtreeValues then
         "directory"
       else
-        simpleSubtrees
+        normalisedSubtrees
     else
       tree;
 
+  /*
+    A minimal normalisation of a filesetTree, intended for pretty-printing:
+    - If all children of a path are recursively included or empty directories, the path itself is also recursively included
+    - If all children of a path are fully excluded or empty directories, the path itself is an empty directory
+    - Other empty directories are represented with the special "emptyDir" string
+      While these could be replaced with `null`, that would take another mapAttrs
+
+    Note that this function is partially lazy.
+
+    Type: Path -> filesetTree -> filesetTree (with "emptyDir"'s)
+  */
+  _normaliseTreeMinimal = path: tree:
+    if tree == "directory" || isAttrs tree then
+      let
+        entries = _directoryEntries path tree;
+        normalisedSubtrees = mapAttrs (name: _normaliseTreeMinimal (path + "/${name}")) entries;
+        subtreeValues = attrValues normalisedSubtrees;
+      in
+      # If there are no entries, or all entries are empty directories, return "emptyDir".
+      # After this branch we know that there's at least one file
+      if all (value: value == "emptyDir") subtreeValues then
+        "emptyDir"
+
+      # If all subtrees are fully included or empty directories
+      # (both of which are coincidentally represented as strings), return "directory".
+      # This takes advantage of the fact that empty directories can be represented as included directories.
+      # Note that the tree == "directory" check allows avoiding recursion
+      else if tree == "directory" || all (value: isString value) subtreeValues then
+        "directory"
+
+      # If all subtrees are fully excluded or empty directories, return null.
+      # This takes advantage of the fact that empty directories can be represented as excluded directories
+      else if all (value: isNull value || value == "emptyDir") subtreeValues then
+        null
+
+      # Mix of included and excluded entries
+      else
+        normalisedSubtrees
+    else
+      tree;
+
+  # Trace a filesetTree in a pretty way when the resulting value is evaluated.
+  # This can handle both normal filesetTree's, and ones returned from _normaliseTreeMinimal
+  # Type: Path -> filesetTree (with "emptyDir"'s) -> Null
+  _printMinimalTree = base: tree:
+    let
+      treeSuffix = tree:
+        if isAttrs tree then
+          ""
+        else if tree == "directory" then
+          " (all files in directory)"
+        else
+          # This does "leak" the file type strings of the internal representation,
+          # but this is the main reason these file type strings even are in the representation!
+          # TODO: Consider removing that information from the internal representation for performance.
+          # The file types can still be printed by querying them only during tracing
+          " (${tree})";
+
+      # Only for attribute set trees
+      traceTreeAttrs = prevLine: indent: tree:
+        foldl' (prevLine: name:
+          let
+            subtree = tree.${name};
+
+            # Evaluating this prints the line for this subtree
+            thisLine =
+              trace "${indent}- ${name}${treeSuffix subtree}" prevLine;
+          in
+          if subtree == null || subtree == "emptyDir" then
+            # Don't print anything at all if this subtree is empty
+            prevLine
+          else if isAttrs subtree then
+            # A directory with explicit entries
+            # Do print this node, but also recurse
+            traceTreeAttrs thisLine "${indent}  " subtree
+          else
+            # Either a file, or a recursively included directory
+            # Do print this node but no further recursion needed
+            thisLine
+        ) prevLine (attrNames tree);
+
+      # Evaluating this will print the first line
+      firstLine =
+        if tree == null || tree == "emptyDir" then
+          trace "(empty)" null
+        else
+          trace "${toString base}${treeSuffix tree}" null;
+    in
+    if isAttrs tree then
+      traceTreeAttrs firstLine "" tree
+    else
+      firstLine;
+
+  # Pretty-print a file set in a pretty way when the resulting value is evaluated
+  # Type: fileset -> Null
+  _printFileset = fileset:
+    if fileset._internalIsEmptyWithoutBase then
+      trace "(empty)" null
+    else
+      _printMinimalTree fileset._internalBase
+        (_normaliseTreeMinimal fileset._internalBase fileset._internalTree);
+
   # Turn a fileset into a source filter function suitable for `builtins.path`
   # Only directories recursively containing at least one files are recursed into
   # Type: Path -> fileset -> (String -> String -> Bool)
@@ -239,7 +384,7 @@ rec {
     let
       # Simplify the tree, necessary to make sure all empty directories are null
       # which has the effect that they aren't included in the result
-      tree = _simplifyTree fileset._internalBase fileset._internalTree;
+      tree = _normaliseTreeFilter fileset._internalBase fileset._internalTree;
 
       # The base path as a string with a single trailing slash
       baseString =
@@ -311,17 +456,59 @@ rec {
     # Special case because the code below assumes that the _internalBase is always included in the result
     # which shouldn't be done when we have no files at all in the base
     # This also forces the tree before returning the filter, leads to earlier error messages
-    if tree == null then
+    if fileset._internalIsEmptyWithoutBase || tree == null then
       empty
     else
       nonEmpty;
 
+  # Transforms the filesetTree of a file set to a shorter base path, e.g.
+  # _shortenTreeBase [ "foo" ] (_create /foo/bar null)
+  # => { bar = null; }
+  _shortenTreeBase = targetBaseComponents: fileset:
+    let
+      recurse = index:
+        # If we haven't reached the required depth yet
+        if index < length fileset._internalBaseComponents then
+          # Create an attribute set and recurse as the value, this can be lazily evaluated this way
+          { ${elemAt fileset._internalBaseComponents index} = recurse (index + 1); }
+        else
+          # Otherwise we reached the appropriate depth, here's the original tree
+          fileset._internalTree;
+    in
+    recurse (length targetBaseComponents);
+
+  # Transforms the filesetTree of a file set to a longer base path, e.g.
+  # _lengthenTreeBase [ "foo" "bar" ] (_create /foo { bar.baz = "regular"; })
+  # => { baz = "regular"; }
+  _lengthenTreeBase = targetBaseComponents: fileset:
+    let
+      recurse = index: tree:
+        # If the filesetTree is an attribute set and we haven't reached the required depth yet
+        if isAttrs tree && index < length targetBaseComponents then
+          # Recurse with the tree under the right component (which might not exist)
+          recurse (index + 1) (tree.${elemAt targetBaseComponents index} or null)
+        else
+          # For all values here we can just return the tree itself:
+          # tree == null -> the result is also null, everything is excluded
+          # tree == "directory" -> the result is also "directory",
+          #   because the base path is always a directory and everything is included
+          # isAttrs tree -> the result is `tree`
+          #   because we don't need to recurse any more since `index == length longestBaseComponents`
+          tree;
+    in
+    recurse (length fileset._internalBaseComponents) fileset._internalTree;
+
   # Computes the union of a list of filesets.
   # The filesets must already be coerced and validated to be in the same filesystem root
   # Type: [ Fileset ] -> Fileset
   _unionMany = filesets:
     let
-      first = head filesets;
+      # All filesets that have a base, aka not the ones that are the empty value without a base
+      filesetsWithBase = filter (fileset: ! fileset._internalIsEmptyWithoutBase) filesets;
+
+      # The first fileset that has a base.
+      # This value is only accessed if there are at all.
+      firstWithBase = head filesetsWithBase;
 
       # To be able to union filesetTree's together, they need to have the same base path.
       # Base paths can be unioned by taking their common prefix,
@@ -332,14 +519,14 @@ rec {
       # so this cannot cause a stack overflow due to a build-up of unevaluated thunks.
       commonBaseComponents = foldl'
         (components: el: commonPrefix components el._internalBaseComponents)
-        first._internalBaseComponents
+        firstWithBase._internalBaseComponents
         # We could also not do the `tail` here to avoid a list allocation,
         # but then we'd have to pay for a potentially expensive
         # but unnecessary `commonPrefix` call
-        (tail filesets);
+        (tail filesetsWithBase);
 
       # The common base path assembled from a filesystem root and the common components
-      commonBase = append first._internalBaseRoot (join commonBaseComponents);
+      commonBase = append firstWithBase._internalBaseRoot (join commonBaseComponents);
 
       # A list of filesetTree's that all have the same base path
       # This is achieved by nesting the trees into the components they have over the common base path
@@ -347,18 +534,18 @@ rec {
       # So the tree under `/foo/bar` gets nested under `{ bar = ...; ... }`,
       # while the tree under `/foo/baz` gets nested under `{ baz = ...; ... }`
       # Therefore allowing combined operations over them.
-      trees = map (fileset:
-        setAttrByPath
-          (drop (length commonBaseComponents) fileset._internalBaseComponents)
-          fileset._internalTree
-        ) filesets;
+      trees = map (_shortenTreeBase commonBaseComponents) filesetsWithBase;
 
       # Folds all trees together into a single one using _unionTree
       # We do not use a fold here because it would cause a thunk build-up
       # which could cause a stack overflow for a large number of trees
       resultTree = _unionTrees trees;
     in
-    _create commonBase resultTree;
+    # If there's no values with a base, we have no files
+    if filesetsWithBase == [ ] then
+      _emptyWithoutBase
+    else
+      _create commonBase resultTree;
 
   # The union of multiple filesetTree's with the same base path.
   # Later elements are only evaluated if necessary.
@@ -379,4 +566,76 @@ rec {
       # The non-null elements have to be attribute sets representing partial trees
       # We need to recurse into those
       zipAttrsWith (name: _unionTrees) withoutNull;
+
+  # Computes the intersection of a list of filesets.
+  # The filesets must already be coerced and validated to be in the same filesystem root
+  # Type: Fileset -> Fileset -> Fileset
+  _intersection = fileset1: fileset2:
+    let
+      # The common base components prefix, e.g.
+      # (/foo/bar, /foo/bar/baz) -> /foo/bar
+      # (/foo/bar, /foo/baz) -> /foo
+      commonBaseComponentsLength =
+        # TODO: Have a `lib.lists.commonPrefixLength` function such that we don't need the list allocation from commonPrefix here
+        length (
+          commonPrefix
+            fileset1._internalBaseComponents
+            fileset2._internalBaseComponents
+        );
+
+      # To be able to intersect filesetTree's together, they need to have the same base path.
+      # Base paths can be intersected by taking the longest one (if any)
+
+      # The fileset with the longest base, if any, e.g.
+      # (/foo/bar, /foo/bar/baz) -> /foo/bar/baz
+      # (/foo/bar, /foo/baz) -> null
+      longestBaseFileset =
+        if commonBaseComponentsLength == length fileset1._internalBaseComponents then
+          # The common prefix is the same as the first path, so the second path is equal or longer
+          fileset2
+        else if commonBaseComponentsLength == length fileset2._internalBaseComponents then
+          # The common prefix is the same as the second path, so the first path is longer
+          fileset1
+        else
+          # The common prefix is neither the first nor the second path
+          # This means there's no overlap between the two sets
+          null;
+
+      # Whether the result should be the empty value without a base
+      resultIsEmptyWithoutBase =
+        # If either fileset is the empty fileset without a base, the intersection is too
+        fileset1._internalIsEmptyWithoutBase
+        || fileset2._internalIsEmptyWithoutBase
+        # If there is no overlap between the base paths
+        || longestBaseFileset == null;
+
+      # Lengthen each fileset's tree to the longest base prefix
+      tree1 = _lengthenTreeBase longestBaseFileset._internalBaseComponents fileset1;
+      tree2 = _lengthenTreeBase longestBaseFileset._internalBaseComponents fileset2;
+
+      # With two filesetTree's with the same base, we can compute their intersection
+      resultTree = _intersectTree tree1 tree2;
+    in
+    if resultIsEmptyWithoutBase then
+      _emptyWithoutBase
+    else
+      _create longestBaseFileset._internalBase resultTree;
+
+  # The intersection of two filesetTree's with the same base path
+  # The second element is only evaluated as much as necessary.
+  # Type: filesetTree -> filesetTree -> filesetTree
+  _intersectTree = lhs: rhs:
+    if isAttrs lhs && isAttrs rhs then
+      # Both sides are attribute sets, we can recurse for the attributes existing on both sides
+      mapAttrs
+        (name: _intersectTree lhs.${name})
+        (builtins.intersectAttrs lhs rhs)
+    else if lhs == null || isString rhs then
+      # If the lhs is null, the result should also be null
+      # And if the rhs is the identity element
+      # (a string, aka it includes everything), then it's also the lhs
+      lhs
+    else
+      # In all other cases it's the rhs
+      rhs;
 }
diff --git a/nixpkgs/lib/fileset/tests.sh b/nixpkgs/lib/fileset/tests.sh
index 0ea96859e7a3..529f23ae8871 100755
--- a/nixpkgs/lib/fileset/tests.sh
+++ b/nixpkgs/lib/fileset/tests.sh
@@ -57,18 +57,35 @@ with lib.fileset;'
 expectEqual() {
     local actualExpr=$1
     local expectedExpr=$2
-    if ! actualResult=$(nix-instantiate --eval --strict --show-trace \
+    if actualResult=$(nix-instantiate --eval --strict --show-trace 2>"$tmp"/actualStderr \
         --expr "$prefixExpression ($actualExpr)"); then
-        die "$actualExpr failed to evaluate, but it was expected to succeed"
+        actualExitCode=$?
+    else
+        actualExitCode=$?
     fi
-    if ! expectedResult=$(nix-instantiate --eval --strict --show-trace \
+    actualStderr=$(< "$tmp"/actualStderr)
+
+    if expectedResult=$(nix-instantiate --eval --strict --show-trace 2>"$tmp"/expectedStderr \
         --expr "$prefixExpression ($expectedExpr)"); then
-        die "$expectedExpr failed to evaluate, but it was expected to succeed"
+        expectedExitCode=$?
+    else
+        expectedExitCode=$?
+    fi
+    expectedStderr=$(< "$tmp"/expectedStderr)
+
+    if [[ "$actualExitCode" != "$expectedExitCode" ]]; then
+        echo "$actualStderr" >&2
+        echo "$actualResult" >&2
+        die "$actualExpr should have exited with $expectedExitCode, but it exited with $actualExitCode"
     fi
 
     if [[ "$actualResult" != "$expectedResult" ]]; then
         die "$actualExpr should have evaluated to $expectedExpr:\n$expectedResult\n\nbut it evaluated to\n$actualResult"
     fi
+
+    if [[ "$actualStderr" != "$expectedStderr" ]]; then
+        die "$actualExpr should have had this on stderr:\n$expectedStderr\n\nbut it was\n$actualStderr"
+    fi
 }
 
 # Check that a nix expression evaluates successfully to a store path and returns it (without quotes).
@@ -84,14 +101,14 @@ expectStorePath() {
     crudeUnquoteJSON <<< "$result"
 }
 
-# Check that a nix expression fails to evaluate (strictly, coercing to json, read-write-mode).
+# Check that a nix expression fails to evaluate (strictly, read-write-mode).
 # And check the received stderr against a regex
 # The expression has `lib.fileset` in scope.
 # Usage: expectFailure NIX REGEX
 expectFailure() {
     local expr=$1
     local expectedErrorRegex=$2
-    if result=$(nix-instantiate --eval --strict --json --read-write-mode --show-trace 2>"$tmp/stderr" \
+    if result=$(nix-instantiate --eval --strict --read-write-mode --show-trace 2>"$tmp/stderr" \
         --expr "$prefixExpression $expr"); then
         die "$expr evaluated successfully to $result, but it was expected to fail"
     fi
@@ -101,16 +118,112 @@ expectFailure() {
     fi
 }
 
-# We conditionally use inotifywait in checkFileset.
+# Check that the traces of a Nix expression are as expected when evaluated.
+# The expression has `lib.fileset` in scope.
+# Usage: expectTrace NIX STR
+expectTrace() {
+    local expr=$1
+    local expectedTrace=$2
+
+    nix-instantiate --eval --show-trace >/dev/null 2>"$tmp"/stderrTrace \
+        --expr "$prefixExpression trace ($expr)" || true
+
+    actualTrace=$(sed -n 's/^trace: //p' "$tmp/stderrTrace")
+
+    nix-instantiate --eval --show-trace >/dev/null 2>"$tmp"/stderrTraceVal \
+        --expr "$prefixExpression traceVal ($expr)" || true
+
+    actualTraceVal=$(sed -n 's/^trace: //p' "$tmp/stderrTraceVal")
+
+    # Test that traceVal returns the same trace as trace
+    if [[ "$actualTrace" != "$actualTraceVal" ]]; then
+        cat "$tmp"/stderrTrace >&2
+        die "$expr traced this for lib.fileset.trace:\n\n$actualTrace\n\nand something different for lib.fileset.traceVal:\n\n$actualTraceVal"
+    fi
+
+    if [[ "$actualTrace" != "$expectedTrace" ]]; then
+        cat "$tmp"/stderrTrace >&2
+        die "$expr should have traced this:\n\n$expectedTrace\n\nbut this was actually traced:\n\n$actualTrace"
+    fi
+}
+
+# We conditionally use inotifywait in withFileMonitor.
 # Check early whether it's available
 # TODO: Darwin support, though not crucial since we have Linux CI
 if type inotifywait 2>/dev/null >/dev/null; then
-    canMonitorFiles=1
+    canMonitor=1
 else
-    echo "Warning: Not checking that excluded files don't get accessed since inotifywait is not available" >&2
-    canMonitorFiles=
+    echo "Warning: Cannot check for paths not getting read since the inotifywait command (from the inotify-tools package) is not available" >&2
+    canMonitor=
 fi
 
+# Run a function while monitoring that it doesn't read certain paths
+# Usage: withFileMonitor FUNNAME PATH...
+# - FUNNAME should be a bash function that:
+#   - Performs some operation that should not read some paths
+#   - Delete the paths it shouldn't read without triggering any open events
+# - PATH... are the paths that should not get read
+#
+# This function outputs the same as FUNNAME
+withFileMonitor() {
+    local funName=$1
+    shift
+
+    # If we can't monitor files or have none to monitor, just run the function directly
+    if [[ -z "$canMonitor" ]] || (( "$#" == 0 )); then
+        "$funName"
+    else
+
+        # Use a subshell to start the coprocess in and use a trap to kill it when exiting the subshell
+        (
+            # Assigned by coproc, makes shellcheck happy
+            local watcher watcher_PID
+
+            # Start inotifywait in the background to monitor all excluded paths
+            coproc watcher {
+                # inotifywait outputs a string on stderr when ready
+                # Redirect it to stdout so we can access it from the coproc's stdout fd
+                # exec so that the coprocess is inotify itself, making the kill below work correctly
+                # See below why we listen to both open and delete_self events
+                exec inotifywait --format='%e %w' --event open,delete_self --monitor "$@" 2>&1
+            }
+
+            # This will trigger when this subshell exits, no matter if successful or not
+            # After exiting the subshell, the parent shell will continue executing
+            trap 'kill "${watcher_PID}"' exit
+
+            # Synchronously wait until inotifywait is ready
+            while read -r -u "${watcher[0]}" line && [[ "$line" != "Watches established." ]]; do
+                :
+            done
+
+            # Call the function that should not read the given paths and delete them afterwards
+            "$funName"
+
+            # Get the first event
+            read -r -u "${watcher[0]}" event file
+
+            # With funName potentially reading files first before deleting them,
+            # there's only these two possible event timelines:
+            # - open*, ..., open*, delete_self, ..., delete_self: If some excluded paths were read
+            # - delete_self, ..., delete_self: If no excluded paths were read
+            # So by looking at the first event we can figure out which one it is!
+            # This also means we don't have to wait to collect all events.
+            case "$event" in
+                OPEN*)
+                    die "$funName opened excluded file $file when it shouldn't have"
+                    ;;
+                DELETE_SELF)
+                    # Expected events
+                    ;;
+                *)
+                    die "During $funName, Unexpected event type '$event' on file $file that should be excluded"
+                    ;;
+            esac
+        )
+    fi
+}
+
 # Check whether a file set includes/excludes declared paths as expected, usage:
 #
 # tree=(
@@ -120,7 +233,7 @@ fi
 # )
 # checkFileset './a' # Pass the fileset as the argument
 declare -A tree
-checkFileset() (
+checkFileset() {
     # New subshell so that we can have a separate trap handler, see `trap` below
     local fileset=$1
 
@@ -168,54 +281,21 @@ checkFileset() (
         touch "${filesToCreate[@]}"
     fi
 
-    # Start inotifywait in the background to monitor all excluded files (if any)
-    if [[ -n "$canMonitorFiles" ]] && (( "${#excludedFiles[@]}" != 0 )); then
-        coproc watcher {
-            # inotifywait outputs a string on stderr when ready
-            # Redirect it to stdout so we can access it from the coproc's stdout fd
-            # exec so that the coprocess is inotify itself, making the kill below work correctly
-            # See below why we listen to both open and delete_self events
-            exec inotifywait --format='%e %w' --event open,delete_self --monitor "${excludedFiles[@]}" 2>&1
-        }
-        # This will trigger when this subshell exits, no matter if successful or not
-        # After exiting the subshell, the parent shell will continue executing
-        # shellcheck disable=SC2154
-        trap 'kill "${watcher_PID}"' exit
-
-        # Synchronously wait until inotifywait is ready
-        while read -r -u "${watcher[0]}" line && [[ "$line" != "Watches established." ]]; do
-            :
-        done
-    fi
-
-    # Call toSource with the fileset, triggering open events for all files that are added to the store
     expression="toSource { root = ./.; fileset = $fileset; }"
-    storePath=$(expectStorePath "$expression")
 
-    # Remove all files immediately after, triggering delete_self events for all of them
-    rm -rf -- *
+    # We don't have lambda's in bash unfortunately,
+    # so we just define a function instead and then pass its name
+    # shellcheck disable=SC2317
+    run() {
+        # Call toSource with the fileset, triggering open events for all files that are added to the store
+        expectStorePath "$expression"
+        if (( ${#excludedFiles[@]} != 0 )); then
+            rm "${excludedFiles[@]}"
+        fi
+    }
 
-    # Only check for the inotify events if we actually started inotify earlier
-    if [[ -v watcher ]]; then
-        # Get the first event
-        read -r -u "${watcher[0]}" event file
-
-        # There's only these two possible event timelines:
-        # - open, ..., open, delete_self, ..., delete_self: If some excluded files were read
-        # - delete_self, ..., delete_self: If no excluded files were read
-        # So by looking at the first event we can figure out which one it is!
-        case "$event" in
-            OPEN)
-                die "$expression opened excluded file $file when it shouldn't have"
-                ;;
-            DELETE_SELF)
-                # Expected events
-                ;;
-            *)
-                die "Unexpected event type '$event' on file $file that should be excluded"
-                ;;
-        esac
-    fi
+    # Runs the function while checking that the given excluded files aren't read
+    storePath=$(withFileMonitor run "${excludedFiles[@]}")
 
     # For each path that should be included, make sure it does occur in the resulting store path
     for p in "${included[@]}"; do
@@ -230,7 +310,9 @@ checkFileset() (
             die "$expression included path $p when it shouldn't have"
         fi
     done
-)
+
+    rm -rf -- *
+}
 
 
 #### Error messages #####
@@ -273,33 +355,40 @@ expectFailure 'toSource { root = ./a; fileset = ./.; }' 'lib.fileset.toSource: `
 rm -rf *
 
 # Path coercion only works for paths
-expectFailure 'toSource { root = ./.; fileset = 10; }' 'lib.fileset.toSource: `fileset` is of type int, but it should be a path instead.'
-expectFailure 'toSource { root = ./.; fileset = "/some/path"; }' 'lib.fileset.toSource: `fileset` \("/some/path"\) is a string-like value, but it should be a path instead.
+expectFailure 'toSource { root = ./.; fileset = 10; }' 'lib.fileset.toSource: `fileset` is of type int, but it should be a file set or a path instead.'
+expectFailure 'toSource { root = ./.; fileset = "/some/path"; }' 'lib.fileset.toSource: `fileset` \("/some/path"\) is a string-like value, but it should be a file set or a path instead.
 \s*Paths represented as strings are not supported by `lib.fileset`, use `lib.sources` or derivations instead.'
 
 # Path coercion errors for non-existent paths
 expectFailure 'toSource { root = ./.; fileset = ./a; }' 'lib.fileset.toSource: `fileset` \('"$work"'/a\) does not exist.'
 
 # File sets cannot be evaluated directly
-expectFailure 'union ./. ./.' 'lib.fileset: Directly evaluating a file set is not supported. Use `lib.fileset.toSource` to turn it into a usable source instead.'
+expectFailure 'union ./. ./.' 'lib.fileset: Directly evaluating a file set is not supported.
+\s*To turn it into a usable source, use `lib.fileset.toSource`.
+\s*To pretty-print the contents, use `lib.fileset.trace` or `lib.fileset.traceVal`.'
+expectFailure '_emptyWithoutBase' 'lib.fileset: Directly evaluating a file set is not supported.
+\s*To turn it into a usable source, use `lib.fileset.toSource`.
+\s*To pretty-print the contents, use `lib.fileset.trace` or `lib.fileset.traceVal`.'
 
 # Past versions of the internal representation are supported
 expectEqual '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 0; _internalBase = ./.; }' \
-    '{ _internalBase = ./.; _internalBaseComponents = path.subpath.components (path.splitRoot ./.).subpath; _internalBaseRoot = /.; _internalVersion = 2; _type = "fileset"; }'
+    '{ _internalBase = ./.; _internalBaseComponents = path.subpath.components (path.splitRoot ./.).subpath; _internalBaseRoot = /.; _internalIsEmptyWithoutBase = false; _internalVersion = 3; _type = "fileset"; }'
 expectEqual '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 1; }' \
-    '{ _type = "fileset"; _internalVersion = 2; }'
+    '{ _type = "fileset"; _internalIsEmptyWithoutBase = false; _internalVersion = 3; }'
+expectEqual '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 2; }' \
+    '{ _type = "fileset"; _internalIsEmptyWithoutBase = false; _internalVersion = 3; }'
 
 # Future versions of the internal representation are unsupported
-expectFailure '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 3; }' '<tests>: value is a file set created from a future version of the file set library with a different internal representation:
-\s*- Internal version of the file set: 3
-\s*- Internal version of the library: 2
+expectFailure '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 4; }' '<tests>: value is a file set created from a future version of the file set library with a different internal representation:
+\s*- Internal version of the file set: 4
+\s*- Internal version of the library: 3
 \s*Make sure to update your Nixpkgs to have a newer version of `lib.fileset`.'
 
 # _create followed by _coerce should give the inputs back without any validation
 expectEqual '{
   inherit (_coerce "<test>" (_create ./. "directory"))
     _internalVersion _internalBase _internalTree;
-}' '{ _internalBase = ./.; _internalTree = "directory"; _internalVersion = 2; }'
+}' '{ _internalBase = ./.; _internalTree = "directory"; _internalVersion = 3; }'
 
 #### Resulting store path ####
 
@@ -311,6 +400,12 @@ tree=(
 )
 checkFileset './.'
 
+# The empty value without a base should also result in an empty result
+tree=(
+    [a]=0
+)
+checkFileset '_emptyWithoutBase'
+
 # Directories recursively containing no files are not included
 tree=(
     [e/]=0
@@ -406,15 +501,32 @@ expectFailure 'toSource { root = ./.; fileset = union ./. ./b; }' 'lib.fileset.u
 expectFailure 'toSource { root = ./.; fileset = unions [ ./a ./. ]; }' 'lib.fileset.unions: element 0 \('"$work"'/a\) does not exist.'
 expectFailure 'toSource { root = ./.; fileset = unions [ ./. ./b ]; }' 'lib.fileset.unions: element 1 \('"$work"'/b\) does not exist.'
 
-# unions needs a list with at least 1 element
+# unions needs a list
 expectFailure 'toSource { root = ./.; fileset = unions null; }' 'lib.fileset.unions: Expected argument to be a list, but got a null.'
-expectFailure 'toSource { root = ./.; fileset = unions [ ]; }' 'lib.fileset.unions: Expected argument to be a list with at least one element, but it contains no elements.'
 
 # The tree of later arguments should not be evaluated if a former argument already includes all files
 tree=()
 checkFileset 'union ./. (_create ./. (abort "This should not be used!"))'
 checkFileset 'unions [ ./. (_create ./. (abort "This should not be used!")) ]'
 
+# unions doesn't include any files for an empty list or only empty values without a base
+tree=(
+    [x]=0
+    [y/z]=0
+)
+checkFileset 'unions [ ]'
+checkFileset 'unions [ _emptyWithoutBase ]'
+checkFileset 'unions [ _emptyWithoutBase _emptyWithoutBase ]'
+checkFileset 'union _emptyWithoutBase _emptyWithoutBase'
+
+# The empty value without a base is the left and right identity of union
+tree=(
+    [x]=1
+    [y/z]=0
+)
+checkFileset 'union ./x _emptyWithoutBase'
+checkFileset 'union _emptyWithoutBase ./x'
+
 # union doesn't include files that weren't specified
 tree=(
     [x]=1
@@ -467,12 +579,249 @@ for i in $(seq 1000); do
     tree[$i/a]=1
     tree[$i/b]=0
 done
-(
-    # Locally limit the maximum stack size to 100 * 1024 bytes
-    # If unions was implemented recursively, this would stack overflow
-    ulimit -s 100
-    checkFileset 'unions (mapAttrsToList (name: _: ./. + "/${name}/a") (builtins.readDir ./.))'
+# This is actually really hard to test:
+# A lot of files would be needed to cause a stack overflow.
+# And while we could limit the maximum stack size using `ulimit -s`,
+# that turns out to not be very deterministic: https://github.com/NixOS/nixpkgs/pull/256417#discussion_r1339396686.
+# Meanwhile, the test infra here is not the fastest, creating 10000 would be too slow.
+# So, just using 1000 files for now.
+checkFileset 'unions (mapAttrsToList (name: _: ./. + "/${name}/a") (builtins.readDir ./.))'
+
+
+## lib.fileset.intersection
+
+
+# Different filesystem roots in root and fileset are not supported
+mkdir -p {foo,bar}/mock-root
+expectFailure 'with ((import <nixpkgs/lib>).extend (import <nixpkgs/lib/fileset/mock-splitRoot.nix>)).fileset;
+  toSource { root = ./.; fileset = intersection ./foo/mock-root ./bar/mock-root; }
+' 'lib.fileset.intersection: Filesystem roots are not the same:
+\s*first argument: root "'"$work"'/foo/mock-root"
+\s*second argument: root "'"$work"'/bar/mock-root"
+\s*Different roots are not supported.'
+rm -rf -- *
+
+# Coercion errors show the correct context
+expectFailure 'toSource { root = ./.; fileset = intersection ./a ./.; }' 'lib.fileset.intersection: first argument \('"$work"'/a\) does not exist.'
+expectFailure 'toSource { root = ./.; fileset = intersection ./. ./b; }' 'lib.fileset.intersection: second argument \('"$work"'/b\) does not exist.'
+
+# The tree of later arguments should not be evaluated if a former argument already excludes all files
+tree=(
+    [a]=0
+)
+checkFileset 'intersection _emptyWithoutBase (_create ./. (abort "This should not be used!"))'
+# We don't have any combinators that can explicitly remove files yet, so we need to rely on internal functions to test this for now
+checkFileset 'intersection (_create ./. { a = null; }) (_create ./. { a = abort "This should not be used!"; })'
+
+# If either side is empty, the result is empty
+tree=(
+    [a]=0
+)
+checkFileset 'intersection _emptyWithoutBase _emptyWithoutBase'
+checkFileset 'intersection _emptyWithoutBase (_create ./. null)'
+checkFileset 'intersection (_create ./. null) _emptyWithoutBase'
+checkFileset 'intersection (_create ./. null) (_create ./. null)'
+
+# If the intersection base paths are not overlapping, the result is empty and has no base path
+mkdir a b c
+touch {a,b,c}/x
+expectEqual 'toSource { root = ./c; fileset = intersection ./a ./b; }' 'toSource { root = ./c; fileset = _emptyWithoutBase; }'
+rm -rf -- *
+
+# If the intersection exists, the resulting base path is the longest of them
+mkdir a
+touch x a/b
+expectEqual 'toSource { root = ./a; fileset = intersection ./a ./.; }' 'toSource { root = ./a; fileset = ./a; }'
+expectEqual 'toSource { root = ./a; fileset = intersection ./. ./a; }' 'toSource { root = ./a; fileset = ./a; }'
+rm -rf -- *
+
+# Also finds the intersection with null'd filesetTree's
+tree=(
+    [a]=0
+    [b]=1
+    [c]=0
 )
+checkFileset 'intersection (_create ./. { a = "regular"; b = "regular"; c = null; }) (_create ./. { a = null; b = "regular"; c = "regular"; })'
+
+# Actually computes the intersection between files
+tree=(
+    [a]=0
+    [b]=0
+    [c]=1
+    [d]=1
+    [e]=0
+    [f]=0
+)
+checkFileset 'intersection (unions [ ./a ./b ./c ./d ]) (unions [ ./c ./d ./e ./f ])'
+
+tree=(
+    [a/x]=0
+    [a/y]=0
+    [b/x]=1
+    [b/y]=1
+    [c/x]=0
+    [c/y]=0
+)
+checkFileset 'intersection ./b ./.'
+checkFileset 'intersection ./b (unions [ ./a/x ./a/y ./b/x ./b/y ./c/x ./c/y ])'
+
+# Complicated case
+tree=(
+    [a/x]=0
+    [a/b/i]=1
+    [c/d/x]=0
+    [c/d/f]=1
+    [c/x]=0
+    [c/e/i]=1
+    [c/e/j]=1
+)
+checkFileset 'intersection (unions [ ./a/b ./c/d ./c/e ]) (unions [ ./a ./c/d/f ./c/e ])'
+
+
+## Tracing
+
+# The second trace argument is returned
+expectEqual 'trace ./. "some value"' 'builtins.trace "(empty)" "some value"'
+
+# The fileset traceVal argument is returned
+expectEqual 'traceVal ./.' 'builtins.trace "(empty)" (_create ./. "directory")'
+
+# The tracing happens before the final argument is needed
+expectEqual 'trace ./.' 'builtins.trace "(empty)" (x: x)'
+
+# Tracing an empty directory shows it as such
+expectTrace './.' '(empty)'
+
+# This also works if there are directories, but all recursively without files
+mkdir -p a/b/c
+expectTrace './.' '(empty)'
+rm -rf -- *
+
+# The empty file set without a base also prints as empty
+expectTrace '_emptyWithoutBase' '(empty)'
+expectTrace 'unions [ ]' '(empty)'
+mkdir foo bar
+touch {foo,bar}/x
+expectTrace 'intersection ./foo ./bar' '(empty)'
+rm -rf -- *
+
+# If a directory is fully included, print it as such
+touch a
+expectTrace './.' "$work"' (all files in directory)'
+rm -rf -- *
+
+# If a directory is not fully included, recurse
+mkdir a b
+touch a/{x,y} b/{x,y}
+expectTrace 'union ./a/x ./b' "$work"'
+- a
+  - x (regular)
+- b (all files in directory)'
+rm -rf -- *
+
+# If an included path is a file, print its type
+touch a x
+ln -s a b
+mkfifo c
+expectTrace 'unions [ ./a ./b ./c ]' "$work"'
+- a (regular)
+- b (symlink)
+- c (unknown)'
+rm -rf -- *
+
+# Do not print directories without any files recursively
+mkdir -p a/b/c
+touch b x
+expectTrace 'unions [ ./a ./b ]' "$work"'
+- b (regular)'
+rm -rf -- *
+
+# If all children are either fully included or empty directories,
+# the parent should be printed as fully included
+touch a
+mkdir b
+expectTrace 'union ./a ./b' "$work"' (all files in directory)'
+rm -rf -- *
+
+mkdir -p x/b x/c
+touch x/a
+touch a
+# If all children are either fully excluded or empty directories,
+# the parent should be shown (or rather not shown) as fully excluded
+expectTrace 'unions [ ./a ./x/b ./x/c ]' "$work"'
+- a (regular)'
+rm -rf -- *
+
+# Completely filtered out directories also print as empty
+touch a
+expectTrace '_create ./. {}' '(empty)'
+rm -rf -- *
+
+# A general test to make sure the resulting format makes sense
+# Such as indentation and ordering
+mkdir -p bar/{qux,someDir}
+touch bar/{baz,qux,someDir/a} foo
+touch bar/qux/x
+ln -s x bar/qux/a
+mkfifo bar/qux/b
+expectTrace 'unions [
+  ./bar/baz
+  ./bar/qux/a
+  ./bar/qux/b
+  ./bar/someDir/a
+  ./foo
+]' "$work"'
+- bar
+  - baz (regular)
+  - qux
+    - a (symlink)
+    - b (unknown)
+  - someDir (all files in directory)
+- foo (regular)'
+rm -rf -- *
+
+# For recursively included directories,
+# `(all files in directory)` should only be used if there's at least one file (otherwise it would be `(empty)`)
+# and this should be determined without doing a full search
+#
+# a is intentionally ordered first here in order to allow triggering the short-circuit behavior
+# We then check that b is not read
+# In a more realistic scenario, some directories might need to be recursed into,
+# but a file would be quickly found to trigger the short-circuit.
+touch a
+mkdir b
+# We don't have lambda's in bash unfortunately,
+# so we just define a function instead and then pass its name
+# shellcheck disable=SC2317
+run() {
+    # This shouldn't read b/
+    expectTrace './.' "$work"' (all files in directory)'
+    # Remove all files immediately after, triggering delete_self events for all of them
+    rmdir b
+}
+# Runs the function while checking that b isn't read
+withFileMonitor run b
+rm -rf -- *
+
+# Partially included directories trace entries as they are evaluated
+touch a b c
+expectTrace '_create ./. { a = null; b = "regular"; c = throw "b"; }' "$work"'
+- b (regular)'
+
+# Except entries that need to be evaluated to even figure out if it's only partially included:
+# Here the directory could be fully excluded or included just from seeing a and b,
+# so c needs to be evaluated before anything can be traced
+expectTrace '_create ./. { a = null; b = null; c = throw "c"; }' ''
+expectTrace '_create ./. { a = "regular"; b = "regular"; c = throw "c"; }' ''
+rm -rf -- *
+
+# We can trace large directories (10000 here) without any problems
+filesToCreate=({0..9}{0..9}{0..9}{0..9})
+expectedTrace=$work$'\n'$(printf -- '- %s (regular)\n' "${filesToCreate[@]}")
+# We need an excluded file so it doesn't print as `(all files in directory)`
+touch 0 "${filesToCreate[@]}"
+expectTrace 'unions (mapAttrsToList (n: _: ./. + "/${n}") (removeAttrs (builtins.readDir ./.) [ "0" ]))' "$expectedTrace"
+rm -rf -- *
 
 # TODO: Once we have combinators and a property testing library, derive property tests from https://en.wikipedia.org/wiki/Algebra_of_sets
author	Alyssa Ross <hi@alyssa.is>	2023-10-20 22:09:03 +0000
committer	Alyssa Ross <hi@alyssa.is>	2023-10-20 22:09:03 +0000
commit	50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e (patch)
tree	f2556b911180125ccbb7ed0e78a54e92da89adce /nixpkgs/lib/fileset
parent	4c16d4548a98563c9d9ad76f4e5b2202864ccd54 (diff)
parent	cfc75eec4603c06503ae750f88cf397e00796ea8 (diff)
download	nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.gz nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.bz2 nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.lz nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.xz nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.zst nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.zip