diff options
author | Alyssa Ross <hi@alyssa.is> | 2023-10-20 22:09:03 +0000 |
---|---|---|
committer | Alyssa Ross <hi@alyssa.is> | 2023-10-20 22:09:03 +0000 |
commit | 50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e (patch) | |
tree | f2556b911180125ccbb7ed0e78a54e92da89adce /nixpkgs/lib/fileset | |
parent | 4c16d4548a98563c9d9ad76f4e5b2202864ccd54 (diff) | |
parent | cfc75eec4603c06503ae750f88cf397e00796ea8 (diff) | |
download | nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.gz nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.bz2 nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.lz nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.xz nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.tar.zst nixlib-50c21d167f7114fa1dbd95e5c4fb30eeb1a2d02e.zip |
Merge commit 'cfc75eec4603c06503ae750f88cf397e00796ea8'
Conflicts: nixpkgs/pkgs/build-support/rust/build-rust-package/default.nix
Diffstat (limited to 'nixpkgs/lib/fileset')
-rw-r--r-- | nixpkgs/lib/fileset/README.md | 72 | ||||
-rw-r--r-- | nixpkgs/lib/fileset/default.nix | 140 | ||||
-rw-r--r-- | nixpkgs/lib/fileset/internal.nix | 317 | ||||
-rwxr-xr-x | nixpkgs/lib/fileset/tests.sh | 495 |
4 files changed, 910 insertions, 114 deletions
diff --git a/nixpkgs/lib/fileset/README.md b/nixpkgs/lib/fileset/README.md index 6e57f1f8f2b4..ebe13f08fdef 100644 --- a/nixpkgs/lib/fileset/README.md +++ b/nixpkgs/lib/fileset/README.md @@ -1,5 +1,10 @@ # File set library +This is the internal contributor documentation. +The user documentation is [in the Nixpkgs manual](https://nixos.org/manual/nixpkgs/unstable/#sec-fileset). + +## Goals + The main goal of the file set library is to be able to select local files that should be added to the Nix store. It should have the following properties: - Easy: @@ -41,12 +46,20 @@ An attribute set with these values: - `_type` (constant string `"fileset"`): Tag to indicate this value is a file set. -- `_internalVersion` (constant `2`, the current version): +- `_internalVersion` (constant `3`, the current version): Version of the representation. +- `_internalIsEmptyWithoutBase` (bool): + Whether this file set is the empty file set without a base path. + If `true`, `_internalBase*` and `_internalTree` are not set. + This is the only way to represent an empty file set without needing a base path. + + Such a value can be used as the identity element for `union` and the return value of `unions []` and co. + - `_internalBase` (path): Any files outside of this path cannot influence the set of files. - This is always a directory. + This is always a directory and should be as long as possible. + This is used by `lib.fileset.toSource` to check that all files are under the `root` argument - `_internalBaseRoot` (path): The filesystem root of `_internalBase`, same as `(lib.path.splitRoot _internalBase).root`. @@ -111,9 +124,57 @@ Arguments: - (+) This can be removed later, if we discover it's too restrictive - (-) It leads to errors when a sensible result could sometimes be returned, such as in the above example. +### Empty file set without a base + +There is a special representation for an empty file set without a base path. +This is used for return values that should be empty but when there's no base path that would makes sense. + +Arguments: +- Alternative: This could also be represented using `_internalBase = /.` and `_internalTree = null`. + - (+) Removes the need for a special representation. + - (-) Due to [influence tracking](#influence-tracking), + `union empty ./.` would have `/.` as the base path, + which would then prevent `toSource { root = ./.; fileset = union empty ./.; }` from working, + which is not as one would expect. + - (-) With the assumption that there can be multiple filesystem roots (as established with the [path library](../path/README.md)), + this would have to cause an error with `union empty pathWithAnotherFilesystemRoot`, + which is not as one would expect. +- Alternative: Do not have such a value and error when it would be needed as a return value + - (+) Removes the need for a special representation. + - (-) Leaves us with no identity element for `union` and no reasonable return value for `unions []`. + From a set theory perspective, which has a well-known notion of empty sets, this is unintuitive. + +### No intersection for lists + +While there is `intersection a b`, there is no function `intersections [ a b c ]`. + +Arguments: +- (+) There is no known use case for such a function, it can be added later if a use case arises +- (+) There is no suitable return value for `intersections [ ]`, see also "Nullary intersections" [here](https://en.wikipedia.org/w/index.php?title=List_of_set_identities_and_relations&oldid=1177174035#Definitions) + - (-) Could throw an error for that case + - (-) Create a special value to represent "all the files" and return that + - (+) Such a value could then not be used with `fileFilter` unless the internal representation is changed considerably + - (-) Could return the empty file set + - (+) This would be wrong in set theory +- (-) Inconsistent with `union` and `unions` + +### Intersection base path + +The base path of the result of an `intersection` is the longest base path of the arguments. +E.g. the base path of `intersection ./foo ./foo/bar` is `./foo/bar`. +Meanwhile `intersection ./foo ./bar` returns the empty file set without a base path. + +Arguments: +- Alternative: Use the common prefix of all base paths as the resulting base path + - (-) This is unnecessarily strict, because the purpose of the base path is to track the directory under which files _could_ be in the file set. It should be as long as possible. + All files contained in `intersection ./foo ./foo/bar` will be under `./foo/bar` (never just under `./foo`), and `intersection ./foo ./bar` will never contain any files (never under `./.`). + This would lead to `toSource` having to unexpectedly throw errors for cases such as `toSource { root = ./foo; fileset = intersect ./foo base; }`, where `base` may be `./bar` or `./.`. + - (-) There is no benefit to the user, since base path is not directly exposed in the interface + ### Empty directories -File sets can only represent a _set_ of local files, directories on their own are not representable. +File sets can only represent a _set_ of local files. +Directories on their own are not representable. Arguments: - (+) There does not seem to be a sensible set of combinators when directories can be represented on their own. @@ -129,7 +190,7 @@ Arguments: - `./.` represents all files in `./.` _and_ the directory itself, but not its subdirectories, meaning that at least `./.` will be preserved even if it's empty. - In that case, `intersect ./. ./foo` should only include files and no directories themselves, since `./.` includes only `./.` as a directory, and same for `./foo`, so there's no overlap in directories. + In that case, `intersection ./. ./foo` should only include files and no directories themselves, since `./.` includes only `./.` as a directory, and same for `./foo`, so there's no overlap in directories. But intuitively this operation should result in the same as `./foo` – everything else is just confusing. - (+) This matches how Git only supports files, so developers should already be used to it. - (-) Empty directories (even if they contain nested directories) are neither representable nor preserved when coercing from paths. @@ -144,7 +205,7 @@ File sets do not support Nix store paths in strings such as `"/nix/store/...-sou Arguments: - (+) Such paths are usually produced by derivations, which means `toSource` would either: - - Require IFD if `builtins.path` is used as the underlying primitive + - Require [Import From Derivation](https://nixos.org/manual/nix/unstable/language/import-from-derivation) (IFD) if `builtins.path` is used as the underlying primitive - Require importing the entire `root` into the store such that derivations can be used to do the filtering - (+) The convenient path coercion like `union ./foo ./bar` wouldn't work for absolute paths, requiring more verbose alternate interfaces: - `let root = "/nix/store/...-source"; in union "${root}/foo" "${root}/bar"` @@ -180,6 +241,5 @@ Here's a list of places in the library that need to be updated in the future: - > The file set library is currently somewhat limited but is being expanded to include more functions over time. in [the manual](../../doc/functions/fileset.section.md) -- Once a tracing function exists, `__noEval` in [internal.nix](./internal.nix) should mention it - If/Once a function to convert `lib.sources` values into file sets exists, the `_coerce` and `toSource` functions should be updated to mention that function in the error when such a value is passed - If/Once a function exists that can optionally include a path depending on whether it exists, the error message for the path not existing in `_coerce` should mention the new function diff --git a/nixpkgs/lib/fileset/default.nix b/nixpkgs/lib/fileset/default.nix index 88c8dcd1a70b..7bd701670386 100644 --- a/nixpkgs/lib/fileset/default.nix +++ b/nixpkgs/lib/fileset/default.nix @@ -6,16 +6,20 @@ let _coerceMany _toSourceFilter _unionMany + _printFileset + _intersection ; inherit (builtins) isList isPath pathExists + seq typeOf ; inherit (lib.lists) + elemAt imap0 ; @@ -156,7 +160,7 @@ If a directory does not recursively contain any file, it is omitted from the sto lib.fileset.toSource: `root` is of type ${typeOf root}, but it should be a path instead.'' # Currently all Nix paths have the same filesystem root, but this could change in the future. # See also ../path/README.md - else if rootFilesystemRoot != filesetFilesystemRoot then + else if ! fileset._internalIsEmptyWithoutBase && rootFilesystemRoot != filesetFilesystemRoot then throw '' lib.fileset.toSource: Filesystem roots are not the same for `fileset` and `root` ("${toString root}"): `root`: root "${toString rootFilesystemRoot}" @@ -170,7 +174,7 @@ If a directory does not recursively contain any file, it is omitted from the sto lib.fileset.toSource: `root` (${toString root}) is a file, but it should be a directory instead. Potential solutions: - If you want to import the file into the store _without_ a containing directory, use string interpolation or `builtins.path` instead of this function. - If you want to import the file into the store _with_ a containing directory, set `root` to the containing directory, such as ${toString (dirOf root)}, and set `fileset` to the file path.'' - else if ! hasPrefix root fileset._internalBase then + else if ! fileset._internalIsEmptyWithoutBase && ! hasPrefix root fileset._internalBase then throw '' lib.fileset.toSource: `fileset` could contain files in ${toString fileset._internalBase}, which is not under the `root` (${toString root}). Potential solutions: - Set `root` to ${toString fileset._internalBase} or any directory higher up. This changes the layout of the resulting store path. @@ -258,15 +262,11 @@ If a directory does not recursively contain any file, it is omitted from the sto */ unions = # A list of file sets. - # Must contain at least 1 element. # The elements can also be paths, # which get [implicitly coerced to file sets](#sec-fileset-path-coercion). filesets: if ! isList filesets then throw "lib.fileset.unions: Expected argument to be a list, but got a ${typeOf filesets}." - else if filesets == [ ] then - # TODO: This could be supported, but requires an extra internal representation for the empty file set, which would be special for not having a base path. - throw "lib.fileset.unions: Expected argument to be a list with at least one element, but it contains no elements." else pipe filesets [ # Annotate the elements with context, used by _coerceMany for better errors @@ -278,4 +278,132 @@ If a directory does not recursively contain any file, it is omitted from the sto _unionMany ]; + /* + The file set containing all files that are in both of two given file sets. + See also [Intersection (set theory)](https://en.wikipedia.org/wiki/Intersection_(set_theory)). + + The given file sets are evaluated as lazily as possible, + with the first argument being evaluated first if needed. + + Type: + intersection :: FileSet -> FileSet -> FileSet + + Example: + # Limit the selected files to the ones in ./., so only ./src and ./Makefile + intersection ./. (unions [ ../LICENSE ./src ./Makefile ]) + */ + intersection = + # The first file set. + # This argument can also be a path, + # which gets [implicitly coerced to a file set](#sec-fileset-path-coercion). + fileset1: + # The second file set. + # This argument can also be a path, + # which gets [implicitly coerced to a file set](#sec-fileset-path-coercion). + fileset2: + let + filesets = _coerceMany "lib.fileset.intersection" [ + { + context = "first argument"; + value = fileset1; + } + { + context = "second argument"; + value = fileset2; + } + ]; + in + _intersection + (elemAt filesets 0) + (elemAt filesets 1); + + /* + Incrementally evaluate and trace a file set in a pretty way. + This function is only intended for debugging purposes. + The exact tracing format is unspecified and may change. + + This function takes a final argument to return. + In comparison, [`traceVal`](#function-library-lib.fileset.traceVal) returns + the given file set argument. + + This variant is useful for tracing file sets in the Nix repl. + + Type: + trace :: FileSet -> Any -> Any + + Example: + trace (unions [ ./Makefile ./src ./tests/run.sh ]) null + => + trace: /home/user/src/myProject + trace: - Makefile (regular) + trace: - src (all files in directory) + trace: - tests + trace: - run.sh (regular) + null + */ + trace = + /* + The file set to trace. + + This argument can also be a path, + which gets [implicitly coerced to a file set](#sec-fileset-path-coercion). + */ + fileset: + let + # "fileset" would be a better name, but that would clash with the argument name, + # and we cannot change that because of https://github.com/nix-community/nixdoc/issues/76 + actualFileset = _coerce "lib.fileset.trace: argument" fileset; + in + seq + (_printFileset actualFileset) + (x: x); + + /* + Incrementally evaluate and trace a file set in a pretty way. + This function is only intended for debugging purposes. + The exact tracing format is unspecified and may change. + + This function returns the given file set. + In comparison, [`trace`](#function-library-lib.fileset.trace) takes another argument to return. + + This variant is useful for tracing file sets passed as arguments to other functions. + + Type: + traceVal :: FileSet -> FileSet + + Example: + toSource { + root = ./.; + fileset = traceVal (unions [ + ./Makefile + ./src + ./tests/run.sh + ]); + } + => + trace: /home/user/src/myProject + trace: - Makefile (regular) + trace: - src (all files in directory) + trace: - tests + trace: - run.sh (regular) + "/nix/store/...-source" + */ + traceVal = + /* + The file set to trace and return. + + This argument can also be a path, + which gets [implicitly coerced to a file set](#sec-fileset-path-coercion). + */ + fileset: + let + # "fileset" would be a better name, but that would clash with the argument name, + # and we cannot change that because of https://github.com/nix-community/nixdoc/issues/76 + actualFileset = _coerce "lib.fileset.traceVal: argument" fileset; + in + seq + (_printFileset actualFileset) + # We could also return the original fileset argument here, + # but that would then duplicate work for consumers of the fileset, because then they have to coerce it again + actualFileset; } diff --git a/nixpkgs/lib/fileset/internal.nix b/nixpkgs/lib/fileset/internal.nix index 2c329edb390d..9892172955c3 100644 --- a/nixpkgs/lib/fileset/internal.nix +++ b/nixpkgs/lib/fileset/internal.nix @@ -7,11 +7,14 @@ let isString pathExists readDir - typeOf + seq split + trace + typeOf ; inherit (lib.attrsets) + attrNames attrValues mapAttrs setAttrByPath @@ -28,6 +31,7 @@ let drop elemAt filter + findFirst findFirstIndex foldl' head @@ -64,7 +68,7 @@ rec { # - Increment this version # - Add an additional migration function below # - Update the description of the internal representation in ./README.md - _currentVersion = 2; + _currentVersion = 3; # Migrations between versions. The 0th element converts from v0 to v1, and so on migrations = [ @@ -89,8 +93,38 @@ rec { _internalVersion = 2; } ) + + # Convert v2 into v3: filesetTree's now have a representation for an empty file set without a base path + ( + filesetV2: + filesetV2 // { + # All v1 file sets are not the new empty file set + _internalIsEmptyWithoutBase = false; + _internalVersion = 3; + } + ) ]; + _noEvalMessage = '' + lib.fileset: Directly evaluating a file set is not supported. + To turn it into a usable source, use `lib.fileset.toSource`. + To pretty-print the contents, use `lib.fileset.trace` or `lib.fileset.traceVal`.''; + + # The empty file set without a base path + _emptyWithoutBase = { + _type = "fileset"; + + _internalVersion = _currentVersion; + + # The one and only! + _internalIsEmptyWithoutBase = true; + + # Due to alphabetical ordering, this is evaluated last, + # which makes the nix repl output nicer than if it would be ordered first. + # It also allows evaluating it strictly up to this error, which could be useful + _noEval = throw _noEvalMessage; + }; + # Create a fileset, see ./README.md#fileset # Type: path -> filesetTree -> fileset _create = base: tree: @@ -103,14 +137,17 @@ rec { _type = "fileset"; _internalVersion = _currentVersion; + + _internalIsEmptyWithoutBase = false; _internalBase = base; _internalBaseRoot = parts.root; _internalBaseComponents = components parts.subpath; _internalTree = tree; - # Double __ to make it be evaluated and ordered first - __noEval = throw '' - lib.fileset: Directly evaluating a file set is not supported. Use `lib.fileset.toSource` to turn it into a usable source instead.''; + # Due to alphabetical ordering, this is evaluated last, + # which makes the nix repl output nicer than if it would be ordered first. + # It also allows evaluating it strictly up to this error, which could be useful + _noEval = throw _noEvalMessage; }; # Coerce a value to a fileset, erroring when the value cannot be coerced. @@ -135,11 +172,11 @@ rec { else if ! isPath value then if isStringLike value then throw '' - ${context} ("${toString value}") is a string-like value, but it should be a path instead. + ${context} ("${toString value}") is a string-like value, but it should be a file set or a path instead. Paths represented as strings are not supported by `lib.fileset`, use `lib.sources` or derivations instead.'' else throw '' - ${context} is of type ${typeOf value}, but it should be a path instead.'' + ${context} is of type ${typeOf value}, but it should be a file set or a path instead.'' else if ! pathExists value then throw '' ${context} (${toString value}) does not exist.'' @@ -155,14 +192,20 @@ rec { _coerce "${functionContext}: ${context}" value ) list; - firstBaseRoot = (head filesets)._internalBaseRoot; + # Find the first value with a base, there may be none! + firstWithBase = findFirst (fileset: ! fileset._internalIsEmptyWithoutBase) null filesets; + # This value is only accessed if first != null + firstBaseRoot = firstWithBase._internalBaseRoot; # Finds the first element with a filesystem root different than the first element, if any differentIndex = findFirstIndex (fileset: - firstBaseRoot != fileset._internalBaseRoot + # The empty value without a base doesn't have a base path + ! fileset._internalIsEmptyWithoutBase + && firstBaseRoot != fileset._internalBaseRoot ) null filesets; in - if differentIndex != null then + # Only evaluates `differentIndex` if there are any elements with a base + if firstWithBase != null && differentIndex != null then throw '' ${functionContext}: Filesystem roots are not the same: ${(head list).context}: root "${toString firstBaseRoot}" @@ -203,22 +246,22 @@ rec { // value; /* - Simplify a filesetTree recursively: - - Replace all directories that have no files with `null` + A normalisation of a filesetTree suitable filtering with `builtins.path`: + - Replace all directories that have no files with `null`. This removes directories that would be empty - - Replace all directories with all files with `"directory"` + - Replace all directories with all files with `"directory"`. This speeds up the source filter function Note that this function is strict, it evaluates the entire tree Type: Path -> filesetTree -> filesetTree */ - _simplifyTree = path: tree: + _normaliseTreeFilter = path: tree: if tree == "directory" || isAttrs tree then let entries = _directoryEntries path tree; - simpleSubtrees = mapAttrs (name: _simplifyTree (path + "/${name}")) entries; - subtreeValues = attrValues simpleSubtrees; + normalisedSubtrees = mapAttrs (name: _normaliseTreeFilter (path + "/${name}")) entries; + subtreeValues = attrValues normalisedSubtrees; in # This triggers either when all files in a directory are filtered out # Or when the directory doesn't contain any files at all @@ -228,10 +271,112 @@ rec { else if all isString subtreeValues then "directory" else - simpleSubtrees + normalisedSubtrees else tree; + /* + A minimal normalisation of a filesetTree, intended for pretty-printing: + - If all children of a path are recursively included or empty directories, the path itself is also recursively included + - If all children of a path are fully excluded or empty directories, the path itself is an empty directory + - Other empty directories are represented with the special "emptyDir" string + While these could be replaced with `null`, that would take another mapAttrs + + Note that this function is partially lazy. + + Type: Path -> filesetTree -> filesetTree (with "emptyDir"'s) + */ + _normaliseTreeMinimal = path: tree: + if tree == "directory" || isAttrs tree then + let + entries = _directoryEntries path tree; + normalisedSubtrees = mapAttrs (name: _normaliseTreeMinimal (path + "/${name}")) entries; + subtreeValues = attrValues normalisedSubtrees; + in + # If there are no entries, or all entries are empty directories, return "emptyDir". + # After this branch we know that there's at least one file + if all (value: value == "emptyDir") subtreeValues then + "emptyDir" + + # If all subtrees are fully included or empty directories + # (both of which are coincidentally represented as strings), return "directory". + # This takes advantage of the fact that empty directories can be represented as included directories. + # Note that the tree == "directory" check allows avoiding recursion + else if tree == "directory" || all (value: isString value) subtreeValues then + "directory" + + # If all subtrees are fully excluded or empty directories, return null. + # This takes advantage of the fact that empty directories can be represented as excluded directories + else if all (value: isNull value || value == "emptyDir") subtreeValues then + null + + # Mix of included and excluded entries + else + normalisedSubtrees + else + tree; + + # Trace a filesetTree in a pretty way when the resulting value is evaluated. + # This can handle both normal filesetTree's, and ones returned from _normaliseTreeMinimal + # Type: Path -> filesetTree (with "emptyDir"'s) -> Null + _printMinimalTree = base: tree: + let + treeSuffix = tree: + if isAttrs tree then + "" + else if tree == "directory" then + " (all files in directory)" + else + # This does "leak" the file type strings of the internal representation, + # but this is the main reason these file type strings even are in the representation! + # TODO: Consider removing that information from the internal representation for performance. + # The file types can still be printed by querying them only during tracing + " (${tree})"; + + # Only for attribute set trees + traceTreeAttrs = prevLine: indent: tree: + foldl' (prevLine: name: + let + subtree = tree.${name}; + + # Evaluating this prints the line for this subtree + thisLine = + trace "${indent}- ${name}${treeSuffix subtree}" prevLine; + in + if subtree == null || subtree == "emptyDir" then + # Don't print anything at all if this subtree is empty + prevLine + else if isAttrs subtree then + # A directory with explicit entries + # Do print this node, but also recurse + traceTreeAttrs thisLine "${indent} " subtree + else + # Either a file, or a recursively included directory + # Do print this node but no further recursion needed + thisLine + ) prevLine (attrNames tree); + + # Evaluating this will print the first line + firstLine = + if tree == null || tree == "emptyDir" then + trace "(empty)" null + else + trace "${toString base}${treeSuffix tree}" null; + in + if isAttrs tree then + traceTreeAttrs firstLine "" tree + else + firstLine; + + # Pretty-print a file set in a pretty way when the resulting value is evaluated + # Type: fileset -> Null + _printFileset = fileset: + if fileset._internalIsEmptyWithoutBase then + trace "(empty)" null + else + _printMinimalTree fileset._internalBase + (_normaliseTreeMinimal fileset._internalBase fileset._internalTree); + # Turn a fileset into a source filter function suitable for `builtins.path` # Only directories recursively containing at least one files are recursed into # Type: Path -> fileset -> (String -> String -> Bool) @@ -239,7 +384,7 @@ rec { let # Simplify the tree, necessary to make sure all empty directories are null # which has the effect that they aren't included in the result - tree = _simplifyTree fileset._internalBase fileset._internalTree; + tree = _normaliseTreeFilter fileset._internalBase fileset._internalTree; # The base path as a string with a single trailing slash baseString = @@ -311,17 +456,59 @@ rec { # Special case because the code below assumes that the _internalBase is always included in the result # which shouldn't be done when we have no files at all in the base # This also forces the tree before returning the filter, leads to earlier error messages - if tree == null then + if fileset._internalIsEmptyWithoutBase || tree == null then empty else nonEmpty; + # Transforms the filesetTree of a file set to a shorter base path, e.g. + # _shortenTreeBase [ "foo" ] (_create /foo/bar null) + # => { bar = null; } + _shortenTreeBase = targetBaseComponents: fileset: + let + recurse = index: + # If we haven't reached the required depth yet + if index < length fileset._internalBaseComponents then + # Create an attribute set and recurse as the value, this can be lazily evaluated this way + { ${elemAt fileset._internalBaseComponents index} = recurse (index + 1); } + else + # Otherwise we reached the appropriate depth, here's the original tree + fileset._internalTree; + in + recurse (length targetBaseComponents); + + # Transforms the filesetTree of a file set to a longer base path, e.g. + # _lengthenTreeBase [ "foo" "bar" ] (_create /foo { bar.baz = "regular"; }) + # => { baz = "regular"; } + _lengthenTreeBase = targetBaseComponents: fileset: + let + recurse = index: tree: + # If the filesetTree is an attribute set and we haven't reached the required depth yet + if isAttrs tree && index < length targetBaseComponents then + # Recurse with the tree under the right component (which might not exist) + recurse (index + 1) (tree.${elemAt targetBaseComponents index} or null) + else + # For all values here we can just return the tree itself: + # tree == null -> the result is also null, everything is excluded + # tree == "directory" -> the result is also "directory", + # because the base path is always a directory and everything is included + # isAttrs tree -> the result is `tree` + # because we don't need to recurse any more since `index == length longestBaseComponents` + tree; + in + recurse (length fileset._internalBaseComponents) fileset._internalTree; + # Computes the union of a list of filesets. # The filesets must already be coerced and validated to be in the same filesystem root # Type: [ Fileset ] -> Fileset _unionMany = filesets: let - first = head filesets; + # All filesets that have a base, aka not the ones that are the empty value without a base + filesetsWithBase = filter (fileset: ! fileset._internalIsEmptyWithoutBase) filesets; + + # The first fileset that has a base. + # This value is only accessed if there are at all. + firstWithBase = head filesetsWithBase; # To be able to union filesetTree's together, they need to have the same base path. # Base paths can be unioned by taking their common prefix, @@ -332,14 +519,14 @@ rec { # so this cannot cause a stack overflow due to a build-up of unevaluated thunks. commonBaseComponents = foldl' (components: el: commonPrefix components el._internalBaseComponents) - first._internalBaseComponents + firstWithBase._internalBaseComponents # We could also not do the `tail` here to avoid a list allocation, # but then we'd have to pay for a potentially expensive # but unnecessary `commonPrefix` call - (tail filesets); + (tail filesetsWithBase); # The common base path assembled from a filesystem root and the common components - commonBase = append first._internalBaseRoot (join commonBaseComponents); + commonBase = append firstWithBase._internalBaseRoot (join commonBaseComponents); # A list of filesetTree's that all have the same base path # This is achieved by nesting the trees into the components they have over the common base path @@ -347,18 +534,18 @@ rec { # So the tree under `/foo/bar` gets nested under `{ bar = ...; ... }`, # while the tree under `/foo/baz` gets nested under `{ baz = ...; ... }` # Therefore allowing combined operations over them. - trees = map (fileset: - setAttrByPath - (drop (length commonBaseComponents) fileset._internalBaseComponents) - fileset._internalTree - ) filesets; + trees = map (_shortenTreeBase commonBaseComponents) filesetsWithBase; # Folds all trees together into a single one using _unionTree # We do not use a fold here because it would cause a thunk build-up # which could cause a stack overflow for a large number of trees resultTree = _unionTrees trees; in - _create commonBase resultTree; + # If there's no values with a base, we have no files + if filesetsWithBase == [ ] then + _emptyWithoutBase + else + _create commonBase resultTree; # The union of multiple filesetTree's with the same base path. # Later elements are only evaluated if necessary. @@ -379,4 +566,76 @@ rec { # The non-null elements have to be attribute sets representing partial trees # We need to recurse into those zipAttrsWith (name: _unionTrees) withoutNull; + + # Computes the intersection of a list of filesets. + # The filesets must already be coerced and validated to be in the same filesystem root + # Type: Fileset -> Fileset -> Fileset + _intersection = fileset1: fileset2: + let + # The common base components prefix, e.g. + # (/foo/bar, /foo/bar/baz) -> /foo/bar + # (/foo/bar, /foo/baz) -> /foo + commonBaseComponentsLength = + # TODO: Have a `lib.lists.commonPrefixLength` function such that we don't need the list allocation from commonPrefix here + length ( + commonPrefix + fileset1._internalBaseComponents + fileset2._internalBaseComponents + ); + + # To be able to intersect filesetTree's together, they need to have the same base path. + # Base paths can be intersected by taking the longest one (if any) + + # The fileset with the longest base, if any, e.g. + # (/foo/bar, /foo/bar/baz) -> /foo/bar/baz + # (/foo/bar, /foo/baz) -> null + longestBaseFileset = + if commonBaseComponentsLength == length fileset1._internalBaseComponents then + # The common prefix is the same as the first path, so the second path is equal or longer + fileset2 + else if commonBaseComponentsLength == length fileset2._internalBaseComponents then + # The common prefix is the same as the second path, so the first path is longer + fileset1 + else + # The common prefix is neither the first nor the second path + # This means there's no overlap between the two sets + null; + + # Whether the result should be the empty value without a base + resultIsEmptyWithoutBase = + # If either fileset is the empty fileset without a base, the intersection is too + fileset1._internalIsEmptyWithoutBase + || fileset2._internalIsEmptyWithoutBase + # If there is no overlap between the base paths + || longestBaseFileset == null; + + # Lengthen each fileset's tree to the longest base prefix + tree1 = _lengthenTreeBase longestBaseFileset._internalBaseComponents fileset1; + tree2 = _lengthenTreeBase longestBaseFileset._internalBaseComponents fileset2; + + # With two filesetTree's with the same base, we can compute their intersection + resultTree = _intersectTree tree1 tree2; + in + if resultIsEmptyWithoutBase then + _emptyWithoutBase + else + _create longestBaseFileset._internalBase resultTree; + + # The intersection of two filesetTree's with the same base path + # The second element is only evaluated as much as necessary. + # Type: filesetTree -> filesetTree -> filesetTree + _intersectTree = lhs: rhs: + if isAttrs lhs && isAttrs rhs then + # Both sides are attribute sets, we can recurse for the attributes existing on both sides + mapAttrs + (name: _intersectTree lhs.${name}) + (builtins.intersectAttrs lhs rhs) + else if lhs == null || isString rhs then + # If the lhs is null, the result should also be null + # And if the rhs is the identity element + # (a string, aka it includes everything), then it's also the lhs + lhs + else + # In all other cases it's the rhs + rhs; } diff --git a/nixpkgs/lib/fileset/tests.sh b/nixpkgs/lib/fileset/tests.sh index 0ea96859e7a3..529f23ae8871 100755 --- a/nixpkgs/lib/fileset/tests.sh +++ b/nixpkgs/lib/fileset/tests.sh @@ -57,18 +57,35 @@ with lib.fileset;' expectEqual() { local actualExpr=$1 local expectedExpr=$2 - if ! actualResult=$(nix-instantiate --eval --strict --show-trace \ + if actualResult=$(nix-instantiate --eval --strict --show-trace 2>"$tmp"/actualStderr \ --expr "$prefixExpression ($actualExpr)"); then - die "$actualExpr failed to evaluate, but it was expected to succeed" + actualExitCode=$? + else + actualExitCode=$? fi - if ! expectedResult=$(nix-instantiate --eval --strict --show-trace \ + actualStderr=$(< "$tmp"/actualStderr) + + if expectedResult=$(nix-instantiate --eval --strict --show-trace 2>"$tmp"/expectedStderr \ --expr "$prefixExpression ($expectedExpr)"); then - die "$expectedExpr failed to evaluate, but it was expected to succeed" + expectedExitCode=$? + else + expectedExitCode=$? + fi + expectedStderr=$(< "$tmp"/expectedStderr) + + if [[ "$actualExitCode" != "$expectedExitCode" ]]; then + echo "$actualStderr" >&2 + echo "$actualResult" >&2 + die "$actualExpr should have exited with $expectedExitCode, but it exited with $actualExitCode" fi if [[ "$actualResult" != "$expectedResult" ]]; then die "$actualExpr should have evaluated to $expectedExpr:\n$expectedResult\n\nbut it evaluated to\n$actualResult" fi + + if [[ "$actualStderr" != "$expectedStderr" ]]; then + die "$actualExpr should have had this on stderr:\n$expectedStderr\n\nbut it was\n$actualStderr" + fi } # Check that a nix expression evaluates successfully to a store path and returns it (without quotes). @@ -84,14 +101,14 @@ expectStorePath() { crudeUnquoteJSON <<< "$result" } -# Check that a nix expression fails to evaluate (strictly, coercing to json, read-write-mode). +# Check that a nix expression fails to evaluate (strictly, read-write-mode). # And check the received stderr against a regex # The expression has `lib.fileset` in scope. # Usage: expectFailure NIX REGEX expectFailure() { local expr=$1 local expectedErrorRegex=$2 - if result=$(nix-instantiate --eval --strict --json --read-write-mode --show-trace 2>"$tmp/stderr" \ + if result=$(nix-instantiate --eval --strict --read-write-mode --show-trace 2>"$tmp/stderr" \ --expr "$prefixExpression $expr"); then die "$expr evaluated successfully to $result, but it was expected to fail" fi @@ -101,16 +118,112 @@ expectFailure() { fi } -# We conditionally use inotifywait in checkFileset. +# Check that the traces of a Nix expression are as expected when evaluated. +# The expression has `lib.fileset` in scope. +# Usage: expectTrace NIX STR +expectTrace() { + local expr=$1 + local expectedTrace=$2 + + nix-instantiate --eval --show-trace >/dev/null 2>"$tmp"/stderrTrace \ + --expr "$prefixExpression trace ($expr)" || true + + actualTrace=$(sed -n 's/^trace: //p' "$tmp/stderrTrace") + + nix-instantiate --eval --show-trace >/dev/null 2>"$tmp"/stderrTraceVal \ + --expr "$prefixExpression traceVal ($expr)" || true + + actualTraceVal=$(sed -n 's/^trace: //p' "$tmp/stderrTraceVal") + + # Test that traceVal returns the same trace as trace + if [[ "$actualTrace" != "$actualTraceVal" ]]; then + cat "$tmp"/stderrTrace >&2 + die "$expr traced this for lib.fileset.trace:\n\n$actualTrace\n\nand something different for lib.fileset.traceVal:\n\n$actualTraceVal" + fi + + if [[ "$actualTrace" != "$expectedTrace" ]]; then + cat "$tmp"/stderrTrace >&2 + die "$expr should have traced this:\n\n$expectedTrace\n\nbut this was actually traced:\n\n$actualTrace" + fi +} + +# We conditionally use inotifywait in withFileMonitor. # Check early whether it's available # TODO: Darwin support, though not crucial since we have Linux CI if type inotifywait 2>/dev/null >/dev/null; then - canMonitorFiles=1 + canMonitor=1 else - echo "Warning: Not checking that excluded files don't get accessed since inotifywait is not available" >&2 - canMonitorFiles= + echo "Warning: Cannot check for paths not getting read since the inotifywait command (from the inotify-tools package) is not available" >&2 + canMonitor= fi +# Run a function while monitoring that it doesn't read certain paths +# Usage: withFileMonitor FUNNAME PATH... +# - FUNNAME should be a bash function that: +# - Performs some operation that should not read some paths +# - Delete the paths it shouldn't read without triggering any open events +# - PATH... are the paths that should not get read +# +# This function outputs the same as FUNNAME +withFileMonitor() { + local funName=$1 + shift + + # If we can't monitor files or have none to monitor, just run the function directly + if [[ -z "$canMonitor" ]] || (( "$#" == 0 )); then + "$funName" + else + + # Use a subshell to start the coprocess in and use a trap to kill it when exiting the subshell + ( + # Assigned by coproc, makes shellcheck happy + local watcher watcher_PID + + # Start inotifywait in the background to monitor all excluded paths + coproc watcher { + # inotifywait outputs a string on stderr when ready + # Redirect it to stdout so we can access it from the coproc's stdout fd + # exec so that the coprocess is inotify itself, making the kill below work correctly + # See below why we listen to both open and delete_self events + exec inotifywait --format='%e %w' --event open,delete_self --monitor "$@" 2>&1 + } + + # This will trigger when this subshell exits, no matter if successful or not + # After exiting the subshell, the parent shell will continue executing + trap 'kill "${watcher_PID}"' exit + + # Synchronously wait until inotifywait is ready + while read -r -u "${watcher[0]}" line && [[ "$line" != "Watches established." ]]; do + : + done + + # Call the function that should not read the given paths and delete them afterwards + "$funName" + + # Get the first event + read -r -u "${watcher[0]}" event file + + # With funName potentially reading files first before deleting them, + # there's only these two possible event timelines: + # - open*, ..., open*, delete_self, ..., delete_self: If some excluded paths were read + # - delete_self, ..., delete_self: If no excluded paths were read + # So by looking at the first event we can figure out which one it is! + # This also means we don't have to wait to collect all events. + case "$event" in + OPEN*) + die "$funName opened excluded file $file when it shouldn't have" + ;; + DELETE_SELF) + # Expected events + ;; + *) + die "During $funName, Unexpected event type '$event' on file $file that should be excluded" + ;; + esac + ) + fi +} + # Check whether a file set includes/excludes declared paths as expected, usage: # # tree=( @@ -120,7 +233,7 @@ fi # ) # checkFileset './a' # Pass the fileset as the argument declare -A tree -checkFileset() ( +checkFileset() { # New subshell so that we can have a separate trap handler, see `trap` below local fileset=$1 @@ -168,54 +281,21 @@ checkFileset() ( touch "${filesToCreate[@]}" fi - # Start inotifywait in the background to monitor all excluded files (if any) - if [[ -n "$canMonitorFiles" ]] && (( "${#excludedFiles[@]}" != 0 )); then - coproc watcher { - # inotifywait outputs a string on stderr when ready - # Redirect it to stdout so we can access it from the coproc's stdout fd - # exec so that the coprocess is inotify itself, making the kill below work correctly - # See below why we listen to both open and delete_self events - exec inotifywait --format='%e %w' --event open,delete_self --monitor "${excludedFiles[@]}" 2>&1 - } - # This will trigger when this subshell exits, no matter if successful or not - # After exiting the subshell, the parent shell will continue executing - # shellcheck disable=SC2154 - trap 'kill "${watcher_PID}"' exit - - # Synchronously wait until inotifywait is ready - while read -r -u "${watcher[0]}" line && [[ "$line" != "Watches established." ]]; do - : - done - fi - - # Call toSource with the fileset, triggering open events for all files that are added to the store expression="toSource { root = ./.; fileset = $fileset; }" - storePath=$(expectStorePath "$expression") - # Remove all files immediately after, triggering delete_self events for all of them - rm -rf -- * + # We don't have lambda's in bash unfortunately, + # so we just define a function instead and then pass its name + # shellcheck disable=SC2317 + run() { + # Call toSource with the fileset, triggering open events for all files that are added to the store + expectStorePath "$expression" + if (( ${#excludedFiles[@]} != 0 )); then + rm "${excludedFiles[@]}" + fi + } - # Only check for the inotify events if we actually started inotify earlier - if [[ -v watcher ]]; then - # Get the first event - read -r -u "${watcher[0]}" event file - - # There's only these two possible event timelines: - # - open, ..., open, delete_self, ..., delete_self: If some excluded files were read - # - delete_self, ..., delete_self: If no excluded files were read - # So by looking at the first event we can figure out which one it is! - case "$event" in - OPEN) - die "$expression opened excluded file $file when it shouldn't have" - ;; - DELETE_SELF) - # Expected events - ;; - *) - die "Unexpected event type '$event' on file $file that should be excluded" - ;; - esac - fi + # Runs the function while checking that the given excluded files aren't read + storePath=$(withFileMonitor run "${excludedFiles[@]}") # For each path that should be included, make sure it does occur in the resulting store path for p in "${included[@]}"; do @@ -230,7 +310,9 @@ checkFileset() ( die "$expression included path $p when it shouldn't have" fi done -) + + rm -rf -- * +} #### Error messages ##### @@ -273,33 +355,40 @@ expectFailure 'toSource { root = ./a; fileset = ./.; }' 'lib.fileset.toSource: ` rm -rf * # Path coercion only works for paths -expectFailure 'toSource { root = ./.; fileset = 10; }' 'lib.fileset.toSource: `fileset` is of type int, but it should be a path instead.' -expectFailure 'toSource { root = ./.; fileset = "/some/path"; }' 'lib.fileset.toSource: `fileset` \("/some/path"\) is a string-like value, but it should be a path instead. +expectFailure 'toSource { root = ./.; fileset = 10; }' 'lib.fileset.toSource: `fileset` is of type int, but it should be a file set or a path instead.' +expectFailure 'toSource { root = ./.; fileset = "/some/path"; }' 'lib.fileset.toSource: `fileset` \("/some/path"\) is a string-like value, but it should be a file set or a path instead. \s*Paths represented as strings are not supported by `lib.fileset`, use `lib.sources` or derivations instead.' # Path coercion errors for non-existent paths expectFailure 'toSource { root = ./.; fileset = ./a; }' 'lib.fileset.toSource: `fileset` \('"$work"'/a\) does not exist.' # File sets cannot be evaluated directly -expectFailure 'union ./. ./.' 'lib.fileset: Directly evaluating a file set is not supported. Use `lib.fileset.toSource` to turn it into a usable source instead.' +expectFailure 'union ./. ./.' 'lib.fileset: Directly evaluating a file set is not supported. +\s*To turn it into a usable source, use `lib.fileset.toSource`. +\s*To pretty-print the contents, use `lib.fileset.trace` or `lib.fileset.traceVal`.' +expectFailure '_emptyWithoutBase' 'lib.fileset: Directly evaluating a file set is not supported. +\s*To turn it into a usable source, use `lib.fileset.toSource`. +\s*To pretty-print the contents, use `lib.fileset.trace` or `lib.fileset.traceVal`.' # Past versions of the internal representation are supported expectEqual '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 0; _internalBase = ./.; }' \ - '{ _internalBase = ./.; _internalBaseComponents = path.subpath.components (path.splitRoot ./.).subpath; _internalBaseRoot = /.; _internalVersion = 2; _type = "fileset"; }' + '{ _internalBase = ./.; _internalBaseComponents = path.subpath.components (path.splitRoot ./.).subpath; _internalBaseRoot = /.; _internalIsEmptyWithoutBase = false; _internalVersion = 3; _type = "fileset"; }' expectEqual '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 1; }' \ - '{ _type = "fileset"; _internalVersion = 2; }' + '{ _type = "fileset"; _internalIsEmptyWithoutBase = false; _internalVersion = 3; }' +expectEqual '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 2; }' \ + '{ _type = "fileset"; _internalIsEmptyWithoutBase = false; _internalVersion = 3; }' # Future versions of the internal representation are unsupported -expectFailure '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 3; }' '<tests>: value is a file set created from a future version of the file set library with a different internal representation: -\s*- Internal version of the file set: 3 -\s*- Internal version of the library: 2 +expectFailure '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 4; }' '<tests>: value is a file set created from a future version of the file set library with a different internal representation: +\s*- Internal version of the file set: 4 +\s*- Internal version of the library: 3 \s*Make sure to update your Nixpkgs to have a newer version of `lib.fileset`.' # _create followed by _coerce should give the inputs back without any validation expectEqual '{ inherit (_coerce "<test>" (_create ./. "directory")) _internalVersion _internalBase _internalTree; -}' '{ _internalBase = ./.; _internalTree = "directory"; _internalVersion = 2; }' +}' '{ _internalBase = ./.; _internalTree = "directory"; _internalVersion = 3; }' #### Resulting store path #### @@ -311,6 +400,12 @@ tree=( ) checkFileset './.' +# The empty value without a base should also result in an empty result +tree=( + [a]=0 +) +checkFileset '_emptyWithoutBase' + # Directories recursively containing no files are not included tree=( [e/]=0 @@ -406,15 +501,32 @@ expectFailure 'toSource { root = ./.; fileset = union ./. ./b; }' 'lib.fileset.u expectFailure 'toSource { root = ./.; fileset = unions [ ./a ./. ]; }' 'lib.fileset.unions: element 0 \('"$work"'/a\) does not exist.' expectFailure 'toSource { root = ./.; fileset = unions [ ./. ./b ]; }' 'lib.fileset.unions: element 1 \('"$work"'/b\) does not exist.' -# unions needs a list with at least 1 element +# unions needs a list expectFailure 'toSource { root = ./.; fileset = unions null; }' 'lib.fileset.unions: Expected argument to be a list, but got a null.' -expectFailure 'toSource { root = ./.; fileset = unions [ ]; }' 'lib.fileset.unions: Expected argument to be a list with at least one element, but it contains no elements.' # The tree of later arguments should not be evaluated if a former argument already includes all files tree=() checkFileset 'union ./. (_create ./. (abort "This should not be used!"))' checkFileset 'unions [ ./. (_create ./. (abort "This should not be used!")) ]' +# unions doesn't include any files for an empty list or only empty values without a base +tree=( + [x]=0 + [y/z]=0 +) +checkFileset 'unions [ ]' +checkFileset 'unions [ _emptyWithoutBase ]' +checkFileset 'unions [ _emptyWithoutBase _emptyWithoutBase ]' +checkFileset 'union _emptyWithoutBase _emptyWithoutBase' + +# The empty value without a base is the left and right identity of union +tree=( + [x]=1 + [y/z]=0 +) +checkFileset 'union ./x _emptyWithoutBase' +checkFileset 'union _emptyWithoutBase ./x' + # union doesn't include files that weren't specified tree=( [x]=1 @@ -467,12 +579,249 @@ for i in $(seq 1000); do tree[$i/a]=1 tree[$i/b]=0 done -( - # Locally limit the maximum stack size to 100 * 1024 bytes - # If unions was implemented recursively, this would stack overflow - ulimit -s 100 - checkFileset 'unions (mapAttrsToList (name: _: ./. + "/${name}/a") (builtins.readDir ./.))' +# This is actually really hard to test: +# A lot of files would be needed to cause a stack overflow. +# And while we could limit the maximum stack size using `ulimit -s`, +# that turns out to not be very deterministic: https://github.com/NixOS/nixpkgs/pull/256417#discussion_r1339396686. +# Meanwhile, the test infra here is not the fastest, creating 10000 would be too slow. +# So, just using 1000 files for now. +checkFileset 'unions (mapAttrsToList (name: _: ./. + "/${name}/a") (builtins.readDir ./.))' + + +## lib.fileset.intersection + + +# Different filesystem roots in root and fileset are not supported +mkdir -p {foo,bar}/mock-root +expectFailure 'with ((import <nixpkgs/lib>).extend (import <nixpkgs/lib/fileset/mock-splitRoot.nix>)).fileset; + toSource { root = ./.; fileset = intersection ./foo/mock-root ./bar/mock-root; } +' 'lib.fileset.intersection: Filesystem roots are not the same: +\s*first argument: root "'"$work"'/foo/mock-root" +\s*second argument: root "'"$work"'/bar/mock-root" +\s*Different roots are not supported.' +rm -rf -- * + +# Coercion errors show the correct context +expectFailure 'toSource { root = ./.; fileset = intersection ./a ./.; }' 'lib.fileset.intersection: first argument \('"$work"'/a\) does not exist.' +expectFailure 'toSource { root = ./.; fileset = intersection ./. ./b; }' 'lib.fileset.intersection: second argument \('"$work"'/b\) does not exist.' + +# The tree of later arguments should not be evaluated if a former argument already excludes all files +tree=( + [a]=0 +) +checkFileset 'intersection _emptyWithoutBase (_create ./. (abort "This should not be used!"))' +# We don't have any combinators that can explicitly remove files yet, so we need to rely on internal functions to test this for now +checkFileset 'intersection (_create ./. { a = null; }) (_create ./. { a = abort "This should not be used!"; })' + +# If either side is empty, the result is empty +tree=( + [a]=0 +) +checkFileset 'intersection _emptyWithoutBase _emptyWithoutBase' +checkFileset 'intersection _emptyWithoutBase (_create ./. null)' +checkFileset 'intersection (_create ./. null) _emptyWithoutBase' +checkFileset 'intersection (_create ./. null) (_create ./. null)' + +# If the intersection base paths are not overlapping, the result is empty and has no base path +mkdir a b c +touch {a,b,c}/x +expectEqual 'toSource { root = ./c; fileset = intersection ./a ./b; }' 'toSource { root = ./c; fileset = _emptyWithoutBase; }' +rm -rf -- * + +# If the intersection exists, the resulting base path is the longest of them +mkdir a +touch x a/b +expectEqual 'toSource { root = ./a; fileset = intersection ./a ./.; }' 'toSource { root = ./a; fileset = ./a; }' +expectEqual 'toSource { root = ./a; fileset = intersection ./. ./a; }' 'toSource { root = ./a; fileset = ./a; }' +rm -rf -- * + +# Also finds the intersection with null'd filesetTree's +tree=( + [a]=0 + [b]=1 + [c]=0 ) +checkFileset 'intersection (_create ./. { a = "regular"; b = "regular"; c = null; }) (_create ./. { a = null; b = "regular"; c = "regular"; })' + +# Actually computes the intersection between files +tree=( + [a]=0 + [b]=0 + [c]=1 + [d]=1 + [e]=0 + [f]=0 +) +checkFileset 'intersection (unions [ ./a ./b ./c ./d ]) (unions [ ./c ./d ./e ./f ])' + +tree=( + [a/x]=0 + [a/y]=0 + [b/x]=1 + [b/y]=1 + [c/x]=0 + [c/y]=0 +) +checkFileset 'intersection ./b ./.' +checkFileset 'intersection ./b (unions [ ./a/x ./a/y ./b/x ./b/y ./c/x ./c/y ])' + +# Complicated case +tree=( + [a/x]=0 + [a/b/i]=1 + [c/d/x]=0 + [c/d/f]=1 + [c/x]=0 + [c/e/i]=1 + [c/e/j]=1 +) +checkFileset 'intersection (unions [ ./a/b ./c/d ./c/e ]) (unions [ ./a ./c/d/f ./c/e ])' + + +## Tracing + +# The second trace argument is returned +expectEqual 'trace ./. "some value"' 'builtins.trace "(empty)" "some value"' + +# The fileset traceVal argument is returned +expectEqual 'traceVal ./.' 'builtins.trace "(empty)" (_create ./. "directory")' + +# The tracing happens before the final argument is needed +expectEqual 'trace ./.' 'builtins.trace "(empty)" (x: x)' + +# Tracing an empty directory shows it as such +expectTrace './.' '(empty)' + +# This also works if there are directories, but all recursively without files +mkdir -p a/b/c +expectTrace './.' '(empty)' +rm -rf -- * + +# The empty file set without a base also prints as empty +expectTrace '_emptyWithoutBase' '(empty)' +expectTrace 'unions [ ]' '(empty)' +mkdir foo bar +touch {foo,bar}/x +expectTrace 'intersection ./foo ./bar' '(empty)' +rm -rf -- * + +# If a directory is fully included, print it as such +touch a +expectTrace './.' "$work"' (all files in directory)' +rm -rf -- * + +# If a directory is not fully included, recurse +mkdir a b +touch a/{x,y} b/{x,y} +expectTrace 'union ./a/x ./b' "$work"' +- a + - x (regular) +- b (all files in directory)' +rm -rf -- * + +# If an included path is a file, print its type +touch a x +ln -s a b +mkfifo c +expectTrace 'unions [ ./a ./b ./c ]' "$work"' +- a (regular) +- b (symlink) +- c (unknown)' +rm -rf -- * + +# Do not print directories without any files recursively +mkdir -p a/b/c +touch b x +expectTrace 'unions [ ./a ./b ]' "$work"' +- b (regular)' +rm -rf -- * + +# If all children are either fully included or empty directories, +# the parent should be printed as fully included +touch a +mkdir b +expectTrace 'union ./a ./b' "$work"' (all files in directory)' +rm -rf -- * + +mkdir -p x/b x/c +touch x/a +touch a +# If all children are either fully excluded or empty directories, +# the parent should be shown (or rather not shown) as fully excluded +expectTrace 'unions [ ./a ./x/b ./x/c ]' "$work"' +- a (regular)' +rm -rf -- * + +# Completely filtered out directories also print as empty +touch a +expectTrace '_create ./. {}' '(empty)' +rm -rf -- * + +# A general test to make sure the resulting format makes sense +# Such as indentation and ordering +mkdir -p bar/{qux,someDir} +touch bar/{baz,qux,someDir/a} foo +touch bar/qux/x +ln -s x bar/qux/a +mkfifo bar/qux/b +expectTrace 'unions [ + ./bar/baz + ./bar/qux/a + ./bar/qux/b + ./bar/someDir/a + ./foo +]' "$work"' +- bar + - baz (regular) + - qux + - a (symlink) + - b (unknown) + - someDir (all files in directory) +- foo (regular)' +rm -rf -- * + +# For recursively included directories, +# `(all files in directory)` should only be used if there's at least one file (otherwise it would be `(empty)`) +# and this should be determined without doing a full search +# +# a is intentionally ordered first here in order to allow triggering the short-circuit behavior +# We then check that b is not read +# In a more realistic scenario, some directories might need to be recursed into, +# but a file would be quickly found to trigger the short-circuit. +touch a +mkdir b +# We don't have lambda's in bash unfortunately, +# so we just define a function instead and then pass its name +# shellcheck disable=SC2317 +run() { + # This shouldn't read b/ + expectTrace './.' "$work"' (all files in directory)' + # Remove all files immediately after, triggering delete_self events for all of them + rmdir b +} +# Runs the function while checking that b isn't read +withFileMonitor run b +rm -rf -- * + +# Partially included directories trace entries as they are evaluated +touch a b c +expectTrace '_create ./. { a = null; b = "regular"; c = throw "b"; }' "$work"' +- b (regular)' + +# Except entries that need to be evaluated to even figure out if it's only partially included: +# Here the directory could be fully excluded or included just from seeing a and b, +# so c needs to be evaluated before anything can be traced +expectTrace '_create ./. { a = null; b = null; c = throw "c"; }' '' +expectTrace '_create ./. { a = "regular"; b = "regular"; c = throw "c"; }' '' +rm -rf -- * + +# We can trace large directories (10000 here) without any problems +filesToCreate=({0..9}{0..9}{0..9}{0..9}) +expectedTrace=$work$'\n'$(printf -- '- %s (regular)\n' "${filesToCreate[@]}") +# We need an excluded file so it doesn't print as `(all files in directory)` +touch 0 "${filesToCreate[@]}" +expectTrace 'unions (mapAttrsToList (n: _: ./. + "/${n}") (removeAttrs (builtins.readDir ./.) [ "0" ]))' "$expectedTrace" +rm -rf -- * # TODO: Once we have combinators and a property testing library, derive property tests from https://en.wikipedia.org/wiki/Algebra_of_sets |