Is there a file that will always not exist?

Question

Along the lines of /dev/null (path to an empty source/sink file), is there a path that will never point to a valid file on at least Linux? This is mostly for testing purposes of some scripts I'm writing, and I don't want to just delete or move a file that doesn't belong to the script if it exists.

@WGroleau /.. is guaranteed to exist and to be the same as /. — Kusalananda, Apr 05 '21 at 08:43
What is the test? If its a test for a regular file, use a pathname that is guaranteed to not be a regular file, like / — Kusalananda, Apr 05 '21 at 11:19
FWIW Debian uses /nonexistent as the home directory of some system users to make sure it's set to something that doesn't exist. But I've seen some poorly written commercial software installer creating those directories as a result (to place a customisation file in the home directory of all users even system ones!) — Stéphane Chazelas, Apr 05 '21 at 12:49
if you don't obey Linux filename length limit, see Filename length limits on linux?, then you would done your test by checking a case where filename (or filename+path) exceed this limitation. — αғsнιη, Apr 06 '21 at 05:13
@StéphaneChazelas: I'm tempted to respond to that sort of nonsense with sudo chattr +i /, but I imagine it might just crash if it can't create the directory that it's not supposed to create. — Kevin, Apr 07 '21 at 02:21
@StéphaneChazelas: I've seen /dev/null as home directory elsewhere. With udev it's hard for that to stay broken even if a bad script breaks it. — Joshua, Apr 07 '21 at 20:21

terdon · Accepted Answer · 2021-04-06T14:22:46.420

107

As an alternative, I would suggest that your script create a temporary directory, and then look for a file name in there. That way, you are 100% certain that the file doesn't exist, and you have full control and can easily clean up after yourself. Something like:

dir=$(mktemp -d)
if [ -e "$dir"/somefile ]; then
    echo "Something is seriously wrong here, '$dir/somefile' exists!"
fi
rmdir "$dir"

You can write the equivalent code in any language, the vast majority (all?) higher level languages will have some dedicated tool to handle creating and deleting temporary directories. This seems like a far safer and cleaner approach than trying to guess a file name that should not exist.

edited Apr 06 '21 at 14:22

answered Apr 04 '21 at 23:31

terdon

242,166

Comments are not for extended discussion; this conversation has been moved to chat. – terdon Apr 06 '21 at 14:23
4

(Note: if you're paranoid about malicious other processes creating a file in your new temp directory before you check it: 1. you probably have larger problems if something with your own UID or root is attacking you, 2. see revision history and chat for crazy corner-case ideas to make that less likely.) – Peter Cordes Apr 06 '21 at 18:45
@AlexeiKhlebnikov please take any further comments on edge cases or anything else not actually part of the question to the chat room created for that purpose. – terdon Apr 08 '21 at 08:06
In general, using a temporary directory to hold temporary test files is a good approach. All the temporary files can be reliably removed by recursively deleting the content of the temporary directory. – Raedwald May 01 '21 at 19:29

score 88 · Answer 2 · answered Apr 04 '21 at 23:43

88

/dev/null/foo cannot exist, unless /dev/null is a directory.

POSIX requires /dev/null to be "an empty data source and infinite data sink". I'm not sure if it's totally impossible to have a directory with these characteristics. Nevertheless I think it's pretty safe to assume /dev/null is not a directory in your *nix.

Note if you try to open /dev/null/foo then you will get ENOTDIR (not a directory), not ENOENT (no such file or directory). This may or may not be acceptable for your testing purposes.

answered Apr 04 '21 at 23:43

Kamil Maciorowski

21,864

4

If you are root it used to be possible to remove the /dev/null special device and then recreate it as a directory. Did something similar once by accident - it took a while to figure out what had happened. – Thorbjørn Ravn Andersen Apr 05 '21 at 08:52
11

@ThorbjørnRavnAndersen True. But if you are root then you can break mktemp as well and thus render this other answer unreliable. To be clear: I'm not trying to discredit the other answer. In fact mktemp -d was in my (now deleted) comment. I commented at the time, not answered, because IMO mktemp -d is not "along the lines of /dev/null". At about the same time terdon made it an answer, fair and square. OTOH I think /dev/null/foo is "along the lines", hence my answer. Well, I guess my nitpicking wasted my chance of writing the best answer. – Kamil Maciorowski Apr 05 '21 at 10:59
3

This is not to disvalue your answer, but to let future readers know of mysterious culprits that may be happening when things are behaving strangely. Another one, can be files hiding in a directory that is a mount point. That disk space is tricky to find. – Thorbjørn Ravn Andersen Apr 05 '21 at 11:53
In a similar fashion, /bin/sh/foo is quite unlikely to exist, but POSIX does not seem to say much about where the shell is on your file system. – Jens Apr 05 '21 at 14:09
6

/dev/null/foo cannot exist, unless /dev/null is a directory. It can happen. I've had to fix systems after botched shell scripts run as root have done unspeakable things to /dev/null. So yes, I've seen /dev/null both as a regular file and as a directory. – Andrew Henle Apr 05 '21 at 16:30
3

If you don't need ENOENT, you can also get ENAMETOOLONG pretty easily by simply using a really long name. But /dev/null/foo is shorter, and works even on systems with huge or no limits. – Peter Cordes Apr 05 '21 at 18:39
@AndrewHenle such a symptom would only possible on a corrupted FS, right? Or was your experience with an intact FS? – Ruslan Apr 06 '21 at 22:39
2

@Ruslan No. It's always been a perfectly fine file system. The problem is someone writes a shell script to collect some output, and being clever they remove the output file and either make a directory with that name and put files in that directory or create a new output file. Then someone else runs that script as root and uses /dev/null for the output file. Say goodby to /dev/null... – Andrew Henle Apr 06 '21 at 23:02
I've never seen a hosed /dev/null but I've seen /dev/tty be a directory. – Joshua Apr 07 '21 at 20:11
1

@AndrewHenle I've seen someone do something similar, too (turn /dev/null into a regular file). And it was a damned hard bug to find. I remember having to use the Sherlock Holmes debugging technique to find it: "When you have eliminated the impossible, whatever remains, however improbable, must be the truth". – jrw32982 Apr 08 '21 at 03:15

ilkkachu · Answer 3 · 2021-04-06T19:32:14.793

Taking a page from any cryptography handbook, if you generate a large enough random string, it will only exist on any given system with such an insignificant probability that the possibility of a hit can be ignored.

E.g. assuming you have a working /dev/urandom (and you should), something like this would generate in f a valid filename based on a 128-bit random number:

f=/$(head -c 16 /dev/urandom |base64 |tr / ,)

The output is something like /B90sYd,aNrcw7d7Itcb8fQ==. (The leading slash is fixed and on purpose to make it an absolute path. The tailing == are also fixed and due to Base64 padding. They can be ignored.)

A system generating random file names at the rate of 1 trillion / second would take trillions of years to generate the one generated by your script. Note that any collision isn't enough, so the birthday attack doesn't apply. This is basically the same as brute forcing a 128-bit symmetric key.

See also e.g.: How long would it take to brute force an AES-128 key?

Note that this requires a working /dev/urandom. It won't work if someone has replaced that with a static file containing some known string, like e.g. the three bytes \x86\x89\x9e, which when Base64-encoded, produce the string home; or if you're in e.g. a chrooted context where /dev/urandom isn't available. Also, e.g. embedded systems with no real means to initialize the system's RNG may face issues. Don't use this in situations like that, but also don't generate any cryptographic keys in situations like that.

As an alternative, you could possibly use the empty string. At least on Linux, trying to use it as a filename just gives an error:

$ cat ""
cat: '': No such file or directory
$ touch ""
touch: cannot touch '': No such file or directory

(As an aside, I find the error it gives (ENOENT) somewhat amusing. One might think it'd say the name is invalid, instead of that it doesn't exist.)

Note that if you put that in a variable, you really need to remember the quotes when expanding it! E.g. f=; cat $f would just read from stdin.

$ f=
$ cat "$f"
cat: '': No such file or directory
$ touch "$f"
touch: cannot touch '': No such file or directory

However, if you do cd "" in the shell, it just changes to the current directory. POSIX says that "If [the given path] is an empty string, the results are unspecified.". All shells I tried explicitly use the current path in the chdir() call, e.g.:

/tmp$ strace -etrace=chdir zsh -c 'cd ""'
chdir("/tmp")                           = 0
+++ exited with 0 +++
/tmp$

I got this idea based on an earlier (now deleted) answer. This may or may not be system-specific, I only tried on Linux. Caveat emptor.

Also, as mentioned in comments, if you don't care which exact error you get, or use something that doesn't even tell you, e.g. [ -f ... ] or [ -e ... ] in the shell, you could just create an over-long filename.

On pretty much all filesystems, the maximum length of a single file is 255 or less (see the table in Comparison of file systems on Wikipedia). A full path can be longer, but a single file name of 256 bytes is impossible, and gives ENAMETOOLONG:

$ f=$(printf %256s x | tr ' ' x)
$ touch "$f"
touch: cannot touch 'xxx...xxx': File name too long

But doing if [ -e "$f" ]; then ... works without error (and the test fails) in all shells I tried.

(The POSIX definition says that -e is "False if pathname cannot be resolved", but doesn't explicitly mention diagnostics. So perhaps some implementation could give an error in some situation, I'm not sure. Do tell if you find such a case.)

_{(The table in Wikipedia does mention two Linux filesystems with a higher per-file length, but I doubt they're used much nowadays, and I also understand Linux has a 255-byte limit in general, regardless of the filesystem.)}

I consider it good practice to always include quotes around expansions anyway. — Brian Drake, Apr 05 '21 at 11:35
I don't like relying too much on "this will most likely never happen". You might just be very unlucky some day. — Thorbjørn Ravn Andersen, Apr 05 '21 at 11:55
I do not like relying on this either, but we rely on it anyway. For example, as I understand it, Linux relies on a UUID to determine which partition it was started from. A collision here would presumably be catastrophic, unless Linux detects the collision and halts. — Brian Drake, Apr 05 '21 at 12:06
@ThorbjørnRavnAndersen There's also a greater-than-zero probability that your computer will spontaneously grow an arm and start slapping people in the face. I'm guessing you don't consider that to be realistically possible, though. Likewise, it's not realistically possible for a randomly generated filename with 128 bits of entropy to coincidentally match a filename that already exists on the system. — Tanner Swett, Apr 05 '21 at 12:07
@ThorbjørnRavnAndersen, yeah... well, if you ever use e.g. git, you rely on not having two objects generate the same SHA-1 hash. (SHA-1 is 160 bits, but since it can be any pair, the birthday attack applies, and you need only around 2^80 objects to have a 50 % chance of a collision.) Similar for almost all modern cryptography. And you might, in theory, be so unlucky that someone would guess your password on the very first try. (Or your GPG key, or whatever.) — ilkkachu, Apr 05 '21 at 12:07
@ThorbjørnRavnAndersen, but really, it's not about just "being unlucky one day" in the ordinary sense; it's more like on the order of getting hit by lighting, being bitten by a shark, surviving a car accident, and winning the national lottery, all on the same day. And then some. The numbers are big. I mean, mind-bogglingly big. That crypto.SE post I linked to compares it to the age of the universe, and still needs to add two more zeroes. — ilkkachu, Apr 05 '21 at 12:11
@BrianDrake, and oh yes indeed, you should always use quotes in POSIX-like shells. Just that often you can get away without them, but here, no chance of that working. — ilkkachu, Apr 05 '21 at 12:26
@ThorbjørnRavnAndersen You rely on "most likely never happen" more than you think. Cryptography (pretty much all of cryptography) is based on random number being never guessable. So any banking you've ever done has been secured on "most likely never happen" — Philip Couling, Apr 06 '21 at 16:03
Thanks to all those telling me how unlikely this is. Apparently you have never been bitten by something that should not happen ever. You might find https://docs.microsoft.com/en-us/archive/blogs/larryosterman/one-in-a-million-is-next-tuesday interesting. — Thorbjørn Ravn Andersen, Apr 07 '21 at 05:59
@ThorbjørnRavnAndersen: "one in a million" is hugely, enormously, exceptionally, mind-bogglingly bigger that "one in a quadrillion" (2^80). That's the point being made. — abeboparebop, Apr 07 '21 at 07:50
@ThorbjørnRavnAndersen, one million (1 000 000) is 10^6, about 2^20. Then, 2^128 is about 10^38. There's a difference of a full 32 orders of magnitude. In comparison, that's the same scale difference as between the size of a red blood cell (~10^-6 m) and the diameter of the visible universe (~10^26 m). So yes, once in a million is next Tuesday, or if you try a bit harder, next second. Once in 10^38 is never. — ilkkachu, Apr 07 '21 at 08:05
Now, if anyone wants to continue arguing statistics and scales of numbers, please take it somewhere else. Downvote on the way if you have to, but I'm a bit tired of getting pings for that subject. — ilkkachu, Apr 07 '21 at 08:06

DrSheldon · Answer 4 · 2021-04-06T23:05:14.893

16

Since process IDs are never negative, /proc/-1 will never exist.

This also works even if your variant of Unix doesn't support the procfs (or if it's not mounted)!

A similar method using file descriptors: /dev/fd/-1

I suppose some variant of Unix might theoretically allow a file descriptor of -1, but that is going to break a lot of existing code.

There are probably many other such possibilities with auto-generated filesystems.

edited Apr 06 '21 at 23:05

answered Apr 06 '21 at 04:22

DrSheldon

261

Good idea. Little anecdote, I have worked on more than one big applications in the past where programmers used this kind of thinking; i.e. if a DB auto-generated their IDs starting from 1, and thus never created negative IDs, they would use "negativeness" as a flag to go along with some ID, or use negative numbers as pseudo IDs or similar. So, yeah. I surely hope never to see /proc/-1. :) – AnoE Apr 06 '21 at 13:29
5

This probably assumes that a Linux system is in use. Other Unices with a /proc file hierarchy may possibly have a /proc/-1 name. – Kusalananda Apr 06 '21 at 18:29
2

I'd go for something like /proc/-123 or even /proc/xyz-non-existent; as @Kusalananda says, -1 doesn't sound that crazy for a possible Unix to have in an auto-generated FS. A weird filename that could only get created manually, in a directory that on most systems makes manual-creation impossible, is an even safer bet. – Peter Cordes Apr 06 '21 at 18:55
Is that guaranteed? If someone creates 2^31 processes, can the IDs overflow and wrap around into negative numbers? – user1024 Apr 06 '21 at 22:07
1

@user1024 you can't create so many processes, see Maximum number of processes in linux. Other systems with /proc filesystem will most likely have similar limits. – Ruslan Apr 06 '21 at 22:46
1

@user1024, pid_t has to be signed, because -1 is used as an error indicator, which also means that it can't be used as a real process id. PIDs also can't be negative, since kill() and waitpid() treat negative values specially (as process group ids). Also POSIX says a PID is "unique positive integer identifier". (Which still doesn't mean /proc/-1 couldn't exist for some weird purpose, but I can't see why anyone would do it.) Same with file descriptors, they're defined as non-negative, and -1 is used as an error value, so it can't be given as a real fd. – ilkkachu Apr 06 '21 at 23:49
@ilkkachu: Killing -1 is really destructive, and a handle of -1 is bonkers for lots of reasons. I suppose a system could allow -1 to be created deliberately by dup2() but so much stuff breaks I would advise against ever implementing that. – Joshua Apr 07 '21 at 20:16
@Joshua, dup2() also returns the new fd, so it would have the same problem where that return value could either mean an error or not. Though of course you could just force the user to rely on resetting errno beforehand and checking it, but then again, if you make an insane system, you could also have dup2() be the call to write bytes to a file. In any case, pretty much all of the answers here on the site assume a somewhat sensible POSIX-ish environment. Guarding against any possible insanity, or systems that actively try trick you, is both impossible and out of scope. – ilkkachu Apr 07 '21 at 20:24

Brian Drake · Answer 5 · 2021-04-05T13:56:53.997

terdon has already given a good answer: use mktemp -d. This post looks at the question in a more fundamental way, and explains why this command is usually the best answer.

There is no constant path that will always not exist. But based on your sentence about testing scripts, it sounds like a variable path is just as good, provided the script can generate it itself.

The only way to generate such a path is to keep generating paths, until the generated path does not exist. Unfortunately, this can lead to a race condition: some other program might be create the file after you check whether it exists, but before you perform whichever operation depends on the file not existing.

The best way to avoid this race condition is to use a path that other programs are supposed to avoid, like a path under /tmp. Unfortunately, /tmp tends to be world-writable, so now you have an even bigger problem: some other user might create the file.

What you really want is a temporary directory that other programs are supposed to avoid and other users do not have access to. Even better if the directory is empty, so you can just use any path under that directory.

mktemp -d creates a directory meeting all the criteria in the previous paragraph, while defending itself against its own race conditions. When you have finished, you can use rmdir to remove the directory:

dir="$(mktemp -d)"
# Use "$dir"/foo as a nonexistent path.
rmdir "$dir"

score 8 · Answer 6 · answered Apr 05 '21 at 22:24

I like to keep things simple. Instead of using a path that cannot exist, I use one that I would never create, nor can I imagine anyone creating.

My tests that need a non-existent path use something like /thisfiledoesnotexist. I consider this reasonable, because:

The filename is self-documenting.
It takes root permissions to create this file, so it's unlikely to get created by accident.
I would never create it.
I can't imaging why anyone else ever would. Linux sysadmins like to keep the root directory clean

If you are truly paranoid, then your test harness can first test that the file does not exist, failing if it does. Only if the file does not exist would it proceed with the test.

I like the simplicity of this approach. Additionally, some test libraries make a distinction between assertions (things that must be true for the test to pass) and assumptions (things that must be true for the test to run.) That way you can assume that the directory does not exist and avoid false negative test results (presumably with warnings about the tests that did not run). — Chris Bouchard, Apr 07 '21 at 02:48

score 3 · Answer 7 · edited Apr 07 '21 at 12:00

3

The empty path, "", cannot exist in Linux or POSIXy systems. In Linux, empty path always fails with ENOENT (see man 7 path_resolution), on other systems the error might be different; POSIX only says that it must not be resolved successfully.

Empty pathname

In the original UNIX, the empty pathname referred to the current directory. Nowadays POSIX decrees that an empty pathname must not be resolved successfully. Linux returns ENOENT in this case.

^(source)

A null pathname shall not be successfully resolved.

^(source)

edited Apr 07 '21 at 12:00

Kamil Maciorowski

21,864

answered Apr 07 '21 at 11:36

Glärbo

39

Already mentioned in this other answer as "an alternative", "possibly" and "at least on Linux". Your answer is more firm in this matter. I added the relevant citations. +1. – Kamil Maciorowski Apr 07 '21 at 12:01
@KamilMaciorowski: Thanks! Also, POSIX Std, Section 2.3: Error numbers explicitly describes ENOENT as "No such file or directory. A component of a specified pathname does not exist, or the pathname is an empty string." However, there is a bit of contention whether that means all POSIXy systems must return ENOENT error for empty pathnames, or if ENOENT is just a recommended error for that case, and other error codes are also allowed for that case (since empty pathname resolution is only required to not succeed). – Glärbo Apr 07 '21 at 13:02
On some cases, for example if the path is used as a prefix, can be interpreted like ".", so I would recommend against this. – Raedwald May 01 '21 at 19:32

Dewi Morgan · Answer 8 · 2021-04-07T14:53:07.450

Let's assume the criteria are:

test should work without write access to the device (so no temporary folder);
must return E_NOENT (so no using a known-existent filename as a parent folder, as that gives E_NOTDIR).

For this case, the deprecated C library functions char *tmpnam(char *str) and char *tempnam(const char *dir, const char *pfx); both generate and return a short, valid temporary filename which is guaranteed not to exist at that point in time.

It's possible that this may be sufficient for your use case, but of course, the methods are deprecated for a reason. There is a race condition between running tmpnam and checking for the file's existence, during which the file could be created. A malicious attacker can also deliberately create filenames with far higher chance of collision than 1 in TMP_MAX. Even with multiple non-malicious processes, the birthday problem arises.

Plus, these library calls require C, which your tests are unlikely to be written in. We could implement similar functionality in the test code: generate a filename, see if it exists, repeat until one doesn't exist.

However, a naive implementation would not only be vulnerable to the race condition above, but also have a very lengthy fail state: if E_NOENT is never correctly returned, so every file appears to exist, it'll keep trying indefinitely. That's exactly the wrong kind of fail state, when writing a test for E_NOENT: it means detecting failure takes forever!

However, as @ikkachu writes, with suitable creation of longer filenames (eg a UUID, or the approach ikkachu suggests), it should be cryptographically infeasible for our random file to already exist, so we only need to try once. It's tempting to try twice, just in case. But that's only equivalent to adding a single extra bit of entropy to our filename's random length.

But really, I do think it's worth emphasising what I wrote under another answer, that in general, relying on winning races does not work. That's not a likely problem here, but still, mktemp() and tmpnam() are deprecated/removed from POSIX, and for a reason. mkstemp() and mkdtemp() should be safer. I did enjoy the man pages too: "The tmpnam() function returns a pointer to a string that is a valid filename, and such that a file with this name did not exist at some point in time, so that naive programmers may think it a suitable name for a temporary file.", and "Never use mktemp()" — ilkkachu, Apr 06 '21 at 18:55
Updated: apologies for inadvertently associating your name with the daft "keep retrying" approach, that should be fixed now! Unfortunately, mkstemp() and mkdtemp() require write access, so are unsuitable here (if we create them and then delete them before checking for nonexistence, we're still racing, and we're semaphoring to an attacker what name we're using, and we need write access which I assumed we didn't have). I can't find a non-write higher-entropy replacement for mktemp(), but that doesn't mean it doesn't exist. — Dewi Morgan, Apr 06 '21 at 19:21
Yes, I know you said about not wanting to assume write access here. And that's totally fine. What I said about the functions was purely about the general case, where one actually wants to create a file. I'm just a bit allergic to the idea of accidentally propagating bad practices, that's all. :) — ilkkachu, Apr 06 '21 at 19:43
I strongly agree. I've edited for tone, which I hope addresses that. — Dewi Morgan, Apr 07 '21 at 14:41

alwayslearning · Answer 9 · 2021-04-05T11:45:43.553

1

If you strictly enforce the file requirement, you can read the absolute path of the current working directory and then use it to check for a regular file.

To work around the possibility of the script deleting the current working directory and creating a file with the same path/name before doing the check, perhaps we can consider the parent directory of the script itself.

A trivial case may be to check for the path "/" which is always guaranteed to be a directory, never a regular file.

edited Apr 05 '21 at 11:45

answered Apr 05 '21 at 11:32

alwayslearning

144
4

1

The current working directory is a regular file if you run rmdir "$(pwd)"; touch "$(pwd)" first. I tested it; stat "$(pwd)" reported a “regular empty file”. – Brian Drake Apr 05 '21 at 11:45
1

@BrianDrake, may be but it is difficult (though not impossible) to imagine a script which deletes current working directory. Perhaps the trivial case of "/" still holds good then. – alwayslearning Apr 05 '21 at 11:48
5

/ should work if you only look for a regular file. But if you try to do something else with it, it might not work. I'm not sure about current systems, but at least there have been some where you can open a directory for reading and get something out (the directory listing, in some format). – ilkkachu Apr 05 '21 at 12:55

score -1 · Answer 10 · answered Dec 11 '23 at 05:47

-1

Since it is impossible to create a file whose name contains a null byte, you can be absolutely certain there is no file named \0 in any directory.

By the same reasoning, however, it might prove tricky to test for this file's existence.

(The same is true for a file name containing the directory separator)

answered Dec 11 '23 at 05:47

ardnew

159
1
5

Since in practice the APIs involved use C-srrings, this is the same as the "empty path" suggestion from a couple of years ago. – muru Dec 11 '23 at 06:11

Is there a file that will always not exist?

10 Answers10