The GNU find man page says as follows — and this appears specific to GNU find, other implementations may differ, see below:
The + and - prefixes signify greater than and less than, as usual; i.e., an exact size of n units does not match. Bear in mind that the size is rounded up to the next unit. Therefore -size -1M
is not equivalent to -size -1048576c
. The former only matches empty files, the latter matches files from 0 to 1,048,575 bytes.
Question:
Ok, so I guess -1M is rounded to 0M, -2M to -1M and so on... ?
No. It's not the limit in the -size
condition that's rounded, but the file size itself.
Take a file of 1234 bytes and a -size -1M
directive. The file size is rounded up the nearest unit mentioned in the directive, here, MB's. 1234 -> 1 MB. That doesn't match the condition, since -size -1M
demands less than 1 MB (after this rounding). So, indeed, -size -1x
for any x
, returns only empty files.
Similarly, -size 1M
would match the above file, since after rounding, it's exactly 1 MB in size. On the other hand, -size 1k
would not, since it rounds to 2 kB.
Note that the -
or +
in front of the number in the condition is irrelevant for the rounding behaviour.
It may be useful to just always specify the sizes in bytes, since that way there's no rounding to stumble on. -size -$((1024*1024))c
will reliably find files that are strictly less than 1 MB (or 1 MiB, if you will) in size. If you want a range, you can use e.g. ( -size +$((512*1024-1))c -size -$((1024*1024+1))c )
for files within [512 kB, 1024 kB].
Another question on this: Why does `find -size -1G` not find any files?
Gilles mentions in that linked question the fact that POSIX only specifies -size N
as meaning size in 512-byte blocks (rounded as above: "the file size in bytes, divided by 512 and rounded up to the next integer"), and -size Nc
as meaning the size in bytes. Both with the optional plus or minus. The others are left unspecified, and not all find
implementations recognize other prefixes, or round like GNU find does.
I tested with Busybox and the *BSD find on my Mac, and it seems they treat
conditions with size specifiers in a way that feels more sensible, i.e. -size -1k
matches files from 0 to 1023 bytes, the same as -size -1024c
, and similarly for -size -1M
== -size -1024k
(Busybox only has c
, b
and k
). Then again, Busybox doesn't seem to do the rounding even for sizes specified in blocks, against what the POSIX text seems to say it should.
So, YMMV and again, maybe better to stick with sizes in bytes.
Note that there's a similar issue with the -atime
, -mtime
and -ctime
conditions:
-atime n
File was last accessed n*24 hours ago. When find figures out how many 24-hour periods ago the file was last accessed, any fractional part is ignored, so to match -atime +1
, a file has to have been accessed at least two days ago.
And similarly, it may be easier to just use -amin +$((24*60-1))
to find files that have been last accessed at least a full 24 h ago. (Up to rounding to a minute, which you can't get rid of.)
See also: Why does find -mtime +1 only return files older than 2 days?
Is this all normal or am I doing something wrong and what's the exact behavior of the -size parameter?
It's "normal" as far as the behaviour of GNU find is concerned, but I wouldn't call it exactly sensible. You're not wrong to be confused, it's find
that is confusing.
find
that differs in this behavior (how they differ)? – Kusalananda Mar 09 '21 at 11:18find
has a-size
that only follows the POSIX spec. No other suffixes allowed thanc
. I haven't looked exactly how-ctime
etc. works, but I know there is a difference there too. – Kusalananda Mar 09 '21 at 11:54-mtime
(and other time options).-mtime -2
means up to 1 day old, and doesn't include 1 day + 1 second old. – Barmar Mar 10 '21 at 17:17