Goal
I was looking for a simple way to check for new files.
As the target system is a minimal embedded Linux platform, I cannot just install more packages.
Current Solution
A nice solution seemed to be using find ... -newer reference.file
and then
repeatedly run that with touch
ing the reference.file in each run, as suggested here:
https://unix.stackexchange.com/a/249238/562136
In my case, the code looks like this:
NEW_FILES=()
WORK_FILE_DIR="/some/folder/path"
REFERENCE_FILE="${WORK_FILE_DIR}/.last_checked_reference"
function find_new_files() {
mapfile -t NEW_FILES < <(find "$WORK_FILE_DIR" -type f -newerBa "$REFERENCE_FILE" -name "*.txt")
touch "$REFERENCE_FILE"
}
mkdir -p $WORK_FILE_DIR
touch $REFERENCE_FILE
while true; do
find_new_files
for file in "${NEW_FILES[@]}"; do
echo "New file $file"
# ... handle file content in multiple steps
done
echo "--"
sleep 5
done
Note that I actually used -newerBa
and not just newer
.
Expected behaviour
By using -newerBa
only files created after the last access to REFERENCE_FILE should be listed, with the access time being updated on each touch
.
I expected the output to look like
--
--
New file <file1>
--
--
--
New file <file2>
...
Actual behaviour
The output looks like this:
--
--
New file <file1>
--
New file <file1>
--
New file <file1>
--
New file <file1>
New file <file2>
--
New file <file1>
New file <file2>
...
BUT when I touch the REFERENCE_FILE externally, meaning from my CLI while the script runs, the touch has the expected effect:
--
--
New file <file1>
--
New file <file1>
-- <-- at this point, touch REFERENCE_FILE from my CLI
--
New file <file2>
--
New file <file2>
...
What I tried
- I added
stat $REFERENCE_FILE
to each iteration and can see that 3 of the 4 times (all but creation date) are updated properly while the script is running.
I checked stat
when updating the REFERENCE_FILE from my CLI, and I cannot see any difference.
16777221 96762726 -rw------- 1 user staff 0 0
"Feb 25 11:53:24 2023"
"Feb 25 11:53:24 2023"
"Feb 25 11:53:24 2023"
"Feb 25 01:10:39 2023"
4096 0 0
<path>/.last_checked_reference
--
16777221 96762726 -rw------- 1 user staff 0 0
"Feb 25 11:53:29 2023"
"Feb 25 11:53:29 2023"
"Feb 25 11:53:29 2023"
"Feb 25 01:10:39 2023"
4096 0 0
<path>/.last_checked_reference
- Use
touch -a $REFERENCE_FILE
andtouch -m $REFERENCE_FILE
- Adjusted file permissions to 666 or 777 to make sure that 600 is not a problem.
What works
I can completely remove and recreate the REFERENCE_FILE.
rm "$REFERENCE_FILE"
touch "$REFERENCE_FILE"
Question
I do not understand why stat
shows updated times and the script does not work as intended, but then reacts as intended to each touch
from my CLI.
Why does it behave like this?
noatime
option ? – Paul_Pedant Feb 25 '23 at 13:42stat
command then show a "-" instead of an actual time for the last access? – L. Heinrichs Feb 25 '23 at 14:08-newerBa
not being supported and having to use-newerca
or-newerma
instead, it works fine... AFAIUnoatime
should only inhibit atime changes from reads and writes, not change the fact that the atime field still exists, nor manual changes to it throughtouch
– ilkkachu Feb 25 '23 at 14:30find
finishes and the script updates the reference file timestamp. It'd not be listed in that iteration and would be older than the reference on the next, and hence missed completely. You could prevent that by creating a new reference with another name before runningfind
, and moving it to place after it, though that might just invert the problem and make some files appear twice. – ilkkachu Feb 25 '23 at 15:30stat
output format suggests the OP is using some sort of BSD system. – Stéphane Chazelas Feb 25 '23 at 16:52I thought about that. However, the files are created by another process, which may get duplicate input data. It writes out the data to a file if the file does not exist. So when moving the file, there is a chance would be recreated and again treated as new.
Yup I also checked that with
– L. Heinrichs Feb 25 '23 at 16:53stat
and no changes to the timestamps occur.sed -i '' sed-code file
for instance doesn't edit the file in place, it replaces it with a modified copy so the birth time would be new. – Stéphane Chazelas Feb 25 '23 at 16:55atime
is still updated for creates and writes, so all files and directories still have it: thenoatime
just gives file systems the option to suppress updating it on every read, either per-file or whole fs. Linux Documentation Project has more detail. – Paul_Pedant Feb 26 '23 at 09:19