0

Following http://superuser.com/questions/1780479 and http://superuser.com/questions/1777606, we issue the following script to compare times of the same–full-path symlinks under directories $1 and $2:

#!/bin/bash
cd $1
find . -type l -exec bash -c "if [[ -h \"{}\" && -h \"$2/{}\" ]]; then if (test $(readlink \"{}\") = $(readlink \"$2/{}\")) then if (find \"$2/{}\" -prune -newer \"{}\" -printf 'a\n' | grep -q a) then echo \"{} is older than $2/{}\"; else if (find \"{}\" -prune -newer \"$2/{}\" -printf 'a\n' | grep -q a) then echo \"$2/{} is older than {}\"; fi; fi; fi; fi" \;

The usage is compare_times.sh directory_1 directory_2 (where compare_times.sh is the name of the script). We use it as demonstrated in the following example:

user@machine:/tmp/D1$ ls -lt --full-time /tmp/linked_file /tmp/D*
-rw-r--r-- 1 user user  0 2023-04-25 00:12:09.289942358 +0200 /tmp/linked_file

/tmp/D2: total 0 lrwxrwxrwx 1 user user 14 2023-04-25 00:07:00.265830604 +0200 lnk -> ../linked_file

/tmp/D1: total 0 lrwxrwxrwx 1 user user 14 2023-04-25 00:06:40.922078186 +0200 lnk -> ../linked_file user@machine:/tmp/D1$ compare_times.sh . ../D2 ./lnk is older than ../D2/./lnk user@machine:/tmp/D1$

As you see, find calls bash, which itself calls find. (Probably, this might be written more elegantly, but this is not the point now.) Are there any issues with calling find from under find this way?

The man page of the find command doesn't say whether find is reenterant or not. If find is not reentrant, we could hypothetically silently miss some output, i.e., some symlinks that have the same name and the same position inside the two directories and that point to equal filenames but that have different timestamps.

  • 1
    Never ever do this. Moreover, this is not valid shell – Gilles Quénot Apr 24 '23 at 15:23
  • 3
    What is the issue you are facing? What other commands are not reenterable? find is just a utility, and if it calls itself, it just calls itself. I have much much more to say about the obvious other issues in your code though, so if you update your question with whatever it is that you are having a problem with, we may be able to help you sort it out. – Kusalananda Apr 24 '23 at 15:24
  • 1
    Rephrasing my previous comment: Yes, find can be called from find with no issues. However, your code has code injection vulnerabilities and quoting issues that may cause it to fail or behave in unexpected ways. Since you explicitly discourage us from helping you with your code, I will say nothing further about that. – Kusalananda Apr 24 '23 at 15:31
  • It seems this question is the OP's main task. This one followed; and now the one we're commenting on. – Kamil Maciorowski Apr 24 '23 at 16:08
  • @GillesQuénot Thx! You can assume that all filenames I run the code on have been created by me or trustworthy folks. So, if code injection occurs, then only unintentionally. Which part do you refer to by “not valid shell”? –  Apr 24 '23 at 20:29
  • @Kusalananda Thx! You can assume that all filenames I run the code on have been created by me or trustworthy folks. So, if code injection occurs, then only unintentionally. As for the higher-level goal I wish to achieve, cf. https://superuser.com/questions/1777606 . Any help is welcome, be it the code itself or the high-lever goal. –  Apr 24 '23 at 20:30
  • @Kusalananda Concerning “What is the issue you are facing”: I was afraid that due to find calling itself, some output is missed, i.e., some symlinks that have the same name and the same position inside the two directories and that point to equal filenames but that have different timestamps could be missed. –  Apr 24 '23 at 20:35
  • @AlMa0: can you share a copy paste by editing your post with a use case when you have 2 symlinks with 2 different timestamps? It will helps everyone to understand, and to help you further. Your goal if to find symlinks that have different timestamps, right? – Gilles Quénot Apr 24 '23 at 21:59
  • @GillesQuénot Done. Also other typos corrected. Yes, your understanding of the goal is correct. Presuming the same name and the same path after $1/ and $2/. –  Apr 24 '23 at 22:27

1 Answers1

1

Not all commands are known to be reenterable.

Which ones are you thinking?

Usually, re-entrability comes into question with library functions that implicitly use some global state, and where calling the function twice at the same time, e.g. from within itself would mess up that state. (See e.g. Why are malloc() and printf() said as non-reentrant? on SO.)

But there's no global state between individual processes, they all have their own memory space and OS and the hardware make sure processes can't mess the memory of other processes. Of course, e.g. the filesystem is global state, but I'm not sure find needs to use it for saving state, and anyway respectable programs that do need temporary files are usually pretty good at using unique temporary files.

If state in the filesystem was an issue, it's hard to see how it would be specific to find processes called as each others children, but it'd already come up running any simultaneous find processes.


How useful it is to run one find from another is a whole another question, but I guess you could do this to find all regular files under directories called x...

$ mkdir -p {a/x,b,x}; touch {a/x,b,x}/foo.txt
$ find . -type d -name x -exec find {} -type f \;;
./a/x/foo.txt
./x/foo.txt
ilkkachu
  • 138,973
  • Answering “Which ones are you thinking?”: To be precise, I know of no GNU commands which have been explicitly documented as being reentrable or non-reentrable. Functions are a different matter; cf. https://stackoverflow.com/questions/20732882. Knowing the how folks (including myself) actually write code, it is usual that if an author of a piece of code thinks about any feature (here, reentrability), he/she documents this feature in some way or at least talks about it. –  Apr 24 '23 at 20:18
  • The global state, as you said, could be the file system in our case, say, temporary files (whether find uses them itself or calls library functions or other tools is a different matter). Also some tools which search for find among the process ids of the ancestors/descendants could hypothetically stop at the first found instance of find, while they need another one. Hypothetically, of course. –  Apr 24 '23 at 20:22
  • I suppose that find -type d -exec mv {} ... could be construed as non-reentrant particularly without the -depth option – Chris Davies Apr 24 '23 at 22:26
  • @AlMa0, well you wrote that you know some commands are known not to be... it doesn't have to be a GNU command, any implementations would work as an example. But perhaps the distinction between a function and a command is rather important here, and due to how things work, re-entrabiliry of commands just isn't a thing to consider or document... – ilkkachu Apr 25 '23 at 05:36
  • @AlMa0, why would some command search for find in its ancestors? Is there any plausible reason to do that? I mean, if we accept implausible issues, we could hypothesize a very solitary cat that looks for other cats in the list of all processes and refuses to run if it finds one. Neither of those would be an issue with find itself, though, unless it's the one doing the looking. – ilkkachu Apr 25 '23 at 05:40
  • In my example, the outer find is used to crawl through the directory structure, and the inner find is used only to check whether the timestamps of two symlinks differ (and has nothing to do with finding files itself). –  Apr 25 '23 at 06:39