1

I am writing a script that searches the system for files and then for each file does some sanity checks, and if they pass those, I want to display them in fzf. When the item is clicked on in fzf, I want to run the file with a program.

So far I have:

dir="/path/to/dir"

fd . $dir --size +1MB | while read -r line; do file_type=$(file -b "$line") echo "$line" | fzf if [[ "$file_type" == "data" ]]; then echo "$file_type" fi done

Basically, I search for files bigger than 1MB in the specified dir. For each file I run the file command and check if the command output returns "data". If it does, I want to add it to the fzf list. Then, when I click on the item, I want to run the file with the full path with an app: e.g. myapp /path/to/file/selected/in/fzf.

The problem so far is that fzf is blocking, and I can't populate the list in the loop. Do I really have to add everything to an array first and then pipe this into fzf? Ideally that should happen in parallel, aka, I want to add new items on the fly while the search is still going, instead of waiting for it to complete.

Also I dont know how to later run the selected file. Can someone help me with this?

Kyu96
  • 173

1 Answers1

1

fzf does not block the way you think. It's totally capable of adding to the list on the fly. You can see this in the following example (install pv first if not installed):

find / | pv -qlL 1 | fzf

where pv outputs one line per second, the lines appear in fzf, you can move the selection up and down and nothing blocks.

The problem with your code is you run fzf inside the while read … loop. For each line you run a separate fzf that gets this single line only. The loop can continue only after fzf finishes. So it's not that fzf refuses to read more; it's about fzf being inside the loop.

Basically you want something like this:

find … | data_filter | fzf

where data_filter is a piece of code that filters lines. It can be a shell loop:

find … | while IFS= read -r line; do …; done | fzf

As a filter the loop should print to its stdout all the lines you don't want to filter out (and only these lines). It can be a shell function named data_filter. In your case this is the right function:

data_filter() {
   while IFS= read -r line; do
   [ "$(file -b "$line" 2>/dev/null)" = data ] && printf '%s\n' "$line"
   done
}

And then you use it as a filter in a pipe.

Now we have the pipeline find … | data_filter | fzf which should work well and print whatever pathname you choose. To do something with the file use one of these:

  • find … | data_filter | fzf | xargs …, where xargs is configured to read full lines as they are. With GNU xargs and if the tool you want to run is ls -l, this would be like:

    find … | data_filter | fzf | xargs --no-run-if-empty -d '\n' -I{} ls -l {}
    

    But because xargs by default interprets quotes and does other things (like splitting), you need to know its options well to use it in a case like this. I admit I have never mastered xargs and I'm not sure I used the best set of options here. My point is: a simple invocation like … | fzf | xargs ls -l will break in many cases.

  • f="$(find … | data_filter | fzf)", then use "$file" wherever you want. An advantage is you can know the exit status of fzf. A theoretical disadvantage is $() strips trailing newlines. In practice, in our case a pathname with trailing (or any) newline doesn't pass well through data_filter | fzf anyway.

    Example:

    f="$(find . | data_filter | fzf)"
    status="$?"
    [ "$status" = 0 ] && ls -l "$f"
    

Since all the tools in the pipeline use newlines to separate entries, pathnames with newlines will break the code.


To handle pathnames with newlines you need to make the whole pipeline work with null-terminated (as opposed to newline-terminated) entries.

  • First of all, you need find … -print0 (or equivalent fd command). Note some implementations of find don't support -print0 (-exec printf '%s\0' {} + can be a replacement).

  • Then the new filtering function should be:

    data_filter0() {
       while IFS= read -r -d '' line; do
       [ "$(file -b "$line" 2>/dev/null)" = data ] && printf '%s\0' "$line"
       done
    }
    
  • Next use fzf --read0 --print0.

  • Finally xargs -r0 …. (The alternative with $() is troublesome and I won't elaborate; in this case prefer xargs -r0.) Note these options are not portable and your xargs may or may not support them. As a bonus xargs -r0 will work well with multiple selection from fzf -m.

An example pipeline:

find . -print0 | data_filter0 | fzf -m --read0 --print0 | xargs -r0 ls -l

Notes and useful links:

  • In general the filter can be built into find … (thanks to find -exec). I didn't do this because it seems you want to use fd and I don't know this tool well enough.

  • If you pick an item early (i.e. before find and the filter finish) then fzf and everytnig later in the pipe may exit before find exits. The filter will notice fzf is no more only after it tries to write; similarly find will notice after it tries to write (compare this answer). This means find may work longer than it needs to, preventing the shell from moving on to the next command in the script (or from displaying the prompt, if interactive). You can run some parts in the background:

    ( find . -print0 | data_filter0 & ) | fzf -m --read0 --print0 | xargs -r0 ls -l
    

    This will not prevent find nor the filter from working after fzf exits, but the whole line won't stall. The shell will move on immediately after ls does its job. The processes in the background will terminate sooner or later.

  • Quote right. In your question there is unquoted $dir.

  • Why is printf better than echo?.

  • Understanding IFS= read -r line.

  • How do I use null bytes in Bash?

  • Kamil, thanks for your post. Question: How can one clear the results being sent to FZF? Eg, I can use a while look to poll an endpoint and then pipe that to FZF - eg, while curl... | fzf, but then it keeps stacking the results up in FZF. How can I just pipe the new results and drop/clear the old ones? – PeterM Feb 24 '23 at 15:28
  • @PeterM This looks like a separate question. It's not how the site works. Instead of asking just me, ask a new question so everyone can see it on the main page. In the body of your question link to this answer if you think it provides context. – Kamil Maciorowski Feb 24 '23 at 16:40
  • Kamil, thanks, I posted it separately. – PeterM Feb 24 '23 at 23:54