0

I do and get a list of files where I would like to delete many duplicate backup files

find . -type f -name '._*' 

I would like to find those files which have a corresponding filename

  • /home/masi/._test.tex matches /home/masi/test.tex
  • /home/masi/math/lorem.png matches /home/masi/math/._lorem.png

Pseudocode about files wanted to be saved filename which has corresponding ._filename but also save filename without ._filename

find . -type f -name '._*' -exec \ 
   find filenameWithoutDotUnderscore, if yes, print the filename

Pseudocode 2 clarification about files wanted to be removed = ._filename if there is a corresponding filename

  • If there is filename and ._filename in the same directory, print ._filename such that I can remove the duplicate = ._filename.
  • Exclude filenamePart1_.filenamePart2, bok_3A.pdf, ... in ._filename.
  • Do not remove ._filename if there is no corresponding filename in the same directory.

Reviewing Wildcard's command

I do find . -type f -name '._*' -exec sh -c 'for a; do f="${a%/*}/${a##*/._}"; [ -e "$f" ] && printf "rm -- %s\n" "$a"; done' find-sh {} + but it returns too many files. I think I need more && conditions beside the existence check ([ -e "$f" ]). It would be great to get here some content comparison and lastly diff if suspicion of much difference.

Systems: Ubuntu 16.04 and Debian 8.25
Bash: 4.3.42
Find: 4.7.0

1 Answers1

3

You can do this with find, but to do it robustly you will need to embed a shell one-liner as well. The proper way to do this is one of the following:

Stuff the looping into the spawned shell:

find . -type f -name '._*' -exec sh -c 'for a in "$@"; do f="${a%/*}/${a##*/._}"; [ -e "$f" ] && printf %s\\n "$f"; done' find-sh {} +

Or, spawn a separate shell for each file to be tested (less efficient, potentially more readable):

find . -type f -name '._*' -exec sh -c 'f="${1%/*}/${1##*/._}"; [ -e "$f" ] && printf %s\\n "$f"' find-sh {} \;

To directly remove the backup files, change this to the following for a dry run:

find . -type f -name '._*' -exec sh -c 'for a; do f="${a%/*}/${a##*/._}"; [ -e "$f" ] && printf "rm -- %s\n" "$a"; done' find-sh {} +

Then once you're satisfied with the list of commands that gets printed, use:

find . -type f -name '._*' -exec sh -c 'for a; do f="${a%/*}/${a##*/._}"; [ -e "$f" ] && rm -- "$a"; done' find-sh {} +

Notes:

In all of these, the find-sh argument is an arbitrary string; you could put anything there. It gets set as $0 within the spawned shell and is used for error reporting.

for a in "$@"; do is exactly equivalent to for a; do.

printf is better than echo.

Quoting is important.

Wildcard
  • 36,499
  • 1
    @Masi, the first two commands find the files whose names start in ._ and if there is a file in the same directory without the ._, print that file's name. They are system independent. – Wildcard Jun 10 '16 at 17:33
  • 1
    ${a%/*} expands to the value of the variable a with the last slash and everything after it removed; ${a##*/._} expands to the value of the variable a with everything up to the last occurrence of /._ removed. The / in between is a literal slash. See LESS='+/Parameter Expansion' man bash. – Wildcard Jun 10 '16 at 17:37
  • @Masi, do you want to remove the ._filename files or the filename files? I've bolded the relevant lines of my answer; I already included a "dry run" version.... – Wildcard Jun 10 '16 at 21:09
  • 1
    @Masi, dry run definition. I already provided that. – Wildcard Jun 10 '16 at 21:13
  • @Masi, [ -e "$f" ] is an existence check. Make a backup first. (You should have backups anyway.) Use the command in subdirectories first. Delete them manually, if you like. But your original exact question has been exactly answered. Certainly you should study up enough to understand what the command is doing before you blindly run it, but have you even attempted to understand what the command I posted does? – Wildcard Jun 21 '16 at 17:31
  • Yes, I have. I think the existence check verifies that filename exists. However, I think it is not enough. I think other && conditions are required. What do you think? – Léo Léopold Hertz 준영 Jun 21 '16 at 18:39
  • @Masi, I don't know your requirements. If you want to find files with matching names (your question), I've answered it. If you want to find files by duplicate contents, there are already answers for that. Please reread your question, and then see if my answer answers your question; if so you should accept it. If you have other peculiarities not described in your question, you should ask a new question rather than editing this one in a way that invalidates the answers given. – Wildcard Jun 21 '16 at 19:37