0

I have 2 folders with subdirectories. One with training images and one with training labels.

  • One image belongs to exactly one label
  • The image and the label have different content
  • The image and the label can be found in similar paths. e.g.:
    • images/18/1334/image1.webp
    • labels/18/1334/image1.png
  • The filename (not the extension) is the same
  • There can be multiple files in one subdirectory

How can I remove every label which has no corresponding image (and the other way round)? For example:

images:

.
|---18
     |---a1
     |    |---a1.webp
     |    |---a11.webp
     |---a2
     |    |---a2.webp
     |---a3

labels

.
|---18
     |---a1
     |    |---a1.png
     |    |---a11.png
     |---a2
     |    |---a2.png
     |---a3
          |---a3.png  

Okayish solution (remove files if there are no corresponding labels or images):

.
|---18
     |---a1
     |    |---a1.*
     |    |---a11.*
     |---a2
     |    |---a2.*
     |---a3

Best solution (remove the folders which are empty now, too):

.
|---18
     |---a1
     |    |---a1.*
     |    |---a11.*
     |---a2
          |---a2.*

The asterisk* stands for webp or png.

AG_exp
  • 15

1 Answers1

0

With find and bash:

cd to the parent directory of images and labels and run:

find . \( -name "*.webp" -o -name "*.png" \) -type f -exec bash -c '
if [ "${1##*.}" = "webp" ]; then
  file=${1/\/images\//\/labels\/} 
  file=${file%webp}png
else
  file=${1/\/labels\//\/images\/}
  file=${file%png}webp
fi
[ ! -f "$file" ] && echo rm "{}"
' bash {} \;

You need to remove the echo to really delete the files.

To delete your empty directories, see How to remove all empty directories in a subtree?.

AG_exp
  • 15
Freddy
  • 25,565
  • No need to remove the echo; just pipe it into a shell: ./my-script | sh Most all scripts should be written this way, if you want to look at what is going to happen before you actually commit to doing it. – Jim L. Jul 01 '19 at 23:08
  • Awesome answer! Thank you very much. I still need a bit of experience to build something like that!. There is only an easy to fix error: At the moment we delete the file which does not exist. We want to delete the file which exists but has no corresponding file. – AG_exp Jul 02 '19 at 07:56