3

I am trying to loop through the folders to get the files and do something on them, with output redirected to a text file with the same name as the file. I tried using 'find' -

cd /filepath/orig/v1

for dir in $(find . -type d); do cd $dir for subdir in $(find . -type d); do cd $subdir for file in ls; do echo $file touch $file.txt cdo info $file > $file.txt done done done

But this does not work. The directory structure is like - /filepath/orig/v1/level1/level2/file.nc but subdirectories can have more than two levels.

5 Answers5

2

Loops are unnecessary for this. Find will do it all.

find . -type f ! -name '*.txt' -print -exec sh -c 'cdo info {} > {}.txt' \;

Note that this will clobber existing .txt files and you might want to use a more specific filename filter than "not *.txt"

user10489
  • 6,740
1

If you have a fixed directory structure of two levels:

shopt -s dotglob nullglob

for pathname in /filepath/orig/v1//; do [[ $pathname == *.txt ]] && continue

printf 'Processing "%s"\n' "$pathname" >&2

cdo info "$pathname" >"$pathname.txt"

done

This first enables the dotglob and nullglob shell options. These shell options allows globbing patterns to match hidden names (dotglob) and will ensure that patterns that are not matched are removed completely (nullglob; this means the loop would not run a single iteration if /filepath/orig/v1/*/* does not match any names).

Any name in our loop that already ends with .txt is skipped, and the rest is processed with cdo info to generate a .txt file (note that I don't know what cdo info actually does). Note that there is no need to touch the filename first as the file would be created by virtue of redirecting into it.

Related:


If you know you will only process files with names ending in .nc:

shopt -s dotglob nullglob

for pathname in /filepath/orig/v1//.nc; do printf 'Processing "%s"\n' "$pathname" >&2 cdo info "$pathname" >"$pathname.txt" done


If you want to process all files with names ending in .nc anywhere beneath /filepath/orig/v1:

find /filepath/orig/v1 -type f -name '*.nc' -exec sh -c '
    for pathname do
        printf "Processing \"%s\"\n" "$pathname" >&2
        cdo info "$pathname" >"$pathname.txt"
    done' sh {} +

This calls a short in-line script for batches of found regular files with names ending in .nc.

You could also use /filepath/orig/v1/*/ as the search path with find to only search the subdirectories of /filepath/orig/v1 and not /filepath/orig/v1 itself.

Kusalananda
  • 333,661
0

I dumped 'find' because I had trouble understanding its concept, but seems like this worked -

orig_dir='/filepath/orig/v1'

for entry in "$orig_dir"//; do cd "$entry" x=ls *.nc echo "$x" name=basename $x .nc cdo info "$x" > new_path/"$name".txt

done

0

With GNU Parallel:

doit() {
  dir="$1"
  file="$2"
  cd "$dir"
  echo "$file"
  touch "$file".txt
  cdo info "$file" > "$file".txt
}
export -f doit
# 2 level only
printf "%s\0" */*/* | parallel -0 doit {//} {/}
# any level
find . -type f -print0 | parallel -0 doit {//} {/}

If you do not need the echo, touch, and if cdo can work on full path it can be shorter:

# 2 level only
printf "%s\0" */*/* | parallel -0 'cdo info {} > {}.txt'
# any level
find . -type f -print0 | parallel -0 'cdo info {} > {}.txt'

Contrary to xargs' shell code, {} is safe here.

If you want foo.nc to result in foo.txt:

# 2 level only
printf "%s\0" */*/*.nc | parallel -0 'cdo info {} > {.}.txt'
# any level
find . -type f -name '*.nc' -print0 | parallel -0 'cdo info {} > {.}.txt'
Ole Tange
  • 35,514
0

If you're using GNU or BSD find, you can use the -execdir option. It's the same as -exec except that it changes into the directory containing the file(s) first (and if you're using + instead of ; to terminate the -execdir, it batches up the files in the same dir to minimise to minimise the amount of forking per directory). e.g.

find . -type f -execdir \
  sh -c 'for f; do printf "%s\n" "$f" ; cdo info "$f" > "$f.txt"; done' sh {} +

Notes:

  1. for f; do is the same as for f in "$@"; do

  2. The first arg to the sh -c '...' command is sh. That's the name that will be used in the process table for the sh -c being executed by -exec or -execdir - i.e. $0. You can use any arbitrary name you like there - sh or find-sh are commonly used. If it's not there, then the shell script will not see the first filename found by find. This is specific to sh -c (and some other commands, usually script interpreters, like bash -c), it is not required for most commands that you might want to run with find -exec or -execdir (e.g. grep and sed don't need it)

  3. This uses -type f because, even though we want find to cd into the directory containing files, we only want to process regular files, not directories (or sockets, named pipes, symlinks, etc). If you want to process regular files and symlinks, use either find's -L option or \( -type f -o -type l \). Note that -L will follow symlinks to directories outside of your search tree, which is not usually what you want.

    If using \( -type f -o -type l \). the embedded sh -c script should check each argument to be sure that it (e.g. "$f" in my examples) is either a regular file or a symlink pointing to a regular file (test -f will do this for both because, as documented in help test and man test, "Except for -h and -L, all FILE-related tests dereference symbolic links.").

    find . \( -type f -o -type l \) -execdir \
      sh -c 'for f; do
               printf "%s\n" "$f"
               [ -f "$f" ] && cdo info "$f" > "$f.txt"
             done' sh {} +
    
  4. All variable expansions in the sh -c script are double-quoted. As they should be (See Why does my shell script choke on whitespace or other special characters? for why)


If you need to limit the search depth, you can use the -maxdepth option. e.g.

find . -maxdepth 2 -type f -execdir \
  sh -c 'for f; do printf "%s\n" "$f" ; cdo info "$f" > "$f.txt"; done' sh {} +

find also has related options like -d or -depth, and -mindepth for controlling how it traverses a directory tree.


PS: I don't know what the cdo command does or what arguments it takes but if it supports using -- to mark the end of options and the start of filename args, you should include it in the command, otherwise filenames beginning with - may be treated as options to cdo. e.g.

find . -type f -execdir \
  sh -c 'for f; do printf "%s\n" "$f" ; cdo info -- "$f" > "$f.txt"; done' sh {} +

This is (part of) the reason why I used printf instead of echo. See Why is printf better than echo?

cas
  • 78,579