2

I have created a simple script that i want to use to rename files based on a pattern.

The script uses find and sed to do its work

Everything is basically working, except when I use the $ to tell sed to only match the last occurrence it fails to match anything, so no renaming is performed. If I remove the $ renaming works but all occurrences are renamed which is not what I want, I specifically want to target the last occurrence.

I've followed the docs, tutorials on youtube, searched stack overflow, stack exchange, etc, non of the things i have found work or are even that relevant to my issue, which specifically is tied to the $ address char not working for me.

I've tested the script below, it demonstrates using sed with and without $ char so you can see the issue im having. The logic is organized into 2 functions, searchReplace() and searchReplaceLastMatchOnly(). The searchReplaceLastMatchOnly() function is what I need working, but it fails to match anything.

My test dir is structured as follows:

./bar
./bar/baz
./bar/foobar.txt 
./bar/baz/foobar.txt

Running script in test dir as follows:

./script.sh -d "." -s "bar" -r "Bar" -p "*.txt" -e "txt"

Should change:

./bar/foobar.txt to ./bar/fooBar.txt

and:

./bar/baz/foobar.txt to ./bar/baz/fooBar.txt

--- Actual Results ---

Using searchReplace:

Renaming ./bar/baz/foobar.txt to ./Bar/baz/fooBar.txt

Renaming ./bar/foobar.txt to ./Bar/fooBar.txt

Using searchReplaceLastMatchOnly:

Renaming ./bar/baz/foobar.txt to ./bar/baz/foobar.txt

Renaming ./bar/foobar.txt to ./bar/foobar.txt

Here is the complete script:

# Search and replace using sed.
# $1 target The string to process
# $2 The string to serach for
# $3 The string that will repalce the search string
# Usage Example:
# result=searchReplace "targetToSearch" "Search" "Replace"
# printf "%s" "$result" # should output "targetToReplace"
function searchReplace() {
  # spaces are set as defaults
  target=${1:- }
  search=${2:- }
  replace=${3:- }
  result="$(printf "%s" "$target" | sed "s/${search}/${replace}/g")"
  printf "%s" "$result"
}

# Prints via printf with wrapping newline(\n) chars.
# Note: If no arguments are supplied, pretty print
#       will simpy print a default string followed by
#       a newline character (\n).
# Usage Exmple:
# txt="Text"
# prettyPrint "$txt"
function prettyPrint() {
  # Set default text to print in case no arguments were passed
  # (at the moment this is an empty string)
  text=${1:-}
  [[ -z $text ]] && printf "\n" || printf "\n%s\n" "$text"
  #
}

# Get option values
while getopts "d:p:s:r:e:" opt; do
  case $opt in
  d) dirPath="$OPTARG" ;;
  p) pattern="$OPTARG" ;;
  s) search="$OPTARG" ;;
  r) replace="$OPTARG" ;;
  e) fileExt="$OPTARG" ;;
  *)
    prettyPrint "Error: Invalid flag $opt"
    exit 1
    ;;
  esac
done

# Defaults #
dirPath=${dirPath:-.}
pattern=${pattern:-*}
search=${search:- }
replace=${replace:- }
fileExt=${fileExt:-txt}

prettyPrint "Using searchReplace:"
find "$dirPath" -type f -name "$pattern" | while IFS= read -r original; do
  modified="$(searchReplace "$original" "$search" "$replace")"
  prettyPrint "Renaming $original to $modified"
  #mv "$original" "$modified" | This is the goal...no using till renaming is working.
done

# The dev directory is structured as follows:
# .
# ./bar/fooBar.txt
# ./bar/baz/fooBar.txt
#
# This script when run as follows:
#
# ./script.sh -d "." -s "bar" -r "Bar" -p "*.txt" -e "txt"
#
# Should rename:
# ./bar/foobar.txt to ./bar/fooBar.txt
#
# and also rename:
# ./bar/baz/foobar.txt to ./bar/baz/fooBar.txt
#
# However when I run this script it renames as follows:
#
# ./bar/baz/foobar.txt to ./Bar/baz/fooBar.txt
#
# ./bar/foobar.txt to ./Bar/fooBar.txt
#
# As you can see the ./bar directory is also renamed to Bar, i just want the last
# occurence of bar to be renamed to Bar.
#
# I tried modifying the sed command in the searchReplace() function
# from sed "s/${search}/${replace}/g" to sed "s/${search}$/${replace}/g"
# as i read that the $ should tell sed to match only the last occurence,
# but it doesn't work.
# Below is the modified searchReplace() function that uses the $
#
function searchReplaceLastMatchOnly() {
  # spaces are set as defaults
  target=${1:- }
  search=${2:- }
  replace=${3:- }
  result="$(printf "%s" "$target" | sed "s/${search}$/${replace}/g")"
  printf "%s" "$result"
}

prettyPrint "Using searchReplaceLastMatchOnly:"
# here it is running
find "$dirPath" -type f -name "$pattern" | while IFS= read -r original; do
  modified="$(searchReplaceLastMatchOnly "$original" "$search" "$replace")"
  prettyPrint "Renaming $original to $modified"
  #mv "$original" "$modified" | This is the goal...no using till renaming is working.
done
Freddy
  • 25,565
  • 1
    Your question would benefit from some formatting. You have formatting tools available when you [edit] the question. See also https://unix.meta.stackexchange.com/questions/5308 for example. – Kusalananda Oct 10 '19 at 20:53
  • Your question could also be reduced to two or three lines. It's kind of unreasonable to expect anyone to read all that without paying them some fee first. – jesse_b Oct 10 '19 at 21:11
  • this script seems like a convoluted way to avoid learning regular expressions. and the perl rename command. – cas Oct 11 '19 at 00:10

2 Answers2

2

To replace only the last match, replace

result="$(printf "%s" "$target" | sed "s/${search}$/${replace}/g")

with

result="$(printf "%s" "$target" | sed "s/\(.*\)${search}/\1${replace}/")"

The $ matches the end of the line and not the last search pattern and you don't need the g modifier since there is (at most) one replacement.

The \(.*\) is greedy and sed matches everything before it finds the last $search pattern. And since we don't want to delete this part, we have to include it as \1 in the replacement.

Notes:

  • The script is missing a shebang #!/bin/bash in the first line.
  • -e is not (yet) implemented.
  • There are a lot of variables which are only used once. Removing them would make your code much cleaner, e.g. the searchReplaceLastMatchOnly() function could be reduced to

    function searchReplaceLastMatchOnly() {
      printf "%s" "${1:- }" | sed "s/\(.*\)${2:- }/\1${3:- }/"
    }
    
Freddy
  • 25,565
  • Thank you for your notes. Any advice/feedback is always helpful. Can you explain why i would want to use the -e option, I saw it mentioned in examples and the docs by I'm not clear on why I would need/want it. The docs say -e will "add the script to the commands to be executed" but i don't really understand what that implies. – Sevi Foreman Oct 11 '19 at 01:06
  • @SeviForeman: He was talking about the -e option in your script's getopts section that sets the fileExt variable. That is never used in your code. – jesse_b Oct 11 '19 at 13:08
1

The $ anchor:

Matches the end of a string without consuming any characters. If multiline mode is used, this will also match immediately before a newline character.

This is different from "last occurence", since your strings actually end with txt$ and not bar$ it does not match.

One way to match only the last occurance would be to use rev to reverse your string and only replace the first occurance (of course your replacements would also need to be reversed!)

function searchReplaceLastMatchOnly() {
  # spaces are set as defaults
  local target=$(rev <<<"${1:- }")
  local search=$(rev <<<"${2:- }")
  local replace=$(rev <<<"${3:- }")
  local esearch=$(printf '%s\n' "$search" | sed 's:[][\/.^$*]:\\&:g')
  local ereplace=$(printf '%s\n' "$replace" | sed 's:[\/&]:\\&:g;$!s/$/\\/')
  result="$(printf "%s" "$target" | sed "s/${esearch}/${ereplace}/" | rev)"
  printf "%s" "$result"
}

Note: the g operator has been removed from the sed command so only the first occurrence is replaced. Additionally we are escaping the variables that will be passed to sed to prevent them from being interpreted undesirably.

jesse_b
  • 37,005