0

In a zsh Terminal window under macOS, I'm trying to replace all instances of (Y, d') with \opair{Y, d'} recursively on all .tex files starting in the current directory.

The following seems to do nothing:

find . -type f -name "*.tex" -print0 | xargs -0 sed -i '.bak' -e "s/(Y, d')/\\opair{Y, d'}/g"

How do I fix this?

I did try escaping '(which according to the sed docs I've seen, does not actually require escaping), but that does not fix it.

murray
  • 103
  • 4
  • So, seeing that you've accepted my answer, what was the problem in the end if it was not the X vs Y typo? – Stéphane Chazelas Jul 16 '22 at 17:23
  • @StéphaneChazelas: The problem was escaping the \ in \opair.... And I still do not understand why it does not suffice to have \\opair? Why is \\\\ needed? – murray Jul 17 '22 at 18:59
  • Because the \ is special to the shell (once) so it should be \\ and it is also special to sed (twice) so it should become \\\\. There is a difficult to explain issue that allows \\\ in some cases. @murray – QuartzCristal Jul 18 '22 at 14:18
  • So if I wanted to replace \this by \that with sed, it should be sed -i.bak -e "s/\\\\this/\\\\that/g" (with 4 backslashes in both strings)? – murray Jul 18 '22 at 14:44
  • Yes, exactly, test with echo '\this' | sed "s/\\\\this/\\\\that/g" .... Because you used double quotes, exposing the \ to the shell interpretation. Single qoutes work like this: echo '\this' | sed 's/\\this/\\that/g' – QuartzCristal Jul 19 '22 at 01:34
  • @QuartzCristal: That greatly clarifies matters for me; thank you! – murray Jul 19 '22 at 15:39

1 Answers1

4
find . -name '*.tex' -type f -exec \
  sed -i.bak -e "s/(Y, d')/\\\\opair{Y, d'}/g" {} +
  • You had X instead of Y
  • no need for xargs, when you can use the standard -exec cmd {} + syntax instead.
  • \ needs to be escaped for the shell (still special inside double quotes) and for sed. Alternatively, you could do 's/(Y, d'\'')/\\opair{Y, d'\''}/g' or in rc or in zsh after set -o rcquotes, 's/(Y, d'')/\\opair{Y, d''}/g' as \ is not special within singles (though the problem now shifts to how to pass 's to sed).
  • for find, -name is often less expensive a test than -type so it's better to put it first (some find implementations do the reordering by themselves as an optimisation though).
  • with sed implementations other than FreeBSD's (which is also the one found on macos), the backup suffix must be affixed to the -i option. On FreeBSD and macos, both -i .bak and -i.bak will work, but the latter is more portable and more future-proof as FreeBSD/macos might choose to align with other implementations in the future.

Also beware that there are a lot of characters that look the same and some that are invisible (including some control one). For instance, are you sure the space between Y, and d' is the ASCII space (U+0020) and not for instance the non-breaking space (U+00A0)? Or that ' is the ASCII apostrophe and not U+2019, the right quote?

In vim, ga gives you information about the character under the cursor. uconv -x name < file gives you the name of each character in the input.

reveal() {
  perl -Mcharnames=full -Mopen=locale -pe 's{[^\t\n -~]}{
    sprintf "<U+%04X %s>", ord($&), charnames::viacode(ord($&))}ge' "$@"
}

Can be used to reveal (as things like <U+3000 IDEOGRAPHIC SPACE>St<U+00E9 LATIN SMALL LETTER E WITH ACUTE>phane for  Stéphane for instance) characters other than space, tab, newline and ASCII printable ones.

Also beware that with some find implementations including GNU find on GNU systems, that -name '*.tex' may fail to match on file names that end in .tex but where the rest can't be decoded into characters in the current locale. For instance, it would skip a file called $'St\xe9phane' in a locale that uses UTF-8 as its character encoding as that 0xe9 byte alone can't be decoded into a character. Prefixing the command with LC_ALL=C would work around that.