16

I have a sed replacement command that I would like to be compatible with BSD sed as well as GNU sed. Extended regular expressions are not an issue as I do not need them in this case. My primary problem is difference in the way that the two seds interpret character escape sequences in the replacement strings. My replacement string contains tabs and newlines and I'd like them to be visible in the command strings for ease of maintenance, however, BSD sed doesn't interpret the escape sequences and GNU sed does. What is the appropriate way to instruct sed to interpret these escape sequences on BSD? The following two snippets epitomize my problem:

GNU sed

echo ABC | sed 's/B/\n\tB\n'

yeilds

A
    B
C

BSD sed

echo ABC | sed 's/B\n\tB\n'

yields

AntBnC

Clearly, \n and \t aren't interpreted as escape sequences by BSD sed

Now, to my question. According the BSD sed manpage:

To specify a newline character in the replacement string, precede it with a backslash.

Does this imply that I'd need to precede a literal newline by a backslash? What is the appropriate way to instruct sed to interpret escape sequences like \n in the replacement text?

ephsmith
  • 1,006
  • 2
    BSD sed is not GNU sed, and I don't think it supports such escapes in the output. You'll have to either insert literal characters, install GNU sed, or switch to something that does support such escapes like awk. – jw013 Jul 04 '12 at 17:11
  • @jw013, I'm clear on differentiation between the two. Installing GNU sed is not an option. I was hoping to find enough common ground between the two to accomplish what I'm after with sed. In the end it probably will make sense to use awk. So what do you think about the interpretation of the BSD sed manpage I quoted? – ephsmith Jul 04 '12 at 17:16
  • 2
    Yes, you will need to use literal tabs and newlines, and with newlines you also need to precede them with a backslash, which is basically just a line-continuation mechanism. – jw013 Jul 04 '12 at 17:18
  • @jw013, thanks for your great replies. At this point, for the sake maintenance, I'll take your advice and rework my solution in awk. – ephsmith Jul 04 '12 at 17:57
  • Good choice - awk is a much better plan than the currently accepted answer :) – jw013 Jul 05 '12 at 02:39

3 Answers3

8

You can use the bash $'...' quoting to interpret the escapes before passing the string to sed.

From the bash man page:

   Words  of  the  form  $'string'  are  treated specially.  The word
   expands to string, with backslash-escaped characters  replaced  as
   specified  by the ANSI C standard.  Backslash escape sequences, if
   present, are decoded as follows:
          \a     alert (bell)
          \b     backspace
          \e     an escape character
          \f     form feed
          \n     new line
          \r     carriage return
          \t     horizontal tab
          \v     vertical tab
          \\     backslash
          \'     single quote
          \nnn   the eight-bit character whose  value  is  the  octal
                 value nnn (one to three digits)
          \xHH   the eight-bit character whose value is the hexadeci-
                 mal value HH (one or two hex digits)
          \cx    a control-x character

   The expanded result is single-quoted, as if the  dollar  sign  had
   not been present.

   A  double-quoted  string  preceded by a dollar sign ($) will cause
   the string to be translated according to the current  locale.   If
   the  current locale is C or POSIX, the dollar sign is ignored.  If
   the string is translated and replaced, the replacement is  double-
   quoted.
Kevin
  • 40,767
7

If you need to write portable scripts, you should stick to features in the POSIX standard (a.k.a. Single Unix a.k.a Open Group Base Specification). Issue 7 a.k.a. POSIX-1.2008 is the latest, but many systems haven't finished adopting it yet. Issue 6 a.k.a POSIX-1.2001 is by and large provided by all modern unices.

In sed, the meaning of escape sequences like \t and \n is not portable, except that in a regex, \n stands for a newline. In the replacement text for an s command, \n is not portable, but you can use the sequence backslash-newline to stand for a newline.

A portable way to generate a tab character (or any other character expressed in octal) is with tr. Store the character in a shell variable and substitute this variable in the sed snippet.

tab=$(echo | tr '\n' '\t')
escape=$(echo | tr '\n' '\033')
embolden () {
  sed -e 's/^/'"$escape"'[1m/' -e 's/$/'"$escape"'[0m/'
}

Note again that newlines need to be expressed differently in regexes and in s replacement texts.

You might want to use awk instead. It allows backslash escapes, including octal escapes \ooo, in every string literal.

3

This has been answered over on Stack Overflow:

https://stackoverflow.com/questions/1421478/how-do-i-use-a-new-line-replacement-in-a-bsd-sed

It's pretty much exactly what jw013 said.

In order to insert a literal tab type ctrl+VTab.

bahamat
  • 39,666
  • 4
  • 75
  • 104