5

This came out of one of my comments to this question regarding the use of bc in shell scripting. bc puts line breaks in large numbers, e.g.:

> num=$(echo 6^6^3 | bc)
> echo $num
12041208676482351082020900568572834033367326934574532243581212211450\ 20555710636789704085475234591191603986789604949502079328192358826561\ 895781636115334656050057189523456

But notice they aren't really line breaks in the variable -- or at least there are not if it is used unquoted. For example, in fooling around with more pipe in the assignment, e.g.:

num=$(echo 6^6^3 | bc | perl -pne 's/\\\n//g')

I realized that while there really is an \n in the bc output, checking echo $num > tmp.txt with hexdump shows the \n (ASCII 10) has definitely become a space (ASCII 32) in the variable assignment.

Or at least, in the output of unquoted $num >. Why is that?

As fedorqui points out, if you use quotes: echo "$num", you get newlines again. This is evident by examining the difference between echo $num > tmp.1 and echo "$num" > tmp.2 with hexdump; the former contains \ (backslash space) whereas the later contains \\n (backslash newline).

goldilocks
  • 87,661
  • 30
  • 204
  • 262
  • If you quote when echoing, it shows new lines but with trailing slash: echo "$num". – fedorqui Feb 28 '14 at 14:42
  • 1
    If find cat -A a good intermediate for looking at this kind of thing before going to hexdump or the like. Eg echo 6^6^3 | bc | cat -A. – Graeme Feb 28 '14 at 14:45
  • @fedorqui : Interesting, since that seems to be a further interpolation -- I've added a last paragraph about this. – goldilocks Feb 28 '14 at 14:50
  • In general, quoting while echoing is important to keep the original format. Hence, you have to trust the quoted echo when working with some text. – fedorqui Feb 28 '14 at 14:52
  • @fedorqui : Right, my "last paragraph" was actually a little confused on that -- of course it isn't really a "further interpolation". Did not know about this aspect of quotes vs. non-quotes, tho. Do you know of anywhere that stipulates all the transformations that occur? – goldilocks Feb 28 '14 at 14:56
  • OK I found a reference. See Shell command language - 2.2.3 Double-Quotes --> "Enclosing characters in double-quotes ( "" ) shall preserve the literal value of all characters within the double-quotes, with the exception of the characters backquote, , and ". – fedorqui Feb 28 '14 at 15:03
  • @fedorqui : Okay, but why would backslash-newline be transformed to backslash-space when unquoted? 2.2.1 from there actually states that "A that is not quoted shall preserve the literal value of the following character, with the exception of a . If a follows the , the shell shall interpret this as line continuation. The and shall be removed...the escaped is removed entirely" but clearly that is not at all what is happening. The escaped newline is being replaced by an "escaped" space character. – goldilocks Feb 28 '14 at 15:12
  • @Graeme, sed -n l is better that cat -A as it's non-ambiguous (and is standard/portable). With cat -A, if you see ^M, you don't know if it's a CR character or the two characters ^ and M or ^ followed by a meta character. – Stéphane Chazelas Feb 28 '14 at 15:12
  • @Stephane, good tip, thanks. Thought you would have the definitive answer for this... – Graeme Feb 28 '14 at 15:20
  • You're quoting a section about shell parsing, that's different from what the shell does upon variable expansion (the split+glob operator) – Stéphane Chazelas Feb 28 '14 at 15:36
  • @StephaneChazelas : Okay, so parsing of input vs. expansion of output? In that case I would still expect it to apply when the variable is assigned to. Is the "split+glob" operator real or just something you have used for explication? I can't find any reference to that anywhere else (see my answer here). – goldilocks Feb 28 '14 at 16:07
  • See http://unix.stackexchange.com/search?q=user%3A22565+split%2Bglob, or more specifically this answer to Why do I need to quote variable for if, but not for echo? (of which your question is almost a duplicate) – Stéphane Chazelas Feb 28 '14 at 16:53
  • @StephaneChazelas : I know -- I refer to one of those in my answer here. What I meant was, is that your moniker for the operation? Evidently so, which is fine by me, I just wanted to make sure there is no point looking for other explications of it (by name) in shell docs, etc. – goldilocks Feb 28 '14 at 17:12

5 Answers5

7

No idea why its there, but here's how to disable it with the GNU implementation of bc:

echo '6^6^3' | BC_LINE_LENGTH=0 bc

BC_LINE_LENGTH

This should be an integer specifying the number of characters in an output line for numbers. This includes the backslash and newline characters for long numbers. As an extension, the value of zero disables the multi-line feature. Any other value of this variable that is less than 3 sets the line length to 70.

Update:

I was confused about this question, I thought it was about the origins of the multi-line feature, it does seem like an odd one. Anyway the real answer is that if you do not quote the variable, the shell will do word splitting on it before this is passed to echo. Word splitting is the process where an expansion is split into 'words' depending on the contents of IFS, these 'words' then become different arguments. In the question example, this creates two arguments to echo, which echo then separates with a space (I knew this before Stephane commented, honest...).

To prevent this happening, just double quote the variable:

num=$(echo '6^6^3' | bc)
echo "$num"

Sometimes this is actually useful as a way to remove IFS characters from a variable (although printf %s is safer for arbitrary strings). Eg (in bash):

$ var=$'spaces:    newlines:\n\n\ntabs:\t\t\t end'

$ echo "$var"
spaces:    newlines:


tabs:            end
$ newvar="$(printf '%s ' $var)"
$ echo "$newvar"
spaces: newlines: tabs: end
Graeme
  • 34,027
3

echo puts a space between each two arguments. The shell considers the newline in $num just a word separator (just like space).

lines="a
b
c"
set -x
echo $lines   # several arguments to echo
echo "$lines" # one argument to echo

See this answer (by the OP himself) for a more detailed explanation.

Hauke Laging
  • 90,279
  • I'm going to give you the tick but for posterity refer to my answer which provides some more detail for (other) people who are caught off guard by this. – goldilocks Feb 28 '14 at 16:03
2

Use tr to delete the line continuations and the newlines:

$ num=$(echo 6^6^3 | bc)
$ echo "$num"
12041208676482351082020900568572834033367326934574532243581212211450\
20555710636789704085475234591191603986789604949502079328192358826561\
895781636115334656050057189523456
$ num=$(echo "$num" | tr -d '\n\\')
$ echo "$num"
12041208676482351082020900568572834033367326934574532243581212211450205557106367897040854752345911916039867896049495020793281923588265618957816361153346560500571895 23456
glenn jackman
  • 85,964
  • I'm actually not looking for a way to remove the newlines (there are no doubt dozens). I'm looking for an explanation as to how/why a newline ends up interpreted as a space. – goldilocks Feb 28 '14 at 14:58
  • 1
    @goldilocks, it's not interpreted as a space, it's just that you're using the split+glob operator by leaving your variables unquoted (and echo outputs its arguments separated by a space). – Stéphane Chazelas Feb 28 '14 at 15:07
  • @StephaneChazelas : Sounds like an answer. – goldilocks Feb 28 '14 at 15:15
2

In the bc man page, under expressions, it does explain the limit.

Since numbers are of arbitrary precision, some numbers may not be printable on a single output line. These long numbers will be split across lines using the "\" as the last character on a line. The maximum number of characters printed per line is 70.

X Tian
  • 10,463
0

As per Stephane Chazelas's comment, the issue not exactly that the space is interpreted as a newline. He's explained this elsewhere in relation to a (conceptual) "split+glob" operator, and although I don't totally follow the semantics of that, here's what's happening:

  • The unquoted variable is split on whitespace by the shell and these are passed as arguments to echo. The splitting removes the whitespace. This may be a bit confusing since "backslash-newline" seems like an escape sequence, but it is NOT equivalent to x=\\\n, it's is equivalent to x=$'\\\012', \012 being octal for an actual newline -- as opposed to the escape sequence \\n, which results in the string containing a literal "\n" (backslash-n), which would not be split.

  • echo then outputs its arguments separated by a space.

To reiterate:

str1=hello\\nworld
str2=$'hello\012world'

The shell will not split the first one unquoted since it doesn't actually contain any whitespace -- it contains the escape sequence \n, whereas in the second case it will be split unquoted on the actual newline character (which is whitespace).

goldilocks
  • 87,661
  • 30
  • 204
  • 262