1

I'm on a Linux system and I'm trying to use 'column' on a file that I've spiked with an extended character to use as the separator. Reason being that any normal printing character is liable to show up where I don't want separation to occur, so by using 'sed' to find only those places where I do what the columns to separate, and adding an extended character at those places, then using that same extended character as the 'column' separator I should be OK.
Alas, trying to use hex character AE:

$ column -ts\xAE junk1  
column: Invalid or incomplete multibyte or wide character

... or any other extended character doesn't work and I've tried every combination of quotes and other tricks I can think of. But ts sounds like 'column' is open to using extended characters, so I just have to enter the thing properly.

terdon
  • 242,166
Ray Andrews
  • 2,347

1 Answers1

2

If you want to print Unicode lower case æ, which is u00E6, you can use this:

$ printf '\u00E6\n' 
æ

So, if your file looks like this:

$ printf 'foobarbaz\u00E6bar\u00E6baz bar something else whohooo!\n' 
foobarbazæbaræbaz bar something else whohooo!

You could use column like so:

$ printf 'foobarbaz\u00E6bar\u00E6baz bar something else whohooo!\n' | 
    column -ts$'\u00E6' -o "::::::::"
foobarbaz::::::::bar::::::::baz bar something else whohooo!

Note the ANSI escape format ($'characterCode'); see What does it mean to have a $"dollarsign-prefixed string" in a script? I used -o "::::::::" so you can easily see the columns.

Stephen Kitt
  • 434,908
terdon
  • 242,166
  • @RayAndrews what is \x7f supposed to be? If you expect it to be a newline, you again need to use ANSI escaping: -ts $'\x7f'. – terdon Jan 15 '21 at 16:37
  • Yeah those were my failed efforts. Sheesh, you'd think the man page would be more helpful. – Ray Andrews Jan 15 '21 at 19:11