6

Remove ^M character from log files.

In my script I redirect output of my program to a log file. The output of my log file contains some ^M (newline) characters. I need to remove them while running itself.

My command:

$ java -jar test.jar >> test.log 

test.log has:

Starting script ... ^M Starting script ...Initializing

Ram
  • 1,041

4 Answers4

18

Converting a standalone file

If you run the following command:

$ dos2unix <file>

The <file> will have all the ^M characters stripped. If you want to leave <file> intact, then simply run dos2unix like this:

$ dos2unix -n <file> <newfile>

Parsing output from a command

If you need to do them as part of a chain of commands via a pipe, you can use any number of tools such as tr, sed, awk, or perl to do this.

tr

$ java -jar test.jar | tr -d '^M' >> test.log

sed

$ java -jar test.jar | sed 's/^M//g' >> test.log

awk

$ java -jar test.jar | awk 'sub(/^M/,"")' >> test.log

perl

$ java -jar test.jar | perl -p -e 's/^M//g' >> test.log

Typing ^M

When entering the ^M be sure to enter it in one of the following ways:

  1. As Control + v + M and not Shift + 6 + M.
  2. As a backslash r, i.e. (\r).
  3. As an octal number (\015).
  4. As a hexidecimal number (\x0D).

Why is this necessary?

The ^M is part of how end of lines are terminated on the Windows platform. Each end of line is terminated with a carriage return character followed by a linefeed character.

On Unix systems the end of line is terminated by just a linefeed character.

  • linefeed character = 0x0A in hex, also written as \n.
  • carriage return character = 0x0D in hex, also written as \r.

Examples

You can see these if you pipe the output to a tool such as od or hexdump. Here's a sample file with the line terminating carriage returns + linefeed characters.

$ cat sample.txt
hi there
bye there

You can see them with hexdump as \r + \n:

$ hexdump -c sample.txt 
0000000   h   i       t   h   e   r   e  \r  \n   b   y   e       t   h
0000010   e   r   e  \r  \n                                            
0000015

Or as their hexidecimal 0d + 0a:

$ hexdump -C sample.txt 
00000000  68 69 20 74 68 65 72 65  0d 0a 62 79 65 20 74 68  |hi there..bye th|
00000010  65 72 65 0d 0a                                    |ere..|
00000015

Running this through sed 's/\r//g':

$ sed 's/\r//g' sample.txt |hexdump -C
00000000  68 69 20 74 68 65 72 65  0a 62 79 65 20 74 68 65  |hi there.bye the|
00000010  72 65 0a                                          |re.|
00000013

You can see that sed has removed the 0d character.

Viewing files with ^M without converting?

Yes you can use vim to do this. You can either set the fileformat setting in vim, which will have the effect of converting the file like we were doing above, or you can change the fileformat in the vim view.

changing a file's format

:set fileformat=dos
:set fileformat=unix

You can use the shorthand notation too:

:set ff=dos
:set ff=unix

Alternatively you can just change the fileformat of the view. This approach is nondestructive:

:e ++ff=dos
:e ++ff=unix

Here you can see me opening our ^M file, sample.txt in vim:

           ss of vim dos #1

Now I'm converting the fileformat in the view:

           ss of vim dos #2

Here's what it looks like when converted to the unix fileformat:

           ss of vim dos #3

References

slm
  • 369,824
  • Thanks... I need to do an extra step after my command (java -jar test.jar >> test.log ) is it possible to ignore (^M) character while redirecting output itself ... ?? – Ram Aug 30 '13 at 06:05
  • I used java -jar test.jar | sed 's/\r//g' >> test.log --- working great – Ram Aug 30 '13 at 06:30
  • @Ram - glad it solved your problem. – slm Aug 30 '13 at 06:41
  • @sim The user tries to see the log file using vi test.log and cat -v test.log .. They are treating it as error so i am trying to hide that char . – Ram Aug 30 '13 at 06:50
  • @sim User is using only vi editor . They are not ready to set any settings in editor. They want code itself to handle it . – Ram Aug 30 '13 at 06:58
  • @Ram - understood, just letting you know, I've added how to the answer if you or they are curious. – slm Aug 30 '13 at 07:08
  • This does NOT answer the question because the OP does not have a file with DOS line endings. He has terminal output that uses the ^M to go back and modify the previous output on the line. – psusi Aug 30 '13 at 13:32
  • @psusi - did you even read this entire Q&A thread? I've talked with the OP and he's marked this as the accepted answer b/c it DOES fix his issue. He's redirecting the terminal output to a file that others are then viewing in vi. Please don't post comments on things you haven't read! – slm Aug 30 '13 at 13:36
  • He said that sed worked, not dos2unix. dos2unix looks for CR + NL and replaces them with just NL. He doesn't have CR + NL. Using sed to remove the CR also leaves you with both the before and after text concatenated on one line, so it won't look right compared to what you see on a terminal. – psusi Aug 30 '13 at 13:41
  • @psusi - He never said that dos2unix didn't work. He wanted a solution that would convert the CR by removing them from the log file as they wre being written. Other sections to my answer provide methods for doing this use sed, tr, etc. If you strip the CR from the file and then later more the resulting file it will look just fine, it still has the NL's. His output has CR + NL's. I believe you're misunderstanding what the OP's Q is. – slm Aug 30 '13 at 13:51
  • The OP does not have a CR+NL, he has a CR followed by more text. dos2unix won't remove such a CR because it isn't immediately followed by a NL. – psusi Aug 30 '13 at 17:49
2

Shove the file through dos2unix to fix the line endings.

Or, use one of these:

sed 's,\r$,,'
tr -d '\r'
ash
  • 7,260
  • Thanks ...
    I used java -jar test.jar | sed 's/\r//g' >> test.log --- working great
    – Ram Aug 30 '13 at 06:32
1

You need to fix your program to call isatty() and if stdout is not a tty, then do not output the ^M.

psusi
  • 17,303
  • This is your answer? Please re-read the question. The title even says "remove ... from log files". – slm Aug 30 '13 at 13:38
  • @slm, yes... they shouldn't be in the log file in the first place. Well written programs check that they are actually using a terminal before spitting out terminal control codes. – psusi Aug 30 '13 at 13:42
  • Let's not debate "well written" programs. Sometimes people don't have access to the software that they support and have to do things like this to get the job done. Perhaps instead of this answer, you could explain how one would do this from Java instead. Seems like a gap on this Q&A that you could direct your energies to rather than coming in, downvoting the accepted answer, and saying that it DOESN'T work when clearly it DOES. – slm Aug 30 '13 at 13:55
  • Also in larger environments I've worked I've seen exactly this type of issue come up. It would be too costly to go back and "fix" a program that is doing this, rather it's more cost effect to do something like this. I'd bet money that this software was developed on a PC and is now deployed on a Unix box, and the original developers would be the ones that don't understand testing the type of stdout, and then adjusting accordingly. – slm Aug 30 '13 at 13:59
  • @psusi Thank you. In java code how to detect whether terminal or not ??? – Ram Sep 03 '13 at 10:11
0

Removal of ^M without special signs:

$ tr -d '\015' <file1 >file2 

$ mv file2 file1
42n4
  • 126