1

I send a Windows file using FTP to a Unix system and got appended ^M wherever a new line was intended, and I just want to remove them.

One method which I can opt is to run dos2unix command.

Can anyone suggest another method like sed command to remove such patterns?

clk
  • 2,146
  • 4
    If you can run a sed script, you can run dos2unix, unless you want other edition (removing some lines) dos2unix is "perfect" for the job. – Archemar May 13 '15 at 10:48
  • 1
    Kids these days. ;) FTP has a "ascii" mode (as opposed to "binary") which is supposed to take care of converting newlines in text files. With the good old ftp(1) program this is activated with the ascii command (run it before transferring the file, then run binary to get back). With other clients there might be other ways to achieve that, f.i. with lftp(1) you have to run get -a file. – lcd047 May 13 '15 at 14:56
  • Just to be clear: The ^M were there all along. It's part of how Windows ends lines, so on Windows you just see a newline. You can ask ftp to remove them during the transfer by activating ascii mode. – alexis May 13 '15 at 22:23

4 Answers4

2

Windows line endings consist of the two-character sequence CR, LF. CR is the carriage return character, sometimes represented as \r, \015, ^M, etc. A Unix line ending is just the LF character.

A way to convert Windows line endings to Unix line endings using only standard utilities present on all Unix variants is to use the tr utility.

tr -d '\r' <thefile >thefile.new && mv thefile.new thefile

If the file already has Unix line endings, its content won't be changed.

If you have many files to transform in the current directory, you can use a loop. Assuming that you don't have any files whose name ends with .new:

for x in *; do
  tr -d '\r' <"$x" >"$x.new" && mv "$x.new" "$x"
done

Under Linux (excluding some embedded Linux systems) or Cygwin, you can use sed. The -i option to edit a file in place is specific to these systems. The notation \r for a CR character is more widespread but not universal.

sed -i -e 's/\r//g' thefile
  • 1
    I would suggest the sed body 's/\r$//' -- the anchor will only replace carriage returns when they occur immediately before a newline. This presumes that if a carriage return occurs anywhere else, it's supposed to be there. – glenn jackman May 14 '15 at 10:46
  • Also, using tr you can't fix just line endings, it will change \r everywhere, if it happens to occur anywhere else apart from line endings. – spuk May 14 '15 at 13:07
  • @spuk, glenn: What's the point? A Windows text file doesn't contain CR in the middle of lines anyway. A more robust solution would replace CR by LF except for condensing CRLF to a single LF, to account for Mac line endings. I can't see any reason to preserve CR anywhere. – Gilles 'SO- stop being evil' May 14 '15 at 13:32
  • For me it is a matter of correctness: I don't know, and don't want to worry, if there are valid uses for \r out of line endings, so I prefer to fix only what needs fixing. One case I can think of right now would be output of rsync -P ..., which uses \r to rewrite the progress line (wget and other tools probably do that too). – spuk May 14 '15 at 14:02
1

dos2unix:

sed -i -r -e 's/\r$//' file

unix2dos:

sed -i -r -e 's/$/\r/' file
cuonglm
  • 153,898
spuk
  • 368
1

sed -i -r -e 's/\r$//' file for "dos2unix" is preferable to sed -i -e 's/\r//g' file

With the latter, if I ran it on a classic-mac style file (which have newlines as '\r'), then the new file will not only not be unix style, but will have no newlines at all. Everything will be in one line.

edit: As well, as has been mentioned in comments, it's also preferable to sed 's/^M//g' file, as the ^ symbol is sed code for the beginning of a line and so this removes every M that is at the start of a line. I created a text file with nothing but a leading M on several lines and with that sed command, I get nothing but newlines as output.

0

Using sed:

sed 's/^M//g' filename > newfilename
jcbermu
  • 4,736
  • 18
  • 26
  • Does that require ^M to be entered as the control sequence Ctrl-v Ctrl-m rather than as literal ^M? – steeldriver May 13 '15 at 11:48
  • It's the recommended way.

    However I copied and pasted the command to my bash and it works.

    – jcbermu May 13 '15 at 12:05
  • 2
    Literal ^M shouldn't work, since it's supposed to match M at the beginning of line. \r is probably better: sed 's/\r//g' file >newfile. Or just open the file in Vim, run :set ff=unix, and write it. – lcd047 May 13 '15 at 15:13
  • @steeldriver Yes, it does. The command in this answer removes the letter M when it appears at the beginning of a line. – Gilles 'SO- stop being evil' May 13 '15 at 22:09