4

Git Bash is a nice bash shell you get in Windows as part of the installation of Git. It comes with other typical unix tools bundled inside, such as grep, sed, awk, perl. It doesn't have the file command.

In this shell, I want to detect files that have DOS-style line endings. I thought this command would work but it doesn't:

grep -l ^M$ *

It doesn't work, even files that don't have CR line endings match. For example if I create 2 sample files hello.unix and hello.dos, I can confirm with wc that hello.unix has 6 characters and hello.dos has 7 characters because of the extra CR, but both files match with grep. That is:

$ cat hello.*
hello
hello

$ wc hello.*
      1       1       7 hello.dos
      1       1       6 hello.unix
      2       2      13 total

$ grep -l ^M hello.*
hello.dos
hello.unix

Is this a bug in the implementation of grep in Git Bash? Is there another way to find all files with DOS-style line endings?

janos
  • 11,341

5 Answers5

4

EDIT: Silly me. Of course ^M is CR; and your command should work (works on my system). However, you need to type Ctrl-V Ctrl-M to get the literal '\r'/CR (and not two characters, ^ and M).

Alternatives:

Do this:

find dir -type f -print0 | xargs -0 grep -l `printf '\r\n'`

Or this:

find dir -type f -print0 | xargs -0 grep -lP '\r\n'

You can also use the file utility (not sure if it comes with GIT bash):

find dir -type f -print0 | xargs -0 file | grep CRLF
January
  • 1,937
  • Yeah it should normally work, but it does not work in Git Bash. Unfortunately none of your tips work: the printf doesn't work, grep doesn't have -P flag, and there is no file command in Git Bash... Any other ideas? – janos Oct 11 '12 at 10:01
  • How about \echo -e '\r\n'``? Should do the same as printf. – January Oct 11 '12 at 10:13
  • Same as printf, none of the files match that way. – janos Oct 11 '12 at 12:07
  • Um, so most likely a bug. Do you have a C compiler installed? I can write you a little C program that should work. – January Oct 11 '12 at 12:41
  • No gcc or cc comes with Git Bash. I don't want to install cygwin. So, thanks, but nevermind. My workaround is to convert all files which I suspect to have CR line-endings. The files which don't have will not be changed so it should be ok, it's just an ugly solution. – janos Oct 11 '12 at 13:00
  • The C program does not have to be compiled by Cygwin, it can be compiled for Windows using any other C compiler (like VS Express). – Didi Kohen Oct 11 '12 at 19:41
  • Thanks @DavidKohen but I'd rather install cygwin than VS* ;-) – janos Oct 12 '12 at 08:01
  • I just thought of another thing. Do you have sed? Could you try sed -n '/\r$/p'? Does it show a different result for the CRLF terminated files? – January Oct 12 '12 at 08:11
2

I don't know about git bash, but maybe

if [ "$(tr -cd '\r' < file | wc -c)" -gt 0 ]; then
  echo there are CR characters in there
fi

would work. The idea being not to use text utilities that may treat the CR and LF characters specially.

If that doesn't work, then maybe

if od -An -tx1 < file | grep -q 0d; then
  echo there are CR characters in there
fi

To hook into find:

find . -type f -exec sh -c 'od -An -tx1 < "$1" | grep -q 0d' sh {} \; -print
  • No od in Git Bash. But the tr trick is pretty clever and it works! There is just one relatively minor issue with it: it detects CR anywhere in the file, but I need to detect CR at line endings. You have lead me to the solution, using sed instead of tr. See my answer for more details, and thanks again! – janos Oct 12 '12 at 08:44
  • When is CR used outside of line endings? – Didi Kohen Oct 12 '12 at 11:46
2

@sch has lead me to this solution:

sed -bne '/\r$/ {p;q}' < /path/to/file | grep -q .

This exits with TRUE if the file has any lines ending with CR. To hook this into find:

find /path/to/ -type f -exec sh -c 'sed -bne "/\r$/ {p;q}" < "$1" | grep -q .' sh {} \; -print

And I think I know why grep -l ^M hello.* doesn't work in this shell: it seems that in Git Bash ^M characters are removed from all command line arguments, so grep never actually receives the character, and therefore all files match. This behavior is not only on the command line, but in shell scripts too.

So the key is to express the ^M character with other symbols, such as \r, instead of literally.

janos
  • 11,341
0

Use the file command on Linux/Ubuntu. If the file is in DOS format, the output will include the words, "with CRLF line terminators". If the file is in UNIX format, no such words will be in the output. In the example, below, del.txt is in DOS format and del is in UNIX format.

$ file del.txt
del.txt: C source, ASCII text, with CRLF line terminators
$ echo "hello" > del
user@decatur2:~/manpuriav$ file del
del: ASCII text
Vivek
  • 101
0

You can solve it using Python:

import string
import fileinput

for line in fileinput.input():
    if (string.find(line,"\r")!=-1):
        print fileinput.filename()
        fileinput.nextfile()

This small python file would behave just like you would expect the grep (get a filename list and print the names with CR in them).

Didi Kohen
  • 1,841