remove most but not all lines containing carriage return (\r)

Question

I have a process which outputs too many status-lines with carriage return (\r). I can filter all those status lines by piping them through

sed '/\r/d'

I would instead like to filter all of these lines except, e.g. every 3. Is this possible with standard Unix-Tools (awk?) or do I need a script for that? Lines without CR should be left untouched.

Given Output:

$ (printf '%s\n' {1..10};   printf  '%s\r\n' {1..10}; printf '%s\n' {1..10};)  | cat -v
1
2
3
4
5
6
7
8
9
10
1^M
2^M
3^M
4^M
5^M
6^M
7^M
8^M
9^M
10^M
1
2
3
4
5
6
7
8
9
10

Wanted output (or any other pattern):

Do these lines have CR+LF and you only want to remove the CR or are the lines only CR-separated (classic MacOS style) and you want to remove most of those CRs? — Philippos, Jun 03 '22 at 13:12
Does this answer your question? Printing every Nth line out of a large file into a new file — muru, Jun 03 '22 at 14:32
I clarified my question. Lines without CR should be left untouched. — David Weber, Jun 03 '22 at 14:37

Ed Morton · Accepted Answer · 2022-06-03T14:42:12.943

1

$ awk '!(/\r$/ && ((++c)%3 != 1))' file | cat -v
1
2
3
4
5
6
7
8
9
10
1^M
4^M
7^M
10^M
1
2
3
4
5
6
7
8
9
10

Original answer:

Sounds like all you need is this, using any awk:

awk -v RS='\r' '{ORS=(NR%10000 ? "" : RS)} 1'

e.g. using this as input:

$ printf '%s\r\n' {1..10} | cat -v
1^M
2^M
3^M
4^M
5^M
6^M
7^M
8^M
9^M
10^M

Removing all but every 3rd \r:

$ printf '%s\r\n' {1..10} | awk -v RS='\r' '{ORS=(NR%3 ? "" : RS)} 1' | cat -v
1
2
3^M
4
5
6^M
7
8
9^M
10

edited Jun 03 '22 at 14:42

answered Jun 03 '22 at 12:10

Ed Morton

31,617

thank you for your answer. Unfortunately, my question wasn't very clear. I want to get rid of the lines completely, not just the \r. I've edited my question to be more clear. – David Weber Jun 03 '22 at 14:19
1

Thank you very much! – David Weber Jun 03 '22 at 14:56

score 0 · Answer 2 · answered Jun 05 '22 at 04:44

Using GNU sed, we use the hold space for counting.

sed -E '
  /\r$/{
    G;/\n$/P
    s/.*\n/./
    /.{3}/z;x;d
  }
' file

Using awk, we Use the variable c as a circular counter that gets reset whenver it reaches 3.

awk '
!/\r$/ || !c++
c==3{c=0}
' file

Assuming the carriage returns (\r) , whenever they occur, occur at the end of a line feed (\n) delimited record.

RARE Kpop Manifesto · Answer 3 · 2022-06-11T04:36:32.337

-1

Here's a fringe way to do it in awk :

{m,g}awk '((+$_ % 3) % NF)~(!_<NF)' FS='\r$'  # yes that's a 
                                              # tilde ~ not a minus -
1
2
3
4
5
6
7
8
9
10
1^M
4^M
7^M
10^M
1
2
3
4
5
6
7
8
9
10

Other ways of saying the same thing

mawk 'NF-!_== (+$+_   %    3    ) % NF' FS='\r$'
gawk 'NF-!_== ( $(_++)%(_+_+_--)) % NF' FS='\r$'

edited Jun 11 '22 at 04:36

answered Jun 11 '22 at 04:17

RARE Kpop Manifesto

84

remove most but not all lines containing carriage return (\r)

3 Answers3