7

I have files like this

$ cat trapetz
x = 0:0.0001:7pi
plot(x, sin(x).*cos(x))
Area = trapz(x, sin(x).*cos(x))
$ cat simpson 
f = inline(sin(x).*cos(x));
Area2 = quad(f, 0, 7pi, 1e-16)

I want something like this

$ cat -b -t MISSING? trapetz simpson 
     traapetz 
     1  x = 0:0.0001:7pi
     2  plot(x, sin(x).*cos(x))
     3  Area = trapz(x, sin(x).*cos(x))
     simpson
     1  f = inline(sin(x).*cos(x));
     2  Area2 = quad(f, 0, 7pi, 1e-16)

or even better if some easy way to add wc there:

$ find |tee |...|wc... I feel now reinventing the wheel, there must be some ready...
     traapetz: xyz chars
     1  x = 0:0.0001:7pi
     2  plot(x, sin(x).*cos(x))
     3  Area = trapz(x, sin(x).*cos(x))
     simpson: zyx chars
     1  f = inline(sin(x).*cos(x));
     2  Area2 = quad(f, 0, 7pi, 1e-16)

but I get

$ cat -b -t trapetz simpson 
     1  x = 0:0.0001:7pi
     2  plot(x, sin(x).*cos(x))
     3  Area = trapz(x, sin(x).*cos(x))
     4  f = inline(sin(x).*cos(x));
     5  Area2 = quad(f, 0, 7pi, 1e-16)

Not really cat needed but some easy tool to share and show code snippets like the above, not pastebin. I want some standard command line thing. I am trying to create easy puzzle -pasting for codegolf.se so people could reproduce things easily...

5 Answers5

5

Quick shell script:

#!/bin/sh
# usage: scriptname file1 file2 ...

for file in "$@"
do
    [ -f "$file" ] || continue
    set -- `wc "$file"`
    echo "${file}: lines $1 words $2 bytes $3"
    cat -b -t "$file"
done

It behaves like your sample output, so missing files are silently ignored.

3

A very rough awk implementation:

BEGIN{
    OLDFILENAME="";
}
FNR==1{
    if (OLDFILENAME != "") {
        printf("#### Processed (chars: %s - lines: %s)\n", FWC, FLC);
    }
    printf("#### Processing: %s\n", FILENAME);
    OLDFILENAME=FILENAME;
    FWC=0;
    FLC=0;
}
{
    printf("%04d - %s\n", FNR, $0);
    FWC = FWC + length($0);
    FLC = FLC + 1;
}
END{
    if (OLDFILENAME != "") {
        printf("#### Processed (chars: %s - lines: %s)\n", FWC, FLC);
    }
}

Execute awk -f AWKFILE trapetz simpson to get:

#### Processing: trapetz
0001 - x = 0:0.0001:7pi
0002 - plot(x, sin(x).*cos(x))
0003 - Area = trapz(x, sin(x).*cos(x))
#### Processed (chars: 70 - lines: 3)
#### Processing: simpson
0001 - f = inline(sin(x).*cos(x));
0002 - Area2 = quad(f, 0, 7pi, 1e-16)
#### Processed (chars: 57 - lines: 2)
andcoz
  • 17,130
2

tail -n +1 trapetz simpson will print each file with a leading header giving the file name. nl trapetz simpson prints line numbers but no file names. You'll need to use either awk or some shell glue to combine the two.

for x in trapetz simpson; do
  echo "$x: $(wc -c <"$x") bytes"
  nl "$x"
done

Here's an awk solution which prints the byte count at the bottom:

awk '
    FNR == 1 && NR != 1 {print "end", fn, chars, "characters"; bytes=0}
    END {print "end", fn, chars, "characters"}
    FNR == 1 {print "begin", FILENAME; fn=FILENAME}
    1 {chars += 1 + length; printf "%7d ", FNR; print}
' trapetz simpson
  • Line FNR == 1 && NR != 1... prints the wrong filename...    and it counts characters, not bytes (re non ASCII text) – Peter.O Oct 29 '11 at 16:48
  • BTW, tail -n +1 * will print each file with a leading header (formatted as “==> %s <==”) if there is more than one file.  (If there is only one file, it basically acts like cat.)  You can request the headers with -v. Also, it prints blank lines between the files, which increases readability but may be undesirable if the user specifically wants the output formatted as shown. … … … … … … … … … … … … … … … … … … … Also, isn’t “leading header” redundant? – G-Man Says 'Reinstate Monica' May 17 '23 at 22:51
1

Here is sed + wc + nl + cat (and a bash loop)

set trapetz simpson
for file in "$@" ;do
  { wc -l -m "$file"; cat "$file"; } | 
    sed -nr '1{N;s/(.*)\n(.*)/\2\x01\1/};p' | nl |
    sed -r '1{s/(.*)\x01 *([0-9]+) +([0-9]+) (.*)$/\4  (\2 lines, \3 chars)\n\1/};$s/.*/&\n/' 
done; echo "(${#@} files)"

Output

trapetz  (3 lines, 73 chars)
     1  x = 0:0.0001:7pi
     2  plot(x, sin(x).*cos(x))
     3  Area = trapz(x, sin(x).*cos(x))

simpson  (2 lines, 59 chars)
     1  f = inline(sin(x).*cos(x));
     2  Area2 = quad(f, 0, 7pi, 1e-16)
  
(2 files)

Here is sed + grep (no wc though).

Using sed for situations such as this is good for regex and sed juggling practice, but when the juggling gets to be too much, the ability to use a single tool (eg. awk) is ususlly the better option..

grep -nH '.' trapetz simpson | sed -nre 'G; s/^([^:]+):.*\n\1/&/; tp; h; s/^([^:]+).*/\1/p; g; :p; s/^[^:]+:([^:]+):(.*)\n.*/0000\1  \2/; s/^[^ ]*([^ ]{4})(.*)/\1\2/p; g; s/^([^:]+).*/\1/; h'

Or a more readable representation :)

grep -nH '.' trapetz simpson |
  sed -nre '
  G                      # pattern+=nl+hold
  s/^([^:]+):.*\n\1/&/   # if
      t printline        # when prev==curr branch to printline   
  : new_file             # when prev!=curr print header
      h                  # hold the pattern  
      s/^([^:]+).*/\1/p  # print header (filename)
      g                  # get the held pattern 
  : printline            # print current line (with line number) 
      s/^[^:]+:([^:]+):(.*)\n.*/0000\1  \2/   # zero pad number  
      s/^[^ ]*([^ ]{4})(.*)/\1\2/p            # number width = 4
      g                  # get the held pattern 
      s/^([^:]+).*/\1/   # extract filename
      h                  # hold the filename
'   

Output

trapetz
0001  x = 0:0.0001:7pi
0002  plot(x, sin(x).*cos(x))
0003  Area = trapz(x, sin(x).*cos(x))
simpson
0001  f = inline(sin(x).*cos(x));
0002  Area2 = quad(f, 0, 7pi, 1e-16)
Peter.O
  • 32,916
0

This command list:

echo -e "trapetz\nsimpson" | xargs -I fn sh -c "wc -c fn | sed 's/\(.*\) \(.*\)/\2: \1 chars/';cat -b -t fn"

Produces this output

trapetz: 73 chars
     1  x = 0:0.0001:7pi
     2  plot(x, sin(x).*cos(x))
     3  Area = trapz(x, sin(x).*cos(x))
simpson: 59 chars
     1  f = inline(sin(x).*cos(x));
     2  Area2 = quad(f, 0, 7pi, 1e-16)

Software versions are:

  • bash 4.2.53
  • GNU sed 4.2.2
  • xargs 4.5.11
  • cat 8.21
Tai Paul
  • 1,341