2

I would ask you how to solve this problem: I need to prepend 0s to every line where the word has less than 4 letters.

Example input file:

30
1
508
A0EA
A0EB
A0EC
A0ED

Desired output should be:

0030
0001
0508
A0EA
A0EB
A0EC
A0ED

Many thanks in advance for your support.

Yilmaz
  • 349
Mac
  • 53
  • 1
    you can use awk+printf, see https://www.gnu.org/software/gawk/manual/html_node/Format-Modifiers.html#Format-Modifiers and https://www.gnu.org/software/gawk/manual/html_node/Printf-Examples.html – Sundeep Aug 05 '19 at 14:59
  • 1
    @Sundeep, not the easiest approach when the input is in hexadecimal like here. – Stéphane Chazelas Aug 05 '19 at 15:06
  • 1
    Are those values stored in a file or in a shell array or already being read in a shell loop for some other reason or something else? Are they hex numbers or something else? Do you ever have input strings already longer than 4 chars and, if so, should they be truncated some way or left as-is? – Ed Morton Aug 05 '19 at 15:34
  • @StéphaneChazelas oh, I didn't realize awk didn't have a way to zero fill a string – Sundeep Aug 05 '19 at 15:48

12 Answers12

5

You could add 4 leading zeros to all lines, and then get the 4 last characters of each:

sed 's/^/0000/; s/^.*\(.\{4\}\)/\1/' < file

Or to avoid truncating numbers that were more than 4 digits wide in the first place:

sed 's/^/0000/; s/^.\{1,4\}\(.\{4\}\)/\1/' < file
3

One more for the mix:

$ numfmt --format='%04.0f' --invalid=ignore < file
0030
0001
0508
A0EA
A0EB
A0EC
A0ED

numfmt is provided by the GNU Coreutils package.

steeldriver
  • 81,074
2

With GNU awk for strtonum() if your input is hex numbers:

$ awk '{printf "%04X\n", strtonum("0x"$0)}' file
0030
0001
0508
A0EA
A0EB
A0EC
A0ED

With GNU awk whether your input is hex or not:

$ awk '{print gensub(/ /,0,"g",sprintf("%4s",$0))}' file
0030
0001
0508
A0EA
A0EB
A0EC
A0ED

With any awk whether your input is hex or not:

$ awk '{v=sprintf("%4s",$0); gsub(/ /,0,v); print v}' file
0030
0001
0508
A0EA
A0EB
A0EC

or even:

$ awk '{$0=sprintf("%4s",$0); gsub(/ /,0)} 1' file
0030
0001
0508
A0EA
A0EB
A0EC
A0ED
Ed Morton
  • 31,617
1

Less elegant way:

cat file | sed 's/^\(...\)$/0\1/' | sed 's/^\(..\)$/00\1/' | sed 's/^\(.\)$/000\1/'
  • 1
    You don't need to pipe sed's output into another sed, at least not when you're just doing a series of simple transformations. You can use multiple -e arguments, or you can just use a semi-colon to separate sed commands. e.g. sed -e 's/foo/bar/' -e 's/abc/xyz' or sed -e 's/foo/bar/; s/abc/xyz/'. You don't need the cat, either - sed can read file(s) by itself. – cas Aug 06 '19 at 02:19
0

If the values are inside the shell already and are all hex numbers:

$ set -- 30 1 508 A0EA A0EB A0EC A0ED
$ for var; do printf '%04X\n' "0x$var"; done
0030
0001
0508
A0EA
A0EB
A0EC
A0ED

If the values of a line could be any string (even longer than 4 characters) and line contains the value(s) of a line, the solution becomes more complex:

[ "${#line}" -lt 4 ] && 
    printf '%0*d%s\n' "$((4-${#line}))" 0 "$line" || 
        printf '%s\n' "${line}"

will print the value with as many zeros as needed to make the string 4 characters long.

Then, for an external file (sed and awk solutions are faster for external files), make a loop and expand the code to make it more legible as:

while read -r line; do
    if [ "${#line}" -lt 4 ]; then 
        printf '%0*d%s\n' "$((4-${#line}))" 0 "$line"
    else
        printf '%s\n' "${line}"
    fi
done <file
  • Nice work with mixing couple of functions. – Mac Aug 05 '19 at 15:15
  • That will be extremely slow, see why-is-using-a-shell-loop-to-process-text-considered-bad-practice. Probably at least an order of magnitude slower than the sed or awk solutions. – Ed Morton Aug 05 '19 at 15:23
  • Yes, Ed, that is true. Well, but only for an external file. If the values are in the shell already in a variable like line, this is faster. –  Aug 05 '19 at 15:28
  • Good point, the OP didn't say either way so for all we know he has a shell array of values and is looping through them already for other reasons in which case you would not want to call sed or awk for each value separately. We should have asked some questions before jumping to posting answers - I've added a comment below the question now. – Ed Morton Aug 05 '19 at 15:38
0

With the zsh shell, you can use the l:length::string: left-padding parameter expansion flag.

$ var=FF
$ echo ${(l:4::0:)var}
00FF

To apply it on each word in a file:

printf '%s\n' ${(l:4::0:)$(<file)}

Note that that operator also truncates words larger than 4 characters.

0
$ perl -ne 'printf "%05s", $_' ip.txt
0030
0001
0508
A0EA
A0EB
A0EC
A0ED

Using 5 instead of 4 here, as there's also a newline character in each line. Lines with more than 5 characters will be printed as is.

Sundeep
  • 12,008
0

Using Posix sed:

  sed -e :a -e 's/^.\{0,3\}$/0&/;ta'
0

This is kinda lame, (files must be shorter than getconf ARG_MAX bytes), but it works:

printf '%4s\n' $(<file) | tr ' ' 0
agc
  • 7,223
0
cat file.txt | awk -vlen=4 '{
  add=""          #empty prefix to be added

  if(length($1)!=len){
       for(i=(len-length($1));i<=length($1);i++)
           add=add"0"          #add prefix as necessary
  }

  print $1""add

 }'

Feel free to change the variable len to your liking.

zabidima
  • 11
  • 3
0

Use xargs on macOS or FreeBSD:

xargs printf '%04s\n' < file

This runs printf with arguments from file to the limit of command line arguments per call to printf.

Use sed on Linux and the *BSD:

sed -e 's/^\(.\)$/000\1/' -e 's/^\(..\)$/00\1/' -e 's/^\(...\)$/0\1/'

Short and efficient.

James Risner
  • 1,282
  • What printf implementation does 0-padding with %04s? With the printf implementations I tried I get either the same as %4s (space padding) or an error. – Stéphane Chazelas Dec 25 '22 at 08:54
  • @StéphaneChazelas thanks for the catch, printf functionality differs between *BSD and Linux. So I clarified that and added a sed version a one command (no pipes). – James Risner Dec 25 '22 at 13:31
-1

I have done by using if condition and awk

count_line=`awk '{print NR}' p.txt| sed -n '$p'`

for ((i=1;i<=$count_line;i++)); do j=`awk -v i="$i" -F "" 'NR==i{print NF}' p.txt`; if [[ $j == "1" ]]; then awk -v i="$i" -F "" 'NR==i{print "000"$0}' p.txt; elif [[ $j == "2" ]]; then awk -v i="$i" -F "" 'NR==i{print "00"$0}' p.txt ; elif [[ $j == "3" ]]; then awk -v i="$i" -F "" 'NR==i{print "0"$0}' p.txt; else awk -v i="$i" 'NR==i{print $0}' p.txt; fi; done

output

0030
0001
0508
A0EA
A0EB
A0EC
A0ED