23

I am looking for a way to tell awk to do high-precision arithmetic in a substitution operation. This involves, reading a field from a file and substituting it with a 1% increment on that value. However, I am losing precision there. Here is a simplified reproduction of the problem:

 $ echo 0.4970436865354813 | awk '{gsub($1, $1*1.1)}; {print}'
   0.546748

Here, I have a 16 digit after decimal precision but awk gives only six. Using printf, I am getting the same result:

$ echo 0.4970436865354813 | awk '{gsub($1, $1*1.1)}; {printf("%.16G\n", $1)}'
0.546748

Any suggestions on to how to get the desired precision?

Ketan
  • 9,226
  • 2
    Perhaps awk has higher resolution but it's just your output formatting is truncating. Use printf. – dubiousjim Nov 28 '12 at 15:25
  • No changes in result value after using printf. Question edited accordingly. – Ketan Nov 28 '12 at 15:32
  • As @manatwork has pointed out, that gsub is unnecessary. The problem is gsub works on strings, not numbers, so a conversion is done first using CONVFMT, and the default value for that is %.6g. – jw013 Nov 28 '12 at 15:46
  • @jw013, As I mentioned in the question, my original problem requires gsub since I need to substitute a number with a 1% increment. Agreed, in the simplified example, it is not required. – Ketan Nov 28 '12 at 15:48

3 Answers3

22
$ echo 0.4970436865354813 | awk -v CONVFMT=%.17g '{gsub($1, $1*1.1)}; {print}'
0.54674805518902947

Or rather here:

$ echo 0.4970436865354813 | awk '{printf "%.17g\n", $1*1.1}'
0.54674805518902947

is probably the best you can achieve. Use bc instead for arbitrary precision.

$ echo '0.4970436865354813 * 1.1' | bc -l
.54674805518902943
  • If you want arbitrary precision in AWK you can use the -M flag and set the PREC value to a large number – Robert Benson Apr 03 '18 at 20:09
  • 3
    @RobertBenson, only with GNU awk and only with recent versions (4.1 or above, so not at the time that answer was written) and only when MPFR was enabled at compile time though. – Stéphane Chazelas Apr 03 '18 at 20:37
6

For higher precision with (GNU) awk (with bignum compiled in) use:

$ echo '0.4970436865354813' | awk -M -v PREC=100 '{printf("%.18f\n", $1)}'
0.497043686535481300

The PREC=100 means 100 bits instead of the default 53 bits.
If that awk is not available, use bc

$ echo '0.4970436865354813*1.1' | bc -l
.54674805518902943

Or you will need to learn to live with the inherent imprecision of floats.


In your original lines there are several issues:

  • A factor of 1.1 is 10% increase, not 1% (should be a 1.01 multiplier). I'll use 10%.
  • The conversion format from a string to a (floating) number is given by CONVFMT. Its default value is %.6g. That limits the values to 6 decimal digits (after the dot). That is applied to the result of the gsub change of $1.

    $ a='0.4970436865354813'
    $ echo "$a" | awk '{printf("%.16f\n", $1*1.1)}'
    0.5467480551890295
    
    $ echo "$a" | awk '{gsub($1, $1*1.1)}; {printf("%.16f\n", $1)}'
    0.5467480000000000
    
  • The printf format g removes trailing zeros:

    $ echo "$a" | awk '{gsub($1, $1*1.1)}; {printf("%.16g\n", $1)}'
    0.546748
    
    $ echo "$a" | awk '{gsub($1, $1*1.1)}; {printf("%.17g\n", $1)}'
    0.54674800000000001
    

    Both issues could be solved with:

    $ echo "$a" | awk '{printf("%.17g\n", $1*1.1)}'
    0.54674805518902947
    

    Or

    $ echo "$a" | awk -v CONVFMT=%.30g '{gsub($1, $1*1.1)}; {printf("%.17f\n", $1)}'
    0.54674805518902947 
    

But don't get the idea that this means higher precision. The internal number representation is still a float in double size. That means 53 bits of precision and with that you could only be sure of 15 correct decimal digits, even if many times up to 17 digits look correct. That's a mirage.

$ echo "$a" | awk -v CONVFMT=%.30g '{gsub($1, $1*1.1}; {printf("%.30f\n", $1)}'
0.546748055189029469325134868996

The correct value is:

$ echo "scale=18; 0.4970436865354813 * 1.1" | bc
.54674805518902943

Which could be also calculated with (GNU) awk if the bignum library has been compiled in:

$ echo "$a" | awk -M -v PREC=100 -v CONVFMT=%.30g '{printf("%.30f\n", $1)}'
0.497043686535481300000000000000
0

My awk script is bigger than just a one liner, so I used the combination of Stéphane Chazelas's and Isaac's answers:

  1. I set the CONVFMT variable which will globally takes care of the output formatting
  2. I also use the bignum parameter -M along with the PREC variable

Example snippet:

#!/usr/bin/awk -M -f
BEGIN {
  FS="<|>"
  CONVFMT="%.18g"
  PREC=100
}
{
  if ($2 == "LatitudeDegrees") {
    CORR = $3 // redacted specific corrections
    print("     <LatitudeDegrees>" CORR "</LatitudeDegrees>");
  } else if ($2 == "LongitudeDegrees") {
    CORR = $3 // redacted specific corrections
    print("     <LongitudeDegrees>" CORR "</LongitudeDegrees>");
  } else {
    print($0);
  }
}
END {
}

OP simplified his example, but if the awk script is not a one liner you don't want to pollute it with printfs, but set the format like this in the variable. Likewise the precision so it don't get lost in the actual command line invocation.