TL;DR
Just copy and use the function sigf
in the section A reasonably good "significant numbers" function:
. It is written (as all code in this answer) to work with dash.
It will give the printf
approximation to the integer part of N with $sig
digits.
About the decimal separator.
The first problem to solve with printf is the effect and use of the "decimal mark", which in US is a point, and in DE is a comma (for example). It is a problem because what works for some locale (or shell) will fail with some other locale. Example:
$ dash -c 'printf "%2.3f\n" 12.3045'
12.305
$ ksh -c 'printf "%2.3f\n" 12.3045'
ksh: printf: 12.3045: arithmetic syntax error
ksh: printf: 12.3045: arithmetic syntax error
ksh: printf: warning: invalid argument of type f
12,000
$ ksh -c 'printf "%2.2f\n" 12,3045'
12,304
One common (and incorrect solution) is to set LC_ALL=C
for the printf command. But that sets the decimal mark to a fixed decimal point. For locales where a comma (or other) is the common used character that is a problem.
The solution is to find out inside the script for the shell running it what is the locale decimal separator. That is quite simple:
$ printf '%1.1f' 0
0,0 # for a comma locale (or shell).
Removing zeros:
$ dec="$(IFS=0; printf '%s' $(printf '%.1f'))"; echo "$dec"
, # for a comma locale (or shell).
That value is used to change the file with the list of tests:
sed -i 's/[,.]/'"$dec"'/g' infile
That makes the runs on any shell or locale automatically valid.
Some basics.
It should be intuitive to cut the number to be formatted with the format %.*e
or even %.*g
of printf. The main difference between using %.*e
or %.*g
is how they count digits. One use the full count, the other needs the count less 1:
$ printf '%.*e %.*g' $((4-1)) 1,23456e0 4 1,23456e0
1,235e+00 1,235
That worked well for 4 significant digits.
After the number of digits has been cut from the number, we need an additional step to format numbers with exponents different than 0 (as it was above).
$ N=$(printf '%.*e' $((4-1)) 1,23456e3); echo "$N"
1,235e+03
$ printf '%4.0f' "$N"
1235
This works correctly. The count of the integer part (at the left of the decimal mark) is just the value of the exponent ($exp). The count of decimals needed is the number of significant digits ($sig) less the amount of digits already used on the left part of the decimal separator:
a=$((exp<0?0:exp)) ### count of integer characters.
b=$((exp<sig?sig-exp:0)) ### count of decimal characters.
printf '%*.*f' "$a" "$b" "$N"
As the integral part for the f
format has no limit, there is in fact no need to explicitly declare it and this (simpler) code works:
a=$((exp<sig?sig-exp:0)) ### count of decimal characters.
printf '%0.*f' "$a" "$N"
First trial.
A first function that could do this in a more automated way:
# Function significant (number, precision)
sig1(){
sig=$(($2>0?$2:1)) ### significant digits (>0)
N=$(printf "%0.*e" "$(($sig-1))" "$1") ### N in sci (cut to $sig digits).
exp=$(echo "${N##*[eE+]}+1"|bc) ### get the exponent.
a="$((exp<sig?sig-exp:0))" ### calc number of decimals.
printf "%0.*f" "$a" "$N" ### re-format number.
}
This first attempt works with many numbers but will fail with numbers for which the amount of available digits is less than the significant count requested and the exponent is less than -4:
Number sig Result Correct?
123456789 --> 4< 123500000 >--| yes
23455 --> 4< 23460 >--| yes
23465 --> 4< 23460 >--| yes
1,2e-5 --> 6< 0,0000120000 >--| no
1,2e-15 -->15< 0,00000000000000120000000000000 >--| no
12 --> 6< 12,0000 >--| no
It will add many zeros which are not needed.
Second trial.
To solve that we need to clean N of the exponent and any trailing zeros. Then we can get the effective length of digits available and work with that:
# Function significant (number, precision)
sig2(){ local sig N exp n len a
sig=$(($2>0?$2:1)) ### significant digits (>0)
N=$(printf "%+0.*e" "$(($sig-1))" "$1") ### N in sci (cut to $sig digits).
exp=$(echo "${N##*[eE+]}+1"|bc) ### get the exponent.
n=${N%%[Ee]*} ### remove sign (first character).
n=${n%"${n##*[!0]}"} ### remove all trailing zeros
len=$(( ${#n}-2 )) ### len of N (less sign and dec).
len=$((len<sig?len:sig)) ### select the minimum.
a="$((exp<len?len-exp:0))" ### use $len to count decimals.
printf "%0.*f" "$a" "$N" ### re-format the number.
}
However, that is using floating point math, and "nothing is simple in floating point": Why don’t my numbers add up?
But nothing in "floating point" is simple.
printf "%.2g " 76500,00001 76500
7,7e+04 7,6e+04
However:
printf "%.2g " 75500,00001 75500
7,6e+04 7,6e+04
Why?:
printf "%.32g\n" 76500,00001e30 76500e30
7,6500000010000000001207515928855e+34
7,6499999999999999997831226199114e+34
And, also, the command printf
is a builtin of many shells.
What printf
prints may change with the shell:
$ dash -c 'printf "%.*f" 4 123456e+25'
1234560000000000020450486779904.0000
$ ksh -c 'printf "%.*f" 4 123456e+25'
1234559999999999999886313162278,3840
$ dash ./script.sh
123456789 --> 4< 123500000 >--| yes
23455 --> 4< 23460 >--| yes
23465 --> 4< 23460 >--| yes
1.2e-5 --> 6< 0.000012 >--| yes
1.2e-15 -->15< 0.0000000000000012 >--| yes
12 --> 6< 12 >--| yes
123456e+25 --> 4< 1234999999999999958410892148736 >--| no
A reasonably good "significant numbers" function:
dec=$(IFS=0; printf '%s' $(printf '%.1f')) ### What is the decimal separator?.
sed -i 's/[,.]/'"$dec"'/g' infile
zeros(){ # create an string of $1 zeros (for $1 positive or zero).
printf '%.*d' $(( $1>0?$1:0 )) 0
}
# Function significant (number, precision)
sigf(){ local sig sci exp N sgn len z1 z2 b c
sig=$(($2>0?$2:1)) ### significant digits (>0)
N=$(printf '%+e\n' $1) ### use scientific format.
exp=$(echo "${N##*[eE+]}+1"|bc) ### find ceiling{log(N)}.
N=${N%%[eE]*} ### cut after `e` or `E`.
sgn=${N%%"${N#-}"} ### keep the sign (if any).
N=${N#[+-]} ### remove the sign
N=${N%[!0-9]*}${N#??} ### remove the $dec
N=${N#"${N%%[!0]*}"} ### remove all leading zeros
N=${N%"${N##*[!0]}"} ### remove all trailing zeros
len=$((${#N}<sig?${#N}:sig)) ### count of selected characters.
N=$(printf '%0.*s' "$len" "$N") ### use the first $len characters.
result="$N"
# add the decimal separator or lead zeros or trail zeros.
if [ "$exp" -gt 0 ] && [ "$exp" -lt "$len" ]; then
b=$(printf '%0.*s' "$exp" "$result")
c=${result#"$b"}
result="$b$dec$c"
elif [ "$exp" -le 0 ]; then
# fill front with leading zeros ($exp length).
z1="$(zeros "$((-exp))")"
result="0$dec$z1$result"
elif [ "$exp" -ge "$len" ]; then
# fill back with trailing zeros.
z2=$(zeros "$((exp-len))")
result="$result$z2"
fi
# place the sign back.
printf '%s' "$sgn$result"
}
And the results are:
$ dash ./script.sh
123456789 --> 4< 123400000 >--| yes
23455 --> 4< 23450 >--| yes
23465 --> 4< 23460 >--| yes
1.2e-5 --> 6< 0.000012 >--| yes
1.2e-15 -->15< 0.0000000000000012 >--| yes
12 --> 6< 12 >--| yes
123456e+25 --> 4< 1234000000000000000000000000000 >--| yes
123456e-25 --> 4< 0.00000000000000000001234 >--| yes
-12345.61234e-3 --> 4< -12.34 >--| yes
-1.234561234e-3 --> 4< -0.001234 >--| yes
76543 --> 2< 76000 >--| yes
-76543 --> 2< -76000 >--| yes
123456 --> 4< 123400 >--| yes
12345 --> 4< 12340 >--| yes
1234 --> 4< 1234 >--| yes
123.4 --> 4< 123.4 >--| yes
12.345678 --> 4< 12.34 >--| yes
1.23456789 --> 4< 1.234 >--| yes
0.1234555646 --> 4< 0.1234 >--| yes
0.0076543 --> 2< 0.0076 >--| yes
.000000123400 --> 2< 0.00000012 >--| yes
.000001234000 --> 2< 0.0000012 >--| yes
.000012340000 --> 2< 0.000012 >--| yes
.000123400000 --> 2< 0.00012 >--| yes
.001234000000 --> 2< 0.0012 >--| yes
.012340000000 --> 2< 0.012 >--| yes
.123400000000 --> 2< 0.12 >--| yes
1.234 --> 2< 1.2 >--| yes
12.340 --> 2< 12 >--| yes
123.400 --> 2< 120 >--| yes
1234.000 --> 2< 1200 >--| yes
12340.000 --> 2< 12000 >--| yes
123400.000 --> 2< 120000 >--| yes
%f
/%g
, but that's theprintf
argument, and one doesn't need a POSIXprintf
to have a POSIX shell. I think you should have commented instead of editing there. – Toby Speight Mar 09 '18 at 11:02printf %g
cannot be used in a POSIX script. It's true it's down to theprintf
utility, but that utility is builtin in most shells. The OP tagged as bash, so using a bash shebang is one easy way to get a printf that supports %g. Otherwise, you'd need to add a assuming your printf (or the printf builtin of yoursh
ifprintf
is builtin there) supports the non-standard (but quite common)%g
... – Stéphane Chazelas Mar 09 '18 at 11:30dash
's has a builtinprintf
(which supports%g
). On GNU systems,mksh
is probably the only shell these days that won't have a builtinprintf
. – Stéphane Chazelas Mar 09 '18 at 11:41bash
) and relegate some of this to notes - does it look correct now? – Toby Speight Mar 09 '18 at 11:48printf "%.3g\n" 0.400
gives 0.4 not 0.400 – phiresky Jan 13 '20 at 16:36