0

How can I check if a specific string is a floating points? This are possible floating points:

12.245
+.0009
3.11e33
43.1E11
2e-14

This is what I tried:

grep "^[+\-\.0-9]" 
grep "^[+-]*[0-9]"
grep "^[+\-\.0-9]" 

And other lots of related things, but none filtered anything at all. Almost every string got through. How would I tackle this problem?

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
O'Niel
  • 159

3 Answers3

3
grep -xE '[-+]?[0123456789]*\.?[0123456789]+([eE][-+]?[0123456789]+)?'

With -x, we're anchoring the regexp at the start and and of the line so the lines have to match that pattern as a whole as opposed to the pattern being found anywhere in the line.

If you wanted to match on all the ones supported by POSIX/C strtod() as recognised by many implementations of the printf utility for instance:

r=[$(locale decimal_point)]
d=[0123456789]
h=[0123456789abcdefABCDEF]
grep -xE "[[:space:]]*[-+]?($d*$r?$d+([eE][-+]?$d+)?|\
0[xX]$h*$r?$h*([pP][-+]?$d+)?|\
[iI][nN][fF]([iI][nN][iI][tT][yY])?|\
[nN][aA][nN]|\
NAN\([abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_]+\))"

So also including things like 0x3f, 0xFP-4, -Infinity, NAN(whatever).

$ printf '%g\n' 0x3f 0xFp-4 -Infinity 'NAN(whatever)'
63
0.9375
-inf
nan
1

Alternative Python solution (for less sophisticated input items):

Sample input.txt file:

11
12.245
+.0009
---0
3.11e33
43.1E11
2e-14
t12
aaa
10.001

check_float.py script:

import sys

with open(sys.argv[1], 'r') as inp:
    f = 'No'
    for l in inp.read().splitlines():
        try:
            if float(l) and '.' in l: f = 'Yes'
        except ValueError:
            f = 'No'
        finally:
            print '{0} - {1}'.format(l, f)

Usage:

python check_float.py input.txt

The output:

11 - No
12.245 - Yes
+.0009 - Yes
---0 - No
3.11e33 - Yes
43.1E11 - Yes
2e-14 - Yes
t12 - No
aaa - No
10.001 - Yes
0

Disclaimer: This is an imperfect solution. Perl's Scalar::Util::looks_like_number() function may not be the best choice of routine for doing this. See StéphaneChazelas comments below.

I'm leaving it here if anyone wants to look at it and pick the coproc bits out of it.


Rather than trying to craft your own regular expression to match the many possible floating point number format, use a library that already has implemented it:

perl -MScalar::Util -ne 'exit !Scalar::Util::looks_like_number($_)'

As a bash shell function:

is_number () {
    perl -MScalar::Util -ne 'exit !Scalar::Util::looks_like_number($_)' <<<"$1"
}

is_number hello && echo 'hello is a number'
is_number 1.234 && echo '1.234 is a number'

As a bash co-process (to avoid starting up a Perl process for every time you want to test a number):

coproc PERLIO=:raw perl -MScalar::Util -ne \
    'print Scalar::Util::looks_like_number($_) ? "Yes" : "No", "\n"'

while IFS= read -r -p 'Number please: ' possnum; do
    printf '%s\n' "$possnum" >&${COPROC[1]}
    read -u ${COPROC[0]}

    case "$REPLY" in
        Yes)    printf '%s is a number\n' "$possnum"       ;;
        No)     printf '%s is _not_ a number\n' "$possnum" ;;
    esac
done

kill "$COPROC_PID"

Or combining the two:

coproc PERLIO=:raw perl -MScalar::Util -ne \
    'print Scalar::Util::looks_like_number($_) ? "Yes" : "No", "\n"'

is_number () {
    printf '%s\n' "$1" >&${COPROC[1]}

    local REPLY
    read -u ${COPROC[0]}

    [ "$REPLY" = 'Yes' ] && return 0

    return 1
}

while IFS= read -r -p 'Number please: ' possnum; do
    if is_number "$possnum"; then
        printf '%s is a number\n' "$possnum"
    else
        printf '%s is a _not_ a number\n' "$possnum"
    fi
done

kill "$COPROC_PID"
Kusalananda
  • 333,661
  • But then, you don't know what it matches nor in what context those numbers may be used for. Does it match on 1,23 on locales where , is the radix? Does it match on nan, inf, 0x12, " 123 \t "? – Stéphane Chazelas Oct 12 '17 at 20:45
  • My version of perl (5.22.1) says that "- " (minus space) is a number. – Stéphane Chazelas Oct 12 '17 at 20:52
  • @StéphaneChazelas Ah, well. It was too fun to write for me to delete it. I'll put a disclaimer on top. Hopefully someone will find its use of coproc interesting. – Kusalananda Oct 12 '17 at 20:56
  • It would be nice to know exactly what it matches on actually. Not easy to tell from the source. For instance, it matches on "0 but true" as well. Can we make it honour the locale's decimal radix? – Stéphane Chazelas Oct 12 '17 at 21:04
  • @StéphaneChazelas It doesn't seem to do normal Perl parsing of numbers: hexadecimals and numbers containing _ are "not numbers". Tabs and spaces seems to be allowed, but I'm mystified by - being a number while + is not. The locale does not seem to play in. – Kusalananda Oct 12 '17 at 21:18
  • Locale plays if you use use locale (or -Mlocale). I've just raised a bug about "- " ("\s*-\s+") – Stéphane Chazelas Oct 12 '17 at 21:39
  • @StéphaneChazelas Then locales on my system are broken. I tried 2,2 with LC_ALL=sv_SE.UTF-8 and got "not a number". The space may be dropped from the minus btw, and it's still a number. – Kusalananda Oct 12 '17 at 21:42
  • LC_ALL=sv_SE.UTF-8 perl -Mlocale -MScalar::Util -e 'print 0+Scalar::Util::looks_like_number($_) for "1,2", "1.2", "-", "- "' prints 1101 for me. – Stéphane Chazelas Oct 12 '17 at 21:48
  • @StéphaneChazelas 0101 here. Perl v5.24.2 on OpenBSD 6.2. But with my shell script it says - (no space) is a number. Strange. – Kusalananda Oct 12 '17 at 21:51
  • https://rt.perl.org/Public/Bug/Display.html?id=132278 – Stéphane Chazelas Oct 12 '17 at 21:55
  • In your script, you're testing "-\n" (so it's another "\s*-\s+"). Add the "-l" option to strip the "\n" – Stéphane Chazelas Oct 12 '17 at 21:56
  • @StéphaneChazelas Ah. Good catch! Thanks. And I saw your bug report already. Let's see what comes out of it. This comment section is too long now. I suggest we let it rest for the time being. I might pick out one or two things and put in my answer. Thanks. – Kusalananda Oct 12 '17 at 21:58
  • 1
    Regexp::Common may be a better alternative here (perl -MRegexp::Common=number -lne 'print if /^$RE{num}{real}$/') – Stéphane Chazelas Oct 13 '17 at 11:39
  • @StéphaneChazelas Thanks! When I have time... – Kusalananda Oct 13 '17 at 11:40