-4

I'd like to extract the extension out of a filename with multiple dots. For instance:

gbamidi-v1.0.tar.gz

I should get "tar.gz", not "gz" or "0.tar.gz".

I prefer a solution not relying on bash exclusive features (using posix commands like sed or cut is ok).

EDIT: A reasonable solution could be: "get everything after the second-last dot, but exclude numbers or substrings with a lenght <=1"

eadmaster
  • 1,643

2 Answers2

0

This may not be valid in all cases, but .gz is the extension. foo.tar.gz first has to be extracted to a foo.tar, then unarchived. The fact you can do that in one command is just convenience.

You need to get the extension? It's .gz.

If you need something else then you're going to need to target specific things using regex, awk, cut or the like.

n.st
  • 8,128
coteyr
  • 4,310
0

Assuming that an extension starts with a letter after the period, the following command prints .tar.gz:

echo gbamidi-v1.0.tar.gz | awk \
    'BEGIN { FS = "\." } \
    { \
        extension = ""; \
        i = NF; \
        while ((i > 1) && (substr($i, 1, 1) ~ /[A-Za-z]/)) { \
            extension = "." $i extension; \
            i-- \
        }; \
        print extension \
    }'
  • In zsh I'd do this:
    var=\`echo version12.0.foo.tar.gz | sed 's,[[:digit:]]\+.[[:digit:]]\+,,'\` ; echo ${var#*.}
    
    

    just strip out any decimal numbers, assign the results to variable 'var' then strip out anything up to and including the first dot from 'var'.

    – Ray Andrews Jan 28 '15 at 00:56