6

With the GNU find command (GNU findutils 4.4.2), a regular expression can be used to search for files. For example:

$ find pool -regextype posix-extended -regex ".*/mypackage-([a-zA-Z0-9.]+-[0-9]{1,2})-x86_64.pkg.tar.xz+"

Is it possible to extract the capture group defined by that expression and use it in a -printf argument?

So, given a found file called pool/mypackage-1.4.9-1-x86_64.pkg.tar.xz, I would like to include the 1.4.9-1 part in a printf expression.

Is this possible?

starfry
  • 7,442

2 Answers2

5

If you use

find pool -regextype posix-extended \
    -regex ".*/mypackage-([a-zA-Z0-9.]+-[0-9]{1,2})-x86_64\.pkg\.tar\.xz" \
    -printf '%f\n' |
  grep -Eo '[a-zA-Z0-9.]+-[0-9]{1,2}'

(assuming GNU grep as well), it should work for any path. The regex doesn't allow for any newlines, so there's no way to make it match for example a directory containing a similar name.

l0b0
  • 51,350
  • Note that in this particular case, you could replace the grep with a cut -d- -f2,3 to avoid the overhead of another regexp matching. – Stéphane Chazelas Jun 26 '14 at 12:09
  • When I used this, I got newlines embedded because the grep expression matches two parts. I guess the expression could be fixed but using cut sorts it. However, it's brittle: say the file was mypackage-other the fied numbers would be out. The other answer may be less efficient (not really an issue here) but it does work. – starfry Jun 26 '14 at 13:19
  • Yep, just thought it was nice to reuse the regex exactly – l0b0 Jun 26 '14 at 13:19
2

An alternative to l0b0's fine answer (shorter, but potentially slightly less efficient):

Assuming a (recent) GNU sed:

find pool -print0 |
  sed -znE 's|.*/mypackage-([[:alnum:].]+-[0-9]{1,2})-x86_64\.pkg\.tar\.xz$|\1|p'|
  tr '\0' '\n'

Note the expensive part of find is the walking down the tree which it will have to do anyway whether you have -regex or not. So here, we're doing the matching and reporting in sed instead.

  • This works great. I changed the sed field separator to a colon so more complex expressions that I have which include pipes also work. Thanks for taking the time to answer. – starfry Jun 26 '14 at 13:21