0

For a migration process, I need to do some replacement in my bash script.

So in my .txt files, I have these references for example:

[[File:My Image.png|400px|thumb|center|My Image]]
[[File:My Image.png|400px|thumb|center]]
[[File:My Image.png|400px|thumb]]
[[File:My Image.png|400px]]
[[File:My Image.png]]

What I need to do is to replace all these occurrences with this line (only - so no more size, description, or other informations):

[[File:My Image.png]]

So, I tried to do a is to build a PCRE regex to extract all images names with:

/File:(.*\..{3})/g

I've built this final command to find all occurrences in my .txt files and extract image name with my regex:

find . -name "*.txt" | xargs perl -i -p -e 's/File:(.*\..{3})/$1/g'

But, I've encountered some problems as:

  • An error:

xargs: unterminated quote

  • And finally I don't know how to use extracted image name to replace all occurrences (complete lines)

PS: I'm on MacOS system and use bash v4

2 Answers2

1

I've written a new regex which matches the whole [[...]], and replaces it by only the things you want to keep. It assumes that the filenames don't contain pipe | characters or the terminator ]]. I can't reproduce your issue with xargs, but I replaced it by find's -exec option anyway; the following works for me on Linux.

find . -name "*.txt" -exec perl -i -pe 's/(\[\[File:[^|]*).*?(\]\])/$1$2/g' '{}' +
haukex
  • 283
0

Try

find . -name '*.txt' -exec perl -i -pe 's/File:[^|]+\K\|[^]]+//g' {} \;
  • File:[^|]+ match File: followed by non | characters
  • \K so that we don't have to capture the preceding string and put it back in replacement section
  • \|[^]]+ match | followed by non ] characters to be deleted
  • Can also use sed -i '' 's/\(File:[^|]*\)|[^]]*/\1/g'instead of perl

Further reading:

Sundeep
  • 12,008
  • 1
    Hi, Thank you for your help and your very great explanation, amazing, it' working great as expected and it's very simple. Nice to see some further reading also, I will take a look as soon as possible ;) Have a nice day – Sébastien Robert Apr 04 '18 at 10:53
  • 1
    That regex will also match for example File:x|y (without the preceding [[). – haukex Apr 04 '18 at 11:00
  • @haukex true, and it is a good point :) I was modifying OP's attempt which didn't include the starting [[ or ending ]], so probably they do not make a difference for given input... – Sundeep Apr 04 '18 at 11:12