13

A friend of mine points out that if you do:

perl -pi.bak -e 's/foo/bar/' somefile

when "somefile" is actually a symlink, perl does just what the docs say it will do:

It does this by renaming the input file, opening the output file by the original name, and selecting that output file as the default for print() statements. The extension, if supplied, is used to modify the name of the old file to make a backup copy [...]

Which results in a new symlink "somefile.bak" pointing to the unchanged real file, and a new, changed regular file "somefile" with the changes.

In many cases, following the symlink would be the desired behavior (even if it leaves the correct location of the .bak file ambiguous). Is there a simple way to do this other than testing for symlinks in a wrapper and handling the case appropriately?

(sed does the same thing, for what that's worth.)

mattdm
  • 40,245
  • 1
    Call vim or emacs (I think both do follow symlinks)? Seriously, I fear the answer is to reimplement -p -i in your script. – Gilles 'SO- stop being evil' Mar 15 '11 at 21:20
  • Sed does not in fact do the same thing, at least not since version 4.2.1, released in June of 2009. You must include the --follow-symlinks option for the edit to edit the linked-to file rather than clobbering the symlink; I assume this was done to avoid breakage of existing scripts which may depend on the old behavior. – Garrett Sep 07 '14 at 16:49

3 Answers3

6

I wonder whether the small sponge general-purpose utility ("soak up standard input and write to a file") from moreutils will be helpful in this case and whether it will follow the symlink.

The author describes sponge like this:

It addresses the problem of editing files in-place with Unix tools, namely that if you just redirect output to the file you're trying to edit then the redirection takes effect (clobbering the contents of the file) before the first command in the pipeline gets round to reading from the file. Switches like sed -i and perl -i work around this, but not every command you might want to use in a pipeline has such an option, and you can't use that approach with multiple-command pipelines anyway.

I normally use sponge a bit like this:

sed '...' file | grep '...' | sponge file
  • Hey cool, that works. I suspect Gilles's comment above is the "real" answer, but since he didn't make it as an answer, and since I learned a new utility, I'll take this one. :) – mattdm Mar 16 '11 at 00:26
  • But have you tested sponge for such a usage in practice? As for me: not yet. Could you please leave a comment stating whether it behaved in a test the way wanted here? Oh, I see the comment. Thanks for the confirmation! – imz -- Ivan Zakharyaschev Mar 16 '11 at 00:28
  • — yes, I tried it before replying. When file in your example above is a symlink, the link is left alone and the real file changed. – mattdm Mar 16 '11 at 00:38
  • @mattdm: Yes, thanks for the confirmation! (I noticed your words "that works" a bit later than I wrote my comment.) – imz -- Ivan Zakharyaschev Mar 16 '11 at 00:42
3

If you know for a fact that somefile is a symlink, you can explicitly supply the full path to file with the following:

perl -pi.bak -e 's/foo/bar/' $(readlink somefile)

That way, the original symlink stays intact since the replacement is now done directly with the original file.

IDDQD
  • 253
3

If you're dealing with a single file that, the command would look as follows:

perl -i -pe'...' -- "$qfn"

The solution then is the following:

perl -i -pe'...' -- "$( readlink -e -- "$qfn" )"

This handles both symlinks and non-symlinks.


If you're dealing with an arbitrarily large number of files, the command would look as follows:

... | xargs -r perl -i -pe'...' --

The solution then is the following:

... | xargs -r readlink -ze -- | xargs -r0 perl -i -pe'...' --

This handles both symlinks and non-symlinks.


Caveat readlink is not a standard command, so it's presence, its arguments, and its functionality varies by system, so be sure to check your documentation. For example, some implementations have -f (which you could also use here), but not -e, some neither. busybox's readlink -f behaves like GNU readlink -e. And some systems have a realpath command instead of readlink which generally behaves like GNU readlink -f.

Beware that with GNU readlink -e, symlinks that can't be resolved to an existing file are silently removed.

ikegami
  • 145
  • @Stéphane Chazelas, $var:P doesn't work for me. (X='tmp' zsh -c 'perl -E"say for @ARGV" $X:P' outputs tmp:P) – ikegami Nov 04 '18 at 09:59
  • you need zsh 5.3 (2016) or above for $var:P. With older versions, you can always use $var:A instead. See the manual for the differences and https://www.zsh.org/mla/users/2016/msg00593.html for rationale for introducing :P (the real realpath() interface). – Stéphane Chazelas Nov 04 '18 at 10:50
  • (:A was added in 4.3.10 (2009), GNU readlink -z in 2012, I'm not aware of any other readlink implementation that supports -z). Note that with GNU xargs, you generally want the -r option unless you can guarantee the input won't be empty. Here, it's no big deal (unless the perl code contains a BEGIN/END statement), you'd just get a -i used with no filenames on the command line, reading from STDIN. warning if that happens. – Stéphane Chazelas Nov 04 '18 at 11:07
  • Oops, I misread -r's documentation. I thought I did the opposite of what it does. Readded. – ikegami Nov 04 '18 at 11:14