Here are three versions of a perl script to do this. All three of them require that the first argument is the directory to be searched (e.g. ./app
or ./
). Remaining arguments are the name(s) of any markdown file(s) to be modified (e.g. ./app/file1.md
or ./app/*.md
)
They're all written to search only for .png
files, but that can easily be changed by changing the regular expressions and globs used.
Note: for all three scripts, delete the first #!
line if you want the script to modify the markdown file(s) instead of just print to stdout. The first #! is for testing, to verify it does what you want. The second with -i.bak
actually modifies the markdown file(s) in place (and copies the original to .bak - just change it it -i
if you don't want a backup copy made). See man perlrun
and search for -i
for details on how this option works.
Also note that the File::Basename and File::Find modules used are perl core library modules, included with perl. File::Basename
does essentially what the basename
command does, and File::Find
recursively searches directories, like the find
command.
Why perl and not sh or bash? Because shell is a terrible language for text or data processing. See Why is using a shell loop to process text considered bad practice? for some of the reasons why. Shell's job is to orchestrate other programs to do data processing work, not to do the data processing itself. Using shell to do data processing is like using a shovel when you need a set of screw-drivers, or a fork when you need a ladle.
All three versions were tested with the following files & directory structure:
app/attachments/img1.png
app/attachments/img2.png
app/attachments/more/img5.png
app/file1.md
app/file2.md
app/file3.md
app/img4.png
app/other/img3.png
First version
The first version is useful if the attachment files will only be found in ./app OR ./app/attachments.
If an attachment is found in the location specified by the ![[filename]]
markup, it is left as is. If not found, the script looks for it first in the top-level dir, then in the attachments/ subdirectory.
$ cat fix-paths1.pl
#!/usr/bin/perl -p
#!/usr/bin/perl -p -i.bak
BEGIN { $dir = shift };
use File::Basename;
if (/![[([^]]*.png)]]/i) {
$file = $1;
next if -f $file1;
$bn = fileparse($file);
if (-f "$dir/$bn") {
s/$file/$bn/
} elsif (-f "$dir/attachments/$bn") {
s/$file/attachments/$bn/
} else {
print STDERR "WARNING: Attachment '$file' does not exist. $ARGV:$.\n"
};
}
Sample run - file1.md is the same as in your question:
$ ./fix-paths1.pl ./app/ app/file1.md
Here is an image:
![[attachments/img1.png]]
Here is another image:
![[attachments/img2.png]]
Second version
The second version is useful if files may be found in any immediate sub-directory of ./app/ - i.e. app/attachments/ but not app/attachments/more/
It uses perl's glob
function build an array of .png
files in the specified directory (./app/
) and all immediate sub-directories. The array is used as a cache of all matching files because searching directories is a moderately "expensive" operation - definitely something you don't want to do repeatedly in a loop.
$ cat fix-paths2.pl
#!/usr/bin/perl -p
#!/usr/bin/perl -p -i.bak
use File::Basename;
BEGIN {
$dir = shift;
$dir =~ s:/+$::;
@png = glob("$dir/.png");
push @png, glob("$dir//*.png");
@png = map { s:^$dir/:: ? $_ : $_ } @png;
};
if (/![[([^]]*.png)]]/i) {
$file = $1;
next if -f $file1;
$bn = fileparse($file);
($found) = grep { m:(^|/)$bn$: } @png;
if ($found) {
s/$file/$found/;
} else {
print STDERR "WARNING: Attachment '$file' does not exist. $ARGV:$.\n"
};
}
Sample run:
$ cat app/file2.md
Here is an image:
![[img1.png]]
Here is another image:
![[attachments/img2.png]]
and another:
![[img3.png]]
$ ./fix-paths2.pl ./app/ ./app/file2.md
Here is an image:
![[attachments/img1.png]]
Here is another image:
![[attachments/img2.png]]
and another:
![[other/img3.png]]
This version found and corrected the paths for img1.png and img3.png.
Third version
The third version is useful if attachment files can be found in any sub-directory of ./app/
, no matter how many levels deep they are in the directory tree. The only difference between this and the second version is how it populates the @png
array. The second version uses the glob()
function, while the third uses File::Find
.
The @png array to cache the search results really shows its value here - a recursive directory search is an even more expensive operation than "simple" glob searches.
$ cat fix-paths3.pl
#!/usr/bin/perl -p
#!/usr/bin/perl -p -i.bak
use File::Basename;
use File::Find;
BEGIN {
$dir = shift;
$dir =~ s:/+$::;
sub wanted {
if (m/.png$/) {
($f = $File::Find::name) =~ s:^$dir/::;
push @png, "$f";
};
};
find(&wanted, $dir);
};
if (/![[([^]]*.png)]]/i) {
$file = $1;
next if -f $file1;
$bn = fileparse($file);
($found) = grep { m:(^|/)$bn$: } @png;
if ($found) {
s/$file/$found/;
} else {
print STDERR "WARNING: Attachment '$file' does not exist. $ARGV:$.\n"
};
}
Sample run:
$ cat app/file3.md
Here is an image:
![[img1.png]]
Here is another image:
![[attachments/img2.png]]
and another:
![[img3.png]]
and another:
![[attachments/img4.png]]
and another:
![[other/img5.png]]
$ ./fix-paths3.pl ./app/ ./app/file3.md
Here is an image:
![[attachments/img1.png]]
Here is another image:
![[attachments/img2.png]]
and another:
![[other/img3.png]]
and another:
![[img4.png]]
and another:
![[attachments/more/img5.png]]
This version also found that img5.png
was actually in attachments/more/
even though file3.md file said it was in other/
BUGS
It may be worthwhile stripping leading and trailing spaces from $file
, depending on whether or not you have excess spaces in the attachment filename and on how strictly your markdown interpreter deals with excess spaces. Add the following line after $file = $1;
:
$file =~ s/^\s*|\s*$//g;
If the .png file isn't found where the markdown file says it is, the second and third versions will return the first matching file, even if there's more than one file of the same name (yes, this is more of a design decision than an actual bug - I chose to write it this way). Sometimes this may not be the file you expect it to be - this is a natural result of the GIGO rule.
This could be "fixed" by counting the number of matches (hint: perl's built-in grep
function returns an array - the scripts above throw away all but the first result. The $found
variable could be replaced with an @found
array variable) and either printing an error message if there's more than one, or having some kind of heuristics to prefer attachment files in some directories over others (or to prefer more recent files over older ones, or older files over newer, or ...). The real fix is to edit the input markdown file to avoid ambiguity.
See perldoc -f grep
for details on perl's grep
function.
/full/path/to/the/image.png
or a relative path from your home directoryto/the/image.png
? – Chris Davies Dec 01 '21 at 12:43~/app/
so![[img1.png]]
should be changed to![[attachments/img1.png]]
(vs.![[~/app/attachments/img1.png]]
– Spencer Maroukis Dec 02 '21 at 13:05![[img1.png]]
and img1.png is not in ~/app/ but there are two different files of that name, one in ~/app/attachments/ and another in, say, ~/app/more-attachments/. Are they all.png
files? what should happen if file1.md says img3.png but there's only an img3.jpeg? or if there's an IMG3.PNG? – cas Dec 03 '21 at 04:34