I have a bash script that scrapes a list of urls for links to various kinds of documents. At the end, the script uses wget
to download the files. However, I'm having trouble with filenames containing white space in the name: wget
ends the url at the space. Is there some way to use sed
or something to change the white space to %20
here? Or some other solution?
This is my code:
for url in $(cat download.md)
do
lynx --listonly --dump $url | \
awk '/\.(pdf|doc|docx|odt)$/{ sub("^[ ]*[0-9]+.[ ]*","",$0); print}'
done > ~/links.txt
for i in $( cat ~/links.txt ); do wget $i;
done
wget "${i// /%20}"
? – DopeGhoti Aug 29 '18 at 18:52for i in $( cat ... )
habit, I'd also recommend reading: https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice/169765#169765 – Jeff Schaller Aug 29 '18 at 19:06