I am trying to replace all the image source URLs in an HTML file from a list of URLs in a text file.
File1.html
<td class="MetadataRes" width="380px" colspan="2" style="border-top: 1px #336699 solid;">
<a olv_link="/Default/Scripting/ArticleWin.asp?From=Search&Key=Orange/2011/03/27/129/Ad12911.xml&CollName=Orange_APA3&DOCID=2485870&PageLabelPrint=H2&Skin=%4f%72%61%6e%67%65%43%6f%75%6e%74%79%52%65%67%69%73%74%65%72&AW=%31%34%31%32%36%32%38%32%31%34%35%30%32&sPublication=%4f%72%61%6e%67%65&sScopeID=%44%52&SECTION=%43%6c%61%73%73%69%66%69%65%64&sSorting=%53%63%6f%72%65%2c%64%65%73%63&sQuery=%72%65%67%69%73%74%65%72%65%64%20%6e%75%72%73%65%20%3c%4f%52%3e%20%52%4e&rEntityType=&sSearchInAll=%66%61%6c%73%65&sDateFrom=%25%33%30%25%33%35%25%32%66%25%33%30%25%33%31%25%32%66%25%33%32%25%33%30%25%33%31%25%33%30&sDateTo=%25%33%30%25%33%35%25%32%66%25%33%33%25%33%31%25%32%66%25%33%32%25%33%30%25%33%31%25%33%31&dc:creator=&PageLabel=&dc:publisher=&RefineQueryView=&StartFrom=%30" href="javascript:void(0);" onclick="window.top.sys.openArtWin(this.getAttribute('Olv_link'))">
<img src="/Repository/GetImage.dll?baseHref=Orange/2011/03/27&EntityID=Ad12911&imgExtension=">
</a>
</td>...
* See full file here: http://pastebin.com/XbwtZJPa
File2.txt
/getimage.dll?path=Orange/2011/03/27/129/Img/Ad1291103.gif
/getimage.dll?path=Orange/2011/03/20/133/Img/Ad1330402.gif
/getimage.dll?path=Orange/2010/08/29/137/Img/Ad1372408.gif
I want to replace the URL for the image in the above HTML file with the first URL listed in the URL file to get the following:
Result.html
<td class="MetadataRes" width="380px" colspan="2" style="border-top: 1px #336699 solid;">
<a olv_link="/Default/Scripting/ArticleWin.asp?From=Search&Key=Orange/2011/03/27/129/Ad12911.xml&CollName=Orange_APA3&DOCID=2485870&PageLabelPrint=H2&Skin=%4f%72%61%6e%67%65%43%6f%75%6e%74%79%52%65%67%69%73%74%65%72&AW=%31%34%31%32%36%32%38%32%31%34%35%30%32&sPublication=%4f%72%61%6e%67%65&sScopeID=%44%52&SECTION=%43%6c%61%73%73%69%66%69%65%64&sSorting=%53%63%6f%72%65%2c%64%65%73%63&sQuery=%72%65%67%69%73%74%65%72%65%64%20%6e%75%72%73%65%20%3c%4f%52%3e%20%52%4e&rEntityType=&sSearchInAll=%66%61%6c%73%65&sDateFrom=%25%33%30%25%33%35%25%32%66%25%33%30%25%33%31%25%32%66%25%33%32%25%33%30%25%33%31%25%33%30&sDateTo=%25%33%30%25%33%35%25%32%66%25%33%33%25%33%31%25%32%66%25%33%32%25%33%30%25%33%31%25%33%31&dc:creator=&PageLabel=&dc:publisher=&RefineQueryView=&StartFrom=%30" href="javascript:void(0);" onclick="window.top.sys.openArtWin(this.getAttribute('Olv_link'))">
<img src="/Repository/getimage.dll?path=Orange/2011/03/27/129/Img/Ad1291103.gif">
</a>
</td>...
Is there a recommended shell command to do this? I considered the following sed command on my Mac running 10.9 but ran into errors.
$ gsed -e 's/.*SRC="\/Repository\([^"]*\)".*/\1/p{r File1.html' -e 'd}' File2.txt
/Repository/([\w-]+(?:(?:.[\w-]+)+))([\w-.,@?^=%&:/~+#]*[\w-@?^=%&/~+#])?
– rbroadus Oct 15 '14 at 10:51