4

I try to pipe a stream from wget to tar and extracting it to a specific location.
The file is downloaded by wget but not extracted as desired with tar:

war="/var/www/html"
domain="example.com"
downloaded_file="https://releases.wikimedia.org/mediawiki/1.33/mediawiki-1.33.0.tar.gz"
wget -P "${war}" "${downloaded_file}" | tar -xzvf ${downloaded_file} --transform="s,^${downloaded_file},${domain},"

set -x error:

tar: unrecognized option:
`--transform=s,^https://releases.wikimedia.org/mediawiki/1.33/mediawiki-1.33.0.tar.gz,example.com,'

Why piping of stream from wget to tar, and extracting it to a specific location failed?

  • You ask tar to extract a file whose name is an URL. You want "$(basename "$downloaded_file")" with tar, or something similar. Also, tar would have definitely given you an error message. – Kusalananda Aug 19 '19 at 15:56

3 Answers3

9

You can combine both commands and skip writing a file by instructing wget to write to its standard output:

wget https://releases.wikimedia.org/mediawiki/1.33/mediawiki-1.33.0.tar.gz -O - |
tar -xzvf -

This will cause tar’s output to be mixed with wget’s progress indicator, because it will start extracting the tarball while wget is still downloading it, so you may well want to adjust the output options.

You can use tar’s -C option to control where the files are extracted:

wget https://releases.wikimedia.org/mediawiki/1.33/mediawiki-1.33.0.tar.gz -O - |
tar -xzvf - -C /var/www/html

The target directory needs to exist before the command is run, so mkdir it if necessary first.

Stephen Kitt
  • 434,908
2

You're writing the downloaded data to a file, so you're not actually piping anything to tar. Pipes are only useful if you want the standard output of one program to become the standard input of another. Here, you are downloading a file and then want to open it with another tool, so pipes aren't useful.

The next issue is that your $downloaded_file is actually a URL. So when you tar -xzvf ${downloaded_file} you're actually running tar -xzvf https://releases.wikimedia.org/mediawiki/1.33/mediawiki-1.33.0.tar.gz and that will fail since that file doesn't exist (it's not a file, it's an internet address).

What you want to do is something like this:

war="/var/www/html"
targetUrl="https://releases.wikimedia.org/mediawiki/1.33/mediawiki-1.33.0.tar.gz"
fileName="${targetUrl##*/}"
wget "$targetUrl" -O "$war/$fileName" && 
    tar -xzvf "$war/$fileName"

I don't see why the -P option of wget would be relevant here, nor why you would need the --transform from tar, but if you must use it, you can do:

war="/var/www/html"
domain="example.com"
targetUrl="https://releases.wikimedia.org/mediawiki/1.33/mediawiki-1.33.0.tar.gz"
wget "$targetUrl" -O "$war/$fileName" && 
    tar -xzvf "$war/$fileName" --transform="s,^${targetUrl},${domain},"

I really doubt you do want these though. Why would https://releases.wikimedia.org/mediawiki/1.33/mediawiki-1.33.0.tar.gz be part of the paths in the mediawiki-1.33.0.tar.gz archive?

terdon
  • 242,166
  • Thank you; I had a mistake not to create two variables as you suggest and the https://releases.wikimedia.org/mediawiki/1.33 is indeed redundant in most cases and in others I need only mediawiki-1.33.0.tar.gz. –  Aug 19 '19 at 16:07
  • What's ##*/ in the regex there? A # comment would help, I believe. –  Aug 19 '19 at 16:07
  • @JohnDoea see https://www.tldp.org/LDP/abs/html/string-manipulation.html. The ${var##foo} syntax means "remove the longest match for the glob foo from the front of the variable var". – terdon Aug 19 '19 at 16:08
  • Thank you, I upvoted; seems I had an even more basic mistake of not creating the desired directory I want to extract to with mkdir first, as explained in chat by Stephen Kitt. I was thinking - maybe I should technically accept no answer and just add a note in question-body that I accept both answers, because I also had the problem you clue of piping stream consolidated in file and not the stream itself. –  Aug 20 '19 at 09:16
  • @JohnDoea I don't mind at all if you accept Stephen's answer! It's a great answer. Don't worry about it. – terdon Aug 20 '19 at 09:18
2

wget -qO - "https://releases.wikimedia.org/mediawiki/1.33/mediawiki-1.33.0.tar.gz" | tar -C /var/www/html zxvf -

markgraf
  • 2,860