125

I'd like to download, and extract an archive under a given directory. Here is how I've been doing it so far:

wget http://downloads.mysql.com/source/dbt2-0.37.50.3.tar.gz
tar zxf dbt2-0.37.50.3.tar.gz
mv dbt2-0.37.50.3 dbt2

I'd like instead to download and extract the archive on the fly, without having the tar.gz written to the disk. I think this is possible by piping the output of wget to tar, and giving tar a target, but in practice I don't know how to put the pieces together.

BenMorel
  • 4,587

6 Answers6

167

You can do it by telling wget to output its payload to stdout (with the flag -O-) and suppress its own output (with the flag -q):

wget -qO- your_link_here | tar xvz

To specify a target directory:

wget -qO- your_link_here | tar xvz -C /target/directory

If you happen to have GNU tar, you can also rename the output dir:

wget -qO- your_link_here | tar --transform 's/^dbt2-0.37.50.3/dbt2/' -xvz
Stephen Kitt
  • 434,908
Joseph R.
  • 39,549
42

Another option is to use curl which writes to stdout by default:

curl -s -L https://example.com/archive.tar.gz | tar xvz - -C /tmp
13

This oneliner does the trick:

tar xvzf -C /tmp/ < <(wget -q -O - http://foo.com/myfile.tar.gz)

short explanation: the right side in the parenthesis is executed first (-q tells wget to do it quietly, -O - is used to write the output to stdout).

Then we create a named pipe using the process substitution operator from Bash <( to create a named pipe. This way we create a temporary file descriptor and then direct the contents of that descriptor to tar using the < file redirection operator.

Daniel Serodio
  • 1,173
  • 1
  • 9
  • 14
ItsMe
  • 314
  • 1
  • 5
2

Named pipe with stdin solution and really mind the flags for tar's -xvz

tar -xvz -C /tmp/ -f <(wget -q -O - https://github.com/user/repo/release/download/v/v.tar.gz)
2

One liner that handles redirects and can extract tar.bz2 files. Use xzfor extracting gzip files.

curl -L https://downloads.getmonero.org/cli/linux64 | tar xj
Elijah
  • 123
0

The extraction part should take input from STDOUT. We may need tar -xzvf - -C <output_dir>

Example:


# this may not work
# It might complain 
# tar (child): -C: Cannot open: No such file or directory
wget -qO - https://dlcdn.apache.org/spark/spark-3.3.0/spark-3.3.0-bin-hadoop3-scala2.13.tgz | tar -xzvf -C /opt/spark --strip-component 1

this should work.

wget -qO - https://dlcdn.apache.org/spark/spark-3.3.0/spark-3.3.0-bin-hadoop3-scala2.13.tgz | tar -xzvf - -C /opt/spark --strip-component 1

  • How would one go about this using the wget -N flag? So only do this if the downloaded file has changed? I would imagine I would need to save the existing file for that? – fred Feb 13 '23 at 21:43