249

How can we extract specific files from a large tar.gz file? I found the process of extracting files from a tar in this question but, when I tried the mentioned command there, I got the error:

$ tar --extract --file={test.tar.gz} {extract11}
tar: {test.tar.gz}: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now

How do I then extract a file from tar.gz?

  • 1
    Note that any solution must first decompress the entire archive: https://superuser.com/questions/655739/extract-single-file-from-huge-tgz-file If you want to avoid that, you will need to use another format, such as zip. ZIP will give less compression however, as it compresses individual file separately: https://superuser.com/questions/1013309/why-is-zip-able-to-compress-single-file-smaller-than-multiple-files-with-the-sam I wonder if there is a format that supports both speed and ability to extract a single file. – Ciro Santilli OurBigBook.com May 07 '19 at 16:22
  • 1
    @CiroSantilliOurBigBook.com you could use the parallel version of bzip2 and gzip to speed things up: https://askubuntu.com/questions/62607/whats-the-best-way-to-use-parallel-bzip2-and-gzip-by-default – zs11 Feb 17 '24 at 21:11

6 Answers6

279

You can also use tar -zxvf <tar filename> <file you want to extract>

You must write the file name exacty as tar ztf test.tar.gz shows it. If it says e.g. ./extract11, or some/bunch/of/dirs/extract11, that's what you have to give (and the file will show up under exactly that name, needed directories are created automatically).

  • -x: instructs tar to extract files.
  • -f: specifies filename / tarball name.
  • -v: Verbose (show progress while extracting files).
  • -z: filter archive through gzip, use to decompress .gz files.
  • -t: List the contents of an archive
mgutt
  • 467
  • $ tar -zxvf test.tar.gz extract11 tar: extract11: Not found in archive tar: Error exit delayed from previous errors – Ankit Vashistha Jan 16 '13 at 07:52
  • 1
    then try this tar tf archive.tar.gz | grep extract11.please check whether extract11 is in the archive or not – harish.venkat Jan 16 '13 at 08:06
  • I would like to add a point here, the -z option doesn't work in some versions of unix like HP-UX. – Ankit Vashistha Aug 26 '15 at 06:54
  • The "f" parameter must come last. https://superuser.com/a/150782/78899 Otherwise you get an error like "tar (child): v: Cannot open: No such file or directory" – PJ Brunet Jul 06 '18 at 21:36
  • 2
    @pj-brunet: No, f must not be at the end. But it must be directly followed by the filename. This example with all the other options after the tar name works fine: tar -f mytar.tar.gz -zxv dir/somefile. It extracts "dir/somefile" from "mytar.tar.gz". – mivk Jan 13 '19 at 22:06
  • 6
    Thanks a lot that was great. For viewing the filenames, I used vim <tar filename> and for coying the file into a specific directory, I had to use -C before the filename. that is : tar -zxvf <tar filename> -C <your custom dir> <file you want to extract>. hope this comes helpful to somebody out there – Hossein Nov 10 '20 at 12:24
74

Let's assume you have a tarball called lotsofdata.tar.gz and you just know there is one file in there you want but all you can remember is that its name contains the word contract. You have two options:

Either use tar and grep to list the contents of your tarball so you can find out the full path and name of any files that match the part you know, and then use tar to extract that one file now you know its exact details, or you can use two little known switches to just extract all files that match what little you do know of your file name—you don't need to know the full name or any part of its path for this option. The details are:

Option 1

$ tar -tzf lotsofdata.tar.gz | grep contract

This will list the details of all files whose names contain your known part. Then you extract what you want using:

$ tar -xzf lotsofdata.tar.gz <full path and filename from your list above>

You may need ./ in front of your path for it to work.

Option 2

$ tar -xzf lotsofdata.tar.gz --wildcards --no-anchored '*contract*'

Up to you which you find easier or most useful.

HalosGhost
  • 4,790
  • tar -xzf lotsofdata.tar.gz <full path and filename from your list above> worked for me withour giving ./ at the start of full path – user13107 Apr 06 '17 at 07:26
  • You might have to be patient with these, took a while... Also maybe it's obvious, but it's going to extract the full directory structure in the current directory, even if those directories don't exist yet. – PJ Brunet May 28 '18 at 03:20
  • 1
    BSD tar (at least on macOS) doesn't support the --wildcards option. I was able to still get similar behavior with -x -O *foo. *foo was the filename glob in my case, since I knew the file ended with foo. – Eric Hu Jul 31 '18 at 14:16
23

I was trying to extract a couple hundred files from a tarball with thousands of files the other day. The files I need cannot be referenced by a single wildcard. So I googled and found this page.

However, none of tricks above seem good for my task. I ended up reading the man, and found this option --files-from, so my final solution is

gunzip < thousands.tar.gz | tar -x -v --files-from hundreds.list -f -

and it works like a charm.

Update: The list file should have the same format as you would see from tar -tvf, otherwise you would not be able to extract any files.

lgeorget
  • 13,914
JJ Tang
  • 331
10

Please find below the examples of extracting specific files from tar.gz file.

From local file:

$ tar xvf file.tgz path/README.txt 2nd_file.txt

From remote URL:

$ curl -s http://example.com/file.tgz | tar xvf - path/README.txt 2nd_file.txt
kenorb
  • 20,988
4

Your example works for me if you omit the braces

$ tar --extract --file=test.tar.gz extract11

If your file extract11 is in a subfolder, you should specify the path within the tarball.

$ tar --extract --file=test.tar.gz subfolder/extract11
Bernhard
  • 12,272
  • $ tar --extract --file=test.tar.gz extract11 tar: extract11: Not found in archive tar: Error exit delayed from previous errors – Ankit Vashistha Jan 16 '13 at 07:51
  • Then obiously extract11 is not in the tar-file. Be sure that you need to provide the relative path. See edit – Bernhard Jan 16 '13 at 07:59
1

To extract only files matching a certain pattern:

for i in $(tar ztf test.tar.gz | grep 2021-01); do tar -xzvf test.tar.gz $i; done

For multiple patterns:

for i in $(tar ztf test.tar.gz | egrep '2021-01|2021-02|2021-03'); do tar -xzvf test.tar.gz $i; done