97

I'd like to extract a file from a Docker image without having to run the image.

The docker save option is not currrently a viable option for me as it's saving too huge of a file just to un-tar a specific file.

terdon
  • 242,166
BlakBat
  • 1,061

5 Answers5

169

You can extract files from an image with the following commands:

container_id=$(docker create "$image")
docker cp "$container_id:$source_path" "$destination_path"
docker rm "$container_id"

According to the docker create documentation, this doesn't run the container:

The docker create command creates a writeable container layer over the specified image and prepares it for running the specified command. The container ID is then printed to STDOUT. This is similar to docker run -d except the container is never started. You can then use the docker start <container_id> command to start the container at any point.


For reference (my previous answer), a less efficient way of extracting a file from an image is the following:

docker run some_image cat "$file_path" > "$output_path"
bbc
  • 1,871
  • 6
    You might want to over-ride the entrypoint. docker run --entrypoint /bin/sh my_image -c /bin/cat some_file – Andrew Jun 06 '18 at 23:52
  • 4
    This runs the image, which is specifically what I didn't want to do as stated in my question. – BlakBat Mar 26 '19 at 10:49
  • 1
    Ah, that's a good point. I agree my current answer isn't satisfactory then. – bbc Mar 26 '19 at 17:19
  • I was just thinking you might do something like: docker create image, docker cp container:/file_path /output_path but that doesn't seem to actually work unless you also do docker start container before the cp, which likely runs the container. – bbc Mar 26 '19 at 17:37
  • Never mind what I said about docker start, it's probably not needed. I'll update my answer accordingly. – bbc Apr 16 '19 at 11:51
  • 2
    @BlakBat Does this updated answer work for you? I guess I should have created a new answer but it's done now. – bbc Apr 16 '19 at 12:33
  • 3
    @bbc This updated answer does in fact not start a container (the crux of the question), and does not require to be root. – BlakBat Apr 20 '19 at 08:17
  • Used your answer to extract jvm.options from the elasticsearch-docker. This small script will get the latest version: CONTAINER_ID=$(docker create $(docker image ls | awk '$1 ~ /\/elasticsearch-docker/ { print $2, $3 }' | sort -V | tail -1 | awk '{ print $2 }')) ; docker cp ${CONTAINER_ID}:/usr/share/elasticsearch/config/jvm.options /tmp/jvm.options ; docker rm ${CONTAINER_ID}. – sastorsl Mar 11 '20 at 08:42
  • 3
    I would like to add, if you built an image that does not have a runnable command, when running the docker create command you may get the error Error response from daemon: No command specified In this case, add a dummy command after the image name. It can be dummycommand or any made up gibberish, since it won't actually be run. – PolyTekPatrick Dec 08 '21 at 05:24
9

None of the above worked for me. The complete working command is:

docker run --rm --entrypoint /bin/sh image_name -c "cat /path/filename" > output_filename

Without quotes cat is passed without filename, so it does not know what to show. Also it is a good idea to delete the container after command is finished.

galoget
  • 349
sekrett
  • 199
  • The command you're referring to will only work depending on the docker and how correctly ENTRYPOINT / CMD was set in the Dockerfile; this has nothing to do with quoting. You also say to delete the container, yet you specify --rm. Lastly, when I had posted my question I specified "without having to run the image" and no answers were a solution taking this into account. – BlakBat Oct 21 '18 at 15:41
  • 1
    Regardless which CMD and ENTRYPOINT were set in Dockerfile I override both, so it would work always (on Linux of course). What do you mean by "depending on the docker"? Settings, version, env, what? You question is not correct because images cannot be executed, only containers can. I think there is no correct answer, you have to deal with many files or create a temporary container. --rm removes the temporary container, other's answers leave some junk on your disk. – sekrett Oct 22 '18 at 10:26
  • Can simplify quoting by skipping the shell: docker run --rm --entrypoint /bin/cat image_name /path/filename > output_filename – Beni Cherniavsky-Paskin Jun 25 '20 at 21:21
  • 1
    This answer works, but unfortunately, I came here after figuring out another way, might help someone it's a bit shorter too... docker run --entrypoint /bin/cat image_name /path/filename > filename – lauksas Sep 09 '21 at 17:03
  • @lauksas Yes, it shorter, but a little harder to read, because cat and /path/filename are in different places, but they work together. Also without --rm you leave some junk on disk until you prune docker explicitly. – sekrett Oct 20 '22 at 10:26
9

If storing the full output of docker save isn't an option, you could use pipelines to extract just the needed file from it.

Unfortunately, because the output is a "tar of tars", it can be a slightly iterative process.

What I did when I needed to extract a file just now was:

  1. Determine which version of the image the file you are interested in changed most recently (how you do this probably depends on your image), and the date it was created / saved

  2. Get the full table of contents from the output of the docker save command with:

    docker save IMAGE_NAME | tar -tvf -

  3. Find the layer.tar file(s) in the output of that command that match the date of the image that you determined in step 1. (you can add | grep layer.tar to just show those files)

  4. Extract that layer.tar file to standard out, and get the table of contents of it:

    docker save IMAGE_NAME | tar -xf - -O CHECKSUM_FROM_LIST/layer.tar | tar -tvf -

  5. Verify the file you want is listed, and extract it once you find the name:

    docker save IMAGE_NAME | tar -xf - -O CHECKSUM_FROM_LIST/layer.tar | tar -xf - PATH/TO/YOUR/FILE

If there are more than one layer.tar files matching the date you are looking for in step 2/3, you may need to repeat step 4 for each one of them until you find the right one

Replace the text in capitals in the commands above with the correct image names, checksums and filenames for your case.

0

I believe that Docker containers store cached files created in the following directory for Ubuntu:

/var/lib/docker/aufs/diff/<container_id>

From there you should be able to access the file system and retrieve your file(s).

ryekayo
  • 4,763
  • Nope. That directory only contains layersize and json, and is also not user readable (even if user is in docker group). /var/lib/docker/aufs/diff will contain the file that I look for (but is categorized not by container id) and is also not readable. – BlakBat Dec 20 '16 at 16:24
  • Give me a few and I will look it up. I know for a fact there is a way to retreive the files without entering the container or running it. – ryekayo Dec 20 '16 at 18:18
  • By not readable, how does it show? I have found an example where you can pull text files from the containers by going to the /var/lib/docker/aufs/diff/* directory – ryekayo Dec 20 '16 at 18:49
  • My mistake. User can access /var/lib/docker/aufs (but not all other directories in /var/lib/docker/) – BlakBat Dec 21 '16 at 09:46
  • Can you access as root? – ryekayo Dec 21 '16 at 14:17
  • This approach highly depends on which storage driver (aka "graph driver") your docker is using (run docker info to check). – Beni Cherniavsky-Paskin Jun 25 '20 at 21:20
0

Here is another option which does not even create the container and avoids fiddling with the layers manually, but may act funny if multiple layers contain some of the files you're interested in:

$ docker image save IMAGE_NAME | tar --extract --wildcards --to-stdout '*/layer.tar' | tar --extract --ignore-zeros --verbose FILE_OR_DIRECTORY