68

I have a minimal headless *nix which does not have any command line utilities for downloading files (e.g. no curl, wget, etc). I only have bash.

How can I download a file?

Ideally, I would like a solution that would work across a wide range of *nix.

Chris Snow
  • 4,106

8 Answers8

79

If you have bash 2.04 or above with the /dev/tcp pseudo-device enabled, you can download a file from bash itself.

Paste the following code directly into a bash shell (you don't need to save the code into a file for executing):

function __wget() {
    : ${DEBUG:=0}
    local URL=$1
    local tag="Connection: close"
    local mark=0

    if [ -z "${URL}" ]; then
        printf "Usage: %s \"URL\" [e.g.: %s http://www.google.com/]" \
               "${FUNCNAME[0]}" "${FUNCNAME[0]}"
        return 1;
    fi
    read proto server path <<<$(echo ${URL//// })
    DOC=/${path// //}
    HOST=${server//:*}
    PORT=${server//*:}
    [[ x"${HOST}" == x"${PORT}" ]] && PORT=80
    [[ $DEBUG -eq 1 ]] && echo "HOST=$HOST"
    [[ $DEBUG -eq 1 ]] && echo "PORT=$PORT"
    [[ $DEBUG -eq 1 ]] && echo "DOC =$DOC"

    exec 3<>/dev/tcp/${HOST}/$PORT
    echo -en "GET ${DOC} HTTP/1.1\r\nHost: ${HOST}\r\n${tag}\r\n\r\n" >&3
    while read line; do
        [[ $mark -eq 1 ]] && echo $line
        if [[ "${line}" =~ "${tag}" ]]; then
            mark=1
        fi
    done <&3
    exec 3>&-
}

Then you can execute it as from the shell as follows:

__wget http://example.iana.org/

Source: Moreaki's answer upgrading and installing packages through the cygwin command line?

Update: as mentioned in the comment, the approach outlined above is simplistic:

  • the read will trashes backslashes and leading whitespace.
  • Bash can't deal with NUL bytes very nicely so binary files are out.
  • unquoted $line will glob.
Chris Snow
  • 4,106
23

Use lynx.

It is pretty common for most of Unix/Linux.

lynx -dump http://www.google.com

-dump: dump the first file to stdout and exit

man lynx

Or netcat:

/usr/bin/printf 'GET / \n' | nc www.google.com 80

Or telnet:

(echo 'GET /'; echo ""; sleep 1; ) | telnet www.google.com 80
woodstack
  • 462
  • 2
  • 4
  • 5
    The OP has "*nix which does not have any command line utilities for downloading files", so no lynx for sure. – Celada Jul 25 '14 at 14:06
  • 4
    Note lynx -source is closer to wget – Zombo Dec 20 '14 at 08:34
  • Hey, so this is a really late comment but how do you save the output of the telnet command to a file? Redirecting with ">" outputs both the file's contents and telnet output such as "Trying 93.184.216.34... Connected to www.example.com.". I'm in a situation where I can only use telnet, I'm trying to make a chroot jail with the least frameworks possible. –  Sep 11 '18 at 09:39
19

Adapted from Chris Snow's answer.  This can also handle binary files.

function __curl() {
  read -r proto server path <<<"$(printf '%s' "${1//// }")"
  if [ "$proto" != "http:" ]; then
    printf >&2 "sorry, %s supports only http\n" "${FUNCNAME[0]}"
    return 1
  fi
  DOC=/${path// //}
  HOST=${server//:*}
  PORT=${server//*:}
  [ "${HOST}" = "${PORT}" ] && PORT=80

exec 3<>"/dev/tcp/${HOST}/$PORT" printf 'GET %s HTTP/1.0\r\nHost: %s\r\n\r\n' "${DOC}" "${HOST}" >&3 (while read -r line; do [ "$line" = $'\r' ] && break done && cat) <&3 exec 3>&- }

  • I break && cat to get out of read.
  • I use HTTP 1.0 so there's no need to wait for/send a connection:close.

You can test binary files like this:

$ __curl http://www.google.com/favicon.ico >   mine.ico
$ curl   http://www.google.com/favicon.ico > theirs.ico
$ md5sum mine.ico theirs.ico
f3418a443e7d841097c714d69ec4bcb8  mine.ico
f3418a443e7d841097c714d69ec4bcb8  theirs.ico
131
  • 291
  • This won't handle binary transfer files—it will fail on null bytes. – Wildcard Feb 02 '18 at 02:40
  • @Wildcard, i do not understand , i've edited with a binary file transfer example (containing null bytes), can you point me what i'm missing ? – 131 Feb 02 '18 at 07:58
  • 2
    @Wildcard, heheh, yeah that looks like it should work, since it reads the actual file data with cat. I'm not sure if that's cheating (since it's not purely the shell), or a nice solution (since cat is a standard tool, after all). But @131, you might want to add a note about why it works better than the other solutions here. – ilkkachu Feb 02 '18 at 08:54
  • @Wildcard, I added the pure bash solution too as an answer below. And yes, cheating or not, this is a valid solution and worth an upvote :) – ilkkachu Feb 02 '18 at 10:41
17

Taking the "just Bash and nothing else" strictly, here's one adaptation of earlier answers (@Chris's, @131's) that does not call any external utilities (not even standard ones) but also works with binary files:

#!/bin/bash
download() {
  read proto server path <<< "${1//"/"/ }"
  DOC=/${path// //}
  HOST=${server//:*}
  PORT=${server//*:}
  [[ x"${HOST}" == x"${PORT}" ]] && PORT=80

  exec 3<>/dev/tcp/${HOST}/$PORT

  # send request
  echo -en "GET ${DOC} HTTP/1.0\r\nHost: ${HOST}\r\n\r\n" >&3

  # read the header, it ends in a empty line (just CRLF)
  while IFS= read -r line ; do 
      [[ "$line" == $'\r' ]] && break
  done <&3

  # read the data
  nul='\0'
  while IFS= read -d '' -r x || { nul=""; [ -n "$x" ]; }; do 
      printf "%s$nul" "$x"
  done <&3
  exec 3>&-
}

Use with download http://path/to/file > file.

We deal with NUL bytes with read -d ''. It reads until a NUL byte, and returns true if it found one, false if it didn't. Bash can't handle NUL bytes in strings, so when read returns with true, we add the NUL byte manually when printing, and when it returns false, we know there are no NUL bytes any more, and this should be the last piece of data.

Tested with Bash 4.4 on files with NULs in the middle, and ending in zero, one or two NULs, and also with the wget and curl binaries from Debian. The 373 kB wget binary took about 5.7 seconds to download. A speed of about 65 kB/s or a bit more than 512 kb/s.

In comparison, @131's cat-solution finishes in less than 0.1 s, or almost a hundred times faster. Not very surprising, really.

This is obviously silly, since without using external utilities, there's not much we can do with the downloaded file, not even make it executable.

ilkkachu
  • 138,973
  • Isn't echo a standalone -non shell- binary ? (:p) – 131 Feb 02 '18 at 10:56
  • 2
    @131, no! Bash has echo and printf as builtins (it needs a builtin printf to implement printf -v) – ilkkachu Feb 02 '18 at 11:00
  • Worked great for me. Used it to download a statically compiled curl to do the further requests with that. – stempler Jan 16 '23 at 13:00
  • @stempler, thanks for reminding me I've created this monster. You really should have used one of the other ones. :P – ilkkachu Jan 16 '23 at 13:49
  • @DanielLe, re. the edit, I'm not exactly sure why you'd like to remove the quotes from the parameter expansion there and the linked post also doesn't state any reason. I've tested this in Bash, and it works with the quotes (and also without). In other shells, the results vary, but the "${a//// }" variant only works in Bash and Busybox, while "${a//\// }" works in all shells I tried. The one with quotes doesn't work in zsh or Busybox though, but it doesn't matter that much since the whole exercise only works in Bash anyway. – ilkkachu Jun 26 '23 at 10:07
  • (A similar function could likely be built for zsh, and more easily since it does support NULs in variables directly unlike Bash and most other shells.) – ilkkachu Jun 26 '23 at 10:08
  • now, if we were to change that, it'd likely be better to just replace it with IFS=/ read ... <<< "$1" as that wouldn't require substituting slashes back for DOC on the very next line... – ilkkachu Jun 26 '23 at 10:09
  • @ikkachu I suggested edits to remove the quotes because I tested this answer in Bash and the current version didn't work: https://pastebin.com/CQw1PUnh – Daniel Le Jun 29 '23 at 03:28
  • @DanielLe, hmm, I can't replicate that with Bash 5.x. I just pulled 5.2.15(1) from the source tarball and tested it. The only difference I can see is that with Bash 3.2 (which Macs have by default), the version with quotes indeed doesn't work. See what you get if you put echo $BASH_VERSION inside the function? – ilkkachu Jun 29 '23 at 07:36
  • @ikkachu $ echo $BASH_VERSION 3.2.57(1)-release. You're right, the current version works in Bash v5.x but not in Bash v3.2.x, which is the default in Ventura v13.2 :) I had Bash v5.x installed via Homebrew recently, but the shell I ran in https://pastebin.com/CQw1PUnh was v3.2.x :) – Daniel Le Jun 29 '23 at 09:46
9

Use uploading instead, via SSH from your local machine

A "minimal headless *nix" box means you probably SSH into it. So you can also use SSH to upload to it. Which is functionally equivalent to downloading (of software packages etc.) except when you want a download command to include in a script on your headless server of course.

As shown in this answer, you would execute the following on your local machine to place a file on your remote headless server:

wget -O - http://example.com/file.zip | ssh user@host 'cat >/path/to/file.zip'

Faster uploading via SSH from a third machine

The disadvantage of the above solution compared to downloading is lower transfer speed, since the connection with your local machine usually has much less bandwidth than the connection between your headless server and other servers.

To solve that, you can of course execute the above command on another server with decent bandwidth. To make that more comfortable (avoiding a manual login on the third machine), here is a command to execute on your local machine.

To be secure, copy & paste that command including the leading space character ' '. See the explanations below for the reason.

 ssh user@intermediate-host "sshpass -f <(printf '%s\n' yourpassword) \
   ssh -T -e none \
     -o StrictHostKeyChecking=no \
     < <(wget -O - http://example.com/input-file.zip) \
     user@target-host \
     'cat >/path/to/output-file.zip' \
"

Explanations:

  • The command will ssh to your third machine intermediate-host, start downloading a file to there via wget, and start uploading it to target-host via SSH. Downloading and uploading use the bandwidth of your intermediate-host and happen at the same time (due to Bash pipe equivalents), so progress will be fast.

  • When using this, you have to replace the two server logins (user@*-host), the target host password (yourpassword), the download URL (http://example.com/…) and the output path on your target host (/path/to/output-file.zip) with appropriate own values.

  • For the -T -e none SSH options when using it to transfer files, see these detailed explanations.

  • This command is meant for cases where you can't use SSH's public key authentication mechanism – it still happens with some shared hosting providers, notably Host Europe. To still automate the process, we rely on sshpass to be able to supply the password in the command. It requires sshpass to be installed on your intermediate host (sudo apt-get install sshpass under Ubuntu).

  • We try to use sshpass in a secure way, but it will still not be as secure as the SSH pubkey mechanism (says man sshpass). In particular, we supply the SSH password not as a command line argument but via a file, which is replaced by bash process substitution to make sure it never exists on disk. The printf is a bash built-in, making sure this part of the code does not pop up as a separate command in ps output as that would expose the password [source]. I think that this use of sshpass is just as secure as the sshpass -d<file-descriptor> variant recommended in man sshpass, because bash maps it internally to such a /dev/fd/* file descriptor anyway. And that without using a temp file [source]. But no guarantees, maybe I overlooked something.

  • Again to make the sshpass usage safe, we need to prevent the command from being recorded to the bash history on your local machine. For that, the whole command is prepended with one space character, which has this effect.

  • The -o StrictHostKeyChecking=no part prevents the command from failing in case it never connected to the target host. (Normally, SSH would then wait for user input to confirm the connection attempt. We make it proceed anyway, to not have an indefinitely hanging command on the intermediate host.)

  • sshpass expects a ssh or scp command as its last argument. So we have to rewrite the typical wget -O - … | ssh … command into a form without a bash pipe, as explained here.

tanius
  • 874
7

Based on Chris Snow's script.  I made some improvements:

  • http scheme check (it only supports http)
  • http response validation (response status line check, and split header and body by '\r\n' line, not 'Connection: close' which is not true sometimes)
  • failed on non-200 code (it's important to download files on the internet)

Here is the code:

function __wget() {
    : "${DEBUG:=0}"
    local URL=$1
    local tag="Connection: close"
if [ -z &quot;${URL}&quot; ]; then
    printf &quot;Usage: %s \&quot;URL\&quot; [e.g., %s http://www.google.com/]&quot; \
           &quot;${FUNCNAME[0]}&quot; &quot;${FUNCNAME[0]}&quot;
    return 1;
fi  
read -r proto server path &lt;&lt;&lt; &quot;$(printf '%s' &quot;${URL//// }&quot;)&quot;
local SCHEME=${proto//:*}
local PATH=/${path// //} 
local HOST=${server//:*}
local PORT=${server//*:}
if [[ &quot;$SCHEME&quot; != &quot;http&quot; ]]; then
    printf &quot;sorry, %s only supports http\n&quot; &quot;${FUNCNAME[0]}&quot;
    return 1
fi  
[[ &quot;${HOST}&quot; == &quot;${PORT}&quot; ]] &amp;&amp; PORT=80
[[ &quot;$DEBUG&quot; -eq 1 ]] &amp;&amp; echo &quot;SCHEME=$SCHEME&quot; &gt;&amp;2
[[ &quot;$DEBUG&quot; -eq 1 ]] &amp;&amp; echo &quot;HOST=$HOST&quot;     &gt;&amp;2
[[ &quot;$DEBUG&quot; -eq 1 ]] &amp;&amp; echo &quot;PORT=$PORT&quot;     &gt;&amp;2
[[ &quot;$DEBUG&quot; -eq 1 ]] &amp;&amp; echo &quot;PATH=$PATH&quot;     &gt;&amp;2

if ! exec 3&lt;&gt;&quot;/dev/tcp/${HOST}/$PORT&quot;; then
    return &quot;$?&quot;
fi  

if ! echo -en &quot;GET ${PATH} HTTP/1.1\r\nHost: ${HOST}\r\n${tag}\r\n\r\n&quot; &gt;&amp;3 ; then
    return &quot;$?&quot;
fi  
# 0: at begin, before reading http response
# 1: reading header
# 2: reading body
local state=0
local num=0
local code=0
while read -r line; do
    num=$((num + 1))
    # check http code
    if [ &quot;$state&quot; -eq 0 ]; then
        if [ &quot;$num&quot; -eq 1 ]; then
            if [[ $line =~ ^HTTP/1\.[01][[:space:]]([0-9]{3}).*$ ]]; then
                code=&quot;${BASH_REMATCH[1]}&quot;
                if [[ &quot;$code&quot; != &quot;200&quot; ]]; then
                    printf &quot;failed to wget '%s', code is not 200 (%s)\n&quot; \
                           &quot;$URL&quot; &quot;$code&quot;
                    exec 3&gt;&amp;-
                    return 1
                fi
                state=1
            else
                printf &quot;invalid http response from '%s'&quot; &quot;$URL&quot;
                exec 3&gt;&amp;-
                return 1
            fi
        fi
    elif [ &quot;$state&quot; -eq 1 ]; then
        if [[ &quot;$line&quot; == $'\r' ]]; then
            # found &quot;\r\n&quot;
            state=2
        fi
    elif [ &quot;$state&quot; -eq 2 ]; then
        # redirect body to stdout
        # TODO: any way to pipe data directly to stdout?
        echo &quot;$line&quot;
    fi
done &lt;&amp;3
exec 3&gt;&amp;-

}

  • Nice enhancements +1 – Chris Snow May 16 '17 at 18:43
  • It worked, But I found a concern, when I use this scripts, It keep wait several seconds when all data is read finished, this case not happen in @Chris Snow answer, anyone could explain this? – zw963 May 19 '17 at 14:45
  • And, in this answer, echo -en "GET ${PATH} HTTP/1.1\r\nHost: ${HOST}\r\n${tag}\r\n\r\n" >&3, ${tag} is not specified. – zw963 May 19 '17 at 15:17
  • I edit this answer with tag variable is correct set, it work well now. – zw963 May 19 '17 at 15:28
  • not working with zsh , __wget http://www.google.com sorry, only support http /usr/bin/env: bash: No such file or directory – vrkansagara Dec 14 '17 at 14:27
5

If you have this package libwww-perl

You can simply use:

/usr/bin/GET
  • 1
    Considering that other answers don't respect the question requirement (bash only), I think this is actually better than the lynx solution, as Perl is surely more likely to be preinstalled that Lynx. – Marcus Nov 05 '19 at 11:27
0

If you have python2:

/usr/bin/python2.7 -c "import sys; import urllib2; exec('try: response = urllib2.urlopen(\'http://localhost:8080/ping\');\nexcept Exception as e: sys.exit(1)')"
gowayward
  • 101
  • 1