40

I need to download a large file (1GB). I also have access to multiple computers running Linux, but each is limited to a 50kB/s download speed by an admin policy.

How do I distribute downloading this file on several computers and merge them after all segments have been downloaded, so that I can receive it faster?

B Faley
  • 4,343
  • 11
  • 39
  • 48
  • 7
    Download it at home, and sneakernet it in via usb thumbdrive? – WernerCD Aug 31 '14 at 16:47
  • 2
    I remember stuff like this with old sun b&w station when I was at the university. Just check if you have enough space to save all the contents, one of my friend has been expelled after blocking all computers of the lab (full tmp). – Kartoch Sep 01 '14 at 18:58
  • If there is no download restriction, how fast can the other end send the file? Are there transfer restrictions between computers on the LAN? – Sun Sep 04 '14 at 03:10
  • 1
    @SunWKim No. There is no specific restriction there. – B Faley Sep 04 '14 at 04:24

2 Answers2

60

The common protocols HTTP, FTP and SFTP support range requests, so you can request part of a file. Note that this also requires server support, so it might or might not work in practice.

You can use curl and the -r or --range option to specify the range and eventually just catting the files together. Example:

curl -r 0-104857600         -o distro1.iso 'http://files.cdn/distro.iso'
curl -r 104857601-209715200 -o distro2.iso 'http://files.cdn/distro.iso'
[…]

And eventually when you gathered the individual parts you concatenate them:

cat distro* > distro.iso

You can get further information about the file, including its size with the --head option:

curl --head 'http://files.cdn/distro.iso'

You can retrieve the last chunk with an open range:

curl -r 604887601- -o distro9.iso 'http://files.cdn/distro.iso'

Read the curl man page for more options and explanations.

You can further leverage ssh and tmux to ease running and keeping track of the downloads on multiple servers.

Marco
  • 33,548
  • 15
    Note: careful, when using cat distro* > ... check the sorting of the files as the * expanded by your shell could sort it like this: distro1.iso distro10.iso distro11.iso ... and thus concatenating in the wrong order. – Sebastian Sep 01 '14 at 07:55
  • 8
    a fix for @Sebastian's note would be: cat distro{1..10}.iso – nonchip Sep 01 '14 at 10:46
  • 1
    That solution is shell specific and not portable. cat $(seq -fdist%g.iso 1 10) should be more predictable, but it fails in csh, though. Replacing $(…) with backtics seems to work in most shells. – Marco Sep 01 '14 at 13:30
  • 3
    @Marco, seq is not a portable command either. You can use distro001.iso, distro002.iso... distroy010.iso – Stéphane Chazelas Sep 01 '14 at 15:45
  • Is the admin policy, 50 kB/s per transfer connection, or the total bandwidth allowed on the computer. If it is the former, the answer can be utilized on the same computer rather than having to log into different workstations. – Sun Sep 04 '14 at 16:45
  • @Sun The restriction only applies to the total internet bandwidth allowed on the computer. – B Faley Jun 25 '16 at 06:10
1

It would take about 5.5 hours to rode load a 1 gigabyte file at 50 kilobytes per second.

It seems the effort to coordinate multiple computers to get partials may save some time.

You can look at bittorrent and utilize web seeding along with transfers via peer exchange. Each client can receive pieces and share completed pieces with in the local area network (LAN). You end up with the same 1gb file on each computer but the merging of pieces is automated for you.

Sun
  • 188
  • how can I create a torrent without access to the file with only a single web seed added? mktorrent requires files/directories for calculating checksums etc. – Phani Rithvij Mar 13 '24 at 05:24
  • Seems like this idea isn't practical, creating a torrent from webseed without having access to the full file. I like the idea but sadly this is not possible. – Phani Rithvij Mar 13 '24 at 05:46