0

I'm trying to download a directory using a recursive wget command

wget -m -nH --cut-dirs=5 https://data.darts.isas.jaxa.jp/pub/pds3/sln-l-spice-6-v1.0/slnsp_1000/   

This works for some of the files, but also outputs a flurry of 403 Forbidden errors such as

--2023-06-13 08:43:51--  https://data.darts.isas.jaxa.jp/pub/pds3/sln-l-spice-6-v1.0/slnsp_1000/data/ck/SEL_M_200710_S_V03.lbl
Reusing existing connection to data.darts.isas.jaxa.jp:443.
HTTP request sent, awaiting response... 403 Forbidden
2023-06-13 08:43:51 ERROR 403: Forbidden.

However, if I try to download these files individually, it works

wget -m -nH --cut-dirs=5 https://data.darts.isas.jaxa.jp/pub/pds3/sln-l-spice-6-v1.0/slnsp_1000/data/ck/SEL_M_200710_S_V03.lbl

--2023-06-13 09:06:44-- https://data.darts.isas.jaxa.jp/pub/pds3/sln-l-spice-6-v1.0/slnsp_1000/data/ck/SEL_M_200710_S_V03.lbl Resolving data.darts.isas.jaxa.jp (data.darts.isas.jaxa.jp)... 133.74.198.108 Connecting to data.darts.isas.jaxa.jp (data.darts.isas.jaxa.jp)|133.74.198.108|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 1382 (1.3K) Saving to: ‘ck/SEL_M_200710_S_V03.lbl’

ck/SEL_M_200710_S_V03.lb 100%[================================>] 1.35K --.-KB/s in 0s

2023-06-13 09:06:44 (18.3 MB/s) - ‘ck/SEL_M_200710_S_V03.lbl’ saved [1382/1382]

FINISHED --2023-06-13 09:06:44-- Total wall clock time: 0.7s Downloaded: 1 files, 1.3K in 0s (18.3 MB/s)

I have tried:

  • -e robots=off
  • --user-agent=Mozilla/5.0
  • --trust-server-names
  • Looked at the request header through Chrome developer tools for a single file. There is no cookie and no referer that I can identify.
GET /pub/pds3/sln-l-spice-6-v1.0/slnsp_1000/data/ck/SEL_M_200711_D_V03.BC HTTP/1.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Connection: keep-alive
Host: data.darts.isas.jaxa.jp
Sec-Fetch-Dest: document
Sec-Fetch-Mode: navigate
Sec-Fetch-Site: none
Sec-Fetch-User: ?1
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36
sec-ch-ua: "Not.A/Brand";v="8", "Chromium";v="114", "Google Chrome";v="114"
sec-ch-ua-mobile: ?0

By the way, these urls are from Data ARchives and Transmission System (DARTS) which archives high-level data products obtained by JAXA's (Japan Aerospace Exploration Agency) space science missions. It is meant for public download of these data products and should not have any authentication requirements.

Resources used

2cents
  • 123
  • 8

0 Answers0