115

For debugging purposes I want to monitor the http requests on a network interface.

Using a naive tcpdump command line I get too much low-level information and the information I need is not very clearly represented.

Dumping the traffic via tcpdump to a file and then using wireshark has the disadvantage that it is not on-the-fly.

I imagine a tool usage like this:

$ monitorhttp -ieth0 --only-get --just-urls
2011-01-23 20:00:01 GET http://foo.example.org/blah.js
2011-01-23 20:03:01 GET http://foo.example.org/bar.html
...

I am using Linux.

maxschlepzig
  • 57,532
  • There is same question answered on http://superuser.com/questions/67428/possible-to-catch-urls-in-linux – AlexD Jan 23 '11 at 09:32

6 Answers6

151

Try tcpflow:

tcpflow -p -c -i eth0 port 80 | grep -oE '(GET|POST|HEAD) .* HTTP/1.[01]|Host: .*'

Output is like this:

GET /search?q=stack+exchange&btnI=I%27m+Feeling+Lucky HTTP/1.1
Host: www.google.com

You can obviously add additional HTTP methods to the grep statement, and use sed to combine the two lines into a full URL.

bahamat
  • 39,666
  • 4
  • 75
  • 104
  • An advantage of tcpflow is that it is already available in the default repositories in Ubuntu 10.04 (justsniffer, httpry are not). The package info states that IP fragments are not recorded properly - don't know, if this matters for this use case - perhaps justsniffer can handle them better. – maxschlepzig Jan 22 '11 at 23:11
  • Since you're just grabbing the URL it doesn't seem like it'll matter. Tcpflow will display packets in the order they were received on the interface. Thus, if you were trying to capture file contents you can get packets that arrive out of order and will produce a corrupt file. But your use case listed in the question I think this will work for you. You can also widen your grep (or remove the -o) to see more of the packet data for sorting or whatnot later. – bahamat Jan 23 '11 at 00:01
  • 1
    @bahamat Can "tcpflow" work with https URL? – Maulik patel Oct 23 '18 at 09:39
  • Increasingly, the answer is no. In the past SSL was weak enough that if you had access to the key used for the flow you could decrypt any traffic used with that key. Today, most sites will use perfect-forward-secrecy and negotiate ephemeral keys. The best option today is a so-called "bump in the wire" transparent proxy. – bahamat Oct 28 '18 at 17:02
  • 4
    got nothing by, while browsing,using wifi: sudo tcpflow -p -c -i wlo2 port 80 | grep -oE '(GET|POST|HEAD) .* HTTP/1.[01]|Host: .*' – ses Jan 17 '19 at 00:48
28

You can use httpry or Justniffer to do that.

httpry is available e.g. via the Fedora package repository.

Example call:

# httpry -i em1

(where em1 denotes an network interface name)

Example output:

2013-09-30 21:35:20    192.168.0.1     198.252.206.16    >    POST    unix.stackexchange.com    /posts/6281/editor-heartbeat/edit    HTTP/1.1
2013-09-30 21:35:20    198.252.206.16  192.168.0.1       < HTTP/1.1   200    OK
2013-09-30 21:35:49    192.168.0.1     198.252.206.16    >    POST    unix.stackexchange.com    /posts/validate-body                 HTTP/1.1
2013-09-30 21:35:49    198.252.206.16  192.168.0.1       < HTTP/1.1   200    OK
2013-09-30 21:33:33    192.168.0.1      92.197.129.26    >    GET     cdn4.spiegel.de    /images/image-551203-breitwandaufmacher-fgoe.jpg    HTTP/1.1

(output is a little bit shortened)

maxschlepzig
  • 57,532
X4lldux
  • 381
  • How can I show the header or the body of the request or response? – Mohammed Noureldin Jan 07 '18 at 10:42
  • 1
    got nothing sudo httpry -i wlo2 (where wlo2 is by wifi device name) – ses Jan 17 '19 at 00:50
  • There is some important limitation with httpry. From README page "It is worth noting that httpry is rather naive when it comes to parsing HTTP packets. It does not perform any reordering or reassembly of packets and simply searches the start of each packet for HTTP data and ignores the packet if it does not find valid data." – tigrou Dec 12 '23 at 14:47
15

I was looking for something similar, with the added requirement that it should work for https too.

pcap based tools like tcpflow httpry urlsnarf and other tcpdump kung fu work well for http, but for secure requests you're out of luck.

I came up with urldump, which is a small wrapper around mitmproxy.
iptables is used to redirect traffic to the proxy, so it works transparently.

$ sudo urldump   
http://docs.mitmproxy.org/en/stable/certinstall.html
http://docs.mitmproxy.org/en/stable/_static/js/modernizr.min.js
https://media.readthedocs.org/css/sphinx_rtd_theme.css
https://media.readthedocs.org/css/readthedocs-doc-embed.css
https://media.readthedocs.org/javascript/readthedocs-doc-embed.js
...

See README for more info.

lemonsqueeze
  • 1,525
3

I think Wireshark is capable of doing what you want

On the plus side, it's very powerful, you can install it via apt-get, and it comes with a GUI.

However, the filter system is complicated - but there are good tutorials built in, and it will give you a live or start/stop overview of the traffic.

Typing the word 'http' into the filter will probably give you what you are looking for (i.e. the main traffic generated by users).

  • 1
    Would like to know why this was downvoted. Wireshark can read the interface on the fly and filter to just http traffic. – Kevin M Jan 22 '11 at 17:10
  • 1
    @Kevin M, Well, I did not downvote your answer. But to be fair your answer is a bit incomplete and off-topic. 1) It misses details on how exactly wireshark should be used, i.e. that a filter should be used, the exact filter expression, etc. 2) it does not allow for command line usage like sketched in the question - even if I am ok with the GUI approach, the default view displays GET requests, where the domain name is not displayed side by side - with is not that convenient for the sketched use case. – maxschlepzig Jan 22 '11 at 18:18
  • I mean: s/your answer/Phobia's answer/ – maxschlepzig Jan 22 '11 at 18:27
2

There is also the command line program urlsnarf which is part of the dsniff package (which is also packaged with e.g. Fedora 19).

Example:

# urlsnarf -i em1
urlsnarf: listening on em1 [tcp port 80 or port 8080 or port 3128]
jhost - - [29/May/2014:10:25:09 +0200] "GET http://unix.stackexchange.com/questions HTTP/1.1" - - "-" "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0"
jhost - - [29/May/2014:10:25:36 +0200] "GET http://www.spiegel.de/ HTTP/1.1" - - "-" "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0"
jhost - - [29/May/2014:10:25:36 +0200] "GET http://www.spiegel.de/layout/css/style-V5-2-2.css HTTP/1.1" - - "-" "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0"
jhost - - [29/May/2014:10:25:36 +0200] "GET http://www.spiegel.de/layout/jscfg/http/global-V5-2-2.js HTTP/1.1" - - "-" "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0"
jhost - - [29/May/2014:10:25:36 +0200] "GET http://www.spiegel.de/layout/js/http/javascript-V5-2-2.js HTTP/1.1" - - "-" "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0"
jhost - - [29/May/2014:10:25:36 +0200] "GET http://www.spiegel.de/layout/js/http/interface-V5-2-2.js HTTP/1.1" - - "-" "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0"
jhost - - [29/May/2014:10:25:36 +0200] "GET http://www.spiegel.de/layout/js/http/netmind-V5-2-2.js HTTP/1.1" - - "-" "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0"
jhost - - [29/May/2014:10:25:36 +0200] "GET http://www.spiegel.de/favicon.ico HTTP/1.1" - - "-" "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0"
jhost - - [29/May/2014:10:25:36 +0200] "POST http://ocsp.thawte.com/ HTTP/1.1" - - "-" "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0"
jhost - - [29/May/2014:10:25:36 +0200] "POST http://ocsp.thawte.com/ HTTP/1.1" - - "-" "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0
[..]

(when browsing first to SE and then to spiegel.de)

Limitations: dsnarf does not support IPv6. I can reproduce this bug report with 0.17 on Fedora 19. Also seems to be broken under Ubuntu trusty atm (works fine under lucid).

lemonsqueeze
  • 1,525
maxschlepzig
  • 57,532
2

Another good option might be nethogs

On fedora is available among the core packages, and on centos you can get it through the epel repo.

adriano72
  • 121