One major shortcoming of curl
is that more and more wepages are having their main piece of content painted by a JavaScript AJAX response that occurs after the initial HTTP response. curl
never picks up on this post-painted content.
So to fetch these types of webpages from the command line, I've been reduced to writing scripts in Ruby that drive the SeleniumRC to fire up a Firefox instance and then return the source HTML after these AJAX calls have completed.
It would be much better to have a leaner command line solution for this type of problem. Does anyone know of any?