I use Firefox and I don't have any issues viewing and reading English text on the loaded websites.
If I click Save in Firefox and save the web page in question as a text file I can read everything in the text file - all characters are readable.
However, when I use Downthemall to save these same web pages and save them as .html - which appears to be the only way with Dta - there are characters in the saved HTML files that are unreadable - and the kicker is that they are the critical lines I'm interested in reading and extracting. Firefox view source shows the same unreadable output.
Basically I'm trying to scrub a site (yunfile.com) to gather file names and download links - everything would be fine except I CANNOT read the file names.
Here's an example link: http://page3.dfpan.com/file/syg65488/0141cd27 The problem I'm having is with the file name line where it says Downloading:
HTML file text reads: ¡£¢¢£¥£¢½ãòá碽áòá
In Firefox the same text reads: 20110601.part1.rar
Is there a program and a command I can run to convert these HTML files?
Any suggestions would be greatly appreciated.