I will agree with julien that httrack is probably going to do what you want best. I have used it a number of times when writing my own spiders was not warranted.
As for your external links problem you would setup rules (filters)to only allow files from the domain you are copying.
IE:
notice that excludes ALL, then allows only the files from your server, add any other allowances as needed.Code:-* +www.example.com/*.html +www.example.com/*.php +www.example.com/*.asp +www.example.com/*.gif +www.example.com/*.jpg +www.example.com/*.png -mime:*/* +mime:text/html +mime:image/*
Submit Your Article
Forum Rules

Reply With Quote
