Xavier has a useful page on the official website with:
http://www.httrack.com/html/abuse.html
Sometimes a point of contention. Robots.txt is used to restrict (guide) robotic crawling tools, e.g. search engines. HTTrack is designed to be an offline browser, so to mirror a website intact it needs to access the website in the same way as a browser would. This is why HTTrack provides the option to ignore robots.txt directions.