CatchURL tutorial

Difference (from revision 13 to current revision)

No diff available.

How to log into the Blitz Research website

For this step-by-step guide I am going to make HTTrack log into the Blitz Research website and download selected pages—homepage and the Blitz3D manual.

The steps used here can be applied to many websites with login forms!

New project

Create a new project in WinHTTrack:

http://httrack.kauler.com/screenshots/form_auth_blitz_00.png

Click the Next button.

Click the Add URL button. A window will open.

http://httrack.kauler.com/screenshots/form_auth_blitz_02.png

Click the Capture URL button. A window will open with instructions for changing your browser's proxy settings (these settings will be different for you).

http://httrack.kauler.com/screenshots/form_auth_blitz_03.png

Find login page to start from

Leave HTTrack alone for now and open your web browser. Go to the website you wish to mirror, and find a page where you would normally log in. If you are already logged in then you will need to log out.

For the Blitz Research website, there is a "Login" link from the homepage to another page where a login form is presented. This will be the starting point for the HTTrack project.

http://httrack.kauler.com/screenshots/form_auth_blitz_04.png

Proxy settings

Now you need to temporarily change your browser's proxy settings to those shown in HTTrack. If you already have proxy settings defined you will want to write them down so that you can restore them later.

Firefox

In Firefox go to the Tools menu and choose Options, then click the General icon and click the Connection settings button. Choose Manual proxy configuration and copy the Proxy's address from the HTTrack window to HTTP Proxy box in Firefox, and copy the Proxy's port from HTTrack to the Port box in Firefox.

http://httrack.kauler.com/screenshots/form_auth_blitz_05_firefox.png

Click OK. Click OK again.

Internet Explorer

In Internet Explorer (IE) go to the Tools menu and choose Internet Options. Click the Connections tab and click the LAN Settings button. In the window that opens, tick the box for Use a proxy server for your LAN. Copy the Proxy's address from the HTTrack window to Address box in IE, and copy the Proxy's port from HTTrack to the Port box in IE.

http://httrack.kauler.com/screenshots/form_auth_blitz_05_ie.png

Click OK. Click OK again.

Capture the login details

With the proxy settings now in place HTTrack is ready to capture the details for the login form. Type in your username and password.

http://httrack.kauler.com/screenshots/form_auth_blitz_06_enterlogin.png

Submit the form (in this case I click the Login button) and you should now see a page telling you that HTTrack has caught the link.

http://httrack.kauler.com/screenshots/form_auth_blitz_07_linkcaught.png

Return to HTTrack and you will notice that the URL field is now populated with a URL (do not edit this). Click the OK button and that new URL will now show in the Web Addresses box.

http://httrack.kauler.com/screenshots/form_auth_blitz_08_b.png

Defining filters

At this point you could click Next and run the project, however because the starting URL is within the "Account/" directory (www.blitzbasic.com/Account/_login.php) the project will be scoped to only download anything in Account and below.

Because my purpose is to mirror the homepage and the Blitz3D manual, I will add some Filters to control where HTTrack crawls.

Click the Set options button and select the Scan Rules tab. Set the filters to:

-*
+www.blitzbasic.com/Home/_index_.php
+www.blitzbasic.com/Manuals/_index_.php
+www.blitzbasic.com/b3ddocs/*
+*.png +*.gif +*.jpg +*.css +*.js

Line-by-line this means:

  1. Exclude all files and links
  2. Allow the homepage
  3. Allow the Manuals index page
  4. Allow all pages in the b3ddocs directory
  5. Allow these filetypes (from any server)

Click OK to accept the options.

Click the Next button.

Click Finish to begin the HTTrack mirror.

Success

Assuming everything was fine with the username/password and proxy settings, HTTrack should successfully login and mirror everything needed. I browse my project and see success!

http://httrack.kauler.com/screenshots/form_auth_blitz_09_success.png