Commit 47925f3d authored by Daniel Stenberg's avatar Daniel Stenberg
Browse files

Added a new "13. Web Login" chapter

parent 82c5950c
Loading
Loading
Loading
Loading
+48 −10
Original line number Diff line number Diff line
Online:  http://curl.haxx.se/docs/httpscripting.html
Date:    December 9, 2004
Date:    May 28, 2008

                The Art Of Scripting HTTP Requests Using Curl
                =============================================
@@ -137,6 +137,10 @@ Date: December 9, 2004
  you need to replace that space with %20 etc. Failing to comply with this
  will most likely cause your data to be received wrongly and messed up.

  Recent curl versions can in fact url-encode POST data for you, like this:

        curl --data-urlencode "name=I am Daniel" www.example.com

 4.3 File Upload POST

  Back in late 1995 they defined an additional way to post data over HTTP. It
@@ -202,14 +206,14 @@ Date: December 9, 2004

        curl -T uploadfile www.uploadhttp.com/receive.cgi

6. Authentication
6. HTTP Authentication

 Authentication is the ability to tell the server your username and password
 so that it can verify that you're allowed to do the request you're doing. The
 Basic authentication used in HTTP (which is the type curl uses by default) is
 *plain* *text* based, which means it sends username and password only
 slightly obfuscated, but still fully readable by anyone that sniffs on the
 network between you and the remote server.
 HTTP Authentication is the ability to tell the server your username and
 password so that it can verify that you're allowed to do the request you're
 doing. The Basic authentication used in HTTP (which is the type curl uses by
 default) is *plain* *text* based, which means it sends username and password
 only slightly obfuscated, but still fully readable by anyone that sniffs on
 the network between you and the remote server.

 To tell curl to use a user and password for authentication:

@@ -237,6 +241,10 @@ Date: December 9, 2004
 able to watch your passwords if you pass them as plain command line
 options. There are ways to circumvent this.

 It is worth noting that while this is how HTTP Authentication works, very
 many web sites will not use this concept when they provide logins etc. See
 the Web Login chapter further below for more details on that.

7. Referer

 A HTTP request may include a 'referer' field (yes it is misspelled), which
@@ -407,7 +415,37 @@ Date: December 9, 2004

        curl -H "Destination: http://moo.com/nowhere" http://url.com

13. Debug
13. Web Login

 While not strictly just HTTP related, it still cause a lot of people problems
 so here's the executive run-down of how the vast majority of all login forms
 work and how to login to them using curl.

 It can also be noted that to do this properly in an automated fashion, you
 will most certainly need to script things and do multiple curl invokes etc.

 First, servers mostly use cookies to track the logged-in status of the
 client, so you will need to capture the cookies you receive in the
 responses. Then, many sites also set a special cookie on the login page (to
 make sure you got there through their login page) so you should make a habit
 of first getting the login-form page to capture the cookies set there.

 Some web-based login systems features various amounts of javascript, and
 sometimes they use such code to set or modify cookie contents. Possibly they
 do that to prevent programmed logins, like this manual describes how to...
 Anyway, if reading the code isn't enough to let you repeat the behavior
 manually, capturing the HTTP requests done by your browers and analyzing the
 sent cookies is usually a working method to work out how to shortcut the
 javascript need.

 In the actual <form> tag for the login, lots of sites fill-in random/session
 or otherwise secretly generated hidden tags and you may need to first capture
 the HTML code for the login form and extract all the hidden fields to be able
 to do a proper login POST. Remember that the contents need to be URL encoded
 when sent in a normal POST.


14. Debug

 Many times when you run curl on a site, you'll notice that the site doesn't
 seem to respond the same way to your curl requests as it does to your
@@ -437,7 +475,7 @@ Date: December 9, 2004
 such as ethereal or tcpdump and check what headers that were sent and
 received by the browser. (HTTPS makes this technique inefficient.)

14. References
15. References

 RFC 2616 is a must to read if you want in-depth understanding of the HTTP
 protocol.