Sunday, January 31, 2010

Curl tutorial: forms, POST, login and upload (from scratch), a practical case


If, for some reason, like me, you get tired of going into websites, logging-in, etc... to end up doing a simple and repetitive task cURL may be your best option to get rid of this. CURL is a library (and therefore they have a command-line tool) that can be used to emulate the behaviour of a browser.
Ok, from here I assume some basic network and Linux knowledges, so be aware :)

What we are going to do is from the terminal:

  • Login to zooomr
  • Upload a picture
  • Logout
  • Run it all through a beautiful bash script

 As I a like to do it I'm going to post the code fist, and explain it later:

#!/bin/bash

user=your_user_name
pass=your_pass
file="$1"
cjar="/tmp/cjar"


echo "Logging-in..."

#Log-in
curl \
-s \
--cookie $cjar \
--cookie-jar $cjar \
--user-agent "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.7) Gecko/20100105 Shiretoko/3.5.7" \
--data "username=$user" \
--data "password=$pass"  \
--data "gogogo=1" \
--data "redirect_to=http://zooomr.com" \
--data "processlogin=1" \
--location \
http://www.zooomr.com/login/ &> /dev/null
echo "Uploading..."

#Upload
curl -s --cookie $cjar --cookie-jar $cjar \
--header "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" \
--header "Accept-Language: en-us,en;q=0.5" \
--header "Accept-Encoding: gzip,deflate" \
--header "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7" \
--header "Keep-Alive: 300" \
--header "Connection: keep-alive" \
--header "Expect: " \
--user-agent "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.7) Gecko/20100105 Shiretoko/3.5.7" \
--location \
--referer "http://www.zooomr.com/photos/upload/?noflash=okayiwill" \
-F "Filedata=@$file" -F "labels=test" -F "is_public=0" -F "is_friend=1" -F "is_family=1" -F "done=1"  \
http://upload.zooomr.com/photos/upload/process/ &> /dev/null

echo "Logging-out..."

#Logout
curl --cookie $cjar -s http://de.zooomr.com/logout/ &> /dev/null

rm $cjar &> /dev/null



So, this is simple, first call to curl to login, the again to upload, and another time to logout.

  • Firsts things first : Logging-in.
Let's see out website:
http://www.zooomr.com/login/

There are two fields on the login form, username and password, as usual.
Let's take a look at the code of this form:


So, we see the two above mentioned fields which names are "username" and "password", there are also another 3 hidden fields named "processlogin" whose value is 1, "gogogo" and "redirect_to", we will just leave them as they are.
We can also see that the form HTTP method is POST (great ! curl supports that) and the action is /login/, since we are at http://www.zooomr.com/login/ it is obvious that the path to the "action" page is http://www.zooomr.com/login/.
This all means, we will send our user name in the username field, our password in the password field and finally those hidden fields with their original values.

Let's then take a look to the curl command:

21 echo "Logging-in..."
22
23 #Log-in
24 curl \
25 -s \
26 --cookie $cjar \
27 --cookie-jar $cjar \
28 --user-agent "Mozilla/5.0 (X11; U; Linux x86_64; en-US; ..." \
29 --data "username=$user" \
30 --data "password=$pass"  \
31 --data "gogogo=1" \
32 --data "redirect_to=http://zooomr.com" \
33 --data "processlogin=1" \
34 --location \
35 http://www.zooomr.com/login/ &> /dev/null

Here is the line-by-line explanation:
-s
Silent mode, no output

--cookie $cjar
Use this file as cookie (not needed here, since nothing has been collected yet)

--cookie-jar $cjar
Put the values of the sent cookies on this files

--user-agent "Moz..."
You guessed it

--data / -F ...
Are used to specify values of the POST fields (input from forms)

--location
Report if the page has moved

URL
POST method


That was easy ! Now, we are logged on their servers and we keep on session information in a cookie wherever we put it, in our case /tmp/cjar, we will use this file later on so the server knows that we are logged.


  • Uploading the file
It is time to upload the file now. Since zooomr has a fancy (and useless) flash uploader we will use the non-flash version they provide with a regular "Browse..." button and stuff, let's go: http://www.zooomr.com/photos/upload/?noflash=okayiwill
Note that all this parallel process that we are doing is performed in a normal FF browser, but only once, to be able to study the page, once done, we will not be needing to access the site through the browser anymore :) 


 Once again, lets find this form on the code:

...
...

Ok, a bunch of code, here, but do not panic, it is once again the same:
"Filedata" is the field where you have to put the path to your file.
"is_private"... are the privacy preferences we will send the default values.

The "action" page will be http://upload.zooomr.com/photos/upload/process/ but since the URL of the actual page is www.zooomr.. http://www.zooomr.com/photos/upload/process/ will work aswell.

Now, the code, this was a bit longer:


44 echo "Uploading..."
45
46 #Upload
47 curl -s --cookie $cjar --cookie-jar $cjar \
48 --header "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" \
49 --header "Accept-Language: en-us,en;q=0.5" \
50 --header "Accept-Encoding: gzip,deflate" \
51 --header "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7" \
52 --header "Keep-Alive: 300" \
53 --header "Connection: keep-alive" \
54 --header "Expect: " \
55 --user-agent "Mozilla/5.0 ..." \
56 --location \
58 -F "Filedata=@$file" -F "labels=test" -F "is_public=0" -F "is_friend=1" -F "is_family=1" -F "done=1"  \
59 http://upload.zooomr.com/photos/upload/process/ &> /dev/null
60


First, explanation:
--header
Changes the HTTP header to the specified value, if blank: header is removed
Here, all those headers are NOT needed, but I leave them for "educational" purposes, here is a typical HTTP request:

GET / HTTP/1.1
Host: www.google.es
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.7) Gecko/20100105 Shiretoko/3.5.7
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: PREF=ID=5d58514db6...blablabla


As you can see I tweaked it all so curl would send the same headers as FFox, this is really not needed but I wanted to do it, "for fun".

There is one last header: "Expect", this header has no value, so it is removed.
It happens that CURL adds this header (that indicates what reply the client expects), but the server didn't liked it, and replied a "417: Expectation Failed" error. A simple look through the headers FFox sends helped me to find this out, you can do it with an add-on called live http headers or with wireshark.

Also, to specify a file to upload, yo can not put all the bytes of the file behind the = sign, so @/path/to/file will do the trick and curl will send the file for you.


  • Logging-out: newbie stuff ;)
I'm sure by now you can figure it out by yourself.


Now, try it out, copy the code in a file, configure your username and pass,  "chmod +x  the_file.sh" and run it putting as parameter the path to the file.
If you have read it all carefully you should be able to to that for lots of services online. Remember that CURL works with SSL (https) just as good as without it.

Comments ? Suggestions ? Go and leave me a comment :)

-btw, credit for the logout line goes to: up2zooomr-


UPDATE:
I just made a little GUI version, so you can call it from a file manager or from a console with a parameter, it also adds the possibility to add a title and tags to your picture:

./zooomrUpload /path/to/file











Get it here.
Please be aware that the whole thing is not very user-friendly, that means, no error control and few work on the usability. Anyway, I hope you enjoy !

Blogger templates

A cœur vaillant rien d'impossible.
Powered by Blogger.

Labels

About