5 十一 2008

use python to track friends’ status on xiaonei

Posted by ideal

Although I am not very interested in xiaonei.com. I just want to write a program to know the status of my friends and classmates. Through this program, you can receive the status infomation of your friends in gtalk. The post_status section (to login to a gtalk account and send a message using xmpp) of the code is referred from solrex.(the exact post is here)

To use this program, you need to install the python-xmpp library, if you are using debian or ubuntu, you can install it by sudo apt-get install python-xmpp. Also it means that you need to have two gtalk accounts, you can use the not usually used one to send the message to the usually used one. Notice that these two accounts need to be in friend list. Maybe there are better ways.

#!/usr/bin/python
# -*- coding: UTF-8 -*-
#
# Author: ideal (idealities AT gmail.com)
# Date:   2008-11-04
# Homepage: http://qdhedu.com/ideal

from sgmllib import SGMLParser
import urllib
import urllib2
import xmpp

def post_status(sendto = 'username@gmail.com', message = ''): #put your usually used account here in sendto
    login = 'bjtulinux' #your not usually used account
    passwd = '' #and the password
    talk = xmpp.Client('gmail.com', debug=[])
    talk.connect(server = ('talk.google.com', 5223))
    talk.auth(login, passwd, 'python')
    talk.send(xmpp.Message(sendto, message))

def get_status(email = 'idealities@gmail.com', passwd = ''): #your account and password for xiaonei.com
    url = 'http://xiaonei.com/Login.do'
    vals = {'email': email, 'password': passwd}
    useragent = 'Mozilla/5.0 (Windows; U; Windows NT 5.2; zh-CN; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3'

    import cookielib
    cook = cookielib.CookieJar()
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cook))
    opener.addheaders = [('User-Agent', useragent)]
    urllib2.install_opener(opener)
    sock = urllib2.urlopen(url, urllib.urlencode(vals))
    text = sock.read()
    sock.close()
    return text

class Status(SGMLParser):
    def reset(self):
        self.message = []
        self.isInBlogDiv = False
        self.isInStatusDiv = False
        self.isInH4 = False
        self.isInH4A = False
        self.isInDate = False
        self.count = 0
        SGMLParser.reset(self)

    def start_div(self, attrs):
        if self.isInBlogDiv or self.isInStatusDiv:
            self.count += 1
        for key, value in attrs:
            if key == "class":
                if value == "feed feed-blog text-story expand bold-feed ":
                    self.isInBlogDiv = True
                elif value == "feed feed-status text-story expand ":
                    self.isInStatusDiv = True

    def end_div(self):
        if self.isInBlogDiv or self.isInStatusDiv:
            self.count -= 1
        if self.count == -1:
            if self.isInBlogDiv:
                self.isInBlogDiv = False
            elif self.isInStatusDiv:
                self.isInStatusDiv = False
            self.count = 0

    def start_h4(self, attrs):
        if self.isInBlogDiv or self.isInStatusDiv:
            self.isInH4 = True

    def end_h4(self):
        if self.isInH4:
            self.isInH4 = False

    def start_a(self, attrs):
        if self.isInH4:
            self.isInH4A = True

    def end_a(self):
        if self.isInH4A:
            self.isInH4A = False

    def start_span(self, attrs):
        if self.isInH4:
            for key, value in attrs:
                if key == "class" and value == "date":
                    self.isInDate = True

    def end_span(self):
        if self.isInDate:
            self.isInDate = False

    def handle_data(self, text):
        if self.isInH4A:
            if text != "回复":
                self.message.append(text)
        elif self.isInDate:
            self.message.append(text + "n")
        elif self.isInH4:
            self.message.append(text)

    def output(self):
        """Return processed HTML as a single string"""
        return "".join(self.message)

if __name__ == '__main__':
    html = get_status()
    msg = Status()
    msg.feed(html)
    post_status(message = msg.output())
    msg.close()

You can put it in cron and let it run every n hours.

Tags:

Leave a Reply

评论: