Twitter recently made a streaming API available that allows developers to get real-time status updates using HTTP streaming. You can read about the specifics of the API here. There are a few different levels of streaming that developers can use, most of the more heavyweight streams require explicit permission and some kind of agreement with Twitter. Streams are available in JSON and XML, you can also get a delimited stream which tells you how many bytes each status uses to make I/O a little easier. I am going to show you how to use Apache HttpClient 4.0 and the XPP XML Pull Parser library to write a simple client to consume the publicly available “spritzer” stream.

Setting Up The Client
The first step is to set up the HTTP client. The API uses your twitter username and password with basic authentication to authenticate. We will configure our HttpClient with our twitter user name and password as the credentials for the “stream.twitter.com” scope.


        HttpParams params = new BasicHttpParams();
        HttpProtocolParams.setVersion(params, HttpVersion.HTTP_1_1);
        HttpProtocolParams.setContentCharset(params, "utf-8");

         SchemeRegistry registry = new SchemeRegistry();
        registry.register(new Scheme("http", PlainSocketFactory.getSocketFactory(), 80));

         ThreadSafeClientConnManager manager = new ThreadSafeClientConnManager(params,
                registry);
        DefaultHttpClient client = new DefaultHttpClient(manager, params);

        client.getCredentialsProvider().setCredentials(new AuthScope("stream.twitter.com", 80),
                new UsernamePasswordCredentials("username", "password"));

Requesting And Parsing
Next we will create the Get request for the “spritzer” stream and perform the request. After that we will open a pull parser on the network stream and “pull” the xml from the stream as it comes in.

Each status update comes in the format:


<?xml version="1.0" encoding="UTF-8"?>
<status>
....
</status>

It is important to note that each status has the xml document tag at the beginning.

Most of the parsing code is the result of a lot of trial and error so it is not the prettiest. See the comments towards the bottom for some important comments about the implementation.


        HttpGet get = new HttpGet("http://stream.twitter.com/spritzer.xml");
        try {
            HttpResponse resp = client.execute(get);
            int statusCode = resp.getStatusLine().getStatusCode();

        if (statusCode == 200) {
                InputStream stream = resp.getEntity().getContent();
                XmlPullParserFactory factory;
                factory = XmlPullParserFactory.newInstance();
                XmlPullParser parser = factory.newPullParser();

                parser.setInput(stream, "utf-8");
                while (true) {
                    int event = parser.next();
                    if (event == XmlPullParser.START_TAG) {
                        String name = parser.getName();
                        if (name.equals("status")) {
                            String text = null;
                            String screenName = null;
                            while (true) {
                                event = parser.next();
                                if (event == XmlPullParser.START_TAG) {
                                    name = parser.getName();
                                    if (name.equals("text")) {
                                        text = parser.nextText();
                                    } else if (name.equals("user")) {
                                        String userName;
                                        while (true) {
                                            int eventUser;
                                            eventUser = parser.next();
                                            userName = parser.getName();
                                            if (eventUser == XmlPullParser.START_TAG) {
                                                if (userName.equals("screen_name")) {
                                                    screenName = parser.nextText();
                                                }
                                            } else if (eventUser == XmlPullParser.END_TAG
                                                    && userName.equals(("user"))) {
                                                break;
                                            }
                                        }
                                    }
                                } else if (event == XmlPullParser.END_TAG
                                        && parser.getName().equals("status")){
                                    break;
                                }

                            }
                            //output username and status to console, you will want to parse and dispatch whatever
                            //information you need to something that does something a bit more substantial

                            System.out.println(screenName + ": " + text); //insert money making method here

                            //IMPORTANT HACK
                            //because each stream starts with an <xml/> document tag
                            //we have to reset the input of the parser to the stream
                            parser.setInput(stream, "utf-8");
                        }
                    } else if (event == XmlPullParser.END_TAG
                            && parser.getName().equals("statuses"))
                        break;

                }

            }

        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } catch (XmlPullParserException e) {
            e.printStackTrace();
        }

Here is the entire source you should be able to build it and run it in the console without many problems, but remember never try to actually read the stream.

stream

Posted on June 22nd, 2009 | filed under Uncategorized | Trackback |

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>