News:

Choose a design and let our professionals help you build a successful website   - ITAcumens

Main Menu

How-to: using the new Facebook stream API in a desktop app

Started by dhilipkumar, May 02, 2009, 08:55 PM

Previous topic - Next topic

dhilipkumar

How-to: using the new Facebook stream API in a desktop app

Facebook launched a new set of APIs on Monday that allow third-party software to interact with the Facebook activity stream. Developers can use these new APIs to build sophisticated Facebook client applications that give users direct access to the stream from their desktop.

Courtesy of these APIs, rich support for Facebook could soon arrive in your favorite Twitter client and other social networking programs. In this article, I'll give you an inside look at how I used the new APIs to add full support for the Facebook stream in Gwibber, my own open source microblogging client for Linux.

The activity stream includes several kinds of content, including status updates, images, links, videos, and content that is imported from other services, such as Delicious bookmarks and Google Reader shared items. Users can post comments on stream items and can also indicate that they "like" a specific stream item. Facebook now provides programmatic access to all of this data through several different mechanisms. Developers can use a conventional REST method, a FQL query, or an Atom-based feed.

Atom Activity Extensions
The Atom-based feed is, perhaps, the most intriguing aspect of the new open streams system. Atom is a standardized XML-based format for simple syndication that is similar to RSS but is more robust and extensible. Rather than completely inventing its own dialect, Facebook wisely chose to put its weight behind Atom Activity Extensions, an emerging effort to build a standardized set of activity tags that can be used in Atom feeds.

Atom Activity Extensions is still in the draft stage and is not yet a formal standard. The draft is authored by David Recordon and Martin Atkins of Six Apart under the aegis of the DiSo project, a collaborative effort to build open standards for data portability and social networking.Facebook has become one of the first major adopters of Atom Activity Extensions, a move that will significantly boost the visibility of the nascent standard and help it gain traction. MySpace is also committed to the format and working on an implementation, so it now has the backing of two of the most popular social networking websites. This is a major win for interoperability and it could eventually facilitate development of universal activity stream clients that function in much the same way that desktop news feed readers work today.

Chris Messina, a leading figure in DiSo who is well-known for his work with OAuth and is closely involved with Atom Activity Streams, is enthusiastic about Facebook's adoption of the format. He commented on the implications in a message posted to DiSo's Activity Streams mailing list on Monday following Facebook's announcement."This is indeed good news for Facebook and for this community effort," he wrote. "At the very least, I'm excited to see how similar we can get the feeds coming out of MySpace and Facebook and I'm also eager to start looking at how we can replace the current activities API in OpenSocial with the Activity Streams format."

Facebook makes the stream available to third-party software through a URL. In order to access the feed, the program will need to be authenticated and will have to provide as parameters a session key and signature checksum. The following is the URL format:

http://www.facebook.com/activitystreams/feed.php?source_id=<user-id>&app_id=<yourApplicationId>&session_key=<session_key>&sig=<checksum-slash-signatutre>&v=0.7&read&updated_time=<UnixTime>

The significance of each of those attributes is described in greater detail in the official documentation, but those values should all be relatively familiar to developers who have worked with the Facebook API.
Although the Atom-based feed will be very useful for developers who are building generalized stream clients, Facebook's native APIs are more practical for client applications that will integrate tightly with the service. In Gwibber, I chose to use the new activity stream REST API methods.

arstechnica

dhilipkumar

Implementing Facebook activity streams in Gwibber
I originally created Gwibber in 2007 with the goal of building a social networking application for the GNOME desktop environment. It brings together comprehensive support for several popular microblogging services in a single program with a unified message stream. It is written in the Python programming language and is distributed under the terms of the General Public License (GPL). The actual content stream is drawn with an embedded WebKit HTML renderer and the rest of the user interface is built the GTK+ toolkit.

Gwibber's current Facebook functionality is built on top of PyFacebook, a lightweight open source Python library that wraps the Facebook APIs. PyFacebook mitigates a lot of the pain of Facebook client development because it handles all of the authentication, session, and signature hash stuff. It hides those idiosyncrasies under a simple object-oriented interface that is easy for developers to use.

PyFacebook is also very easy to extend when new Facebook API methods are introduced. Each Facebook API method is described in the PyFacebook library using a simple data structure that specifies the method's name and parameter types. PyFacebook does not appear to have been updated to work with the new stream API methods yet, but it was trivially easy for me to do it myself. You can see my simple PyFacebook modifications here. Note that I did not add all of the new methods, just the ones that I'm using in Gwibber.

Each service that is supported in Gwibber is implemented in its own module which exposes its functionality through a set of methods and properties that is consistent across all of the service modules. This makes it possible to wrap the services with a generalized abstraction layer so that the rest of the client application doesn't have to understand the differences between the various services. This abstraction layer is what makes it possible for Gwibber to display a combined stream of the messages from all of the services. To implement support for the activity stream, I rewrote most of Gwibber's Facebook service module.

Obtaining stream data
To add the full Facebook stream to Gwibber, I had to process the stream and extract the values into standard Gwibber message classes. I started by adding support for reading the stream. With my modified version of PyFacebook, this is very easy. I call the stream.get method on a PyFacebook instance. When called, the stream.get method will return the contents of the stream in either XML or JSON, depending on what you have requested.

The only parameter that is required by stream.get (besides the session key and others that are handled automatically by PyFacebook) is the UID of your application's user. There are several optional parameters that you can provide to customize the output. For example, you can provide start_time and end_time parameters which will display stream content that was published between the specified times. This is sort of Facebook's equivalent of Twitter's since_id and max_id values.

There is also a limit parameter (similar to Twitter's count) which allows you to specify how many messages you would like to download. The default if no limit value is explicitly specified is 30 posts. The maximum number that can be retrieved at once isn't documented, so I did some experimentation to see if I could figure it out.

I tried pulling down 400 but only got 312, with the oldest messages dating back seven days. This leads me to believe that there is probably not a numerical maximum but that it will only give you access to a week of messages. Unlike Twitter, Facebook's stream API doesn't support the concept of paging, so you can't go back any further than that or iteratively download your entire history.

arstechnica

dhilipkumar

Parsing the stream
The data structure returned by stream.get consists of three sections: albums, posts, and profiles. The posts section, as the name implies, includes the activity stream posts. Each post has a set of attributes which includes the ID of the user who created the post (actor_id), a timestamp in UNIX format which indicates the time that the post was created, the text content of the post (message), a URL with the post's permalink, and structures which contain media attachments and comments. You can find a basic overview of these values in the documentation.

In the posts section of the stream, individual users are referenced by ID and user information is not included directly. For example, the comment data does not include the name or profile URL of the person who posted the comment. Instead, it assigns their UID to the comment fromid attribute.

Basic information about all of the users who are referenced in the stream posts is included in the profiles section of the data returned by stream.get. This is very different from Twitter and FriendFeed, which repeat the basic user information in every place where it is used.

The advantage of Facebook's approach is that it reduces the total amount of XML or JSON content that is returned by the API call. The downside is that it increases the complexity of processing the data. In order to get information about the user who made a post, we have to get the UID and match it with one in the profile section.

In my first rough pass at implementing support for activity streams, I made a simple function that just iterates over the profiles until it finds one with the matching ID and then it returns it. A more efficient approach is to build a hash table that associates the ID with the profile structure and then query it every time you need a profile. This is nice and simple with Python:


profiles = dict((p["id"], p) for p in data["profiles"]) 
To give you a clearer idea of how this works, I made a simple example that shows how to use the stream data to display the posts on stdout:


fb = facebook.Facebook(APP_KEY, SECRET_KEY) 
fb.session_key = SESSION_KEY 
fb.uid = UID 
fb.secret = SECRET_KEY 
 
data = facebook.stream.get(UID, limit=80) 
profiles = dict((p["id"], p) for p in data["profiles"]) 
 
for post in data["posts"]: 
  if "message" in post and post["message"]: 
    sender_name = profiles[post["actor_id"]]["name"] 
    text = post["message"] 
    time = datetime.datetime.fromtimestamp(post["created_time"]) 
    comments = post["comments"]["count"] 
 
    print "(%s) %s: %s (%s comments)" % ( 
      time, sender_name, text, comments) 



arstechnica

dhilipkumar

Attachments
Each post in the activity stream can have one attachment with multiple rich media items, such as images and video thumbnails. Attachments are also used in some cases to display content from Facebook applications and third-party Web services (like Digg, for example) that are imported into the activity stream.

This information is contained in the attachment attribute of each post item. The attachment attribute has a media value which holds a list of the individual media items. Attachments have a few common attributes, including a name, caption, description, and permalink URL.

Each media item has a permalink URL and a type attribute that specifies the nature of its content. So far, the only types I have encountered are designated as a "photo", "video", or "link". There may be others added in the future or others that I haven't run across in my own stream. The behavior of these content types seems to be pretty similar.

Most media items have an src attribute that provides a URL for an image that can be used to represent the attachment. For video items, this is typically a thumbnail frame of the video; for photo items, it's typically the image itself. Links also often provide thumbnails, especially in cases where the link is pointing to external media. For example, when somebody posts a tinypic image in the stream, it shows up as a link attachment that points to the image and its src attribute is a thumbnail that is hosted on Facebook.

In some cases, when the attachment thumbnail is hosted by Facebook, the value of the src attribute is the image's path relative to Facebook's root rather than a full URL. In those cases, you have to manually append "http://facebook.com" to the beginning in order to make it a full URL. I handle this in Gwibber by checking to see if the first character in the src value is a forward-slash.

Gwibber uses HTML to describe the contents of an individual message body, so I use a bit of simple markup to display the attachment name and description. Gwibber also iterates through the media items and appends them to the message thumbnail array which is translated to HTML by Gwibber's theme engine. The manner in which thumbnails are displayed is dictated by a theme template, so I don't specify it in the service module. The following is a snippet of the attachment handling code in Gwibber:

if data["attachment"]: 
  if "name" in data["attachment"]: 
    self.html_string += "<p>%s</p>" % data["attachment"]["name"] 
 
  if "description" in data["attachment"]: 
    self.html_string += '<p class="text">%s</p>' % data["attachment"]["description"] 
 
  for m in data["attachment"]["media"]: 
    if m["type"] in ["photo", "video", "link"]: 
      if m["src"] and m["src"][0] == "/": 
        m["src"] = "http://facebook.com" + m["src"] 
      self.thumbnails.append(m) 


As you can see, there isn't much to it. This seems to work reasonably well for most types of attachment content, but there might still be additional cases that it doesn't address.

arstechnica


dhilipkumar

FQL is Facebook's SQL-like database query syntax. It's less flexible and expressive than conventional SQL, but it does let you do some pretty fine-grained queries. I came up with the following FQL query to get the relevant profile info for users who have made comments on a particular post:

SELECT name, profile_url, pic_square, uid
  FROM user WHERE uid in (
    SELECT fromid FROM comment WHERE post_id = '%s')

When I used that query and the get.Comments method, I was able to get enough information to build a reasonable comment thread interface that is on par with Gwibber's support for the equivalent FriendFeed feature. There were, however, a few other little issues that I ran into when implementing that functionality.

Another oddity of Facebook is that it's possible for users to post anonymous comments on stream items. The output of get.Comments will omit the sender ID value for comments that were posted anonymously. Instead, it provides a username attribute that is populated with an arbitrary value supplied by the user. You have to account for this when processing comment data.

Ever since stream comments were first introduced, Gwibber users have been asking me to add support for replying to Facebook messages through the client. This was not previously possible, but the new stream API makes it pretty easy. You just use the stream.addComment method, which requires the target message ID and the comment text as parameters.

Permissions
Facebook's extended permission system requires third-party applications to receive authorization from users before using certain API features. This is intended to prevent applications from behaving in ways that are not anticipated by the user. Facebook has introduced several new permission flags that are related to the new stream APIs: publish_stream and read_stream.

The read_stream permission indicates that the application is permitted to read the contents of the stream. The publish_stream permission indicates that an application can interact with the stream by posting content, "likes," and comments.

It is extremely important to note that these permissions are not generally accessible yet. Facebook says that the stream features are still under development and that during the beta phase of the stream API rollout, only application developers will be able to grant these permissions for their applications. What that means is that regular users will not yet be able to use software that takes advantage of the new stream API—it's currently available for testing and development purposes only.

arstechnica

dhilipkumar

Conclusion
In this article, I primarily intended to illuminate the value of the open activity stream API and provide some insight into how it can be used in client applications. There are, however, a few other more subtle takeaway points that I think are worth exploring briefly.

One of my goals when I was implementing support for the Facebook stream API in Gwibber was to see if I could do it without having to make significant changes to the program outside of the Facebook service module.

I extended Gwibber's template system and service abstraction layer to add support for "like" and inline comments as part of my effort to make Gwibber handle FriendFeed a couple of weeks ago. When I added the full Facebook stream on Monday, I wanted to be able to introduce support for those same capabilities in the Facebook service module without changing the way that they are implemented in the abstraction layer or template system. The point was to see if Gwibber could be designed to do most of the heavy lifting for those features with generalized code that isn't specific to either service and is used the same way across both.

The fact that I was able to do so illuminates remarkable idiomatic similarities between Facebook and FriendFeed. There is a very noticeable shift towards the microblogging paradigm in the social networking space and the degree of differentiation between the competing social networking websites is beginning to fade. The growing support for Atom Activity Extensions is also very significant and represents a major step forward for interoperability between services.

The full source code of the Gwibber microblogging client is available from the project's version control system, which is hosted by Launchpad. I've committed the new Facebook stream features to the template-facebook-stream branch, and the contents of the branch are roughly at a beta level of quality, but the limits imposed by Facebook on who can apply the new extended permissions for streams generally preclude widespread testing at this point.


by Ryan Paul
arstechnica