Recipe: Parsing RSS and Atom Feeds

Sometimes it’s desirable to be able to ingest a remote RSS or Atom feed in order to make content available within a web application. Clearly, the easiest way to expand the content offerings of a web site is to incorporate content from other sources. Standards like RSS and Atom were designed precisely to support the syndication of content in this fashion.

The first thing that pops into the heads of developers when this kind of requirement comes up is the dawning realization that they may have to create some really ugly XML-parsing code. It just sounds like one of hose dreary, painful programming tasks that occasionally come down the pike.

The Problem

Ingest RSS or Atom feeds and parse the content so that it can be repurposed for the needs of a Rails web application.

The Solution

The HTTParty gem makes it almost trivial to parse both RSS and Atom feeds. Listing 1 shows the Ruby code for the Feed class.

Listing 1: The Feed Class

  class Feed
    include HTTParty
    format :xml

    def initialize(feed_url)
      @feed_url = feed_url
    end
  
    def feed_url
      @feed_url
    end
  
    def url
      uri = URI.parse(@feed_url)
      strip_feed_extension(uri.scheme + '://' + uri.host + uri.path)
    end

    def latest(params={})
      response = {}
      begin
        response = Feed.get(@feed_url)
      rescue REXML::ParseException => e
        RAILS_DEFAULT_LOGGER.warn("forum feed parse error: " + e.message)
        response["feed"] = ""
      end
    
      response["feed"]
    end
  
    private
  
      def strip_feed_extension(uri)
        str = uri.sub(/.atom/, '')
        str.sub(/.rss/, '')
      end
  end

Place the feed.rb class in the lib directory of your Rails application. Then run script/console to bring up a console.

> f = Feed.new('https://www.keenertech.com/articles.atom')
> feed = f.latest

That’s all there is to it. The feed has been parsed already. So, let’s view some summary information about the feed.

> feed['title']
KeenerTech.com
> feed['link']['href']
https://www.keenertech.com/articles.atom

Well, that’s great, but what about the entries?

> entries = feed['entry']
[ {}, {}, …]
> e = entries[0]
> e['title']
Leveraging Rails to Build Facebook Apps
> e['author']['name']
David Keener
> e['link']['href']
https://www.keenertech.com/articles/2010/09/29/leveraging-rails-to-build-facebook-apps
> e['summary']
My presentation on "Leveraging Rails to Build Facebook Apps," which I just gave at SunnyConf, is now available online. This presentation is a distillation of some of the practical tactics that my development team at MetroStar Systems has used to create highly successful…

Now, to quote SpiderMan, “with great power comes great responsibility.” HTTParty is just using REXML to do the parsing, which isn’t the speediest parser around but it’s more than good enough for most processing tasks.

Still, for performance reasons, you wouldn’t want to parse a remote XML feed every time a particular web page was requested. So, this is the type of task that demands some form of data caching, whether memcache or simply storing feed data in the database for later use.

Recipe: Default Ordering in Rails

Sometimes it’s convenient to ensure that all rows in a database table are always retrieved in a specified order for consistency. It turns out that this is trivial to do in Rails (both Rails 2.3.x and Rails 3.x).

I had a SETTINGS table in the database that defined the settings that were available in order to customize a product. Anywhere that those settings were listed, I wanted them to be listed in alphabetical order.

In standard Rails fashion, I had a Setting model that corresponded to the database table. As shown below, a default scope can be defined that will automatically be applied whenever find is called for the model (unless overridden by the caller).

Listing 1: The Setting Model (Rails 2.3.x)

   class Setting < ActiveRecord::Base
       default_scope :order => 'name'
   end

Listing 2: The Setting Model (Rails 3.x)

   class Setting < ActiveRecord::Base
       default_scope order(:name)
   end

That’s it. One line of code in Rails 2.3.x or Rails 3.x, and my objective was achieved throughout my entire application.

On another note, AREL can be used in Rails 3.x to chain conditions:

   default_scope where(:deleted_at => nil).order(:name)

Should you need to override a default scope in Rails 3.x, here’s how:

   Setting.with_exclusive_scope.order(:created_at).all

Default scopes can be useful and convenient, just don’t overuse them.

I Have No Klout

Apparently, I have no clout, which is a new measurement of social media influence. Check out the Klout web site for more information about this new statistic.

Twitter Problems

I just got “rate-limited” doing a search on Twitter for #RubyNation. This was the first time I’ve used Twitter in the past week because I’ve been busy with multiple projects. It’s also, by extension, the first and only search I’ve done in the past week. I have to say that either Twitter’s search is broken, or their algorithm for identifying people who need to be rate-limited is totally FUBAR.

Note: It turned out to be a widespread Twitter problem.

Social Media Stats

HubSpot is a great site for collecting statistics about social media. Here’s a link to more stats than you would believe possible.

New Toy: Final Cut Express

My new toy just arrived, compliments of Amazon.com: Final Cut Express. So, I’m now armed and dangerous when it comes to video editing. OK, ok, so it’s not Final Cut Pro, but it’s got a solid subset of features for video editing (and it was MUCH cheaper), and I’m working with two other editors who have Final Cut Pro (so they can do the extra heavy lifting if needed). Onwards to the editing of RubyNation conference content!

Two External Drives and Final Cut Express

I just spent $641.16 on equipment to accommodate the video editing needs of RubyNation 2010. Specifically, I ordered two 2TB Western Digital external drives with FireWire800 support to handle our video storage needs. The FireWire800 support also ensures that the drives can be used as scratch drives during video editing. Additionally, I ordered Final Cut Express for myself (my other editors both have Final Cut Pro, so we’re covered).

Ruby is 10th Most Popular Programming Language

Tiobe Software does a monthly index of the top programming languages. Ruby is listed as the 10th most popular language. Java, as you might expect, is ranked first (although its market share has been steadily declining over time). It’s an interesting overview. Check it out.

Note: (2018/01/13) It’s years later, and Ruby is currently in 11th place.