You are here

An Alternate View of the ipodder.org Directory

The ipodder.org directory is a very elegant solution to a difficult problem. How do you maintain a directory that will grow in size and be maintained by many people without applying a ton of technology? Simple; use OPML. (Click here to go the alternate view or click "read more" to view the complete article.)

OPML is a very limited sub-set of XML that is used to form an outline, a nested set of links or a directory listing. An outline entry, or node, in OPML can point to another file that contains other nodes, therefore a skeleton outline can be formed that allows many users to maintain nodes. The links in the nodes can point to richer links which means that the only real payload the OPML is carrying are pointers. OPML is a great way to make a web-based table of contents. Okay, that's the 15 second description. Does it work? Well sort of.

The ipodder.org directory is one of the few large scale projects using OPML to create a directory. With many people contributing and not many tools being available there are some inconsistencies in the way node maintainers are forming their OPML lists. As a result, when you click on the OPML link at ipodder.org to get the complete raw OPML directory (http://www.ipodder.org/discuss/reader$4.opml) there are gaps that appear in a few places. For instance, the Technolgy node does not show any links, although the pointer to the technology opml file is there. Clicking through to the home of the technology site reveals that there is an OPML file there, but it does not seem to be a pure text file. My browser (Firefox 1.0) tried to render it (something it will not do with a pure text OPML file) and only when I right-clicked "view source" did I see some OPML mark-up. With inconsistencies like that a developer will have to crawl the entire OPML tree and visit all the nodes with their code in order to build a complete directory listing. I hope ipodder.org fixes this and maintains an OPML file on their site that contains all node information, not just most node information. In addition, the "New Podcast" node seems to be reproduced at least three times (as of Nov. 20, 2004), which really adds to the processing overhead when trying to crawl the outline.

So what you say, the directory has already been rendered visually by ipodder.org. By clicking up and down the directory on the little file icons at the site, anybody can view any of the nodes they like. While that is true, the actual layout and execution of the rendered OPML mark-up is not very useful. For one thing, the listing does not behave like an outline. Clicking on a lower level node opens a new page, it does not open a new level or layer on the same page. It makes for an incredibly inefficient way to find a podcast. It's not user friendly and it requires dedicated hunting and pecking with a mouse. The directory should be in one big file that can have it's nodes opened and closed with simple mouse clicks, just like a real outliner. And when we get to a podcast listing we need to see more than just the name of the podcast -- some podcast names are so obscure they don't convey any real information about the podcast. Where and how do we get richer data for the listing? Is this a problem with OPML? Absolutely not.

Using the links in the OPML and an XSL transformation, one can build a file that can be rendered to HTML that will allow outline-like behaviour and display more podcast related detail. Using the document() function of XSL allows a podcast's RSS 2.0 feed to be opened and parsed. The tag is a good place to get supplemental information about a podcast and this can be rendered along with the podcast's feed link on a display page. I've adapted a transformation that does this and it takes about 10 minutes on a highspeed DSL line to visit all of the podcast sites, validate the feeds, pull description tags and render an HTML page that has node expansion and compression features. Clearly that kind of functionality should not be used in real-time for individual users, but scanning the directory on a regular interval and updating the file will be fine. Click here to look at the results.

One problem with this approach is that XML and XSLT can be somewhat finicky, depending on the parser that is used. Many podcasters are not properly posting their feeds. Sometimes the character data is invalid, the tagging is not compliant, the feed is not there or they are incorrectly pointing to an HTML file rather than an RSS 2.0 file. When a parser hits invalid feeds it will stop and throw an error, so all possible errors must be dealt with so that a complete crawl of the OPML file and all remaining feeds can be completed. This requires a pretty flexible XSL parser as well as some conditional coding in the XSL. Currently this transformation posts a note in place of a description if the tag can not be found in the feed or if it can not be read for some reason.

This is only a start. The javascript that opens the nodes is adapted from the Buzz outliner (the only cross-platform OPML outliner that I am aware of). The layout is still pretty ugly, but it is more readable to my eye than the ipodder.org rendering and, best of all, it is quicker and contains more information. At the top of the page are two links that run either a javascript "expand all" or "compress all" function. Click on "expand all" and run the mouse up and down the listing. When the cursor changes to a diagonal double-headed arrow, that is a clickable node in the outline. Clicking the node toggles either an expansion or compression of all sub-nodes below that level. There is weird behaviour in the nodes from time to time that causes some nodes to lock up and become "un-clickable"; this can be remedied by using the "expand all" link -- this is version 1.0. I hope this layout makes it easier for subscribers to view and try more podcast feeds. If you have any ideas on how things can be improved, hack it up and let me know.

Add new comment