An older post from Seb Paquet "The algebra of feeds, or the amateurization of RSS bricolage" is very pertinent to the subject of Pipelines for XML.Recent talk about RSS feed splicing and the ineluctable need for filtering open feeds got me thinking about the variety of operations one might want to perform on feeds. Taking a cue from the operations of set theory we could for instance define the following: - Splicing (union): I want feed C to be the result of merging feeds A and B.
- Intersecting: Given primary feeds A and B, I want feed C to consist of all items that appear in both primary feeds.
- Subtracting (difference): I want to remove from feed A all of the items that also appear in feed B. Put the result in feed C.
- Splitting (subset selection): I want to split feed D into feeds D1 and D2, according to some binary selection criterion on items.
The ultimate RSS bricolage tool would give users an interface to derive feeds from other feeds using the above operations, and spit out a working URL for the resulting feed. I'm not sure how all of it would work, or even if all of it can work in practice. I'm completely abstracting out technical considerations here. While I'm not sure how large the space of useful applications of this could be, here are a couple example uses: - Splicing: All of the posts on the Many-to-many blog have to do with social software, so it would make sense to send its posts over to the social software channel. Now, since the blogging tool we use for that blog doesn't support TrackBack, it can't automatically ping the Topic Exchange. A workaround would be to merge both channels into a new one. In general, this would enable any combination of category feeds from various sources to be constructed very simply. A feed splicer can also serve as a poor man's aggregator.
- Intersecting: Say I want to subscribe to all of Mark's posts that make the Blogdex Top 40; I'd just have to intersect the feeds. Or I could filter a Waypath keyword search feed in the same manner.
- Subtracting: I'm interested in some topic that has an open channel, but find the items by one particular author uninteresting. (This is equivalent to the killfile idea from good ol' USENET.) Subtraction could also be used if you don't want to see your own contributions to a feed.
- Splitting: One might want to manually split a feed into "good" and "bad" subfeeds according to a subjective assessment of quality or relevance, or automatically split according to language, author, etc. Note that this one doesn't qualify as an example of pure feed algebra, as it involves inputs beyond feeds.
Yes, cool, I'd certainly like to do all of those things. I'd like to take newsfeeds or similar data and combine it or filter it or sort it, and I'd like to also be able to do it based on metadata from different sources. So the trouble is now that those feeds are in a number of different formats, and even if some of the metadata, like what appears in some top-40 list, or other ways of rating or categorizing posts, might be available in some XML format, it is not likely to be consistent in any instantly useful way. In a few hours I'd be able to do something with it. But I'd like to be able to do it in a few minutes or few seconds. Hm, this is certainly worth playing a bit with. [ Programming | 2004-01-07 17:21 | | PermaLink ] More >
|