Newsfeeds provide an invaluable service: direct access to content of specific interest. NetNewsWire has long been my preferred newsreader, and the recent addition of synchronized online access makes it, for me, the clear best-in-class news client. Yet although NetNewsWire does a fine job aggregating news feeds, it’s a poor long-term information management solution.

Enter DEVONthink Pro, a highly reviewed research tool/information manager.

The NetNewsWire > DEVONthink bridge is a logical one, so it’s no surprise that DEVONthink Pro comes with preloaded scripts to archive information directly from NetNewsWire. (Sorry, DEVONthink Personal doesn’t include scripting support.) But the feature I really needed – the ability to archive entire RSS feeds into DEVONthink – is not included in these scripts.

To fill this gap, I wrote a script to archive entire newsfeeds (or subsets thereof) to DEVONthink Pro.

This script saves items from your selected feed (or folder of feeds) to DEVONthink Pro as web archives, with the following options:

  • From the selected feed, folder, or smart folder, save:

    • All items

    • Flagged items

    • Unread items

    • Read items

    • Date range (archive all items within a specified date range)
  • Archive options:

    • Feed saves just the content of the news item (typically the best option if the site provides full-text feeds)

    • Site saves the content of the item’s target website (this should be the whole article – particularly useful if the newsfeed is truncated)

    • Site/Print goes to the item’s target website and looks for a link that includes the string “Print”. If one exists, it saves the print-ready page to DEVONthink. If none exists, reverts to “Site” behavior.

  • Perform post-archive actions: mark read, unflag, or do nothing. This action occurs after DEVONthink has archived the article, so you can see archive progress in action and ensure you don’t double-archive an article

  • Optional: introduce a random time delay between archiving articles if using “Site” or “Site/Print”. (Useful if the site you’re reading from has countermeasures to prevent content scraping. Read: if they’re trying to keep you from saving their content for later use.)

My personal workflows for this script:

  • Flag items of interest from all newsfeeds. Then run this script periodically to import all flagged items into DEVONthink. Archive target: Flagged items (feed); Post-archive action: Unflag.

  • Save full-text content of a magazine I subscribe to but don’t want to keep hard copies of. Run this script every week, when the magazine’s RSS feed is updated. Archive target: unread items (Site/Print); Post-archive action: Mark as read.

Get the script here.

Update 2008-08-12: Added an additional Growl-free script to the download for users who don’t have Growl installed. The only difference is that the lines referencing GrowlHelperApp are commented out. Same download link applies.

Update 2010-05-26: Fixed a problem encountered when news items don’t have a description. Added an option to archive items from a specific date range. The download link has been updated.