Archive newsfeeds in DEVONthink Pro via NetNewsWire

Newsfeeds provide an invaluable service: direct access to content of specific interest. NetNewsWire has long been my preferred newsreader, and the recent addition of synchronized online access makes it, for me, the clear best-in-class news client. Yet although NetNewsWire does a fine job aggregating news feeds, it’s a poor long-term information management solution.

Enter DEVONthink Pro, a highly reviewed research tool/information manager.

The NetNewsWire > DEVONthink bridge is a logical one, so it’s no surprise that DEVONthink Pro comes with preloaded scripts to archive information directly from NetNewsWire. (Sorry, DEVONthink Personal doesn’t include scripting support.) But the feature I really needed – the ability to archive entire RSS feeds into DEVONthink – is not included in these scripts.

To fill this gap, I wrote a script to archive entire newsfeeds (or subsets thereof) to DEVONthink Pro.

This script saves items from your selected feed (or folder of feeds) to DEVONthink Pro as web archives, with the following options:

  • From the selected feed, folder, or smart folder, save:

    • All items

    • Flagged items

    • Unread items

    • Read items

    • Date range (archive all items within a specified date range)
  • Archive options:

    • Feed saves just the content of the news item (typically the best option if the site provides full-text feeds)

    • Site saves the content of the item’s target website (this should be the whole article – particularly useful if the newsfeed is truncated)

    • Site/Print goes to the item’s target website and looks for a link that includes the string “Print”. If one exists, it saves the print-ready page to DEVONthink. If none exists, reverts to “Site” behavior.

  • Perform post-archive actions: mark read, unflag, or do nothing. This action occurs after DEVONthink has archived the article, so you can see archive progress in action and ensure you don’t double-archive an article

  • Optional: introduce a random time delay between archiving articles if using “Site” or “Site/Print”. (Useful if the site you’re reading from has countermeasures to prevent content scraping. Read: if they’re trying to keep you from saving their content for later use.)

My personal workflows for this script:

  • Flag items of interest from all newsfeeds. Then run this script periodically to import all flagged items into DEVONthink. Archive target: Flagged items (feed); Post-archive action: Unflag.

  • Save full-text content of a magazine I subscribe to but don’t want to keep hard copies of. Run this script every week, when the magazine’s RSS feed is updated. Archive target: unread items (Site/Print); Post-archive action: Mark as read.

Get the script here.

Update 2008-08-12: Added an additional Growl-free script to the download for users who don’t have Growl installed. The only difference is that the lines referencing GrowlHelperApp are commented out. Same download link applies.

Update 2010-05-26: Fixed a problem encountered when news items don’t have a description. Added an option to archive items from a specific date range. The download link has been updated.

Fidget your way to productivity

Leave it to the OmniGroup to help you stay productive — even when you can’t choose what to tackle next.

The OmniFocus Dashboard widget (“OmniFidget”), released today, does just that. Tell it what contexts you’re in:

OmniFidget 1

And it tells you what to do:

OmniFidget 2

Clicking the task title takes you to the task’s project in OmniFocus. Clicking “No” skips to the next task (but who really wants to disappoint the OmniFidget)?

Get yours here.

Is Google Making Us Stupid? (Think, and think again)

In this month’s The Atlantic, Nicholas Carr explores human cognition in the light of the Internet. Rhetorical extremes notwithstanding(1), it is a worthwhile read. His premise: the way we use the Internet is fundamentally altering the way we think – and not for the better.

I read Carr’s article at a coffee shop. To get through it without distractions, though, I had to shut my laptop and read off the printed page. It was, I must say, a fitting way to experience the article. In fact, try this: Pause for a minute. Click this hyperlink. Read the original article. Go ahead; I’ll wait…

By habitually drinking from the firehose of the Internet, Carr argues, we’re losing our capacity to engage in protracted, concentrated reading or thought. Reading is an imprinted – not instinctive – feature of our brains, so consuming content in dramatically new ways may actually alter the neurological networks in our brains. In short, we are learning to think in “the way the Net distributes [information]: in a swiftly moving stream of particles”. (Are we hard-wiring ourselves to be hyperactive information whores?)

To lose the capacity for “deep thought” would be truly lamentable, and it’s a worthy concern. This is one of the reasons I enjoy the periodic discipline of powering off the computer, phone, and iPod in order to engage in concentrated reading.

Nevertheless, although we may engage in long-form reading less frequently, as long as we retain the capacity for such pursuits, this is no tragedy. What Carr decries as the breakdown in critical thought may be equally heralded as the next great evolution in thought culture.

In short, we’re gaining the newfound ability to navigate torrents of information in a controlled and intentional fashion. This is not a tragedy. This is a breakthrough.

What this trend really demonstrates is a priority shift in the light of our new attention economy. With a glut of information and a scarcity of time, one of the most critical “new” skills is the ability to effectively manage the information we encounter. Our tendency to surf from site to site simply reflects our perception of the decreasing marginal utility of spending time on a given page. The fact that we can do this faster than before means we’re growing more adept at processing information quickly, and – again, assuming we haven’t lost the ability to parse large systems of data – this is a good thing.

Carr’s second concern addresses the commoditization of information and, by extension, knowledge itself. Google’s widely stated purpose (and, arguably, the purpose of the Internet) is to capture and organize the world’s information. Carr compares this process to assembly line efficiency research performed in the early 20th century, and implies that knowledge workers will soon be the next moral casualty in the workforce: just as standardized workflow procedures replaced experience and intuition on the factory floor, so Google will replace the independent thought of today’s knowledge workers.

He’s right, on one level: knowledge acquisition in the traditional sense may become irrelevant as information becomes more quickly and accurately accessible.

But his commentary assumes that knowledge qua knowledge is the apex of our information-based society, and that offloading our responsibility for knowledge (to Google et al) is tantamount to abdicating our intelligence. Quite the contrary: democratizing knowledge will clear the way for true intelligence to stand out. Interestingly, Carr notes that research which once took him days to perform now takes hours. Why does he not recognize that this will enable the brightest minds to vastly increase their intellectual output? Old-world research is out; intelligent synthesis by capable minds is in.

To conclude, while Carr does not (to this reader) adequately establish that the Internet is stunting our intelligence, he is correct in highlighting some of its pitfalls. In the frenzy of “hyperlinks, blinking ads, and other digital gewgaws”, we can become distracted and lose our critical edge. (By the way, did you read his full article? If you couldn’t dream of reading something that long – or tried but were too distracted to get through it – you may have a real case of Internet-onset ADHD. In that case, heed Carr’s warning, set your autoresponders, retire to a cabin with some books, and read for your life!)

The key, as with all areas of knowledge, is to stay alert – avoid intellectual laziness – and engage with the information we encounter. The Internet is merely a tool. Use it as such, and it will never replace your mind.

(1) Carr never poses the question “Is Google making us stupid?”, though it makes an eye-catching title and probably sold more than a few copies of the magazine.

Import RSS feeds into Facebook without relinquishing content control

Facebook has added a feature to import blog posts as Facebook notes. On the face of it, this is a great thing: it provides visibility to people who are unlikely to subscribe to your blog in a newsreader.

It’s Facebook’s Terms of Use that concern me. Although you theoretically retain copyright of your content in some vague perfunctory sense, Facebook can and will use your content (photographs, notes, wall posts, etc — even your privacy-restricted content) for anything they please, thankyouverymuch.

Don’t believe me? Read the official terms page:

By posting User Content to any part of the Site, you automatically grant… to the Company an irrevocable, perpetual, non-exclusive, transferable, fully paid, worldwide license (with the right to sublicense) to use, copy, publicly perform, publicly display, reformat, translate, excerpt (in whole or in part) and distribute such User Content for any purpose, commercial, advertising, or otherwise, on or in connection with the Site or the promotion thereof, to prepare derivative works of, or incorporate into other works, such User Content, and to grant and authorize sublicenses of the foregoing.

If this makes you uncomfortable, the solution is simple: create your content elsewhere. (Posterous and Tumblr are great places to start.)

To get your content back into Facebook, simply create a custom, truncated newsfeed in Feedburner and add it to your Facebook notes.

Procedure using Feedburner:

1) In Feedburner, create a new feed to your site. (Add a descriptive name – e.g., “yoursite – facebook” – to differentiate from your main feed, if applicable.)

2) Under the Optimize tab, go to Summary Burner

3) Add a descriptive footer and choose Save. Mine says:

[Truncated due to Facebook’s acquisitive Terms of Use. Please click “View original post” below for the rest.]

4) Now, import the feed into Facebook here

That’s it. Your posts will now be imported to your Facebook mini-feed, but Facebook doesn’t get its hands on your content.

2/16/09 Update: Facebook now claims your data forever.

Managing acquired information in an information age

Success in the information age hinges on managing the explosion of available information in meaningful ways. To even approach this goal requires a successful information management strategy, which revolves around the questions

  • “How do I find relevant information?”

and its corollary:

  • “How do I manage the information I’ve found?”

On a personal note, these are two of the questions that drive my own technological explorations. Brainstorming and note-taking methods and tools provide another side to the issue. This post is intended to provide some background and framework for said exploration.

How do I find relevant information?

Online information is typically located through complementary methods of search and discovery.

Traditional search technologies will long remain the first resort for information-seekers. Desktop search clients are also available for advanced data mining and research. Yet the rising semantic web is the true future of the Internet, and will enable users to interact with information in more meaningful and relevant ways.

Relationship-based information discovery is rapidly adding an important layer over traditional search tools. Social microsharing platforms (e.g., Twitter) and more robust social platforms (e.g., Twine, in private beta) allow individuals to build a liminal space of like-minded individuals with similar interests.

Two points are worth iterating here:

  1. Social networks are becoming a search sphere in their own right. For me, the Twitter ecosystem has become my trusted first source of user opinions; for many types of information, I search on Twitter before going to Google or DEVONagent.

  2. More and more information is shared and recommended through these relationship-based services. In other words, social networking platforms allow information to be discovered rather than explicitly sought.

Search once, not twice

The key to a useful information management strategy is this: You should only have to find a piece of information once.

Search tools should not be relied upon to find specific pieces of previously located information. If it takes more than fifteen seconds to locate online, it should be in your personal information system, not left to The Google.

If you spend a lot of time looking for information you’ve already encountered, your system is broken and you’re wasting your time. Or your employer’s time. Either way, that time should be spent turning information into knowledge, or putting it to use.

So: what to do with all this acquired information?

Tools of the trade

To be effective, an electronic document management system (EDM) should be:

  • Accessible — it’s available when and where you need it (for both archive and retrieval)

  • Flexible — able to accept input from any variety of sources

  • Scalable — can accept many thousands of documents without becoming unwieldy

  • Searchable — the system is worthless if you can’t find what you’re looking for

  • Extensible — it can be extended through scripting or other means

  • Open — It doesn’t hold your information hostage when you need to change systems

The most rudimentary means of storing information – file systems – fail where it matters most. Because file systems are not designed for this type of data management, they are not truly accessible (saving an excerpt from a website, for instance, is a many-step operation), or quickly searchable (your data are hidden amongst tens of thousands of irrelevant system and program files). In addition, file systems don’t provide end-to-end data functions, so viewing the contents of most file types requires launching another application. Add-on tools like Google Desktop mitigate some of these issues, but they’re no match for a real EDM system.

True EDMs are specifically designed for the task archiving and retrieving information. They can store images, text clippings, and documents of all types; add content indexing to the mix (allowing users to search by any word contained in their files); and are streamlined to allow quick archiving of information. EDMs can be implemented as software-based solutions (see Yojimbo, EagleFiler, and the like), as well as online (see Google Notebook, for instance).

Second-generation information managers like DEVONthink and Twine take content management a step further, adding semantic intelligence and useful content analysis to the user’s database. DEVONthink, a tool that I’ve used for years, analyzes the contents of its articles to identify non-obvious semantic relationships and assist with automatic filing. Twine performs similar functionality in the context of a social network, in theory promising to integrate the most relevant search, discovery, and EDM tools.

Live in the cloud…

As computer usage becomes increasingly network-centric and social, individuals are becoming more and more willing to trade privacy for the convenience and utility of web-based services.

Put another way, we are becoming more willing to keep our information in “the cloud”. (I like the cloud metaphor because, for me, it conjures images of Benjamin Franklin flying his kite in the electric storm. There is energy and power and excitement in the cloud. There’s also risk.)

This trend will spell dramatic shifts in EDM solutions to come. Soon all our data will be accessible from any web-enabled smartphone or computer, anywhere in the world. (And with customs agents able to search the contents of any electronic device with impunity, business travelers may soon be required to keep sensitive data online, not on their machines.)

But online services are not a silver bullet—yet. As a general rule, the current generation of Web 2.0 apps:

  • Make it difficult to work offline (technologies like Google Gears may soon obviate this concern)

  • Don’t take full advantage of OS-level services, keyboard shortcuts, etc

  • Are not easily automated or scriptable

  • Make it difficult to back up files (FUSE applications may change this in the near future)

  • Put users at the mercy of others for data integrity (Granted, it’s vastly more likely that you’ll lose data from your own hard drive crashing – rather than Google’s servers going kaputt – but either scenario is a possibility. Pick your poison)

…with your feet on the ground

Until these concerns can be fully mitigated, the most promising path forward lies in hybrid desktop/web platforms that allow users to maintain local and online control of information.

These may be end-to-end solutions (for example, the NewsGator family of products includes web- and software-based newsreaders that are fully synchronized) or more specific sync services (Plaxo, for instance, synchronizes desktop calendar and address book clients with online equivalents). When implemented correctly, these tools can be phenomenally useful.

I’ve been waiting for this same innovation to make its way to the world of EDM apps, and there are some promising options emerging. A limited example is DEVONthink Pro Office, which has a built-in web server that provides remote access to your database. (First impression: it’s slick, but you’re out of luck if you’re stuck behind a firewall or the database isn’t running.) Evernote is a new EDM tool with full desktop-to-web synchronization tools, as well as limited online editing.

The beginning

Ultimately, any EDM solution is only a tool — but it may be the most important tool in the arsenal of knowledge workers. It is therefore of critical importance that we take our EDM strategies seriously.

You may not yet have an EDM strategy. But creating one may be the most important step you can take in your development as a knowledge worker.

Take a moment to think about how you manage what you know. Start exploring technologies, asking how they can improve your knowledge set.

It may take months to work out a reasonable system of your own… but it’s a beginning, and one well worth making.

OmniFocus has a Mini-Me

The long-awaited OmniFocus iPhone app is officially announced. Features to include:

  • Location-aware (knows when you’re near the hardware store to pick up that drill bit)
  • Live automatic sync over the network (EDGE or WiFi via .Mac or WebDAV, according to the site, though they likely mean EDGE, 3G, or WiFi, via MobileMe or WebDAV)
OmniFocus for iPhone

Product page here.