Archives for posts with tag: information

WMATA, the D.C.-area public transit authority, has so far declined to provide schedule information for use by third-party providers, including Google Transit. For now, this means WMATA’s own website is the only online source of schedule information.

This is troubling on two accounts:

a) WMATA is largely government-funded, so route information should be treated as an open, public good; and

b) It appears WMATA refuses to open the data for fear of lost advertising revenues from their website, not out of any alleged benefit to riders. (source)

From their FAQ:

We believe that if we are to partner with an outside entity that we should look at what the cost-benefit is to that third party…

During the past year Metro has invested significant money to upgrade its Web site… The site includes Google maps in the neighborhoods in which our rail stations are located.

Interpretation: WMATA finds it acceptable to freely use the high-quality, extensible maps provided by Google but is unwilling to reciprocate by sharing data. Why? Because they’re worried Google will make money from WMATA data and they won’t get a piece of it. (Rumor has it that WMATA is holding out for a revenue-sharing agreement.)

The critical fallacy of this is that falsely treating information as a scarce good in a zero-sum economy harms both the content creator (WMATA) and the target audience (Metro riders).

On the contrary, when public data are opened for actual public use, everyone wins. Riders win by gaining easier access to route information. Google wins by gaining yet another data source. WMATA wins by increasing ridership.

I sent an email to WMATA’s chief administrative officer, Emeka Moneme, to tease out this last point. Excerpt:

If a traveler is planning to walk or drive to her destination, she will never even think to visit the WMATA website. However, there is a very strong chance she will use Google Maps. I assure you, there is no easier way to learn of a mass transit option than to pull up a route on maps.google.com and see “Also available: Public Transit”. Try it yourself on Google maps here: http://bit.ly/5khV

Each and every individual who discovers a WMATA route through such means is a potential new rider. He or she may use that Metro route for years to come. The value of a single new rider vastly outweighs that of a few pageviews on wmata.com.

Surely we can agree that there’s no zero-sum game in that. With open access to information, everyone wins.


(p.s. From the post title: I am not conflating Google with “the people”. Taxpayer-funded data should be open to any entity. In this case, however, providing data to Google is clearly in the people’s interest.)

Let’s talk about the elephant in the room.

I am, of course, referring to Evernote, a tool that’s designed to remember everything you throw at it – then provide access to your information from virtually anywhere.

If you’re unfamiliar with the tool, this video will give you a quick picture of what it does:

Despite the obvious awesomeness of its sync/access capabilities, earlier iterations of Evernote failed my acquired-information management criteria on numerous counts:

  • It couldn’t accept basic documents like PDFs (flexibility fail)
  • It wasn’t scriptable (extensibility fail)
  • You couldn’t import/export data en mass (openness fail)

But the Evernote team have been hard at work, and with the recent addition of scriptability and an API, it’s worth a serious second look.

Accessibility: 9/10

Accessibility is undoubtedly where Evernote shines: you can access your data on the web client, your Mac or PC, or your iPhone or Windows Mobile device – and it all stays synchronized. Ergo, you still have your data when the network goes down. (The mobile Evernote clients act more as search/input portals into your Evernote data, though the latest iPhone version now stores your “favorites” locally so you can access critical notes offline. Also, if you prefer to keep some data private, you can selectively opt out of synchronization.)

Besides being Evernote’s killer feature, accessibility is also the cornerstone of the product’s business model: Evernote itself is free, but if you need more than 40MB/month you’ll need to upgrade to the premium version ($5/month or $45/year). The most exciting Evernote use cases will probably be mashups that use its API, which means increased bandwidth needs – and, therefore, subscriptions. (N.B.: Evernote’s API documentation describes the bandwidth elements as the “accounting structure”.)

Flexiblity: 4/10

Evernote accepts text/RTF files, images, web clippings, PDFs, and audio notes. To get files in, you can drag-and drop files onto the desktop client, email them to a special Evernote email address, or use the bookmarklet to clip content directly from any web browser. Handily, if you have part of a web page selected, the bookmarklet just saves the selection. Read: Evernote is most in its element when used for web clippings.

In addition, images can be snapped from your webcam or iPhone camera. More on this later.

Unfortunately, Evernote can’t handle many common document types, including Word documents (though RTF documents work passably). Most other filetypes (mind maps, outliner documents, etc) are out of the picture as well.

In the Mac client, the built-in content editor is little more than a glorified text editor. Text formatting is limited to font/size, bold/italic/underline, and alignment, though you can also attach images. (Not to mention the insulting font selection, which includes Arial but not Helvetica – shame!) The Windows client also provides some drawing tools (useful with a tablet PC), a broader font selection, and outlining functions. Update 11/10/08: Version 1.1.6 for Mac introduces orderd/unordered lists and tables.

Scalability: 6/10 (est.)

I haven’t thrown a tremendous amount of data at Evernote yet, so it’s unclear how performance is affected by a large data set. (I was hoping to put it to the test via the Delicious bookmark import, but Evernote just imported the bookmarks as links, rather than scraping the bookmarks’ targets.)

From a user interface perspective, scalability may be a problem. Items are accessed by browsing (by tags and metadata) as well as search. Aside from the ability to use multiple “notebooks”, there is no standard hierarchical organization. Users with a large number of tags or documents might become frustrated by this.

Searchability: 8/10

Basic search functions are solid, though generally unremarkable (no regular expressions, no advanced operators). You can combine search terms with tag filters.

In addition, Evernote has a couple search tricks up its sleeve:

  • Images pass through Evernote’s OCR engine when synchronized, turning image text into searchable data. This even works, to a large extent, on handwriting – slick!

  • Evernote metadata includes standard text tags as well as optional location data, so you should be able to search by location as well as content

Extensibility 7/10

The Mac client now includes a basic AppleScript dictionary, which allows for integration with other apps. No content-level scripting, but the most important feature – note creation – is available.

One example of extensibility in action: Justin adapted my NetNewsWire » DEVONthink batch archiving script to work with Evernote, including tagging and notebook selection. Check it out here.

In addition, the Evernote API provides full access to Evernote data. I’m not aware of any Evernote-based applications yet, though Pelotonics is planning some level of integration. If useful integration emerges, this is will be a big win for users.

Openness 4/10

Disappointingly, despite claiming export options as a feature, Evernote maintains a tight grip on your data. Need to send a file to your colleague? Forget drag-and-drop: you’ll need to go through Evernote’s export or email functions.

On the Mac side, the only export option produces a proprietary Evernote-formatted XML file with document contents embedded. The Windows client can also export in HTML, web archive, and text formats.

When emailing files, Evernote wraps most notes in an ad-encrusted PDF document before sending. (Yes, it’s as bad as it sounds. Look for the “Plain text note” here. That PDF is what you get when you try to email a text file.) Mercifully, you can email PDFs in their original form – though whether this is by design or oversight is unclear. What is clear is that Evernote doesn’t make it easy for you to use your data as you wish.*

A third option is to share your documents in a public notebook (like this one). This of course not the same as export, but it does provide a refreshing level of social openness uncommon in tools of this nature.

Final thoughts

Evernote deftly handles web clippings and snapshots, and ubiquitous access makes it a viable tool for web research and data management.

Perhaps paradoxically, Evernote’s impressive accessibility also limits how I use it. Synchronizing data through Evernote’s server means I won’t use it for sensitive data. So although it’s hard to imagine a situation in which I won’t have access to my iPhone, Mac, or a web browser, it’s also hard to imagine a situation in which said access is truly critical. So I typically use Evernote for less important (but nonetheless useful) tasks:

  • Clipping captioned images and business cards for OCR
  • Jotting beer-tasting notes
  • Snapping photos of the same
  • Misc. data capture when I’m away from my computer

Bottom line: despite its limitations, Evernote is a great tool. At the free price point, you’re unlikely to find a more robust tool… so give the elephant a whirl.


*Comparisons could be drawn to DRM-laden music purchased from iTunes: it’s quite likely that you’ll never want to use it outside the iTunes/iPod ecosystem. But if (or when) that day comes, you won’t want to deal with their restrictions on your data. Same principle.

In this month’s The Atlantic, Nicholas Carr explores human cognition in the light of the Internet. Rhetorical extremes notwithstanding(1), it is a worthwhile read. His premise: the way we use the Internet is fundamentally altering the way we think – and not for the better.


I read Carr’s article at a coffee shop. To get through it without distractions, though, I had to shut my laptop and read off the printed page. It was, I must say, a fitting way to experience the article. In fact, try this: Pause for a minute. Click this hyperlink. Read the original article. Go ahead; I’ll wait…


By habitually drinking from the firehose of the Internet, Carr argues, we’re losing our capacity to engage in protracted, concentrated reading or thought. Reading is an imprinted – not instinctive – feature of our brains, so consuming content in dramatically new ways may actually alter the neurological networks in our brains. In short, we are learning to think in “the way the Net distributes [information]: in a swiftly moving stream of particles”. (Are we hard-wiring ourselves to be hyperactive information whores?)

To lose the capacity for “deep thought” would be truly lamentable, and it’s a worthy concern. This is one of the reasons I enjoy the periodic discipline of powering off the computer, phone, and iPod in order to engage in concentrated reading.

Nevertheless, although we may engage in long-form reading less frequently, as long as we retain the capacity for such pursuits, this is no tragedy. What Carr decries as the breakdown in critical thought may be equally heralded as the next great evolution in thought culture.

In short, we’re gaining the newfound ability to navigate torrents of information in a controlled and intentional fashion. This is not a tragedy. This is a breakthrough.

What this trend really demonstrates is a priority shift in the light of our new attention economy. With a glut of information and a scarcity of time, one of the most critical “new” skills is the ability to effectively manage the information we encounter. Our tendency to surf from site to site simply reflects our perception of the decreasing marginal utility of spending time on a given page. The fact that we can do this faster than before means we’re growing more adept at processing information quickly, and – again, assuming we haven’t lost the ability to parse large systems of data – this is a good thing.

Carr’s second concern addresses the commoditization of information and, by extension, knowledge itself. Google’s widely stated purpose (and, arguably, the purpose of the Internet) is to capture and organize the world’s information. Carr compares this process to assembly line efficiency research performed in the early 20th century, and implies that knowledge workers will soon be the next moral casualty in the workforce: just as standardized workflow procedures replaced experience and intuition on the factory floor, so Google will replace the independent thought of today’s knowledge workers.

He’s right, on one level: knowledge acquisition in the traditional sense may become irrelevant as information becomes more quickly and accurately accessible.

But his commentary assumes that knowledge qua knowledge is the apex of our information-based society, and that offloading our responsibility for knowledge (to Google et al) is tantamount to abdicating our intelligence. Quite the contrary: democratizing knowledge will clear the way for true intelligence to stand out. Interestingly, Carr notes that research which once took him days to perform now takes hours. Why does he not recognize that this will enable the brightest minds to vastly increase their intellectual output? Old-world research is out; intelligent synthesis by capable minds is in.

To conclude, while Carr does not (to this reader) adequately establish that the Internet is stunting our intelligence, he is correct in highlighting some of its pitfalls. In the frenzy of “hyperlinks, blinking ads, and other digital gewgaws”, we can become distracted and lose our critical edge. (By the way, did you read his full article? If you couldn’t dream of reading something that long – or tried but were too distracted to get through it – you may have a real case of Internet-onset ADHD. In that case, heed Carr’s warning, set your autoresponders, retire to a cabin with some books, and read for your life!)

The key, as with all areas of knowledge, is to stay alert – avoid intellectual laziness – and engage with the information we encounter. The Internet is merely a tool. Use it as such, and it will never replace your mind.

(1) Carr never poses the question “Is Google making us stupid?”, though it makes an eye-catching title and probably sold more than a few copies of the magazine.

Success in the information age hinges on managing the explosion of available information in meaningful ways. To even approach this goal requires a successful information management strategy, which revolves around the questions

  • “How do I find relevant information?”

and its corollary:

  • “How do I manage the information I’ve found?”

On a personal note, these are two of the questions that drive my own technological explorations. Brainstorming and note-taking methods and tools provide another side to the issue. This post is intended to provide some background and framework for said exploration.

How do I find relevant information?

Online information is typically located through complementary methods of search and discovery.

Traditional search technologies will long remain the first resort for information-seekers. Desktop search clients are also available for advanced data mining and research. Yet the rising semantic web is the true future of the Internet, and will enable users to interact with information in more meaningful and relevant ways.

Relationship-based information discovery is rapidly adding an important layer over traditional search tools. Social microsharing platforms (e.g., Twitter) and more robust social platforms (e.g., Twine, in private beta) allow individuals to build a liminal space of like-minded individuals with similar interests.

Two points are worth iterating here:

  1. Social networks are becoming a search sphere in their own right. For me, the Twitter ecosystem has become my trusted first source of user opinions; for many types of information, I search on Twitter before going to Google or DEVONagent.

  2. More and more information is shared and recommended through these relationship-based services. In other words, social networking platforms allow information to be discovered rather than explicitly sought.

Search once, not twice

The key to a useful information management strategy is this: You should only have to find a piece of information once.

Search tools should not be relied upon to find specific pieces of previously located information. If it takes more than fifteen seconds to locate online, it should be in your personal information system, not left to The Google.

If you spend a lot of time looking for information you’ve already encountered, your system is broken and you’re wasting your time. Or your employer’s time. Either way, that time should be spent turning information into knowledge, or putting it to use.

So: what to do with all this acquired information?

Tools of the trade

To be effective, an electronic document management system (EDM) should be:

  • Accessible — it’s available when and where you need it (for both archive and retrieval)

  • Flexible — able to accept input from any variety of sources

  • Scalable — can accept many thousands of documents without becoming unwieldy

  • Searchable — the system is worthless if you can’t find what you’re looking for

  • Extensible — it can be extended through scripting or other means

  • Open — It doesn’t hold your information hostage when you need to change systems

The most rudimentary means of storing information – file systems – fail where it matters most. Because file systems are not designed for this type of data management, they are not truly accessible (saving an excerpt from a website, for instance, is a many-step operation), or quickly searchable (your data are hidden amongst tens of thousands of irrelevant system and program files). In addition, file systems don’t provide end-to-end data functions, so viewing the contents of most file types requires launching another application. Add-on tools like Google Desktop mitigate some of these issues, but they’re no match for a real EDM system.

True EDMs are specifically designed for the task archiving and retrieving information. They can store images, text clippings, and documents of all types; add content indexing to the mix (allowing users to search by any word contained in their files); and are streamlined to allow quick archiving of information. EDMs can be implemented as software-based solutions (see Yojimbo, EagleFiler, and the like), as well as online (see Google Notebook, for instance).

Second-generation information managers like DEVONthink and Twine take content management a step further, adding semantic intelligence and useful content analysis to the user’s database. DEVONthink, a tool that I’ve used for years, analyzes the contents of its articles to identify non-obvious semantic relationships and assist with automatic filing. Twine performs similar functionality in the context of a social network, in theory promising to integrate the most relevant search, discovery, and EDM tools.

Live in the cloud…

As computer usage becomes increasingly network-centric and social, individuals are becoming more and more willing to trade privacy for the convenience and utility of web-based services.

Put another way, we are becoming more willing to keep our information in “the cloud”. (I like the cloud metaphor because, for me, it conjures images of Benjamin Franklin flying his kite in the electric storm. There is energy and power and excitement in the cloud. There’s also risk.)

This trend will spell dramatic shifts in EDM solutions to come. Soon all our data will be accessible from any web-enabled smartphone or computer, anywhere in the world. (And with customs agents able to search the contents of any electronic device with impunity, business travelers may soon be required to keep sensitive data online, not on their machines.)

But online services are not a silver bullet—yet. As a general rule, the current generation of Web 2.0 apps:

  • Make it difficult to work offline (technologies like Google Gears may soon obviate this concern)

  • Don’t take full advantage of OS-level services, keyboard shortcuts, etc

  • Are not easily automated or scriptable

  • Make it difficult to back up files (FUSE applications may change this in the near future)

  • Put users at the mercy of others for data integrity (Granted, it’s vastly more likely that you’ll lose data from your own hard drive crashing – rather than Google’s servers going kaputt – but either scenario is a possibility. Pick your poison)

…with your feet on the ground

Until these concerns can be fully mitigated, the most promising path forward lies in hybrid desktop/web platforms that allow users to maintain local and online control of information.

These may be end-to-end solutions (for example, the NewsGator family of products includes web- and software-based newsreaders that are fully synchronized) or more specific sync services (Plaxo, for instance, synchronizes desktop calendar and address book clients with online equivalents). When implemented correctly, these tools can be phenomenally useful.

I’ve been waiting for this same innovation to make its way to the world of EDM apps, and there are some promising options emerging. A limited example is DEVONthink Pro Office, which has a built-in web server that provides remote access to your database. (First impression: it’s slick, but you’re out of luck if you’re stuck behind a firewall or the database isn’t running.) Evernote is a new EDM tool with full desktop-to-web synchronization tools, as well as limited online editing.

The beginning

Ultimately, any EDM solution is only a tool — but it may be the most important tool in the arsenal of knowledge workers. It is therefore of critical importance that we take our EDM strategies seriously.

You may not yet have an EDM strategy. But creating one may be the most important step you can take in your development as a knowledge worker.

Take a moment to think about how you manage what you know. Start exploring technologies, asking how they can improve your knowledge set.

It may take months to work out a reasonable system of your own… but it’s a beginning, and one well worth making.

FireStats icon Powered by FireStats