Tuesday, November 3, 2009

Shortened URLs ruining the internet.

From Open the Future:

"The use of URL-shortening services is a classic example of short-term need trumping long-term resilience.

Shortened URLs:

• are not human-readable, and even the versions with user-generated mnemonics are little better than crude tags;
• don't provide contextual clues, which would offer a way to find the information later (if the article has expired, for example) by looking up relevant keywords or related concepts;
• rely on the continued presence of the particular shortener—any downtime or disappearance kills potentially millions of links.

That is, URL-shorteners violate three key principles of resilient design: they offer no transparency, no redundancy, and no decentralization. They're classic single-points of failure.

As a result, shortened URLs have little or no reference or archival value. A dead short URL is far worse than a dead standard URL, in fact, because (a) you have no way of getting contextual meaning, and (b) you can't even go look up the address on the Internet Archive. This is a real problem for those of us who think of the Internet as a tool for building knowledge. For better or for worse, services such as Twitter have gone from being ephemeral conversation media to being used as tools of collaborative awareness about the world. We can no longer assume that a link in a short message is of only transient value.

Yet many of us (including me) rely heavily on shorteners when using URLs "conversationally," such as on Twitter or in an instant message chat. They take far fewer characters than a typical URL; in length-limited media such as Twitter, that's a critical advantage.

So, in the immortal phrase, what is to be done?

Given that the need for URL shortening will remain as long as we use character-limit media such as Twitter or SMS, I can think of a few steps that would help to return some of the information resilience to the system:

• Embed shortening "behind the scenes" in Twitter and the like, so that senders just enter a full URL, and recipients see the full URL whenever possible. The full URL should show up on the web version, so that the real address gets archived.
• Google, Bing, Yahoo, and the other search engines should auto-translate any shortened URLs they stumble upon when indexing pages, so that at the very least the cached version contains the full address. The Internet Archive should definitely be doing this.
• All URL-shortening services should agree to make the records of short URL -> full URL links available to search and archival sites, under appropriate privacy conditions (e.g., all names/IP addresses of users stripped out, data only available if the company goes under, data only available after five years, users can choose to allow the URL link to expire).

Any of these would be an enormous step forward, and the combination would make for a much more resilient system. Admittedly, all of these steps require a bit of coding work, and aren't going to be implemented overnight. However, nobody said resilience was easy—just necessary."
Jamais Cascio

No comments: