<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Sleight of handle</title>
	<atom:link href="http://bit-player.org/2008/sleight-of-handle/feed/" rel="self" type="application/rss+xml" />
	<link>http://bit-player.org/2008/sleight-of-handle</link>
	<description>An amateur's outlook on computation and mathematics.</description>
	<pubDate>Mon, 01 Dec 2008 20:03:29 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.3</generator>
		<item>
		<title>By: brian</title>
		<link>http://bit-player.org/2008/sleight-of-handle#comment-1752</link>
		<dc:creator>brian</dc:creator>
		<pubDate>Thu, 31 Jul 2008 13:50:00 +0000</pubDate>
		<guid isPermaLink="false">http://bit-player.org/?p=151#comment-1752</guid>
		<description>@Stuart Haber: I agree that the Phelps and Wilensky lexical signatures are very cool idea. And I'm sure that if such a digest were included in every URL or link, people would find other clever things to do with the information. Google would love it.

But I have to add that it's not hard to think up scenarios where redirecting through a search engine would be unhelpful. Consider a phishing message with a link to a URL that looks legitimate but names a nonexistent document; the lexical signature is carefully crafted to send you off to a nefarious web site. (Phelps and Wilensky don't discuss at all how their scheme would work in a world with malicious agents.)

By the way, one of the ISPs I use has implemented an analogous idea at a different level of the network infrastructure -- not when a site returns a 404 error but when a DNS lookup fails. Instead of getting a proper error message, I get a "helpful" list of suggested alternatives, plus a gaggle of advertising. The only thing I can say in favor of this intervention is that it has made me a more careful typist when I enter a URL by hand.</description>
		<content:encoded><![CDATA[<p>@Stuart Haber: I agree that the Phelps and Wilensky lexical signatures are very cool idea. And I&#8217;m sure that if such a digest were included in every URL or link, people would find other clever things to do with the information. Google would love it.</p>
<p>But I have to add that it&#8217;s not hard to think up scenarios where redirecting through a search engine would be unhelpful. Consider a phishing message with a link to a URL that looks legitimate but names a nonexistent document; the lexical signature is carefully crafted to send you off to a nefarious web site. (Phelps and Wilensky don&#8217;t discuss at all how their scheme would work in a world with malicious agents.)</p>
<p>By the way, one of the ISPs I use has implemented an analogous idea at a different level of the network infrastructure &#8212; not when a site returns a 404 error but when a DNS lookup fails. Instead of getting a proper error message, I get a &#8220;helpful&#8221; list of suggested alternatives, plus a gaggle of advertising. The only thing I can say in favor of this intervention is that it has made me a more careful typist when I enter a URL by hand.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stuart Haber</title>
		<link>http://bit-player.org/2008/sleight-of-handle#comment-1751</link>
		<dc:creator>Stuart Haber</dc:creator>
		<pubDate>Wed, 30 Jul 2008 20:33:36 +0000</pubDate>
		<guid isPermaLink="false">http://bit-player.org/?p=151#comment-1751</guid>
		<description>Here's another clever solution:  In place of the ordinary URL for a web page, use instead a "robust hyperlink", extending the URL with a "lexical signature" consisting of (say) five well-chosen words in the page.  To find the page, given the robust pointer, first try the URL.  If that fails, then ask Google to find the page -- which it will, if the five words were indeed chosen well, and the page is in Google's index.

See "Robust hyperlinks cost just five words each", by Thomas A. Phelps and Robert Wilensky, at &lt;a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2000/CSD-00-1091.pdf" rel="nofollow"&gt;http://www.eecs.berkeley.edu/Pubs/TechRpts/2000/CSD-00-1091.pdf&lt;/a&gt;.  It is easy to automate both the choice of lexical signature and the dereferencing mechanism.</description>
		<content:encoded><![CDATA[<p>Here&#8217;s another clever solution:  In place of the ordinary URL for a web page, use instead a &#8220;robust hyperlink&#8221;, extending the URL with a &#8220;lexical signature&#8221; consisting of (say) five well-chosen words in the page.  To find the page, given the robust pointer, first try the URL.  If that fails, then ask Google to find the page &#8212; which it will, if the five words were indeed chosen well, and the page is in Google&#8217;s index.</p>
<p>See &#8220;Robust hyperlinks cost just five words each&#8221;, by Thomas A. Phelps and Robert Wilensky, at <a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2000/CSD-00-1091.pdf" rel="nofollow">http://www.eecs.berkeley.edu/Pubs/TechRpts/2000/CSD-00-1091.pdf</a>.  It is easy to automate both the choice of lexical signature and the dereferencing mechanism.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Christopher Lord</title>
		<link>http://bit-player.org/2008/sleight-of-handle#comment-1740</link>
		<dc:creator>Christopher Lord</dc:creator>
		<pubDate>Wed, 23 Jul 2008 18:30:25 +0000</pubDate>
		<guid isPermaLink="false">http://bit-player.org/?p=151#comment-1740</guid>
		<description>I saw one solution to this problem once that really blew me away; give data names, and network with the names instead of the addresses -- a.k.a. content-centric networking. Van Jacobson, an inventor of TCP/IP, is a big proponent of this idea. 

It's the mindset of functional programming applied to networking. instead of more indirection, less indirection (i.e., zero indirection)

Tim Berners-Lee is suggesting this sort of notion by having invariant URLs. The only problem is, the invariance is not enforced by the web, and like filthy C programmers, we change URLs constantly as a result. The web unnecessarily ties DATA to LOCATION in the url schema, resulting in mutable names for data.</description>
		<content:encoded><![CDATA[<p>I saw one solution to this problem once that really blew me away; give data names, and network with the names instead of the addresses &#8212; a.k.a. content-centric networking. Van Jacobson, an inventor of TCP/IP, is a big proponent of this idea. </p>
<p>It&#8217;s the mindset of functional programming applied to networking. instead of more indirection, less indirection (i.e., zero indirection)</p>
<p>Tim Berners-Lee is suggesting this sort of notion by having invariant URLs. The only problem is, the invariance is not enforced by the web, and like filthy C programmers, we change URLs constantly as a result. The web unnecessarily ties DATA to LOCATION in the url schema, resulting in mutable names for data.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jim Ward</title>
		<link>http://bit-player.org/2008/sleight-of-handle#comment-1709</link>
		<dc:creator>Jim Ward</dc:creator>
		<pubDate>Thu, 26 Jun 2008 13:28:55 +0000</pubDate>
		<guid isPermaLink="false">http://bit-player.org/?p=151#comment-1709</guid>
		<description>Re: "The only real standard is volume" - I thought that was the way standards should work. First you get volume, THEN you write the standard. Everybody builds a better mousetrap, but it's pretty much a debate until you get other people using it.

Maybe it's the circle of (web) life that URLs die?</description>
		<content:encoded><![CDATA[<p>Re: &#8220;The only real standard is volume&#8221; - I thought that was the way standards should work. First you get volume, THEN you write the standard. Everybody builds a better mousetrap, but it&#8217;s pretty much a debate until you get other people using it.</p>
<p>Maybe it&#8217;s the circle of (web) life that URLs die?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nik</title>
		<link>http://bit-player.org/2008/sleight-of-handle#comment-1708</link>
		<dc:creator>Nik</dc:creator>
		<pubDate>Wed, 25 Jun 2008 12:33:18 +0000</pubDate>
		<guid isPermaLink="false">http://bit-player.org/?p=151#comment-1708</guid>
		<description>The problem of broken links in the web was, sadly, easily predictable and many of the pitfalls could have been avoided had TBL et al spent a bit more time looking at prior art in the hypertext community. 

Contrary to popular belief, there was a well established and thriving hypertext community long before the Web (see for example the musings of Ted Nelson). However as Scott McNealy once quipped, the only real standard is volume, and now the web is the largest (and most broken :) hypertext collection in the world.

There was a fairly sensible proposal to introduce a DNS-style indirection using URNs, where a URN server would translate a URN to a URL transparently to the user. This would work if the target of a URN were always informed that they'd been targetted, and made sure they updated the URN server with the new address (e.g. as we do for DNS name resolution).

Still, for all its faults perhaps we should just be grateful how well the web works as a whole rather than the occasional grief a 404 error causes.</description>
		<content:encoded><![CDATA[<p>The problem of broken links in the web was, sadly, easily predictable and many of the pitfalls could have been avoided had TBL et al spent a bit more time looking at prior art in the hypertext community. </p>
<p>Contrary to popular belief, there was a well established and thriving hypertext community long before the Web (see for example the musings of Ted Nelson). However as Scott McNealy once quipped, the only real standard is volume, and now the web is the largest (and most broken :) hypertext collection in the world.</p>
<p>There was a fairly sensible proposal to introduce a DNS-style indirection using URNs, where a URN server would translate a URN to a URL transparently to the user. This would work if the target of a URN were always informed that they&#8217;d been targetted, and made sure they updated the URN server with the new address (e.g. as we do for DNS name resolution).</p>
<p>Still, for all its faults perhaps we should just be grateful how well the web works as a whole rather than the occasional grief a 404 error causes.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: D. Eppstein</title>
		<link>http://bit-player.org/2008/sleight-of-handle#comment-1707</link>
		<dc:creator>D. Eppstein</dc:creator>
		<pubDate>Mon, 23 Jun 2008 18:17:31 +0000</pubDate>
		<guid isPermaLink="false">http://bit-player.org/?p=151#comment-1707</guid>
		<description>Stale doi's are not the only problem; the other is that the same content may be hosted by multiple entities with different doi's, and when one entity stops hosting its content the doi gives you no way of finding the other copies. iPhylo has more &lt;a href="http://iphylo.blogspot.com/2007/05/duplicate-dois.html" rel="nofollow"&gt;here&lt;/a&gt;, &lt;a href="http://iphylo.blogspot.com/2008/05/bioone-andor-crossref-sucks.html" rel="nofollow"&gt;here&lt;/a&gt;, and &lt;a href="http://iphylo.blogspot.com/2008/05/when-dois-collide-and-then-disappear.html" rel="nofollow"&gt;here&lt;/a&gt;.</description>
		<content:encoded><![CDATA[<p>Stale doi&#8217;s are not the only problem; the other is that the same content may be hosted by multiple entities with different doi&#8217;s, and when one entity stops hosting its content the doi gives you no way of finding the other copies. iPhylo has more <a href="http://iphylo.blogspot.com/2007/05/duplicate-dois.html" rel="nofollow">here</a>, <a href="http://iphylo.blogspot.com/2008/05/bioone-andor-crossref-sucks.html" rel="nofollow">here</a>, and <a href="http://iphylo.blogspot.com/2008/05/when-dois-collide-and-then-disappear.html" rel="nofollow">here</a>.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Barry Cipra</title>
		<link>http://bit-player.org/2008/sleight-of-handle#comment-1706</link>
		<dc:creator>Barry Cipra</dc:creator>
		<pubDate>Mon, 23 Jun 2008 18:13:36 +0000</pubDate>
		<guid isPermaLink="false">http://bit-player.org/?p=151#comment-1706</guid>
		<description>"If everyone counts on Google to know where everything is, Google will have no way of finding anything"

In a talk some years ago on the linear algebra lurking beneath PageRank, the speaker referred to this prospect as "eigendeath."  I don't remember who gave the talk or whether the term was original with him or her.  Does anyone know the etymology?  Google gives no hits for it!</description>
		<content:encoded><![CDATA[<p>&#8220;If everyone counts on Google to know where everything is, Google will have no way of finding anything&#8221;</p>
<p>In a talk some years ago on the linear algebra lurking beneath PageRank, the speaker referred to this prospect as &#8220;eigendeath.&#8221;  I don&#8217;t remember who gave the talk or whether the term was original with him or her.  Does anyone know the etymology?  Google gives no hits for it!</p>
]]></content:encoded>
	</item>
</channel>
</rss>
