Let's talk about Emergent properties of information, vs. explicit properties, shall we?
This came up recently when we were talking about metadata, and how it is terrifically useful as an emergent property developed by a metadata generator (Google or Technorati), and terrifically inconvenient as an explicit property developed by the content generator (RDF coded by humans).
Well, here's another example of the same thing in another domain: Artificial Intelligence. Marvin Minsky says "AI has been brain-dead since the 1970s". This is a terrific Freudian slip, as well as metadata confusion. { Marvin is one of the true pioneers of AI, by the way, and one of my heroes. } Read on...
First, perhaps a brief digression is in order. An emergent property of something is an attribute which "emerges" from the whole, a higher-level thing which summarizes lower-level things. For example, this post is philosophical. No one letter or word or sentence or even paragraph contains the attribute "philosophical", that property emerges from the whole. If I enclosed the whole post with tags like this <philosophy>...</philosophy> that would be an explicit property; something I explicitly added (and the "<philosophy>" tags would be metadata).
Going back to AI, Marvin says AI can't deal with concepts like "water is wet". In this case, wet is an emergent property of water; no one water molecule has this property, but a bunch of molecules in liquid form together do. And in the physical world there is no way to add metadata - you can't "tag" liquid with a property like "wet". But here's where I respectfully part ways from Marvin. Emergent properties like "wet" can be determined, usually by analogy - and AI has made huge strides in this sort of processing. It is just like Google can determine that a website about water is "wet" by examining all the links to the site ("this site is wet"), even if the site itself does not mention wetness (or "know" that it is wet in any way).
Marvin goes on to say expert systems based on rules and heuristics have 'reached a dead-end'. This is exactly right! But this doesn't mean AI has stopped - it has redirected... Using rules and heuristics is akin to using explicit metadata, an approach which is inherently limited. Using inference engines to determine emergent properties is more powerful and actually easier. In a way, this is a "brain dead" approach, because it doesn't require that a lot of effort be invested in creating metadata the way rules/heuristic approaches do. Instead, the effort is invested in analysis of the information to synthesize the meta-information. So Marvin is dead right - AI is "brain dead" - but not in the way he meant. It certainly doesn't mean progress has stopped, quite the contrary. Google is a shining example of AI in action.
And speaking of Google, consider The Semantic Web (note capital letters). Some people think labeling everything on every web page with metadata will make searching and managing the information on the web easier. Wrong. This is just like putting <philosophy> tags around this article. If this is philosophy, then that's because it is an emergent property of what I wrote, not because I explicitly labeled it as such. What if I labeled it <sports> or <art>? That wouldn't make it either one (well, you might consider it art, but that would be a matter of opinion, not a clearly labeled fact. And that's the point!) Instead of explicitly attaching metadata to everything, the web has evolved superior ways of implicitly computing emergent properties, exemplified by search engines like Google and Technorati. Not only is this much easier - everything doesn't have to be categorized up front - but it works much better, because the emergent properties do not have to be predetermined in advance.
This same emergent vs. explicit distinction comes up in image processing, which my little company Aperio does all day long. When you're trying to recognize patterns in images, you can do it two ways. One way is to make a list of possible "features" images can have, things like shapes, colors, textures, relationships to one another, etc. Then you catalog all the features of a particular image explicitly by way of characterizing that image. The other way is to compute emergent properties of the image dynamically, and use them to characterize the image. This is conceptually simpler and actually easier, because an exhaustive list of potentially useful features does not have to be compiled up front. It does require clever algorithms for computing the emergent properties, which is where Aperio has some cool ideas...
Whenever you read about people who want to add metadata to information to make properties explicit, be skeptical! There are generally better ways to accomplish the same thing by computing emergent properties from the information itself.
|