Dear Google, smarter plz

I say google because it happens to be the only search engine I use.  I guess this applies to all of them, if for some weird reason you use yahoo or ask or msn.

When I search for something like “java spring static method”, often times google will return results from mailing lists or blogs or similar things that have multiple unrelated blurbs on one page.

For example, perhaps a single web page shows the 3 most recent blogs on it.  Blog 1 is about java threading, blog 2 is about spring coil matresses and static cling, and blog 3 is about the scientific method.  Apparently it’s terrbear.org’s blog, thus the non sequitur.

Google returns me this page proudly saying “I have matched all of your words, I’m a super genious”.  I follow up with “Son of a B, I’ll never find an example of how to use java’s spring framework to call a static setter method for dependency injection!”  Solution?  There are several.

  1. The owner of the web page could tell search engines not to index his home page, just index his pages that show blog’s individually.  This is putting a lot of responsibility in terrbear’s hands.  I don’t think he’ll do it, and even if he takes the time to figure out how to give the search engines these hints, no rules says they will all follow it.
  2. Search engines could develop more sophisticated semantic parsing to realize that although these words are on the same page, they are really discussing 3 different topics.  That would be great but I would prefer a solution that is more in the range of my lifetime.
  3. Semantic web.  Digg is doing itYahoo is parsing it.  It is basically using a standard called RDF to give your html a semantic context, like subject, object, predicate.  From there you can turn a simple post into something more meaningful to search engines (any software really) so that google no longer thinks static cling has anything to do with dependency injection.

Not that I think any of these things are around the corner, but I think #3 has the most potential.  Not only does it make it easier to maintain (automatically generate or annotate your posts with RDFa) but it has a wider range of uses.  Not only would search engines use it, your browser could use it to show relevant bookmarks, google could use it to give more relevant ads, skynet could teach itself from the internet, etc.

2 Comments so far

  1. terry on May 5th, 2008

    i don’t think i’ve ever talked about java threading, spring coil mattresses, or static cling.

    that said, i changed the tag line according to your paragraph. nice.

  2. jay on June 3rd, 2008

    You ended your last paragraph with both sentences starting “Not only. . .”. I thought that was a nice touch.