I went to hear 
Mr Raghu Ramakrishnan of 
Yahoo! Research give a talk at 
ComLab, which was a real treat. Lovely people in a lovely building doing interesting things, this is the joy of living in Oxford and I am determined to take more advantage of it.
The talk was rather light on detail, if clearly and energetically given. The problem space was described very well, but the solution was mostly given by pointing to 
DBLife, so it can be done but the how is left for you to discover. 
If, like me, you think of "Yahoo!" as a failing company  with an incorrect belief that punctuation can form part of a name then you may not have paid much attention to them. 
(Douglas Crockford's 
json excepted).
However the point was well made that as the number two in the market "Yahoo!" must, necessarily, play a market disrupting game and the way they are doing this is to open their API. The current monetised search model is sown up by big G. The next generation will be 'semantic' search: 
- the search engine inferring the reason for the search and possibly tracking episodes in a search which may be engaged in over a number of sessions over a period of days or months.
- the search engine extracting more structure from the pages crawled. 
How to do this is unknown so "Yahoo!" have opened their API, lowered the barrier to entry for academics and others to play with the results of spidering. No more having to setup the spiders, parsers or vast datastores previously required to start a websearch project.
"Yahoo!" do not want anything in return, all they want is the current incumbent de-throned, so that they can have another shot, possibly in partnership with the next big idea.