This is old news on a couple of dimensions. Read Write Web had a post on how Google uses human relevance studies to help judge/QA their search results. This resulted from an interview that Peter Norvig gave to MIT Technology Review and caused some commenting in the blogosphere (NewYorkTimes Tech Blog, Goolge Blogoscoped). Old news on old news.
We now know that both Yahoo and Microsoft are using (to some degree) human studies to evaluate computational advertising algorithms (see this and this). Evaluating the correlation of what informational item an algorithm predicts, vs what humans think, is relevant to a context is the performance metric of your algorithm.
Question: When will TREC have a computational advertising contest?