Beyond PageRank: Machine Learning for Static Ranking

  • Matthew Richardson, Microsoft Research, USA
  • Amit Prakash, MSN, USA
  • Eric Brill, Microsoft Research, USA

Track: Search

Slot: 16:00-17:30, Friday 26th May

Since the publication of Brin and Page's paper on PageRank, many in the Web community have depended on PageRank for the static (query-independent) ordering of Web pages. We show that we can significantly outperform PageRank using features that are independent of the link structure of the Web. We gain a further boost in accuracy by using data on the frequency at which users visit Web pages. We use RankNet, a ranking machine learning algorithm, to combine these and other static features based on anchor text and domain characteristics. The resulting model achieves a static ranking pairwise accuracy of 67.3% (vs. 56.7% for PageRank or 50% for random).

