Tuesday, October 7, 2008

Thinking Cap questions on Page rank/ Authorities/hubs..

------
Digression:

Here is the link to the Montana history professor who took "sponsorship" from a burrito joint for his class:

http://www.missoulian.com/articles/2008/10/01/news/top/news01.txta

(Check out "Wait Wait Don't Tell me" on NPR on Sundays if you want to be kept abreast of this kind of important news ;-)
----------

Here is a thinking cap question:

 The following are several "comparisons" between page rank and Authorities/hubs methods for computing page importance. Comment on whether or not these comparisons make sense.

1. A/H analysis is too costly because it has to be done for each query, while page-rank analysis, which can be done off-line and once for all queries, is much better.

2. page rank analysis is not as good in taking importance in the context of the queries, while A/H analysis is much better.

3. A/H analysis has major issues with stability (i.e., its ranking can change a lot with just small random changes to link structure), while page rank is much more stable.

4. Its all a bunch of ballyhoo, and underneath it all, A/H and pagerank both give same ranking of importance to all the pages.


------------
Rao


4 comments:

Vijayakrishnan Nagarajan said...
This comment has been removed by the author.
Vijayakrishnan Nagarajan said...

1. The main factor for the cost is because the corpus is huge. Both Page rank and A/H should have this matrix to process. A/H, however on small changes to the link structure will affect the rank considerably. But page rank on the other hand doesn't affect much so once we compute the page rank offline we can wait for a period of time and then calculate it again.

3. If a link structure changes the page rank will also get affected but I don't think there will be a big change like A/H. While computing page rank, we use 1/n for sink pages and for a forwarding page we add (1-c)1/N . If the link structure of this sink node or forwarding page change then there wouldn't be much changes in the page rank as there is also a weight factor(c) associated to this.

Anonymous said...

2. It is true that A/H is much better than page rank analysis in regarding importance in the context of the queries. However, its property of going from a specific topic to a very general one, completely away from the query context, casts a shadow over this good feature.

Dmitry Voronov said...

1. A/H can be done "off-line and once for all queries" like page-rank also.

3. A/H and page-rank both have problems with stability as well as solutions.