CS 610 Sec 004 Spring 2020  
Instructor: Ravi Varadarajan  
Performance of Google Page Ranking Algorithm  
Final Project Option 2  
The goal of this project is to analyze the performance of Google PageRank algorithm which we will  
discuss in detail in the class when we discuss graphs. This algorithm assigns a rank to a web page based  
on how often a web user can arrive at a page clicking on links from other pages.  Higher the rank of a  
web page higher its relevance which the search engine can use in displaying the results of a search.  
PageRank is probability function defined as  ∶  →	[0, 1] where S is the set of web pages of interest  
and  ∑ ()!∈#	  = 1. Let every page in S be represented as a vertex in a directed graph (S, E) where  
edge (u, v) indicates there is a link from page u to page v. Then the rank function satisfies the relation: () = (1 − ) + 	 2 ()/()(,()*	+   
“d” is a damping factor usually around 0.85 which indicates that the user will stop surfing after clicking  
through some series of links. You will be using an iterative algorithm to find a solution to the above  
recurrence equations.  In this algorithm we start with some initial rank values for the vertices and then  
modify them iteratively using the above equation.   
In the first part of the project you will be implementing this algorithm for which the input will consist of  
(i) a directed graph, (ii) a damping factor, (c) iteration limit and (d) error limit. You will terminate the  
algorithm if either the new rank values differ from values in the previous iteration within the error limit  
or the number of iterations exceeds the iteration limit. In the latter case we will deem there is no  
solution. You can start with initial value of 1/n for the rank of each vertex.  You should have your own  
implementation of this algorithm (not borrowed from any web site or from anyone else).  
Second part of the project requires you to write a program to test the performance of this algorithm. For  
various values of the number of vertices “n” and maximum out-degree (as % of n), you need to generate  
random graphs as inputs.  These graphs are generated by flipping a coin to determine if there should be  
an edge from a vertex u to vertex v within the maximum out-degree constraint. For each of these  
graphs, run your algorithm and only if there is a solution, include it in the time statistics that you will  
use to assess the performance. For each value of n and maximum out-degree, generate many input  
cases to measure the average time.  
You should plot the average time as a function of number of vertices for each maximum out-degree  
value (% of n) that you have selected.  
You should implement it in either C++ or Java (preferable). For Java, submit a zip file containing (a)  
source code, (b) Jar file containing .class files and (c) a Windows command file that can be run to test  
the first part of the project by entering information such as the directed graph, damping factor etc.  
when prompted by the program, (d) a Windows command file that can be run to execute the test  
program of second part of the project and see the performance results in a tabular form and (e) one- 
page summary of your findings regarding how the algorithm performs asymptotically, including the  
performance plots. For C++, instead of (b), submit the executables that can be run on an AFS UNIX host  
and for (c) and (d) submit UNIX shell scripts to run the test programs.