Under construction; I will add links as we go along. Each citation has a public URL which may end in a dead link over time, so I have tried to add a link to a private copy as well; obviously if only this page is copied over somewhere, the private links will break.
The 3xFaloutsos paper was an early study, starting off many power-law papers.
The bow-tie paper characterized large scale structures in the Web graph.
Estimates of the diameter of the Web graph.
If you are (very) mathematically inclined here are two superhubs to check: a course page by Alan Frieze, and another one by David Aldous.
Barabasi and others wrote the initial preferential attachment/rich-gets-richer/power law degree distribution with power=3; this is kind of inflexible. Meanwhile Web data has power closer to 2.
Kumar and others proposed and analyzed the model of copying links from a refernce node, and offered the scope for smooth tuning of the power.
Pennock and others proposed a linear splice between preferential attachment and uniform attachment, also giving a way to tune the power to observed data.
Adler and Mitzenmacher give a simple algorithm for compressing graphs using the reference idea, and many negative (hardness) results for improving upon it.
Nevertheless, a group in Italy has written a great Java package called WebGraph to compress Web graphs down to only about 2.58 bits per link.
Henzinger and others discuss a practical way of sampling from the Web graph without using backlinks, although this means there can be some initial bias.
Bar-Yossef and others use a backlink and self-loop trick to turn the Web graph into an undirected, regular graph, so that we can sample directly from a walk on the transformed graph.
We applied a Bar-Yossef-type algorithm to study broad topic communities on the Web.
Kephart and White analyze some special cases of virus-spreading in Erdos-Renyi graphs.
Wang and others give a general survivability analyses for arbitrary graphs.
The unpublished "Maxwell's equation" paper from Page, Brin and others.
Tomlin suggested that Pagerank is only solution to a larger class of constrained flow problems.
A number of followup papers/projects biased the jump and/or walk steps to treat pages differently.
Richardson and Domingos proposed the "intelligent surfer" (as against the "random surfer").
Haveliwala avoids the overhead of a per-word pagerank computation by proposing a coarsening to a topic-specific pagerank.
Bar-Yossef et al. proposed an absorbing random walk to score a node for freshness/staleness of pages.
There's also been a proliferation of papers which apply Pagerank-style ideas for doing keyword searches in graphs derived from relational or XML data sources.
ObjectRank uses ideas very similar to Richardson and Domingos, but on (semi-)structured data graphs rather than the Web graph.
Faloutsos, McCurley