Crawling a Country: Better Strategies than Breadth-First for Web Page Ordering Ricardo Baeza-Yates Carlos Castillo - Ricardo Baeza-Yates - Document - PDFSEARCH.IO - Document Search Engine

Back to Results

First Page	Meta Content
	Crawling a Country: Better Strategies than Breadth-First for Web Page Ordering Ricardo Baeza-Yates Carlos Castillo Add to Reading List Document Date: 2005-04-05 09:16:25 Open Document File Size: 269,07 KB Share Result on Facebook City Santa Clara / Dallas / Cancun / Santiago / Dunedin / Amsterdam / Cairo / Honolulu / Riberao Preto / San Jose / Geneva / Chiba / Brisbane / New York / Caesarea / Rome / Roanoke / Cambridge / Lisbon / London / / Company GPL Software / Kendall / Next Generation Information Technologies / ACM Press / Google / IEEE CS Press / Intel / / Country Switzerland / Netherlands / Egypt / Japan / Brazil / Australia / Portugal / United Kingdom / Israel / Italy / New Zealand / Chile / United States / Greece / / Currency USD / pence / / / IndustryTerm power-law distribution / real Web search / Web structure / infinite Web / web frontier / internet portals / actual Web / Web Conference / shark-search algorithm / hidden web / web robots session / eigenvector-based reputation systems / 43th Web servers / Web site administrators / Web Research Universidad de Concepcion / real Web crawlers / Web site mapping / topic-specific web resource discovery / Internet Technology / given Web site / estimated search length / Web search engines / Web crawling / Internet Archive / citation algorithm / web load / Web graph / Web graphs / actual Web graphs / Web Worm / Web crawler / spam Web pages / given server / actual Web crawl / fixed standard server / Web crawler architectures / Web Research Universidad de Chile rbaeza@dcc.uchile.cl ccastillo@dcc.uchile.cl Mauricio Marin Andrea Rodriguez Center / web collections / active Web sites / large Web sites / large-scale hypertextual Web search engine / breadth-first search / large Web / on-line page importance computation / search engine spamming / search engine perspective / Web page heaps / Web graph representing / Web Techniques / re-crawl Web / Web Servers / Web characterization / web-based services / extensible web crawler / topic-driven web crawlers / search engines / search engine copy / distributed web crawler / personal search agent / actual Web crawls / topmost Web site / Web search / real Web / active Web site / Web page importance / incremental web crawler / site-specific web crawlers / search engine / search results / Web crawlers / Web server cooperation / / OperatingSystem Linux / GNU / / Organization Univ. of Washington / Universidad de Chile / International World Wide Web Conference Committee / Web Page Ordering Ricardo Baeza-Yates Carlos Castillo Center for Web Research Universidad / Universidad de Concepcion / Center for Web Research Universidad / UCLA / Population Division / United Nations / / Person W. Edward G. Coffman / Hist / Morgan Kaufmann / Mauricio Marin Andrea / Ricardo Baeza-Yates Carlos / / Position General / random surfer / Economist / surfer / scheduler / / Product Pagerank / / ProvinceOrState Virginia / Hawaii / New York / California / Texas / Massachusetts / / PublishedMedium Theoretical Computer Science / Lecture Notes in Computer Science / / Technology RAM / Linux / filtering processor / Internet Technology / shark-search algorithm / Scheduling algorithms / Dom / Pagerank citation algorithm / search engine / machine learning / ISDN / Pagerank algorithm / HTTP / Data Mining / simulation / crawling algorithms / ranking and crawling algorithms / Web server / / URL http / SocialTag Computing Search engine optimization Web crawlers Markov models Link analysis PageRank Focused crawler Backlink Invisible Web World Wide Web