Back to Results
First PageMeta Content
World Wide Web / Uniform resource identifier / Uniform resource locator / Robots exclusion standard / Port / Sitemaps / Web crawlers / Information science / Computing


Mercator: A Scalable, Extensible Web Crawler Allan Heydon and Marc Najork Compaq Systems Research Center 130 Lytton Ave. Palo Alto, CA 94301 {heydon,najork}@pa.dec.com
Add to Reading List

Document Date: 2006-09-08 22:19:53


Open Document

File Size: 109,95 KB

Share Result on Facebook

Company

Google / AltaVista / /

/

Facility

Omitted port / /

IndustryTerm

Internet service providers / cross-site / appropriate network protocol / malicious web servers / Web conferences / document processing code / Courteous web crawlers / local name server / Internet Name Domain / basic algorithm / web documents / Internet Archive crawler / interpolated binary search / web servers / extensible web crawler / web crawling problems / disk search / binary search / search engines / web crawling / Internet Archive / web server / given web server / web crawler design / computing / actual world wide web crawls / extensible web crawlers / Web Crawler The / Internet Archive crawlers / search engine / authoritative server / malicious web server / internet crawls / web crawlers / web crawler / web services / /

OperatingSystem

Unix / /

Organization

Scalable / Extensible Web Crawler Allan Heydon and Marc Najork Compaq Systems Research Center / Domain Name Service / /

Person

Rabin / Mark Manasse / Matthew Gray / Marc Najork / /

Position

Extractor / Walker / thread scheduler / head / random walker / Link Extractor / /

Product

URL / Broder / Mercator / /

ProgrammingLanguage

Java / HTML / /

ProvinceOrState

Manitoba / /

Technology

Gopher protocols / HTTP protocol / fingerprinting algorithm / User-supplied protocol / appropriate network protocol / Unix / 3.3 The HTTP Protocol / search engine / random access / operating system / 4.1 Protocol / HTML / network protocol / GIF / DNS / Java / abstract Protocol / ISP / HTTP / caching / web server / network protocols / basic algorithm / /

URL

http /

SocialTag