Dublin / Computer Vision / Hannover / New York / Marrakech / New York City / /
Company
K. Vieira A. S. / Google / Link Density (P/C/N) Dim Precision / B. Ribeiro-Neto A. S. / Sixth International Language Resources / /
Country
Germany / South Africa / United States / Canada / Australia / United Kingdom / Morocco / India / /
Currency
USD / / /
Event
Product Issues / Product Recall / /
Facility
National Institute of Standards and Technology / In Building / Lucene IR library / /
IndustryTerm
Web-scale solution / opinion mining / web page structure / web-page segmentation / web-page cleaning tool / Web corpus / Web content / Web intrapage informative structure mining / power-law / web page templates / Web Corpora / tree-based algorithms / Web browser / data mining / news search engine / Web page segments / Web service / web-page content identification / WEB PAGE FEATURES Feature Levels / opinion mining pipeline / web template content / web page template detection / Web Document Modeling / Web search engines / search precision / Web-as-Corpus / data management / Web information / web search / 1R algorithm / mining / web page segmentation / form factor devices / actual content Web pages / large web-derived corpus / search engine / /
MarketIndex
set 675 / /
Organization
European Language Resources Association / National Science Foundation / Wolfgang Nejdl L3S Research Center / National Institute of Standards and Technology / T PT / Federal Government / Leibniz Universität Hannover Appelstr / /
Person
Web / Wolfgang Nejdl / /
Position
simple and plausible stochastic model for describing the boilerplate creation process / author / Shannon random writer / simple random writer / representative / /
Product
Lucene IR / Hannover / Leaves / NumWords/LinkDensity / /
ProgrammingLanguage
HTML / /
ProvinceOrState
New York / /
PublishedMedium
the Google News / /
TVShow
Shannon / /
Technology
same algorithms / BOILERPLATE Algorithm / decision-tree-based algorithms / search engine / machine learning / HTML / Terms Algorithms / Knowledge Management / Automatic identification / Pasternack algorithm / data mining / BTE algorithm / DOM / document object model / 1R algorithm / evaluated algorithms / /