CLEAR: a credible method to evaluate website archivability Vangelis Banos† Yunhyong Kim‡ Seamus Ross‡ Yannis Manolopoulos† †Aristotle University of Thessaloniki, Greece ‡ - ROT - Document - PDFSEARCH.IO - Document Search Engine

Back to Results

First Page	Meta Content
	CLEAR: a credible method to evaluate website archivability Vangelis Banos† Yunhyong Kim‡ Seamus Ross‡ Yannis Manolopoulos† †Aristotle University of Thessaloniki, Greece ‡ Add to Reading List Open Document File Size: 296,87 KB Share Result on Facebook City Madrid / San Francisco / Wroclaw / Vienna / Lyon / Dublin / Banf / Rome / Edinburgh / / Company BBC / N. Press / Enterprise Information Systems / Mysql / / Country France / Canada / United Kingdom / Italy / New Zealand / Poland / Austria / Spain / / / Event Product Release / / Facility Vangelis Banos† Yunhyong Kim‡ Seamus Ross‡ Yannis Manolopoulos† †Aristotle University of Thessaloniki / University of Glasgow / National Library of New Zealand / British Library / / IndustryTerm streaming media / server static web content / web archive content / web applications / predefined proprietary software / web server34 / blog specific technologies / web application / web archiving initiatives / web spider / web professionals / web archives / web archive quality / risk management / web aggregation practices / web application implementing / web site resources / web archivists / referenced media content / web crawling software / web harvester / Web Dynamics / target site / particular site / empirical web aggregation practices / web archiving process / Web archiving / particular web archiving research task / web spiders / web archiving workflow / web resources / web curator tool / web server / web archiving tasks / times proprietary software / web application execution model / selective web archiving / diﬀerent services / web archive operators / web crawler / crawler technology / reference media content / diﬀerent applications / Internet Archive18 / web content / Internet Archive11 / web content identification / external systems / Internet Preservation Consortium / web browsers / archival quality web crawler / prototype web application / deep web entity pages / web service / web preservation / web archive / Internet Engineering Task Force RFC / influence web design professionals / web bots / format validation tool / Web-At-Risk project / web interface / web crawler technology / web developer / invalid web content / web servers / web archiving projects / Web-based services / web archiving standards / web archiving community / search engines / thorough tool / website archivabliity tool / web archiving systems / main web harvesting application / Web Archiving Workshop / validation tool / Web Search / initiative used open source web services / web archiving workflows / web technologies / erroneous web archives / open source software Web Curator Tool / Internet Archive2 / web crawlers / technology stack / / Organization General Assembly / Digital Preservation Coalition / Aristotle University of Thessaloniki / European Commission / Harvard / University of Glasgow / Congress / Preserving government / / Person Archivability Facet / Archivability Facets / Yannis Manolopoulos / D. Xin / V / S. Joe / V / A. Priscilla Caplan / S. Lorna Campbell / D. Michael Day / Seamus Ross / / Position General / author / e.g. author / curator / open source software Web Curator / / Product Credible Live Evaluation / / ProgrammingLanguage XML / Javascript / FP / Python / HTML / / ProvinceOrState Washington / California / / PublishedMedium The VLDB Journal / D-Lib Magazine / / Technology XML / JSON / Robots.txt protocol / operating system / HTML / Sitemaps protocol / 4.1 Technology / Sitemap.xml4 protocol / Resource Description Framework / RDF / Robots.txt protocols / Flash / jpeg / blog specific technologies / streaming media / pdf / crawler technology / web technologies / web crawler technology / HTTP / Data Mining / web server / / URL http / SocialTag Digital libraries Data quality Archival science Web archiving Data management Robots exclusion standard Link rot Web ARChive Heritrix Computing