Dedoop: Efficient Deduplication with Hadoop Lars Kolb Andreas Thor Erhard Rahm

First Page		Document Content
Date: 2012-08-30 08:45:36 Computing Concurrent computing Parallel computing Workflow technology Distributed computing architecture Apache Software Foundation Cloud infrastructure MapReduce Apache Hadoop Record linkage Workflow Data cleansing		Dedoop: Efficient Deduplication with Hadoop Lars Kolb Andreas Thor Erhard Rahm Add to Reading List Source URL: dbs.uni-leipzig.de Download Document from Source Website File Size: 1,05 MB Share Document on Facebook

	SparkTrails: A MapReduce Implementation of HypTrails for Comparing Hypotheses About Human Trails Martin Becker Hauke Mewes DocID: 1xUtT - View Document
	Scaling Up with MapReduce, Hadoop, and Amazon Term Frequency Kenneth Lay DocID: 1vprX - View Document
	MapReduce による大規模分散システムのシミュレーション杉野好宏† 華井 DocID: 1uWY0 - View Document
	Densest Subgraph in Streaming and MapReduce Bahman Bahmani Ravi Kumar Sergei Vassilvitskii DocID: 1uBLZ - View Document
	1/7/16 More Go, Lab 1 Hints, and MapReduce Tom Anderson DocID: 1utqA - View Document