Login

 

Show simple item record

dc.contributor.author Daggupati, Bhanupreeti en
dc.date.accessioned 2013-03-12T18:43:21Z en
dc.date.available 2013-03-12T18:43:21Z en
dc.date.copyright 2011 en
dc.date.issued 2013-03-12 en
dc.identifier.uri http://hdl.handle.net/10139/6429 en
dc.description.abstract There has been an exponential growth of data in the last decade both in public and private domain. This thesis presents an unsupervised mechanism to identify duplicates that refer to the same real-world entity. With an unsupervised algorithm, there is no need for manual labeling of training data. This thesis builds on this idea by introducing an additional classifier, known as the blocking classifier. Various experiments are conducted on a dataset to verify the effectiveness of the unsupervised algorithm in general and the additional blocking classifier in particular. en
dc.language.iso en_US en
dc.rights All rights reserved to author and California State University Channel Islands en
dc.subject Unsupervised duplicate detection en
dc.subject Web databases en
dc.subject Blocking classifier en
dc.subject Query results en
dc.subject Search engine en
dc.subject Computer Science thesis en
dc.title Unsupervised Duplicate Detection (UDD) of Query Results from Multiple Web Databases en
dc.type Thesis en


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Search DSpace


My Account

RSS Feeds