buchspektrum Internet-Buchhandlung

Neuerscheinungen 2011

Stand: 2020-01-07
Schnellsuche
ISBN/Stichwort/Autor
Herderstraße 10
10625 Berlin
Tel.: 030 315 714 16
Fax 030 315 714 14
info@buchspektrum.de

Nidhi Tyagi

Information Retrieval System


A Domain Specific Parallel Crawler
2011. 92 S.
Verlag/Jahr: VDM VERLAG DR. MÜLLER 2011
ISBN: 3-639-37779-6 (3639377796)
Neue ISBN: 978-3-639-37779-8 (9783639377798)

Preis und Lieferzeit: Bitte klicken


The World Wide Web is an interlinked collection of billions of documents formatted using HTML. Due to the growing and dynamic nature of the web, it has become a challenge to traverse all URLs in the web documents and handle these URLs, so it has become imperative to parallelize a crawling process. The crawler process is further being parallelized in the form ecology of crawler workers that parallely download information from the web. This paper proposes a novel architecture of parallel crawler, which is based on domain specific crawling, makes crawling task more effective, scalable and load-sharing among the different crawlers which parallel download web pages related to different domains specific URLs.
Er. Nidhi Tyagi received her bachelor´s degree (BE) in Computer Science & Engineering from Pune University. The author holds masters of Computer Engineering degree from Shobhit University, Meerut. She has more ten years of teaching experience and her areas of interest are Information Retrieval Systems, Data Mining and Database Management Systems.