| ||||||||||||||||||||||||||||||||||||||||||||||||||||
[CLB09] Block clustering for web pages categorizationConférence Internationale avec comité de lecture : Intelligent Data Engineering and Automated Learning - IDEAL 2009, September 2009, Vol. 5788, pp.260-267, Series Lecture Notes in Computer Science, Burgos, Espagne, (DOI: 10.1007/978-3-642-04394-9_32)Mots clés: Text Mining, Block Clustering, Categorization, Clustering, Machine Learning, Data Mining, Web Mining, Natural Language Processing
Résumé:
With the growth of web-based applications and the increased
popularity of the World Wide Web (WWW), the WWW became the
greatest source of information available in the world leading to an increased
difficulty of extracting relevant information. Moreover, the content
of web sites is constantly changing leading to continual changes in
Web users’ behaviours. Therefore, there is significant interest in analysing
web content data to better serve users. Our proposed approach, which
is grounded on automatic textual analysis of a web site independently
from the usage attempts to define groups of documents dealing with the
same topic. Both document clustering and word clustering are well studied
problems. However, most existing algorithms cluster documents and
words separately but not simultaneously. In this paper, we propose to
apply a block clustering algorithm to categorize a web site pages according
to their content. We report results of our recent testing of CROKI2
algorithm on a tourist web site.
Equipe:
msdma
BibTeX
|
||||||||||||||||||||||||||||||||||||||||||||||||||||