Using Symbolic Objects to Cluster Web Documents
Web Clustering is useful for several activities in the WWW, from automatically building web directories to improve re- trieval performance. Nevertheless, due to the huge size of the web, a linear mechanism must be employed to cluster web documents. The k-means is one classic algorithm used in this problem. We present a variant of the vector model to be used with the k-means algorithm. Our representation uses symbolic objects for clustering web documents. Some experiments were done with positive results and future work is optimistic.
Sponsor of The CIO Dinner