Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, 2007, p. 213 - 222.
This paper addresses the problem of making text mining results more comprehensible to humanities scholars, journalists, intelligence analysts, and other researchers, in order to support the analysis of text collections. Our system, FeatureLens, visualizes a text collection at different levels of granularity and enables users to discover interesting text patterns. Text patterns are defined as frequent itemsets of n-grams, and they capture the repetition of exact or similar expressions in the collection. Users can find meaningful co-occurrences of text patterns by visualizing them within and across documents in the collection. This also permits users to identify the temporal evolution of usage such as increasing, decreasing or sudden appearance of text patterns. Initial studies suggest that the proposed visualization helped a literary scholar and 8 advanced-degree users create new hypotheses and make interesting insights about 2 analyzed text collections.