About us
The Heterogeneous Data and Multimedia Management group of the University of Turin works in tight connection with Arizona State University, where Maria Luisa Sapino is also Adjunct Professor. Some of our projects are located at ASU, while others are led by the Computer Science Department at the University of Torino.
Major topics of interest of the group:
- Taxonomy matching and integration
Ontologies and concept taxonomies help software systems organize data more effectively for particular application domains. Ontologies also enable sharing and integration of data from different domains and data sources. However, ontologies from different domains are rarely identical; thus, there is need for techniques to find alignments between concepts iin different ontologies and taxonomies We are developing concept vector based schemes that capture structural information inherent in taxonomies to facilitate structure-based matching of concepts across taxonomies. The structure and content based relationship among concepts within the same taxonomy, or across multiple taxonomies are leveraged to drive innovative navigation mechanisms within a multimedia document space, thus leading to innovative information retrieval and recommendation mechanisms. - Meta-data driven table summarization.
Table summarization is needed in various scenarios where it is hard to display a large data set. For instance, a scientist who is exploring possible databases for further analysis may want to see concise yet informative summaries of large tables. Similarly, small devices, such as PDAs, cannot effectively present a large table of results with their small screens. We are proposing novel approaches to table summarization, which formulate the problem with the help of domain lattices. In addition, we are studying the impact, in terms of time complexity and of quality of the results, which combine table summarization and domain lattices summarization. - Visualization techniques for structured and unstructured data
We are developing innovative techniques to visualize information units which can be mined from large collections of structured and unstructured data. One unstructured information source is textual documents, e.g., newspapers. The use of tag clouds is common for presenting frequently occurring tags or keywords in a collection to the users. Most visualizations of tag clouds vary the sizes of the fonts to differentiate important tags from others. This, however, is sufficient neither to help the user explore and discover relationships between tags in a collection, nor to help track the changes in these relationships across time frames in dynamic collections. We are proposing alternative "contextual-layout" methods, for presenting tags or keywords that are associated with dynamically evolving textual content, like news streams. We first map tags onto a latent semantic space, then we analyze the relationships between tags in this semantic space and the resulting tag cloud is condensed into a hierarchy in a way that captures contextual relationships between tags: descendant terms in the hierarchy occur within the context defined by the ancestor terms. This provides a mechanism for navigation within the tag space as well as for the contextual organization of the text documents. We are also developing novel techniques to visualize relationships among concepts extracted from large relational databases. These rely on novel layout mechanisms that highlight inherent relationships in data.