Efficient adaptive retrieval and mining in large multimedia databases
Assent, Ira; Seidl, Thomas (Thesis advisor)
Aachen : Publikationsserver der RWTH Aachen University (2008)
Dissertation / PhD Thesis
Multimedia data ranging from images to videos and time series is created in numerous scientific, commercial and home applications. Access to increasingly large data volumes stored in multimedia databases is a core task to retrieve similar objects or to generate an overview of the entire content. Examples include retrieval of similar magnetic resonance images for diagnostic purposes, or automatic detection of customer segments for sales promotion. Meaningful retrieval and pattern detection require content-based methods that describe the relevant characteristics of multimedia objects. As opposed to manual keyword annotation techniques that are typically infeasible for large data volumes, content-based approaches use similarity models to process multimedia data. Similarity models specify appropriate features and their relationship for effective content based access. As most multimedia features require many different attributes, high dimensionality of multimedia features and huge database sizes are major challenges for efficient and effective retrieval and mining. In this work, very common feature types for multimedia data are studied: histogram and time series data. Histograms are used for a variety of features such as color, shape or texture. Time series data is prevalent for sensor measurements, stock data, and may even be applied to shapes and other features as well. For these data types, effective adaptable similarity models are usually computationally far too complex for usage in large high dimensional multimedia databases. Therefore efficient algorithms for these effective models are proposed. In this work, indexing techniques are used that allow for efficient query processing and mining by restricting the search space to task relevant data. Multistep filter-and-refine approaches using novel filter functions with quality guarantees ensure that fast response times are achieved without any loss of result accuracy.