2016_02_24_approximate-_approach_to_order_preserving_matching_under_δγ_distances

Por: Juan Carlos Mendivelso

The problem of string matching consists of finding all the occurrences of a pattern string P=P[1..m] in a text string T[1..n]. It has important applications in several areas such as information retrieval and bioinformatics. However, exact string matching does not support all the applications. For instance, in some areas the alphabet is drawn from a set of integer values. These integer strings are normally found in cipher text, financial data, meteorology data, image data, and music data, to name some. In such numeric strings, it would be unrealistic and ineffective to search for exact occurrences of a pattern but rather ought to search for similar instances of it. For instance, order-preserving matching is a recently introduced variant of the string matching problem that searches

for substrings in the text whose natural representation matches the natural representation of the pattern. The natural representation of a string X is a string that contains the rankings of the characters occurring at each position of X. Then, order-preserving matching regards the internal structure of the strings rather than their absolute values. Consequently, it has important

applications in stock market analysis and music information retrieval. But such application areas require more flexibility: not only the substrings with exactly the same structure are of interest, but also the ones that are similar. In this project, we propose an approximate version of order-preserving matching based on the δγ- distances that permits an individual error between the ranking of corresponding symbols (bounded by δ) and a global error of all the positions (bounded by γ).

Publicado por Fundación Universitaria Konrad Lorenz El día 02/24/2017 Enlace permanente Comentarios (0)