Maximal pair
In computer science, a maximal pair within a string is a pair of matching substrings that are maximal, where "maximal" means that it is not possible to make a longer matching pair by extending the range of both substrings to the left or right.
Example
Index | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
Character | x | a | b | c | y | a | b | c | w | a | b | c | y | z |
For example, in this table, the substrings at indices 2 to 4 (in red) and indices 6 to 8 (in blue) are a maximal pair, because they contain identical characters (abc
), and they have different characters to the left (x
at index 1 and y
at index 5) and different characters to the right (y
at index 5 and w
at index 9). Similarly, the substrings at indices 6 to 8 (in blue) and indices 10 to 12 (in green) are a maximal pair.
However, the substrings at indices 2 to 4 (in red) and indices 10 to 12 (in green) are not a maximal pair, as the character y
follows both substrings, and so they can be extended to the right to make a longer pair.
Formal definition
Formally, a maximal pair of substrings with starting positions and respectively, and both of length , is specified by a triple , such that, given a string of length , (meaning that the substrings have identical contents), but (they have different characters to their left) and (they also have different characters to their right; together, these two inequalities are the condition for being maximal). Thus, in the example above, the maximal pairs are (the red and blue substrings) and (the green and blue substrings), and is not a maximal pair.
Related concepts and time complexity
A maximal repeat is the string represented by a maximal pair. A supermaximal repeat is a maximal repeat never occurring as a proper substring of another maximal repeat. In the above example, abc
and abcy
are both maximal repeats, but only abcy
is a supermaximal repeat.
Maximal pairs, maximal repeats and supermaximal repeats can each be found in time using a suffix tree,[1] if there are such structures.
References
- Gusfield, Dan (1999) [1997]. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. USA: Cambridge University Press. p. 143. ISBN 0-521-58519-8.
External links
- Project for the computation of all maximal repeats in one ore more strings in Python, using suffix array.