coalib.parsing.StringProcessing package

Submodules

coalib.parsing.StringProcessing.Core module

coalib.parsing.StringProcessing.Core.escape(string, escape_chars, escape_with='\\')

Escapes all chars given inside the given string.

Parameters:
  • string – The string where to escape characters.
  • escape_chars – The string or Iterable that contains the characters to escape. Each char inside this string will be escaped in the order given. Duplicate chars are allowed.
  • escape_with – The string that should be used as escape sequence.
Returns:

The escaped string.

coalib.parsing.StringProcessing.Core.nested_search_in_between(begin, end, string, max_matches=0, remove_empty_matches=False, use_regex=False)

Searches for a string enclosed between a specified begin- and end-sequence. Also enclosed n are put into the result. Doesn’t handle escape sequences, but supports nesting.

Nested sequences are ignored during the match. Means you get only the first nesting level returned. If you want to acquire more levels, just reinvoke this function again on the return value.

Using the same begin- and end-sequence won’t match anything.

Parameters:
  • begin – A pattern that defines where to start matching.
  • end – A pattern that defines where to end matching.
  • string – The string where to search in.
  • max_matches – Defines the maximum number of matches. If 0 or less is provided, the number of splits is not limited.
  • remove_empty_matches – Defines whether empty entries should be removed from the result. An entry is considered empty if no inner match was performed (regardless of matched start and end patterns).
  • use_regex – Specifies whether to treat the begin and end patterns as regexes or simple strings.
Returns:

An iterator returning the matched strings.

coalib.parsing.StringProcessing.Core.position_is_escaped(string, position=None)

Checks whether a char at a specific position of the string is preceded by an odd number of backslashes.

Parameters:
  • string – Arbitrary string
  • position – Position of character in string that should be checked
Returns:

True if the character is escaped, False otherwise

coalib.parsing.StringProcessing.Core.search_for(pattern, string, flags=0, max_match=0, use_regex=False)

Searches for a given pattern in a string.

Parameters:
  • pattern – A pattern that defines what to match.
  • string – The string to search in.
  • flags – Additional flags to pass to the regex processor.
  • max_match – Defines the maximum number of matches to perform. If 0 or less is provided, the number of splits is not limited.
  • use_regex – Specifies whether to treat the pattern as a regex or simple string.
Returns:

An iterator returning MatchObject’s.

coalib.parsing.StringProcessing.Core.search_in_between(begin, end, string, max_matches=0, remove_empty_matches=False, use_regex=False)

Searches for a string enclosed between a specified begin- and end-sequence. Also enclosed n are put into the result. Doesn’t handle escape sequences.

Parameters:
  • begin – A pattern that defines where to start matching.
  • end – A pattern that defines where to end matching.
  • string – The string where to search in.
  • max_matches – Defines the maximum number of matches. If 0 or less is provided, the number of matches is not limited.
  • remove_empty_matches – Defines whether empty entries should be removed from the result. An entry is considered empty if no inner match was performed (regardless of matched start and end patterns).
  • use_regex – Specifies whether to treat the begin and end patterns as regexes or simple strings.
Returns:

An iterator returning InBetweenMatch objects that hold information about the matched begin, inside and end string matched.

coalib.parsing.StringProcessing.Core.split(pattern, string, max_split=0, remove_empty_matches=False, use_regex=False)

Splits the given string by the specified pattern. The return character (n) is not a natural split pattern (if you don’t specify it yourself). This function ignores escape sequences.

Parameters:
  • pattern – A pattern that defines where to split.
  • string – The string to split by the defined pattern.
  • max_split – Defines the maximum number of splits. If 0 or less is provided, the number of splits is not limited.
  • remove_empty_matches – Defines whether empty entries should be removed from the result.
  • use_regex – Specifies whether to treat the split pattern as a regex or simple string.
Returns:

An iterator returning the split up strings.

coalib.parsing.StringProcessing.Core.unescape(string)

Trimms off all escape characters from the given string.

Parameters:string – The string to unescape.
coalib.parsing.StringProcessing.Core.unescaped_rstrip(string)

Strips whitespaces from the right side of given string that are not escaped.

Parameters:string – The string where to strip whitespaces from.
Returns:The right-stripped string.
coalib.parsing.StringProcessing.Core.unescaped_search_for(pattern, string, flags=0, max_match=0, use_regex=False)

Searches for a given pattern in a string that is not escaped.

Parameters:
  • pattern – A pattern that defines what to match unescaped.
  • string – The string to search in.
  • flags – Additional flags to pass to the regex processor.
  • max_match – Defines the maximum number of matches to perform. If 0 or less is provided, the number of splits is not limited.
  • use_regex – Specifies whether to treat the pattern as a regex or simple string.
Returns:

An iterator returning MatchObject’s.

coalib.parsing.StringProcessing.Core.unescaped_search_in_between(begin, end, string, max_matches=0, remove_empty_matches=False, use_regex=False)

Searches for a string enclosed between a specified begin- and end-sequence. Also enclosed n are put into the result. Handles escaped begin- and end-sequences (and so only patterns that are unescaped).

Warning

Using the escape character ‘’ in the begin- or end-sequences the function can return strange results. The backslash can interfere with the escaping regex-sequence used internally to match the enclosed string.

Parameters:
  • begin – A regex pattern that defines where to start matching.
  • end – A regex pattern that defines where to end matching.
  • string – The string where to search in.
  • max_matches – Defines the maximum number of matches. If 0 or less is provided, the number of matches is not limited.
  • remove_empty_matches – Defines whether empty entries should be removed from the result. An entry is considered empty if no inner match was performed (regardless of matched start and end patterns).
  • use_regex – Specifies whether to treat the begin and end patterns as regexes or simple strings.
Returns:

An iterator returning the matched strings.

coalib.parsing.StringProcessing.Core.unescaped_split(pattern, string, max_split=0, remove_empty_matches=False, use_regex=False)

Splits the given string by the specified pattern. The return character (n) is not a natural split pattern (if you don’t specify it yourself). This function handles escaped split-patterns (and so splits only patterns that are unescaped).

Parameters:
  • pattern – A pattern that defines where to split.
  • string – The string to split by the defined pattern.
  • max_split – Defines the maximum number of splits. If 0 or less is provided, the number of splits is not limited.
  • remove_empty_matches – Defines whether empty entries should be removed from the result.
  • use_regex – Specifies whether to treat the split pattern as a regex or simple string.
Returns:

An iterator returning the split up strings.

coalib.parsing.StringProcessing.Core.unescaped_strip(string)

Strips whitespaces of the given string taking escape characters into account.

Parameters:string – The string where to strip whitespaces from.
Returns:The stripped string.

coalib.parsing.StringProcessing.Filters module

coalib.parsing.StringProcessing.Filters.limit(iterator, count)

A filter that removes all elements behind the set limit.

Parameters:
  • iterator – The iterator to be filtered.
  • count – The iterator limit. All elements at positions bigger than this limit are trimmed off. Exclusion: 0 or numbers below does not limit at all, means the passed iterator is completely yielded.
coalib.parsing.StringProcessing.Filters.trim_empty_matches(iterator, groups=(0, ))

A filter that removes empty match strings. It can only operate on iterators whose elements are of type MatchObject.

Parameters:
  • iterator – The iterator to be filtered.
  • groups – An iteratable defining the groups to check for blankness. Only results are not yielded if all groups of the match are blank. You can not only pass numbers but also strings, if your MatchObject contains named groups.

coalib.parsing.StringProcessing.InBetweenMatch module

class coalib.parsing.StringProcessing.InBetweenMatch.InBetweenMatch(begin, inside, end)

Bases: object

Holds information about a match enclosed by two matches.

begin
end
classmethod from_values(begin, begin_pos, inside, inside_pos, end, end_pos)

Instantiates a new InBetweenMatch from Match values.

This function allows to bypass the usage of Match object instantation:

>>> a = InBetweenMatch(Match("A", 0), Match("B", 1), Match("C", 2))
>>> b = InBetweenMatch.from_values("A", 0, "B", 1, "C", 2)
>>> assert a == b
Parameters:
  • begin – The matched string from start pattern.
  • begin_pos – The position of the matched begin string.
  • inside – The matched string from inside/in-between pattern.
  • inside_pos – The position of the matched inside/in-between string.
  • end – The matched string from end pattern.
  • end_pos – The position of the matched end string.
Returns:

An InBetweenMatch from the given values.

inside

coalib.parsing.StringProcessing.Match module

class coalib.parsing.StringProcessing.Match.Match(match, position)

Bases: object

Stores information about a single textual match.

end_position

Marks the end position of the matched text (zero-based).

Returns:The end-position.
match

Returns the text matched.

Returns:The text matched.
position

Returns the position where the text was matched (zero-based).

Returns:The position.
range

Returns the position range where the text was matched.

Returns:A pair indicating the position range. The first element is the start position, the second one the end position.

Module contents