See: Description
Interface | Description |
---|---|
Env.Binder |
Interface for performing custom binding of values to the environment
|
MultiPatternMatcher.NodePatternTrigger<T> |
A function which returns a collections of patterns that may match when
given a single node from a larger sequence.
|
MultiPatternMatcher.SequencePatternTrigger<T> |
A function which returns a collections of patterns that may match when
a sequence of nodes.
|
NodePatternTransformer<T1,T2> |
Interface to transform a node pattern from a
NodePattern<T1> into a
NodePattern <T2> . |
PhraseTable.WordList | |
SequenceMatchAction<T> |
Performs action on a sequence
|
SequenceMatcher.MatchReplacement<T> |
Interface that specifies what to replace a matched pattern with
|
SequenceMatchResult<T> |
The result of a match against a sequence.
|
SequenceMatchRules.ExtractRule<I,O> |
Interface for a rule that extracts a list of matched items from an input.
|
SequenceMatchRules.Rule |
A sequence match rule.
|
SequencePattern.NodesMatchChecker<T> | |
SequencePattern.Parser<T> |
Enum | Description |
---|---|
MultiWordStringMatcher.MatchType |
if
matchType is EXCT : match exact string
if matchType is EXCTWS : match exact string, except whitespace can match multiple whitespaces
if matchType is LWS : match case insensitive string, except whitespace can match multiple whitespaces
if matchType is LNRM : disregards punctuation, does case insensitive match
if matchType is REGEX : interprets string as regex already |
SequenceMatcher.FindType |
Type of search to perform
FIND_NONOVERLAPPING - Find nonoverlapping matches (default)
FIND_ALL - Find all potential matches
Greedy/reluctant quantifiers are not enforced
(perhaps should add syntax where some of them are enforced...)
|
edu.stanford.nlp.pipeline.TokensRegexAnnotator
,
the edu.stanford.nlp.pipeline.TokensRegexNERAnnotator
,
and the SUTime functionality in edu.stanford.nlp.pipeline.NERCombinerAnnotator
.
TokensRegex provides a language for specifying rules to extract expressions over a token sequence.
CoreMapExpressionExtractor
and
SequenceMatchRules
describes
the language and how the extraction rules are created.
At the core of TokensRegex are the
TokenSequenceMatcher
and
TokenSequencePattern
classes which
can be used to match patterns over a sequences of tokens.
The usage is designed to follow the paradigm of the Java regular expression library
java.util.regex
. The usage is similar except that matches are done
over List<CoreMap>
instead of over String
.
Example:
List<CoreLabel> tokens = ...;
TokenSequencePattern pattern = TokenSequencePattern.compile(...);
TokenSequenceMatcher matcher = pattern.getMatcher(tokens);
The classes SequenceMatcher
and
SequencePattern
can be used to build
classes for recognizing regular expressions over sequences of arbitrary types.
TokensRegex also offers a group of utility classes.
MultiPatternMatcher
provides utility functions for
finding expressions with multiple patterns.
For instance, using MultiPatternMatcher.findNonOverlapping(java.util.List<? extends T>)
you can find all nonoverlapping subsequences for a given set of patterns.
To find character offsets of multiple word expressions in a String
,
you can also use
MultiWordStringMatcher.findTargetStringOffsets(java.lang.String, java.lang.String)
.