Package Api Documentation for mlconjug¶
API Reference for the classes in mlconjug3.mlconjug.py¶
MLConjug Main module.
-
mlconjug3.mlconjug.
extract_verb_features
(verb, lang, ngram_range)[source]¶ - Custom Vectorizer optimized for extracting verbs features.The Vectorizer subclasses sklearn.feature_extraction.text.CountVectorizer .As in Indo-European languages verbs are inflected by adding a morphological suffix, the vectorizer extracts verb endings and produces a vector representation of the verb with binary features.To enhance the results of the feature extration, several other features have been included:The features are the verb’s ending n-grams, starting n-grams, length of the verb, number of vowels, number of consonants and the ratio of vowels over consonants.
- Parameters
verb – string. Verb to vectorize.
lang – string. Language to analyze.
ngram_range – tuple. The range of the ngram sliding window.
- Returns
list. List of the most salient features of the verb for the task of finding it’s conjugation’s class.
-
class
mlconjug3.mlconjug.
Conjugator
(language='fr', model=None)[source]¶ Bases:
object
This is the main class of the project.The class manages the Verbiste data set and provides an interface with the scikit-learn pipeline.If no parameters are provided, the default language is set to french and the pre-trained french conjugation pipeline is used.The class defines the method conjugate(verb, language) which is the main method of the module.- Parameters
language – string. Language of the conjugator. The default language is ‘fr’ for french.
model – mlconjug3.Model or scikit-learn Pipeline or Classifier implementing the fit() and predict() methods. A user provided pipeline if the user has trained his own pipeline.
-
conjugate
(verb, subject='abbrev')[source]¶ - This is the main method of this class.It first checks to see if the verb is in Verbiste.If it is not, and a pre-trained scikit-learn pipeline has been supplied, the method then calls the pipeline to predict the conjugation class of the provided verb.Returns a Verb object or None.
- Parameters
verb – string. Verb to conjugate.
subject – string. Toggles abbreviated or full pronouns. The default value is ‘abbrev’. Select ‘pronoun’ for full pronouns.
- Returns
Verb object or None.
-
class
mlconjug3.mlconjug.
DataSet
(verbs_dict)[source]¶ Bases:
object
This class holds and manages the data set.Defines helper methodss for managing Machine Learning tasks like constructing a training and testing set.- Parameters
verbs_dict – A dictionary of verbs and their corresponding conjugation class.
-
class
mlconjug3.mlconjug.
Model
(vectorizer=None, feature_selector=None, classifier=None, language=None)[source]¶ Bases:
object
This class manages the scikit-learn pipeline.The Pipeline includes a feature vectorizer, a feature selector and a classifier.If any of the vectorizer, feature selector or classifier is not supplied at instance declaration, the __init__ method will provide good default values that get more than 92% prediction accuracy.- Parameters
vectorizer – scikit-learn Vectorizer.
feature_selector – scikit-learn Classifier with a fit_transform() method
classifier – scikit-learn Classifier with a predict() method
language – language of the corpus of verbs to be analyzed.
API Reference for the classes in mlconjug3.PyVerbiste.py¶
PyVerbiste.
-
class
mlconjug3.PyVerbiste.
ConjugManager
(language='default')[source]¶ Bases:
object
This is the class handling the mlconjug3 json files.
- Parameters
language – string. | The language of the conjugator. The default value is fr for French. | The allowed values are: fr, en, es, it, pt, ro.
-
_load_verbs
(verbs_file)[source]¶ Load and parses the verbs from the json file.
- Parameters
verbs_file – string or path object. Path to the verbs json file.
-
_load_conjugations
(conjugations_file)[source]¶ Load and parses the conjugations from the json file.
- Parameters
conjugations_file – string or path object. Path to the conjugation json file.
-
_detect_allowed_endings
()[source]¶ - Detects the allowed endings for verbs in the supported languages.All the supported languages except for English restrict the form a verb can take.As English is much more productive and varied in the morphology of its verbs, any word is allowed as a verb.
- Returns
set. A set containing the allowed endings of verbs in the target language.
-
is_valid_verb
(verb)[source]¶ - Checks if the verb is a valid verb in the given language.English words are always treated as possible verbs.Verbs in other languages are filtered by their endings.
- Parameters
verb – string. The verb to conjugate.
- Returns
bool. True if the verb is a valid verb in the language. False otherwise.
-
class
mlconjug3.PyVerbiste.
Verbiste
(language='default')[source]¶ Bases:
mlconjug3.PyVerbiste.ConjugManager
This is the class handling the Verbiste xml files.
- Parameters
language – string. | The language of the conjugator. The default value is fr for French. | The allowed values are: fr, en, es, it, pt, ro.
-
_load_verbs
(verbs_file)[source]¶ Load and parses the verbs from the xml file.
- Parameters
verbs_file – string or path object. Path to the verbs xml file.
-
static
_parse_verbs
(file)[source]¶ Parses the XML file.
- Parameters
file – FileObject. XML file containing the verbs.
- Returns
OrderedDict. An OrderedDict containing the verb and its template for all verbs in the file.
-
_load_conjugations
(conjugations_file)[source]¶ Load and parses the conjugations from the xml file.
- Parameters
conjugations_file – string or path object. Path to the conjugation xml file.
-
_parse_conjugations
(file)[source]¶ Parses the XML file.
- Parameters
file – FileObject. XML file containing the conjugation templates.
- Returns
OrderedDict. An OrderedDict containing all the conjugation templates in the file.
-
static
_load_tense
(tense)[source]¶ Load and parses the inflected forms of the tense from xml file.
- Parameters
tense – list of xml tags containing inflected forms. The list of inflected forms for the current tense being processed.
- Returns
list. List of inflected forms.
-
_detect_allowed_endings
()¶ - Detects the allowed endings for verbs in the supported languages.All the supported languages except for English restrict the form a verb can take.As English is much more productive and varied in the morphology of its verbs, any word is allowed as a verb.
- Returns
set. A set containing the allowed endings of verbs in the target language.
-
get_conjug_info
(template)¶ Gets conjugation information corresponding to the given template.
- Parameters
template – string. Name of the verb ending pattern.
- Returns
OrderedDict or None. OrderedDict containing the conjugated suffixes of the template.
-
get_verb_info
(verb)¶ Gets verb information and returns a VerbInfo instance.
- Parameters
verb – string. Verb to conjugate.
- Returns
VerbInfo object or None.
-
is_valid_verb
(verb)¶ - Checks if the verb is a valid verb in the given language.English words are always treated as possible verbs.Verbs in other languages are filtered by their endings.
- Parameters
verb – string. The verb to conjugate.
- Returns
bool. True if the verb is a valid verb in the language. False otherwise.
-
class
mlconjug3.PyVerbiste.
VerbInfo
(infinitive, root, template)[source]¶ Bases:
object
This class defines the Verbiste verb information structure.
- Parameters
infinitive – string. Infinitive form of the verb.
root – string. Lexical root of the verb.
template – string. Name of the verb ending pattern.
-
class
mlconjug3.PyVerbiste.
Verb
(verb_info, conjug_info, subject='abbrev', predicted=False)[source]¶ Bases:
object
This class defines the Verb Object.
- Parameters
verb_info – VerbInfo Object.
conjug_info – OrderedDict.
subject – string. Toggles abbreviated or full pronouns. The default value is ‘abbrev’. Select ‘pronoun’ for full pronouns.
predicted – bool. Indicates if the conjugation information was predicted by the model or retrieved from the dataset.
-
iterate
()[source]¶ Iterates over all conjugated forms and returns a list of tuples of those conjugated forms.
- Returns
list. List of conjugated forms.
-
class
mlconjug3.PyVerbiste.
VerbFr
(verb_info, conjug_info, subject='abbrev', predicted=False)[source]¶ Bases:
mlconjug3.PyVerbiste.Verb
This class defines the French Verb Object.
-
_load_conjug
()[source]¶ - Populates the inflected forms of the verb.Adds personal pronouns to the inflected verbs.
-
conjugate_person
(key, persons_dict, term)¶ Creates the conjugated form of the person specified by the key argument. :param key: string. :param persons_dict: OrderedDict :param term: string. :return: None.
-
iterate
()¶ Iterates over all conjugated forms and returns a list of tuples of those conjugated forms.
- Returns
list. List of conjugated forms.
-
-
class
mlconjug3.PyVerbiste.
VerbEn
(verb_info, conjug_info, subject='abbrev', predicted=False)[source]¶ Bases:
mlconjug3.PyVerbiste.Verb
This class defines the English Verb Object.
-
_load_conjug
()[source]¶ - Populates the inflected forms of the verb.Adds personal pronouns to the inflected verbs.
-
conjugate_person
(key, persons_dict, term)¶ Creates the conjugated form of the person specified by the key argument. :param key: string. :param persons_dict: OrderedDict :param term: string. :return: None.
-
iterate
()¶ Iterates over all conjugated forms and returns a list of tuples of those conjugated forms.
- Returns
list. List of conjugated forms.
-
-
class
mlconjug3.PyVerbiste.
VerbEs
(verb_info, conjug_info, subject='abbrev', predicted=False)[source]¶ Bases:
mlconjug3.PyVerbiste.Verb
This class defines the Spanish Verb Object.
-
_load_conjug
()[source]¶ - Populates the inflected forms of the verb.Adds personal pronouns to the inflected verbs.
-
conjugate_person
(key, persons_dict, term)¶ Creates the conjugated form of the person specified by the key argument. :param key: string. :param persons_dict: OrderedDict :param term: string. :return: None.
-
iterate
()¶ Iterates over all conjugated forms and returns a list of tuples of those conjugated forms.
- Returns
list. List of conjugated forms.
-
-
class
mlconjug3.PyVerbiste.
VerbIt
(verb_info, conjug_info, subject='abbrev', predicted=False)[source]¶ Bases:
mlconjug3.PyVerbiste.Verb
This class defines the Italian Verb Object.
-
_load_conjug
()[source]¶ - Populates the inflected forms of the verb.Adds personal pronouns to the inflected verbs.
-
conjugate_person
(key, persons_dict, term)¶ Creates the conjugated form of the person specified by the key argument. :param key: string. :param persons_dict: OrderedDict :param term: string. :return: None.
-
iterate
()¶ Iterates over all conjugated forms and returns a list of tuples of those conjugated forms.
- Returns
list. List of conjugated forms.
-
-
class
mlconjug3.PyVerbiste.
VerbPt
(verb_info, conjug_info, subject='abbrev', predicted=False)[source]¶ Bases:
mlconjug3.PyVerbiste.Verb
This class defines the Portuguese Verb Object.
-
_load_conjug
()[source]¶ - Populates the inflected forms of the verb.Adds personal pronouns to the inflected verbs.
-
conjugate_person
(key, persons_dict, term)¶ Creates the conjugated form of the person specified by the key argument. :param key: string. :param persons_dict: OrderedDict :param term: string. :return: None.
-
iterate
()¶ Iterates over all conjugated forms and returns a list of tuples of those conjugated forms.
- Returns
list. List of conjugated forms.
-
-
class
mlconjug3.PyVerbiste.
VerbRo
(verb_info, conjug_info, subject='abbrev', predicted=False)[source]¶ Bases:
mlconjug3.PyVerbiste.Verb
This class defines the Romanian Verb Object.
-
_load_conjug
()[source]¶ - Populates the inflected forms of the verb.Adds personal pronouns to the inflected verbs.
-
conjugate_person
(key, persons_dict, term)¶ Creates the conjugated form of the person specified by the key argument. :param key: string. :param persons_dict: OrderedDict :param term: string. :return: None.
-
iterate
()¶ Iterates over all conjugated forms and returns a list of tuples of those conjugated forms.
- Returns
list. List of conjugated forms.
-