abacusai.api_class
Submodules
Package Contents
Classes
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Generic enumeration. |
|
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
|
An abstract class for the sampling config of a feature group |
|
The number of distinct values of the key columns to include in the sample, or number of rows if key columns not specified. |
|
The fraction of distinct values of the feature group to include in the sample. |
|
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
|
Training config for the PERSONALIZATION problem type |
|
Training config for the PREDICTIVE_MODELING problem type |
|
Training config for the FORECASTING problem type |
|
Training config for the NAMED_ENTITY_EXTRACTION problem type |
|
Training config for the NATURAL_LANGUAGE_SEARCH problem type |
|
Training config for the SENTENCE_BOUNDARY_DETECTION problem type |
|
Training config for the SENTIMENT_DETECTION problem type |
|
Training config for the DOCUMENT_CLASSIFICATION problem type |
|
Training config for the DOCUMENT_SUMMARIZATION problem type |
|
Training config for the DOCUMENT_VISUALIZATION problem type |
|
Training config for the CLUSTERING problem type |
|
Training config for the CLUSTERING_TIMESERIES problem type |
|
Training config for the CUMULATIVE_FORECASTING problem type |
|
Training config for the ANOMALY_DETECTION problem type |
|
Training config for the THEME ANALYSIS problem type |
|
Training config for the AI_AGENT problem type |
|
Training config for the CUSTOM_TRAINED_MODEL problem type |
|
Training config for the CUSTOM_ALGORITHM problem type |
|
Training config for the OPTIMIZATION problem type |
|
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
|
A config class for python function arguments |
|
A config class for python function arguments |
|
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
- class abacusai.api_class.ApiClass
Bases:
abc.ABC
Helper class that provides a standard way to create an ABC using inheritance.
- __post_init__()
- to_dict()
Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.
- class abacusai.api_class.ParsingConfig
Bases:
abacusai.api_class.abstract.ApiClass
Helper class that provides a standard way to create an ABC using inheritance.
- class abacusai.api_class.ApiEnum
Bases:
enum.Enum
Generic enumeration.
Derive from this class to define new enumerations.
- __eq__(other)
Return self==value.
- __hash__()
Return hash(self).
- class abacusai.api_class.ProblemType
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- AI_AGENT = 'ai_agent'
- ANOMALY_DETECTION = 'anomaly_new'
- ANOMALY_OUTLIERS = 'anomaly'
- CLUSTERING = 'clustering'
- CLUSTERING_TIMESERIES = 'clustering_timeseries'
- CUMULATIVE_FORECASTING = 'cumulative_forecasting'
- NAMED_ENTITY_EXTRACTION = 'nlp_ner'
- NATURAL_LANGUAGE_SEARCH = 'nlp_search'
- SENTENCE_BOUNDARY_DETECTION = 'nlp_sentence_boundary_detection'
- SENTIMENT_DETECTION = 'nlp_sentiment'
- DOCUMENT_CLASSIFICATION = 'nlp_classification'
- DOCUMENT_SUMMARIZATION = 'nlp_summarization'
- DOCUMENT_VISUALIZATION = 'nlp_document_visualization'
- PERSONALIZATION = 'personalization'
- PREDICTIVE_MODELING = 'regression'
- FORECASTING = 'forecasting'
- CUSTOM_TRAINED_MODEL = 'plug_and_play'
- CUSTOM_ALGORITHM = 'trainable_plug_and_play'
- FEATURE_STORE = 'feature_store'
- IMAGE_CLASSIFICATION = 'vision_classification'
- OBJECT_DETECTION = 'vision_object_detection'
- IMAGE_VALUE_PREDICTION = 'vision_regression'
- MODEL_MONITORING = 'model_monitoring'
- LANGUAGE_DETECTION = 'language_detection'
- OPTIMIZATION = 'optimization'
- PRETRAINED_MODELS = 'pretrained'
- THEME_ANALYSIS = 'theme_analysis'
- class abacusai.api_class.RegressionObjective
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- AUC = 'auc'
- ACCURACY = 'acc'
- LOG_LOSS = 'log_loss'
- PRECISION = 'precision'
- RECALL = 'recall'
- F1_SCORE = 'fscore'
- MAE = 'mae'
- MAPE = 'mape'
- WAPE = 'wape'
- RMSE = 'rmse'
- R_SQUARED_COEFFICIENT_OF_DETERMINATION = 'r^2'
- class abacusai.api_class.RegressionTreeHPOMode
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- RAPID = ('rapid',)
- THOROUGH = 'thorough'
- class abacusai.api_class.RegressionAugmentationStrategy
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- SMOTE = 'smote'
- RESAMPLE = 'resample'
- class abacusai.api_class.RegressionTargetTransform
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- LOG = 'log'
- QUANTILE = 'quantile'
- YEO_JOHNSON = 'yeo-johnson'
- BOX_COX = 'box-cox'
- class abacusai.api_class.RegressionTypeOfSplit
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- RANDOM = 'Random Sampling'
- TIMESTAMP_BASED = 'Timestamp Based'
- ROW_INDICATOR_BASED = 'Row Indicator Based'
- class abacusai.api_class.RegressionTimeSplitMethod
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- TEST_SPLIT_PERCENTAGE_BASED = 'Test Split Percentage Based'
- TEST_START_TIMESTAMP_BASED = 'Test Start Timestamp Based'
- class abacusai.api_class.RegressionLossFunction
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- HUBER = 'Huber'
- MSE = 'Mean Squared Error'
- MAE = 'Mean Absolute Error'
- MAPE = 'Mean Absolute Percentage Error'
- MSLE = 'Mean Squared Logarithmic Error'
- TWEEDIE = 'Tweedie'
- CROSS_ENTROPY = 'Cross Entropy'
- FOCAL_CROSS_ENTROPY = 'Focal Cross Entropy'
- AUTOMATIC = 'Automatic'
- CUSTOM = 'Custom'
- class abacusai.api_class.SamplingMethodType
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- N_SAMPLING = 'N_SAMPLING'
- PERCENT_SAMPLING = 'PERCENT_SAMPLING'
- class abacusai.api_class.FillLogic
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- AVERAGE = 'average'
- MAX = 'max'
- MEDIAN = 'median'
- MIN = 'min'
- CUSTOM = 'custom'
- BACKFILL = 'bfill'
- FORWARDFILL = 'ffill'
- LINEAR = 'linear'
- NEAREST = 'nearest'
- class abacusai.api_class.BatchSize
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- BATCH_8 = 8
- BATCH_16 = 16
- BATCH_32 = 32
- BATCH_64 = 64
- BATCH_128 = 128
- BATCH_256 = 256
- BATCH_384 = 384
- BATCH_512 = 512
- BATCH_740 = 740
- BATCH_1024 = 1024
- class abacusai.api_class.HolidayCalendars
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- AU = 'AU'
- UK = 'UK'
- US = 'US'
- class abacusai.api_class.ExperimentationMode
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- RAPID = 'rapid'
- THOROUGH = 'thorough'
- class abacusai.api_class.PersonalizationTrainingMode
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- EXPERIMENTAL = 'EXP'
- PRODUCTION = 'PROD'
- class abacusai.api_class.PersonalizationObjective
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- NDCG = 'ndcg'
- NDCG_5 = 'ndcg@5'
- NDCG_10 = 'ndcg@10'
- MAP = 'map'
- MAP_5 = 'map@5'
- MAP_10 = 'map@10'
- MRR = 'mrr'
- PERSONALIZATION = 'personalization@10'
- COVERAGE = 'coverage'
- class abacusai.api_class.ForecastingObjective
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- ACCURACY = 'w_c_accuracy'
- WAPE = 'wape'
- MAPE = 'mape'
- CMAPE = 'cmape'
- RMSE = 'rmse'
- CV = 'coefficient_of_variation'
- BIAS = 'bias'
- SRMSE = 'srmse'
- class abacusai.api_class.ForecastingFrequency
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- HOURLY = '1H'
- DAILY = '1D'
- WEEKLY_SUNDAY_START = '1W'
- WEEKLY_MONDAY_START = 'W-MON'
- WEEKLY_SATURDAY_START = 'W-SAT'
- MONTH_START = 'MS'
- MONTH_END = '1M'
- QUARTER_START = 'QS'
- QUARTER_END = '1Q'
- YEARLY = '1Y'
- class abacusai.api_class.ForecastingDataSplitType
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- AUTO = 'Automatic Time Based'
- TIMESTAMP = 'Timestamp Based'
- ITEM = 'Item Based'
- PREDICTION_LENGTH = 'Force Prediction Length'
- class abacusai.api_class.ForecastingLossFunction
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- CUSTOM = 'Custom'
- MEAN_ABSOLUTE_ERROR = 'mae'
- NORMALIZED_MEAN_ABSOLUTE_ERROR = 'nmae'
- PEAKS_MEAN_ABSOLUTE_ERROR = 'peaks_mae'
- MEAN_ABSOLUTE_PERCENTAGE_ERROR = 'stable_mape'
- POINTWISE_ACCURACY = 'accuracy'
- ROOT_MEAN_SQUARE_ERROR = 'rmse'
- NORMALIZED_ROOT_MEAN_SQUARE_ERROR = 'nrmse'
- ASYMMETRIC_MEAN_ABSOLUTE_PERCENTAGE_ERROR = 'asymmetric_mape'
- STABLE_STANDARDIZED_MEAN_ABSOLUTE_PERCENTAGE_ERROR = 'stable_standardized_mape_with_cmape'
- GAUSSIAN = 'mle_gaussian_local'
- GAUSSIAN_FULL_COVARIANCE = 'mle_gaussfullcov'
- GUASSIAN_EXPONENTIAL = 'mle_gaussexp'
- MIX_GAUSSIANS = 'mle_gaussmix'
- WEIBULL = 'mle_weibull'
- NEGATIVE_BINOMIAL = 'mle_negbinom'
- LOG_ROOT_MEAN_SQUARE_ERROR = 'log_rmse'
- class abacusai.api_class.ForecastingLocalScaling
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- ZSCORE = 'zscore'
- SLIDING_ZSCORE = 'sliding_zscore'
- LAST_POINT = 'lastpoint'
- MIN_MAX = 'minmax'
- MIN_STD = 'minstd'
- ROBUST = 'robust'
- ITEM = 'item'
- class abacusai.api_class.ForecastingFillMethod
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- BACK = 'BACK'
- MIDDLE = 'MIDDLE'
- FUTURE = 'FUTURE'
- class abacusai.api_class.ForecastingQuanitlesExtensionMethod
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- DIRECT = 'direct'
- QUADRATIC = 'quadratic'
- ANCESTRAL_SIMULATION = 'simulation'
- class abacusai.api_class.NERObjective
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- LOG_LOSS = 'log_loss'
- AUC = 'auc'
- PRECISION = 'precision'
- RECALL = 'recall'
- ANNOTATIONS_PRECISION = 'annotations_precision'
- ANNOTATIONS_RECALL = 'annotations_recall'
- class abacusai.api_class.NERModelType
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- PRETRAINED_BERT = 'pretrained_bert'
- PRETRAINED_ROBERTA_27 = 'pretrained_roberta_27'
- PRETRAINED_ROBERTA_43 = 'pretrained_roberta_43'
- PRETRAINED_MULTILINGUAL = 'pretrained_multilingual'
- LEARNED = 'learned'
- class abacusai.api_class.NLPDocumentFormat
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- AUTO = 'auto'
- TEXT = 'text'
- DOC = 'doc'
- TOKENS = 'tokens'
- class abacusai.api_class.SentimentType
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- VALENCE = 'valence'
- EMOTION = 'emotion'
- class abacusai.api_class.ClusteringImputationMethod
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- AUTOMATIC = 'Automatic'
- ZEROS = 'Zeros'
- INTERPOLATE = 'Interpolate'
- class abacusai.api_class.ConnectorType
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- FILE = 'FILE'
- DATABASE = 'DATABASE'
- STREAMING = 'STREAMING'
- APPLICATION = 'APPLICATION'
- class abacusai.api_class.PythonFunctionArgumentType
Bases:
ApiEnum
Generic enumeration.
Derive from this class to define new enumerations.
- FEATURE_GROUP = 'FEATURE_GROUP'
- INTEGER = 'INTEGER'
- STRING = 'STRING'
- BOOLEAN = 'BOOLEAN'
- FLOAT = 'FLOAT'
- JSON = 'JSON'
- LIST = 'LIST'
- DATASET_ID = 'DATASET_ID'
- MODEL_ID = 'MODEL_ID'
- FEATURE_GROUP_ID = 'FEATURE_GROUP_ID'
- MONITOR_ID = 'MONITOR_ID'
- BATCH_PREDICTION_ID = 'BATCH_PREDICTION_ID'
- DEPLOYMENT_ID = 'DEPLOYMENT_ID'
- class abacusai.api_class.ApiClass
Bases:
abc.ABC
Helper class that provides a standard way to create an ABC using inheritance.
- __post_init__()
- to_dict()
Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.
- class abacusai.api_class._ApiClassFactory
Bases:
abc.ABC
Helper class that provides a standard way to create an ABC using inheritance.
- config_abstract_class
- config_class_key
- config_class_map
- class abacusai.api_class.SamplingConfig
Bases:
abacusai.api_class.abstract.ApiClass
An abstract class for the sampling config of a feature group
- __post_init__()
- class abacusai.api_class.NSamplingConfig
Bases:
SamplingConfig
The number of distinct values of the key columns to include in the sample, or number of rows if key columns not specified.
- Parameters:
sampling_method (SamplingMethodType) – N_SAMPLING
sample_count (int) – The number of rows to include in the sample
key_columns (list[str]) – The feature(s) to use as the key(s) when sampling
- sampling_method: abacusai.api_class.enums.SamplingMethodType
- class abacusai.api_class.PercentSamplingConfig
Bases:
SamplingConfig
The fraction of distinct values of the feature group to include in the sample.
- Parameters:
sampling_method (SamplingMethodType) – PERCENT_SAMPLING
sample_percent (float) – The percentage of the rows to sample
key_columns (list[str]) – The feature(s) to use as the key(s) when sampling
- sampling_method: abacusai.api_class.enums.SamplingMethodType
- class abacusai.api_class._SamplingConfigFactory
Bases:
abacusai.api_class.abstract._ApiClassFactory
Helper class that provides a standard way to create an ABC using inheritance.
- config_class_key = 'sampling_method'
- config_class_map
- class abacusai.api_class.ApiClass
Bases:
abc.ABC
Helper class that provides a standard way to create an ABC using inheritance.
- __post_init__()
- to_dict()
Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.
- class abacusai.api_class._ApiClassFactory
Bases:
abc.ABC
Helper class that provides a standard way to create an ABC using inheritance.
- config_abstract_class
- config_class_key
- config_class_map
- class abacusai.api_class.TrainingConfig
Bases:
abacusai.api_class.abstract.ApiClass
Helper class that provides a standard way to create an ABC using inheritance.
- problem_type: abacusai.api_class.enums.ProblemType
- class abacusai.api_class.PersonalizationTrainingConfig
Bases:
TrainingConfig
Training config for the PERSONALIZATION problem type :param problem_type: PERSONALIZATION :type problem_type: ProblemType :param objective: Ranking scheme used to select final best model. :type objective: PersonalizationObjective :param sort_objective: Ranking scheme used to sort models on the metrics page. :type sort_objective: PersonalizationObjective :param training_mode: whether to train in production or experimental mode. :type training_mode: PersonalizationTrainingMode :param target_action_types: List of action types to use as targets for training. :type target_action_types: List[str] :param target_action_weights: Dictionary of action types to weights for training. :type target_action_weights: Dict[str, float] :param session_event_types: List of event types to treat as occurrences of sessions. :type session_event_types: List[str] :param test_split: Percent of dataset to use for test data. We support using a range between 6% to 20% of your dataset to use as test data. :type test_split: int :param recent_days_for_training: Limit training data to a certain latest number of days. :type recent_days_for_training: int :param training_start_date: Only consider training interaction data after this date. Specified in the timezone of the dataset. :type training_start_date: datetime :param test_on_user_split: Use user splits instead of using time splits, when validating and testing the model. :type test_on_user_split: bool :param test_split_on_last_k_items: Use last k items instead of global timestamp splits, when validating and testing the model. :type test_split_on_last_k_items: bool :param test_last_items_length: Number of items to leave out for each user when using leave k out folds. :type test_last_items_length: int :param test_window_length_hours: Duration (in hours) of most recent time window to use when validating and testing the model. :type test_window_length_hours: int :param explicit_time_split: Sets an explicit time-based test boundary. :type explicit_time_split: bool :param test_row_indicator: Column indicating which rows to use for training (TRAIN), validation (VAL) and testing (TEST). :type test_row_indicator: str :param full_data_retraining: Train models separately with all the data. :type full_data_retraining: bool :param sequential_training: Train a mode sequentially through time. :type sequential_training: bool :param data_split_feature_group_table_name: Specify the table name of the feature group to export training data with the fold column. :type data_split_feature_group_table_name: str :param dropout_rate: Dropout rate for neural network. :type dropout_rate: int :param batch_size: Batch size for neural network. :type batch_size: BatchSize :param disable_transformer: Disable training the transformer algorithm. :type disable_transformer: bool :param disable_gpu: Disable training on GPU. :type disable_gpu: boo :param filter_history: Do not recommend items the user has already interacted with. :type filter_history: bool :param explore_lookback_hours: Number of hours since creation time that an item is eligible for explore fraction. :type explore_lookback_hours: int :param max_history_length: Maximum length of user-item history to include user in training examples. :type max_history_length: int :param compute_rerank_metrics: Compute metrics based on rerank results. :type compute_rerank_metrics: bool :param item_id_dropout: Fraction of item_id values to randomly dropout during training. :type item_id_dropout: float :param add_time_features: Include interaction time as a feature. :type add_time_features: bool :param disable_timestamp_scalar_features: Exclude timestamp scalar features. :type disable_timestamp_scalar_features: bool :param compute_session_metrics: Evaluate models based on how well they are able to predict the next session of interactions. :type compute_session_metrics: bool :param max_user_history_len_percentile: Filter out users with history length above this percentile. :type max_user_history_len_percentile: int :param downsample_item_popularity_percentile: Downsample items more popular than this percentile. :type downsample_item_popularity_percentile: float
- problem_type: abacusai.api_class.enums.ProblemType
- sort_objective: abacusai.api_class.enums.PersonalizationObjective
- training_mode: abacusai.api_class.enums.PersonalizationTrainingMode
- training_start_date: datetime.datetime
- batch_size: abacusai.api_class.enums.BatchSize
- class abacusai.api_class.RegressionTrainingConfig
Bases:
TrainingConfig
Training config for the PREDICTIVE_MODELING problem type :param problem_type: PREDICTIVE_MODELING :type problem_type: ProblemType :param objective: Ranking scheme used to select final best model. :type objective: RegressionObjective :param sort_objective: Ranking scheme used to sort models on the metrics page. :type sort_objective: RegressionObjective :param tree_hpo_mode: (RegressionTreeHPOMode): Turning off Rapid Experimentation will take longer to train. :param type_of_split: Type of data splitting into train/test (validation also). :type type_of_split: RegressionTypeOfSplit :param test_split: Percent of dataset to use for test data. We support using a range between 5% to 20% of your dataset to use as test data. :type test_split: int :param disable_test_val_fold: Do not create a TEST_VAL set. All records which would be part of the TEST_VAL fold otherwise, remain in the TEST fold. :type disable_test_val_fold: bool :param k_fold_cross_validation: Use this to force k-fold cross validation bagging on or off. :type k_fold_cross_validation: bool :param num_cv_folds: Specify the value of k in k-fold cross validation. :type num_cv_folds: int :param timestamp_based_splitting_column: Timestamp column selected for splitting into test and train. :type timestamp_based_splitting_column: str :param timestamp_based_splitting_method: Method of selecting TEST set, top percentile wise or after a given timestamp. :type timestamp_based_splitting_method: RegressionTimeSplitMethod :param test_splitting_timestamp: Rows with timestamp greater than this will be considered to be in the test set. :type test_splitting_timestamp: str :param sampling_unit_keys: Constrain train/test separation to partition a column. :type sampling_unit_keys: List[str] :param test_row_indicator: Column indicating which rows to use for training (TRAIN) and testing (TEST). Validation (VAL) can also be specified. :type test_row_indicator: str :param rebalance_classes: Class weights are computed as the inverse of the class frequency from the training dataset when this option is selected as “Yes”. It is useful when the classes in the dataset are unbalanced.
Re-balancing classes generally boosts recall at the cost of precision on rare classes.
- Parameters:
rare_class_augmentation_threshold (float) – Augments any rare class whose relative frequency with respect to the most frequent class is less than this threshold.
augmentation_strategy (RegressionAugmentationStrategy) – Strategy to deal with class imbalance and data augmentation.
training_rows_downsample_ratio (float) – Uses this ratio to train on a sample of the dataset provided.
active_labels_column (str) – Specify a column to use as the active columns in a multi label setting.
min_categorical_count (int) – Minimum threshold to consider a value different from the unknown placeholder.
sample_weight (str) – Specify a column to use as the weight of a sample for training and eval.
numeric_clipping_percentile (float) – Uses this option to clip the top and bottom x percentile of numeric feature columns where x is the value of this option.
target_transform (RegressionTargetTransform) – Specify a transform (e.g. log, quantile) to apply to the target variable.
ignore_datetime_features (bool) – Remove all datetime features from the model. Useful while generalizing to different time periods.
max_text_words (int) – Maximum number of words to use from text fields.
perform_feature_selection (bool) – If enabled, additional algorithms which support feature selection as a pretraining step will be trained separately with the selected subset of features. The details about their selected features can be found in their respective logs.
feature_selection_intensity (int) – This determines the strictness with which features will be filtered out. 1 being very lenient (more features kept), 100 being very strict.
batch_size (BatchSize) – Batch size.
dropout_rate (int) – Dropout percentage rate.
pretrained_model_name (str) – Enable algorithms which process text using pretrained multilingual NLP models.
is_multilingual (bool) – Enable algorithms which process text using pretrained multilingual NLP models.
loss_function (RegressionLossFunction) – Loss function to be used as objective for model training.
loss_parameters (str) – Loss function params in format <key>=<value>;<key>=<value>;…..
target_encode_categoricals (bool) – Use this to turn target encoding on categorical features on or off.
drop_original_categoricals (bool) – This option helps us choose whether to also feed the original label encoded categorical columns to the mdoels along with their target encoded versions.
data_split_feature_group_table_name (str) – Specify the table name of the feature group to export training data with the fold column.
custom_loss_functions (list[str]) – Registered custom losses available for selection.
custom_metrics (list[str]) – Registered custom metrics available for selection.
- problem_type: abacusai.api_class.enums.ProblemType
- sort_objective: abacusai.api_class.enums.RegressionObjective
- tree_hpo_mode: abacusai.api_class.enums.RegressionTreeHPOMode
- type_of_split: abacusai.api_class.enums.RegressionTypeOfSplit
- timestamp_based_splitting_method: abacusai.api_class.enums.RegressionTimeSplitMethod
- augmentation_strategy: abacusai.api_class.enums.RegressionAugmentationStrategy
- target_transform: abacusai.api_class.enums.RegressionTargetTransform
- batch_size: abacusai.api_class.enums.BatchSize
- loss_function: abacusai.api_class.enums.RegressionLossFunction
- class abacusai.api_class.ForecastingTrainingConfig
Bases:
TrainingConfig
Training config for the FORECASTING problem type :param problem_type: FORECASTING :type problem_type: ProblemType :param prediction_length: How many timesteps in the future to predict. :type prediction_length: int :param objective: Ranking scheme used to select final best model. :type objective: ForecastingObjective :param sort_objective: Ranking scheme used to sort models on the metrics page. :type sort_objective: ForecastingObjective :param forecast_frequency: Forecast frequency. :type forecast_frequency: ForecastingFrequency :param probability_quantiles: Prediction quantiles. :type probability_quantiles: list[float] :param force_prediction_length: Force length of test window to be the same as prediction length. :type force_prediction_length: int :param filter_items: Filter items with small history and volume. :type filter_items: bool :param enable_feature_selection: Enable feature selection. :type enable_feature_selection: bool :param enable_cold_start: Enable cold start forecasting by training/predicting for zero history items. :type enable_cold_start: bool :param enable_multiple_backtests: Whether to enable multiple backtesting or not. :type enable_multiple_backtests: bool :param num_backtesting_windows: Total backtesting windows to use for the training. :type num_backtesting_windows: int :param backtesting_window_step_size: Use this step size to shift backtesting windows for model training. :type backtesting_window_step_size: int :param full_data_retraining: Train models separately with all the data. :type full_data_retraining: bool :param additional_forecast_keys: List[str]: List of categoricals in timeseries that can act as multi-identifier. :param experimentation_mode: Selecting Thorough Experimentation will take longer to train. :type experimentation_mode: ExperimentationMode :param type_of_split: Type of data splitting into train/test. :type type_of_split: ForecastingDataSplitType :param test_by_item: Partition train/test data by item rather than time if true. :type test_by_item: bool :param test_start: Limit training data to dates before the given test start. :type test_start: datetime :param test_split: Percent of dataset to use for test data. We support using a range between 5% to 20% of your dataset to use as test data. :type test_split: int :param loss_function: Loss function for training neural network. :type loss_function: ForecastingLossFunction :param underprediction_weight: Weight for underpredictions :type underprediction_weight: float :param disable_networks_without_analytic_quantiles: Disable neural networks, which quantile functions do not have analytic expressions (e.g, mixture models) :type disable_networks_without_analytic_quantiles: bool :param initial_learning_rate: Initial learning rate. :type initial_learning_rate: float :param l2_regularization_factor: L2 regularization factor. :type l2_regularization_factor: float :param dropout_rate: Dropout percentage rate. :type dropout_rate: int :param recurrent_layers: Number of recurrent layers to stack in network. :type recurrent_layers: int :param recurrent_units: Number of units in each recurrent layer. :type recurrent_units: int :param convolutional_layers: Number of convolutional layers to stack on top of recurrent layers in network. :type convolutional_layers: int :param convolution_filters: Number of filters in each convolution. :type convolution_filters: int :param local_scaling_mode: Options to make NN inputs stationary in high dynamic range datasets. :type local_scaling_mode: ForecastingLocalScaling :param zero_predictor: Include subnetwork to classify points where target equals zero. :type zero_predictor: bool :param skip_missing: Make the RNN ignore missing entries rather instead of processing them. :type skip_missing: bool :param batch_size: Batch size. :type batch_size: ForecastingBatchSize :param batch_renormalization: Enable batch renormalization between layers. :type batch_renormalization: bool :param history_length: While training, how much history to consider. :type history_length: int :param prediction_step_size: Number of future periods to include in objective for each training sample. :type prediction_step_size: int :param training_point_overlap: Amount of overlap to allow between training samples. :type training_point_overlap: float :param max_scale_context: Maximum context to use for local scaling. :type max_scale_context: int :param quantiles_extension_method: Quantile extension method :type quantiles_extension_method: ForecastingQuanitlesExtensionMethod :param number_of_samples: Number of samples for ancestral simulation :type number_of_samples: int :param symmetrize_quantiles: Force symmetric quantiles (like in Gaussian distribution) :type symmetrize_quantiles: bool :param use_log_transforms: Apply logarithmic transformations to input data. :type use_log_transforms: bool :param smooth_history: Smooth (low pass filter) the timeseries. :type smooth_history: float :param skip_local_scale_target: Skip using per training/prediction window target scaling. :type skip_local_scale_target: bool :param timeseries_weight_column: If set, we use the values in this column from timeseries data to assign time dependent item weights during training and evaluation. :type timeseries_weight_column: str :param item_attributes_weight_column: If set, we use the values in this column from item attributes data to assign weights to items during training and evaluation. :type item_attributes_weight_column: str :param use_timeseries_weights_in_objective: If True, we include weights from column set as “TIMESERIES WEIGHT COLUMN” in objective functions. :type use_timeseries_weights_in_objective: bool :param use_item_weights_in_objective: If True, we include weights from column set as “ITEM ATTRIBUTES WEIGHT COLUMN” in objective functions. :type use_item_weights_in_objective: bool :param skip_timeseries_weight_scaling: If True, we will avoid normalizing the weights. :type skip_timeseries_weight_scaling: bool :param timeseries_loss_weight_column: Use value in this column to weight the loss while training. :type timeseries_loss_weight_column: str :param use_item_id: Include a feature to indicate the item being forecast. :type use_item_id: bool :param use_all_item_totals: Include as input total target across items. :type use_all_item_totals: bool :param handle_zeros_as_missing_values: If True, handle zero values in demand as missing data. :type handle_zeros_as_missing_values: bool :param datetime_holiday_calendars: Holiday calendars to augment training with. :type datetime_holiday_calendars: list[HolidayCalendars] :param fill_missing_values: Strategy for filling in missing values. :type fill_missing_values: list[dict] :param enable_clustering: Enable clustering in forecasting. :type enable_clustering: bool :param data_split_feature_group_table_name: Specify the table name of the feature group to export training data with the fold column. :type data_split_feature_group_table_name: str :param custom_loss_functions: Registered custom losses available for selection. :type custom_loss_functions: list[str] :param custom_metrics: Registered custom metrics available for selection. :type custom_metrics: list[str]
- problem_type: abacusai.api_class.enums.ProblemType
- sort_objective: abacusai.api_class.enums.ForecastingObjective
- forecast_frequency: abacusai.api_class.enums.ForecastingFrequency
- experimentation_mode: abacusai.api_class.enums.ExperimentationMode
- type_of_split: abacusai.api_class.enums.ForecastingDataSplitType
- test_start: datetime.datetime
- loss_function: abacusai.api_class.enums.ForecastingLossFunction
- local_scaling_mode: abacusai.api_class.enums.ForecastingLocalScaling
- batch_size: abacusai.api_class.enums.BatchSize
- quantiles_extension_method: abacusai.api_class.enums.ForecastingQuanitlesExtensionMethod
- datetime_holiday_calendars: List[abacusai.api_class.enums.HolidayCalendars]
- class abacusai.api_class.NamedEntityExtractionTrainingConfig
Bases:
TrainingConfig
Training config for the NAMED_ENTITY_EXTRACTION problem type :param problem_type: NAMED_ENTITY_EXTRACTION :type problem_type: ProblemType :param objective: Ranking scheme used to select final best model. :type objective: NERObjective :param sort_objective: Ranking scheme used to sort models on the metrics page. :type sort_objective: NERObjective :param ner_model_type: Type of NER model to use. :type ner_model_type: NERModelType :param test_split: Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset. :type test_split: int :param test_row_indicator: Column indicating which rows to use for training (TRAIN) and testing (TEST). :type test_row_indicator: str :param dropout_rate: Dropout rate for neural network. :type dropout_rate: float :param batch_size: Batch size for neural network. :type batch_size: BatchSize :param active_labels_column: Entities that have been marked in a particular text :type active_labels_column: str :param document_format: Format of the input documents. :type document_format: NLPDocumentFormat :param include_longformer: Whether to include the longformer model. :type include_longformer: bool
- problem_type: abacusai.api_class.enums.ProblemType
- objective: abacusai.api_class.enums.NERObjective
- sort_objective: abacusai.api_class.enums.NERObjective
- ner_model_type: abacusai.api_class.enums.NERModelType
- batch_size: abacusai.api_class.enums.BatchSize
- document_format: abacusai.api_class.enums.NLPDocumentFormat
- class abacusai.api_class.NaturalLanguageSearchTrainingConfig
Bases:
TrainingConfig
Training config for the NATURAL_LANGUAGE_SEARCH problem type :param problem_type: NATURAL_LANGUAGE_SEARCH :type problem_type: ProblemType :param abacus_internal_model: Use a Abacus.AI LLM to answer questions about your data without using any external APIs :type abacus_internal_model: bool :param num_completion_tokens: Default for maximum number of tokens for chat answers. Reducing this will get faster responses which are more succinct :type num_completion_tokens: int :param larger_embeddings: Use a higher dimension embedding model. :type larger_embeddings: bool :param search_chunk_size: Chunk size for indexing the documents. :type search_chunk_size: int :param chunk_overlap_fraction: Overlap in chunks while indexing the documents. :type chunk_overlap_fraction: float :param test_split: Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset. :type test_split: int
- problem_type: abacusai.api_class.enums.ProblemType
- class abacusai.api_class.SentenceBoundaryDetectionTrainingConfig
Bases:
TrainingConfig
Training config for the SENTENCE_BOUNDARY_DETECTION problem type :param problem_type: SENTENCE_BOUNDARY_DETECTION :type problem_type: ProblemType :param test_split: Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset. :type test_split: int :param dropout_rate: Dropout rate for neural network. :type dropout_rate: float :param batch_size: Batch size for neural network. :type batch_size: BatchSize
- problem_type: abacusai.api_class.enums.ProblemType
- batch_size: abacusai.api_class.enums.BatchSize
- class abacusai.api_class.SentimentDetectionTrainingConfig
Bases:
TrainingConfig
Training config for the SENTIMENT_DETECTION problem type :param problem_type: SENTIMENT_DETECTION :type problem_type: ProblemType :param sentiment_type: Type of sentiment to detect. :type sentiment_type: SentimentType :param test_split: Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset. :type test_split: int :param dropout_rate: Dropout rate for neural network. :type dropout_rate: float :param batch_size: Batch size for neural network. :type batch_size: BatchSize :param compute_metrics: Whether to compute metrics. :type compute_metrics: bool
- problem_type: abacusai.api_class.enums.ProblemType
- sentiment_type: abacusai.api_class.enums.SentimentType
- batch_size: abacusai.api_class.enums.BatchSize
- class abacusai.api_class.DocumentClassificationTrainingConfig
Bases:
TrainingConfig
Training config for the DOCUMENT_CLASSIFICATION problem type :param problem_type: DOCUMENT_CLASSIFICATION :type problem_type: ProblemType :param zero_shot_hypotheses: Zero shot hypotheses. Example text: ‘This text is about pricing’. :type zero_shot_hypotheses: List[str] :param test_split: Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset. :type test_split: int :param dropout_rate: Dropout rate for neural network. :type dropout_rate: float :param batch_size: Batch size for neural network. :type batch_size: BatchSize
- problem_type: abacusai.api_class.enums.ProblemType
- batch_size: abacusai.api_class.enums.BatchSize
- class abacusai.api_class.DocumentSummarizationTrainingConfig
Bases:
TrainingConfig
Training config for the DOCUMENT_SUMMARIZATION problem type :param problem_type: DOCUMENT_SUMMARIZATION :type problem_type: ProblemType :param test_split: Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset. :type test_split: int :param dropout_rate: Dropout rate for neural network. :type dropout_rate: float :param batch_size: Batch size for neural network. :type batch_size: BatchSize
- problem_type: abacusai.api_class.enums.ProblemType
- batch_size: abacusai.api_class.enums.BatchSize
- class abacusai.api_class.DocumentVisualizationTrainingConfig
Bases:
TrainingConfig
Training config for the DOCUMENT_VISUALIZATION problem type :param problem_type: DOCUMENT_VISUALIZATION :type problem_type: ProblemType :param test_split: Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset. :type test_split: int :param dropout_rate: Dropout rate for neural network. :type dropout_rate: float :param batch_size: Batch size for neural network. :type batch_size: BatchSize
- problem_type: abacusai.api_class.enums.ProblemType
- batch_size: abacusai.api_class.enums.BatchSize
- class abacusai.api_class.ClusteringTrainingConfig
Bases:
TrainingConfig
Training config for the CLUSTERING problem type :param problem_type: CLUSTERING :type problem_type: ProblemType :param num_clusters_selection: Number of clusters. If None, will be selected automatically. :type num_clusters_selection: int
- problem_type: abacusai.api_class.enums.ProblemType
- class abacusai.api_class.ClusteringTimeseriesTrainingConfig
Bases:
TrainingConfig
Training config for the CLUSTERING_TIMESERIES problem type :param problem_type: CLUSTERING_TIMESERIES :type problem_type: ProblemType :param num_clusters_selection: Number of clusters. If None, will be selected automatically. :type num_clusters_selection: int :param imputation: Imputation method for missing values. :type imputation: ClusteringImputationMethod
- problem_type: abacusai.api_class.enums.ProblemType
- class abacusai.api_class.CumulativeForecastingTrainingConfig
Bases:
TrainingConfig
Training config for the CUMULATIVE_FORECASTING problem type :param problem_type: CUMULATIVE_FORECASTING :type problem_type: ProblemType :param test_split: Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset. :type test_split: int :param historical_frequency: Forecast frequency :type historical_frequency: str :param cumulative_prediction_lengths: List of Cumulative Prediction Frequencies. Each prediction length must be between 1 and 365. :type cumulative_prediction_lengths: List[int] :param skip_input_transform: Avoid doing numeric scaling transformations on the input. :type skip_input_transform: bool :param skip_target_transform: Avoid doing numeric scaling transformations on the target. :type skip_target_transform: bool :param predict_residuals: Predict residuals instead of totals at each prediction step. :type predict_residuals: bool
- problem_type: abacusai.api_class.enums.ProblemType
- class abacusai.api_class.AnomalyDetectionTrainingConfig
Bases:
TrainingConfig
Training config for the ANOMALY_DETECTION problem type :param problem_type: ANOMALY_DETECTION :type problem_type: ProblemType :param test_split: Percent of dataset to use for test data. We support using a range between 5 (i.e. 5%) to 20 (i.e. 20%) of your dataset to use as test data. :type test_split: int :param value_high: Detect unusually high values. :type value_high: bool :param mixture_of_gaussians: Detect unusual combinations of values using mixture of Gaussians. :type mixture_of_gaussians: bool :param variational_autoencoder: Use variational autoencoder for anomaly detection. :type variational_autoencoder: bool :param spike_up: Detect outliers with a high value. :type spike_up: bool :param spike_down: Detect outliers with a low value. :type spike_down: bool :param trend_change: Detect changes to the trend. :type trend_change: bool
- problem_type: abacusai.api_class.enums.ProblemType
- class abacusai.api_class.ThemeAnalysisTrainingConfig
Bases:
TrainingConfig
Training config for the THEME ANALYSIS problem type :param problem_type: THEME_ANALYSIS :type problem_type: ProblemType
- problem_type: abacusai.api_class.enums.ProblemType
- class abacusai.api_class.AIAgentTrainingConfig
Bases:
TrainingConfig
Training config for the AI_AGENT problem type :param problem_type: AI_AGENT :type problem_type: ProblemType :param description: Description of the agent function. :type description: str
- problem_type: abacusai.api_class.enums.ProblemType
- class abacusai.api_class.CustomTrainedModelTrainingConfig
Bases:
TrainingConfig
Training config for the CUSTOM_TRAINED_MODEL problem type :param problem_type: CUSTOM_TRAINED_MODEL :type problem_type: ProblemType :param max_catalog_size: Maximum expected catalog size. :type max_catalog_size: int :param max_dimension: Maximum expected dimension of the catalog. :type max_dimension: int :param index_output_path: Fully qualified cloud location (GCS, S3, etc) to export snapshots of the embedding to. :type index_output_path: str :param docker_image_uri: Docker image URI. :type docker_image_uri: str :param service_port: Service port. :type service_port: int
- problem_type: abacusai.api_class.enums.ProblemType
- class abacusai.api_class.CustomAlgorithmTrainingConfig
Bases:
TrainingConfig
Training config for the CUSTOM_ALGORITHM problem type :param problem_type: CUSTOM_ALGORITHM :type problem_type: ProblemType :param train_function_name: The name of the train function. :type train_function_name: str :param predict_many_function_name: The name of the predict many function. :type predict_many_function_name: str :param training_input_tables: List of tables to use for training. :type training_input_tables: List[str] :param predict_function_name: Optional name of the predict function if the predict many function is not given. :type predict_function_name: str :param train_module_name: The name of the train module - only relevant if model is being uploaded from a zip file or github repositoty. :type train_module_name: str :param predict_module_name: The name of the predict module - only relevant if model is being uploaded from a zip file or github repositoty. :type predict_module_name: str :param test_split: Percent of dataset to use for test data. We support using a range between 6% to 20% of your dataset to use as test data. :type test_split: int
- problem_type: abacusai.api_class.enums.ProblemType
- class abacusai.api_class.OptimizationTrainingConfig
Bases:
TrainingConfig
Training config for the OPTIMIZATION problem type :param problem_type: OPTIMIZATION :type problem_type: ProblemType
- problem_type: abacusai.api_class.enums.ProblemType
- class abacusai.api_class._TrainingConfigFactory
Bases:
abacusai.api_class.abstract._ApiClassFactory
Helper class that provides a standard way to create an ABC using inheritance.
- config_abstract_class
- config_class_key = 'problem_type'
- config_class_map
- class abacusai.api_class.ApiClass
Bases:
abc.ABC
Helper class that provides a standard way to create an ABC using inheritance.
- __post_init__()
- to_dict()
Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.
- class abacusai.api_class.PythonFunctionArgument
Bases:
abacusai.api_class.abstract.ApiClass
A config class for python function arguments
- Parameters:
variable_type (PythonFunctionArgumentType) – The type of the python function argument
name (str) – The name of the python function variable
is_required (bool) – Whether the argument is required
value (Any) – The value of the argument
pipeline_variable (str) – The name of the pipeline variable to use as the value
- variable_type: abacusai.api_class.enums.PythonFunctionArgumentType
- value: Any
- class abacusai.api_class.OutputVariableMapping
Bases:
abacusai.api_class.abstract.ApiClass
A config class for python function arguments
- Parameters:
variable_type (PythonFunctionArgumentType) – The type of the python function argument
name (str) – The name of the python function variable
- variable_type: abacusai.api_class.enums.PythonFunctionArgumentType
- class abacusai.api_class.ApiClass
Bases:
abc.ABC
Helper class that provides a standard way to create an ABC using inheritance.
- __post_init__()
- to_dict()
Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.
- class abacusai.api_class._ApiClassFactory
Bases:
abc.ABC
Helper class that provides a standard way to create an ABC using inheritance.
- config_abstract_class
- config_class_key
- config_class_map
- class abacusai.api_class.FeatureGroupExportConfig
Bases:
abacusai.api_class.abstract.ApiClass
Helper class that provides a standard way to create an ABC using inheritance.
- connector_type: abacusai.api_class.enums.ConnectorType
- class abacusai.api_class.FileConnectorExportConfig
Bases:
FeatureGroupExportConfig
Helper class that provides a standard way to create an ABC using inheritance.
- connector_type: abacusai.api_class.enums.ConnectorType
- to_dict()
Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.
- class abacusai.api_class.DatabaseConnectorExportConfig
Bases:
FeatureGroupExportConfig
Helper class that provides a standard way to create an ABC using inheritance.
- connector_type: abacusai.api_class.enums.ConnectorType
- to_dict()
Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.
- class abacusai.api_class._FeatureGroupExportConfigFactory
Bases:
abacusai.api_class.abstract._ApiClassFactory
Helper class that provides a standard way to create an ABC using inheritance.
- config_abstract_class
- config_class_key = 'connectorType'
- config_class_map