Generator¶
Multiple Aspect Trajectory Tools Framework
MAT-data: Data Preprocessing for Multiple Aspect Trajectory Data Mining
The present application offers a tool, to support the user in the classification task of multiple aspect trajectories, specifically for extracting and visualizing the movelets, the parts of the trajectory that better discriminate a class. It integrates into a unique platform the fragmented approaches available for multiple aspects trajectories and in general for multidimensional sequence classification into a unique web-based and python library system. Offers both movelets visualization and classification methods.
Created on Dec, 2023 Copyright (C) 2023, License GPL Version 3 or superior (see LICENSE file)
@author: Tarlis Portela
- matdata.generator.randomGenerator(N=10, M=50, L=10, C=10, random_seed=1, fileprefix='random', fileposfix='train', attr_desc=None, save_to=False, outformats=['csv'])[source]¶
Function to generate trajectories based on random data.
Parameters:¶
- Nint, optional
Number of trajectories (default 10)
- Mint, optional
Size of trajectories (default 50)
- Lint, optional
Number of attributes (default 10)
- Cint, optional
Number of classes (default 10)
- random_seedint, optional
Random Seed (default 1)
- attr_desclist of dict, optional
Data type intervals to generate attributes as a list of descriptive dicts. Default: None (uses default types) OR a list of instances of AttributeGenerator
- save_tostr or bool, optional
Destination folder to save, or False if not to save CSV files (default False)
- fileprefixstr, optional
Output filename prefix (default ‘sample’)
- fileposfixstr, optional
Output filename postfix (default ‘train’)
- outformatslist, optional
Output file formats for saving (default [‘csv’])
Returns:¶
- pandas.DataFrame
The generated dataset.
- matdata.generator.samplerGenerator(N=10, M=50, C=1, random_seed=1, fileprefix='sample', fileposfix='train', cols_for_sampling=['space', 'time', 'day', 'rating', 'price', 'weather', 'root_type', 'type'], save_to=False, base_data=None, outformats=['csv'])[source]¶
Function to generate trajectories based on real data.
Parameters:¶
- Nint, optional
Number of trajectories (default 10)
- Mint, optional
Size of trajectories, number of points (default 50)
- Cint, optional
Number of classes (default 1)
- random_seedint, optional
Random seed (default 1)
- cols_for_samplinglist, optional
Columns to add in the generated dataset. Default: [‘space’, ‘time’, ‘day’, ‘rating’, ‘price’, ‘weather’, ‘root_type’, ‘type’].
- save_tostr or bool, optional
Destination folder to save, or False if not to save CSV files (default False)
- fileprefixstr, optional
Output filename prefix (default ‘sample’)
- fileposfixstr, optional
Output filename postfix (default ‘train’)
- base_dataDataFrame, optional
DataFrame of trajectories to use as a base for sampling data. Default: None (uses example data)
- outformatslist, optional
Output file formats for saving (default [‘csv’])
Returns:¶
- pandas.DataFrame
The generated dataset.
- matdata.generator.scalerRandomGenerator(Ns=[100, 10], Ms=[10, 10], Ls=[8, 10], Cs=[2, 10], random_seed=1, fileprefix='scalability', fileposfix='train', attr_desc=None, save_to=None, save_desc_files=True, outformats=['csv'])[source]¶
Function to generate trajectory datasets based on random data.
Parameters:¶
- Nslist of int, optional
Parameters to scale the number of trajectories. List of 2 values: starting number, number of elements (default [100, 10])
- Mslist of int, optional
Parameters to scale the size of trajectories. List of 2 values: starting number, number of elements (default [10, 10])
- Lslist of int, optional
Parameters to scale the number of attributes (* doubles the columns). List of 2 values: starting number, number of elements (default [8, 10])
- Cslist of int, optional
Parameters to scale the number of classes. List of 2 values: starting number, number of elements (default [2, 10])
- random_seedint, optional
Random seed (default 1)
- attr_desclist, optional
Data type intervals to generate attributes as a list of descriptive dicts. Default: None (uses default types)
- save_tostr or bool, optional
Destination folder to save, or False if not to save CSV files (default False)
- fileprefixstr, optional
Output filename prefix (default ‘sample’)
- fileposfixstr, optional
Output filename postfix (default ‘train’)
- save_desc_filesbool, optional
True if to save the .json description files, False otherwise (default True)
- outformatslist, optional
Output file formats for saving (default [‘csv’])
Returns:¶
None
- matdata.generator.scalerSamplerGenerator(Ns=[100, 10], Ms=[10, 10], Ls=[8, 10], Cs=[2, 10], random_seed=1, fileprefix='scalability', fileposfix='train', cols_for_sampling=['space', 'time', 'day', 'rating', 'price', 'weather', 'root_type', 'type'], save_to=None, base_data=None, save_desc_files=True, outformats=['csv'])[source]¶
Generates trajectory datasets based on real data.
Parameters:¶
- Nslist of int, optional
Parameters to scale the number of trajectories. List of 2 values: starting number, number of elements (default [100, 10])
- Mslist of int, optional
Parameters to scale the size of trajectories. List of 2 values: starting number, number of elements (default [10, 10])
- Lslist of int, optional
Parameters to scale the number of attributes (* doubles the columns). List of 2 values: starting number, number of elements (default [8, 10])
- Cslist of int, optional
Parameters to scale the number of classes. List of 2 values: starting number, number of elements (default [2, 10])
- random_seedint, optional
Random seed (default 1)
- fileprefixstr, optional
Output filename prefix (default ‘scalability’)
- fileposfixstr, optional
Output filename postfix (default ‘train’)
- cols_for_samplinglist or dict, optional
Columns to add in the generated dataset. Default: [‘space’, ‘time’, ‘day’, ‘rating’, ‘price’, ‘weather’, ‘root_type’, ‘type’]. If a dictionary is provided in the format: {‘aspectName’: ‘type’, ‘aspectName’: ‘type’}, it is used when providing base_data and saving .MAT.
- save_tostr or bool, optional
Destination folder to save, or False if not to save CSV files (default False)
- base_dataDataFrame, optional
DataFrame of trajectories to use as a base for sampling data. Default: None (uses example data)
- save_desc_filesbool, optional
True if to save the .json description files, False otherwise (default True)
- outformatslist, optional
Output file formats for saving (default [‘csv’])
Returns:¶
None