poets.io package¶
Submodules¶
poets.io.download module¶
Provides download functions for FTP/SFTP, HTTP and local data sources.
- poets.io.download.download_ftp(download_path, host, directory, filedate, port=21, username='', password='', dirstruct=None, ffilter='', begin=None, end=None)[source]¶
Downloads data via FTP.
- download_path : str, optional
- Path where to save the downloaded files.
- host : str
- Link to host.
- directory : str
- Path to data on host.
- filedate : dict
- Dict which points to the date fields in the filename
- port : int, optional
- Port to host, defaults to 21.
- username : str, optional
- Username for source, defaults to emtpy str.
- password : str, optional
- Passwor for source, defaults to emtpy str.
- dirstruct : list of str, optional
- Folder structure on host, each list element represents a subdirectory.
- ffilter : str, optional
- Used for filtering files on a server, defaults to emtpy str.
- begin : datetime, optional
- Set either to first date of remote repository or date of last file in local repository.
- end : datetime, optional
- Date until which data should be downloaded.
- bool
- True if data is available, False if not.
- poets.io.download.download_http(download_path, host, directory, filename, filedate, dirstruct, ffilter=None, begin=None, end=datetime.datetime(2014, 12, 16, 11, 1, 9, 659138))[source]¶
Download data via HTTP
- download_path : str, optional
- Path where to save the downloaded files.
- host : str
- Link to host.
- directory : str
- Path to data on host.
- filename : str
- Structure/convention of the file name.
- filedate : dict
- Dict which points to the date fields in the filename.
- dirstruct : list of str
- Folder structure on host, each list element represents a subdirectory.
- ffilter : str, optional
- Used for filtering files on a server, defaults to None.
- begin : datetime, optional
- Set either to first date of remote repository or date of last file in local repository.
- end : datetime, optional
- Date until which data should be downloaded.
- bool
- true if data is available, false if not
- poets.io.download.download_local(download_path, directory, filedate, dirstruct=None, ffilter='', begin=datetime.datetime(1900, 1, 1, 0, 0), end=datetime.datetime(2014, 12, 16, 11, 1, 9, 659153))[source]¶
Download data from local path
- download_path : str
- Path where to save the downloaded files.
- directory : str
- Path to locally stored data.
- filedate : dict
- Dict which points to the date fields in the filename.
- dirstruct : list of str, optional
- Folder structure in directory, each list element represents a subdirectory.
- ffilter : str, optional
- Used for filtering files on a server, defaults to empty string.
- begin : datetime, optional
- Set either to first date of remote repository or date of last file in local repository, defaults to datetime(1900, 1, 1).
- end : datetime, optional
- Date until which data should be downloaded, defaults to datetime.now()
- bool
- True if data is available, false if not.
- poets.io.download.download_sftp(download_path, host, directory, port, username, password, filedate, dirstruct=None, ffilter='', begin=None, end=None)[source]¶
Download data via SFTP.
- download_path : str, optional
- Path where to save the downloaded files.
- host : str
- Link to host.
- directory : str
- Path to data on host.
- port : int
- Port to host.
- username : str
- Username for source.
- password : str
- Password for source.
- filedate : dict
- Dict which points to the date fields in the filename.
- dirstruct : list of str, optional
- Folder structure on host, each list element represents a subdirectory.
- ffilter : str, optional
- Used for filtering files on a server, defaults to emtpy str.
- begin : datetime, optional
- Set either to first date of remote repository or date of last file in local repository.
- end : datetime, optional
- Date until which data should be downloaded.
- bool
- True if data is available, false if not.
- poets.io.download.filesInDir_ftp(path, ftp, filedate, begin, end, filelist)[source]¶
List all files in directory and subdirectories on an FTP server.
- path : str
- Path to data on host.
- ftp : ftplib connection
- Connection to Server.
- filedate : dict
- Dict which points to the date fields in the filename.
- begin : datetime,
- Date from which on to download data.
- end : datetime
- Date until which to download data.
- filelist : list of str
- List of filepaths or empty list.
- filelist : list
- List containing all files in directory and subdirectories
- poets.io.download.filesInDir_sftp(path, sftp, filedate, begin, end, filelist)[source]¶
List all files in directory and subdirectories on an SFTP server.
- path : str
- Path to data on host.
- sftp : paramiko Transport
- Connection to Server.
- filedate : dict
- Dict which points to the date fields in the filename.
- begin : datetime,
- Date from which on to download data.
- end : datetime
- Date until which to download data.
- filelist : list of str
- List of filepaths or empty list.
- filelist : list
- List containing all files in directory and subdirectories
poets.io.fileformats module¶
poets.io.source_base module¶
- class poets.io.source_base.BasicSource(name, filename, filedate, temp_res, rootpath, host, protocol, username=None, password=None, port=22, directory=None, dirstruct=None, begin_date=None, ffilter=None, colorbar='jet', variables=None, nan_value=None, valid_range=None, unit=None, dest_nan_value=-99, dest_regions=None, dest_sp_res=0.25, dest_temp_res='dekad', dest_start_date=datetime.datetime(2000, 1, 1, 0, 0), data_range=None)[source]¶
Bases: object
Base Class for data sources.
- name : str
- Name of the data source.
- filename : str
- Structure/convention of the file name.
- filedate : dict
- Position of date fields in filename, given as tuple.
- temp_res : str
- Temporal resolution of the source.
- rootpath : str
- Root path where all data will be stored.
- host : str
- Link to data host.
- protocol : str
- Protocol for data transfer.
- username : str, optional
- Username for data access.
- password : str, optional
- Password for data access.
- port : int, optional
- Port to data host, defaults to 22.
- directory : str, optional
- Path to data on host.
- dirstruct : list of strings, optional
- Structure of source directory, each list item represents a subdirectory.
- begin_date : datetime, optional
- Date from which on data is available.
- variables : string or list of strings, optional
- Variables used from data source, defaults to [‘dataset’].
- nan_value : int, float, optional
- Nan value of the original data as given by the data provider.
- valid_range : tuple of int of float, optional
- Valid range of data, given as (minimum, maximum).
- data_range : tuple of int of float, optional
- Range of the values as data given in rawdata (minimum, maximum). Will be scaled to valid_range.
- ffilter : str, optional
- Pattern that apperas in filename. Can be used to select out not needed files if multiple files per date are provided.
- colorbar : str, optional
- Colorbar to use, use one from http://matplotlib.org/examples/color/colormaps_reference.html, defaults to jet.
- unit : str, optional
- Unit of dataset for displaying in legend. Does not have to be set if unit is specified in input file metadata. Defaults to None.
- dest_nan_value : int, float, optional
- NaN value in the final NetCDF file.
- dest_regions : list of str, optional
- Regions of interest where data should be resampled to.
- dest_sp_res : int, float, optional
- Spatial resolution of the destination NetCDF file, defaults to 0.25 degree.
- dest_temp_res : string, optional
- Temporal resolution of the destination NetCDF file, possible values: (‘day’, ‘week’, ‘dekad’, ‘month’), defaults to dekad.
- dest_start_date : datetime, optional
- Start date of the destination NetCDF file, defaults to 2000-01-01.
- name : str
- Name of the data source.
- filename : str
- Structure/convention of the file name.
- filedate : dict
- Position of date fields in filename, given as tuple.
- temp_res : str
- Temporal resolution of the source.
- host : str
- Link to data host.
- protocol : str
- Protocol for data transfer.
- username : str
- Username for data access.
- password : str
- Password for data access.
- port : int
- Port to data host.
- directory : str
- Path to data on host.
- dirstruct : list of strings
- Structure of source directory, each list item represents a subdirectory.
- begin_date : datetime
- Date from which on data is available.
- ffilter : str
- Pattern that apperas in filename.
- colorbar : str, optional
- Colorbar to used.
- unit : str
- Unit of dataset for displaying in legend.
- variables : list of strings
- Variables used from data source.
- nan_value : int, float
- Not a number value of the original data as given by the data provider.
- valid_range : tuple of int of float
- Valid range of data, given as (minimum, maximum).
- data_range : tuple of int of float
- Range of the values as data given in rawdata (minimum, maximum).
- dest_nan_value : int, float, optional
- NaN value in the final NetCDF file.
- tmp_path : str
- Path where temporary files are stored.
- rawdata_path : str
- Path where original files are stored.
- data_path : str
- Path where resampled NetCDF file is stored.
- dest_regions : list of str
- Regions of interest where data is resampled to.
- dest_sp_res : int, float
- Spatial resolution of the destination NetCDF file.
- dest_temp_res : string
- Temporal resolution of the destination NetCDF file.
- dest_start_date : datetime.datetime
- First date of the dataset in the destination NetCDF file.
- download(download_path=None, begin=None, end=None)[source]¶
“Download data
- begin : datetime, optional
- start date of download, default to None
- end : datetime, optional
- start date of download, default to None
- download_and_resample(download_path=None, begin=None, end=None, delete_rawdata=False, shapefile=None)[source]¶
Downloads and resamples data.
- download_path : str
- Path where to save the downloaded files.
- begin : datetime.date, optional
- set either to first date of remote repository or date of last file in local repository
- end : datetime.date, optional
- set to today if none given
- delete_rawdata : bool, optional
- Original files will be deleted from rawdata_path if set True
- shapefile : str, optional
- Path to shape file, uses “world country admin boundary shapefile” by default.
- get_variables()[source]¶
Gets all variables given in the NetCDF file.
- variables : list of str
- Variables from given in the NetCDF file.
- read_img(date, region=None, variable=None, scaled=True)[source]¶
Gets images from netCDF file for certain date
- date : datetime
- Date of the image.
- region : str, optional
- Region of interest, set to first defined region if not set.
- variable : str, optional
- Variable to display, selects first available variables if None.
- scaled : bool, optional
- If true, data will be scaled to a predefined range; if false, data will be shown as given in rawdata file; defaults to True.
- img : numpy.ndarray
- Image of selected date.
- lon : numpy.array
- Array with longitudes.
- lat : numpy.array
- Array with latitudes.
- metadata : dict
- Dictionary containing metadata of the variable.
- read_ts(location, region=None, variable=None, shapefile=None, scaled=True)[source]¶
Gets timeseries from netCDF file for a gridpoint.
- location : int or tuple of floats
- Either Grid point index as integer value or Longitude/Latitude given as tuple.
- region : str, optional
- Region of interest, set to first defined region if not set.
- variable : str, optional
- Variable to display, selects all available variables if None.
- shapefile : str, optional
- Path to custom shapefile.
- scaled : bool, optional
- If true, data will be scaled to a predefined range; if false, data will be shown as given in rawdata file; defaults to True
- df : pd.DataFrame
- Timeseries for selected variables.
- resample(begin=None, end=None, delete_rawdata=False, shapefile=None, stepwise=True)[source]¶
Resamples source data to given spatial and temporal resolution.
Writes resampled images into a netCDF data file. Deletes original files if flag delete_rawdata is set True.
- begin : datetime
- Start date of resampling.
- end : datetime
- End date of resampling.
- delete_rawdata : bool
- Original files will be deleted from rawdata_path if set ‘True’.
- shapefile : str, optional
- Path to shape file, uses “world country admin boundary shapefile” by default.
poets.io.unpack module¶
Module for unpacking compressed archives. Based on pyunpack and patool.
- poets.io.unpack.check_compressed(filepath)[source]¶
Checks if a file is compressed using the file extension.
- filepath : string
- Path to input file.
- boolean
- True if compressed, False if not.
- poets.io.unpack.flatten(outpath)[source]¶
Flattens directory structure.
- outpath : str
- Directory to flatten.
- OSError :
- If file cannot be moved.
- poets.io.unpack.unpack(filepath, outpath=None)[source]¶
Unpacks compressed archives and files recursively and flattens the output.
- filepath : str
- Path to zipped archive.
- outpath : str
- Path where decompressed files will be stored.
- flatten : bool, optional
- If True, output dir will be flattened.
- IOError :
- If input file does not exist.