mdfreader module documentation¶
Measured Data Format file reader main module
Platform and python version¶
With Unix and Windows for python 2.6+ and 3.2+
Author: | Aymeric Rateau |
---|
Created on Sun Oct 10 12:57:28 2010
Dependencies¶
- Python >2.6, >3.2 <http://www.python.org>
- Numpy >1.6 <http://numpy.scipy.org>
- Sympy to convert channels with formula
- bitarray for not byte aligned data parsing
- Matplotlib >1.0 <http://matplotlib.sourceforge.net>
- NetCDF
- h5py for the HDF5 export
- xlwt for the excel export (not existing for python3)
- openpyxl for the excel 2007 export
- scipy for the Matlab file conversion
- zlib to uncompress data block if needed
Attributes¶
- PythonVersion : float
- Python version currently running, needed for compatibility of both python 2.6+ and 3.2+
mdfreader module¶
-
class
mdfreader.
mdf
(fileName=None, channelList=None, convertAfterRead=True, filterChannelNames=False)¶ Bases:
mdf3reader.mdf3
,mdf4reader.mdf4
mdf class
Notes
mdf class is a nested dict Channel name is the primary dict key of mdf class At a higher level, each channel includes the following keys :
‘data’ : containing vector of data (numpy)
‘unit’ : unit (string)
‘master’ : master channel of channel (time, crank angle, etc.)
‘description’ : Description of channel
- ‘conversion’: mdfinfo nested dict for CCBlock.
Exist if channel not converted, used to convert with getChannelData method
Examples
>>> import mdfreader >>> yop=mdfreader.mdf('NameOfFile') >>> yop.keys() # list channels names >>> yop.masterChannelList() # list channels grouped by raster or master channel >>> yop.plot('channelName') or yop.plot({'channel1','channel2'}) >>> yop.resample(0.1) or yop.resample(channelName='master3') >>> yop.exportoCSV(sampling=0.01) >>> yop.exportNetCDF() >>> yop.exporttoHDF5() >>> yop.exporttoMatlab() >>> yop.exporttoExcel() >>> yop.exporttoXlsx() >>> yop.convertToPandas() # converts data groups into pandas dataframes >>> yop.keepChannels({'channel1','channel2','channel3'}) # drops all the channels except the one in argument >>> yop.getChannelData('channelName') # returns channel numpy array
Attributes
fileName (str) file name MDFVersionNumber (int) mdf file version number masterChannelList (dict) Represents data structure: a key per master channel with corresponding value containing a list of channels One key or master channel represents then a data group having same sampling interval. multiProc (bool) Flag to request channel conversion multi processed for performance improvement. One thread per data group. file_metadata (dict) file metadata with minimum keys : author, organisation, project, subject, comment, time, date Methods
read( fileName = None, multiProc = False, channelList=None, convertAfterRead=True, filterChannelNames=False ) reads mdf file version 3.x and 4.x write( fileName=None ) writes simple mdf 3.3 file getChannelData( channelName ) returns channel numpy array convertAllChannel() converts all channel data according to CCBlock information getChannelUnit( channelName ) returns channel unit plot( channels ) Plot channels with Matplotlib resample( samplingTime = 0.1, masterChannel=None ) Resamples all data groups exportToCSV( filename = None, sampling = 0.1 ) Exports mdf data into CSV file exportToNetCDF( filename = None, sampling = None ) Exports mdf data into netcdf file exportToHDF5( filename = None, sampling = None ) Exports mdf class data structure into hdf5 file exportToMatlab( filename = None ) Exports mdf class data structure into Matlab file exportToExcel( filename = None ) Exports mdf data into excel 95 to 2003 file exportToXlsx( filename=None ) Exports mdf data into excel 2007 and 2010 file convertToPandas( sampling=None ) converts mdf data structure into pandas dataframe(s) keepChannels( channelList ) keeps only list of channels and removes the other channels mergeMdf( mdfClass ): Merges data of 2 mdf classes -
allPlot
()¶
-
convertAllChannel
()¶ Converts all channels from raw data to converted data according to CCBlock information Converted data will take more memory.
-
convertToPandas
(sampling=None)¶ converts mdf data structure into pandas dataframe(s)
Parameters: sampling : float, optional
resampling interval
Notes
One pandas dataframe is converted per data group Not adapted yet for mdf4 as it considers only time master channels
-
exportToCSV
(filename=None, sampling=None)¶ Exports mdf data into CSV file
Parameters: filename : str, optional
file name. If no name defined, it will use original mdf name and path
sampling : float, optional
sampling interval. None by default
Notes
Data saved in CSV fille be automatically resampled as it is difficult to save in this format data not sharing same master channel Warning: this can be slow for big data, CSV is text format after all
-
exportToExcel
(filename=None)¶ Exports mdf data into excel 95 to 2003 file
Parameters: filename : str, optional
file name. If no name defined, it will use original mdf name and path
Notes
xlwt is not fast for even for small files, consider other binary formats like HDF5 or Matlab If there are more than 256 channels, data will be saved over different worksheets Also Excel 203 is becoming rare these days
-
exportToHDF5
(filename=None, sampling=None)¶ Exports mdf class data structure into hdf5 file
Parameters: filename : str, optional
file name. If no name defined, it will use original mdf name and path
sampling : float, optional
sampling interval.
Notes
The maximum attributes will be stored Data structure will be similar has it is in masterChannelList attribute
-
exportToMatlab
(filename=None)¶ Export mdf data into Matlab file format 5, tentatively compressed
Parameters: filename : str, optional
file name. If no name defined, it will use original mdf name and path
Notes
This method will dump all data into Matlab file but you will loose below information: - unit and descriptions of channel - data structure, what is corresponding master channel to a channel. Channels might have then different lengths
-
exportToNetCDF
(filename=None, sampling=None)¶ Exports mdf data into netcdf file
Parameters: filename : str, optional
file name. If no name defined, it will use original mdf name and path
sampling : float, optional
sampling interval.
-
exportToXlsx
(filename=None)¶ Exports mdf data into excel 2007 and 2010 file
Parameters: filename : str, optional
file name. If no name defined, it will use original mdf name and path
Notes
It is recommended to export resampled data for performances
-
getChannelData
(channelName)¶ Return channel numpy array
Parameters: channelName : str
channel name
Notes
This method is the safest to get channel data as numpy array from ‘data’ dict key might contain raw data
-
keepChannels
(channelList)¶ keeps only list of channels and removes the other channels
Parameters: channelList : list of str
list of channel names
-
mergeMdf
(mdfClass)¶ Merges data of 2 mdf classes
Parameters: mdfClass : mdf
mdf class instance to be merge with self
Notes
both classes must have been resampled, otherwise, impossible to know master channel to match create union of both channel lists and fill with Nan for unknown sections in channels
-
plot
(channels)¶ Plot channels with Matplotlib
Parameters: channels : str or list of str
channel name or list of channel names
Notes
Channel description and unit will be tentatively displayed with axis labels
-
read
(fileName=None, multiProc=False, channelList=None, convertAfterRead=True, filterChannelNames=False)¶ reads mdf file version 3.x and 4.x
Parameters: fileName : str, optional
file name
multiProc : bool
flag to activate multiprocessing of channel data conversion
channelList : list of str, optional
list of channel names to be read If you use channelList, reading might be much slower but it will save you memory. Can be used to read big files
convertAfterRead : bool, optional
flag to convert channel after read, True by default If you use convertAfterRead by setting it to false, all data from channels will be kept raw, no conversion applied. If many float are stored in file, you can gain from 3 to 4 times memory footprint To calculate value from channel, you can then use method .getChannelData()
filterChannelNames : bool, optional
flag to filter long channel names from its module names separated by ‘.’
Notes
If you keep convertAfterRead to true, you can set attribute mdf.multiProc to activate channel conversion in multiprocessing. Gain in reading time can be around 30% if file is big and using a lot of float channels
-
resample
(samplingTime=None, masterChannel=None)¶ Resamples all data groups into one data group having defined sampling interval or sharing same master channel
Parameters: samplingTime : float, optional
resampling interval, None by default. If None, will merge all datagroups into a unique datagroup having the highest sampling rate from all datagroups
**or**
masterChannel : str, optional
master channel name to be used for all channels
Notes
1. resampling is relatively safe for mdf3 as it contains only time series. However, mdf4 can contain also distance, angle, etc. It might make not sense to apply one resampling to several data groups that do not share same kind of master channel (like time resampling to distance or angle data groups) If several kind of data groups are used, you should better use pandas to resample
2. resampling will convert all your channels so be careful for big files and memory consumption
-
write
(fileName=None)¶ Writes simple mdf 3.3 file
Parameters: fileName : str, optional
Name of file If file name is not input, written file name will be the one read with appended ‘_new’ string before extension
Notes
All channels will be converted, so size might be bigger than original file
-
class
mdfreader.
mdfinfo
(fileName=None, filterChannelNames=False)¶ Bases:
dict
- MDFINFO is a class gathering information from block headers in a MDF (Measure Data Format) file
- Structure is nested dicts. Primary key is Block type, then data group, channel group and channel number. Examples of dicts
- mdfinfo[‘HDBlock’] header block
- mdfinfo[‘DGBlock’][dataGroup] Data Group block
- mdfinfo[‘CGBlock’][dataGroup][channelGroup] Channel Group block
- mdfinfo[‘CNBlock’][dataGroup][channelGroup][channel] Channel block including text blocks for comment and identifier
- mdfinfo[‘CCBlock’][dataGroup][channelGroup][channel] Channel conversion information
Examples
>>> import mdfreader >>> FILENAME='toto.dat' >>> yop=mdfreader.mdfinfo(FILENAME) or if you are just interested to have only list of channels >>> yop=mdfreader.mdfinfo() # creates new instance f mdfinfo class >>> yop=mdfreader.listChannels(FILENAME) # returns a simple list of channel names
Attributes
fileName (str) file name mdfversion (int) mdf file version number Methods
readinfo( fileName = None, filterChannelNames=False ) Reads MDF file and extracts its complete structure listChannels( fileName = None ) Read MDF file blocks and returns a list of contained channels -
listChannels
(fileName=None)¶ Read MDF file blocks and returns a list of contained channels
Parameters: fileName : string
file name
Returns: nameList : list of string
list of channel names
-
readinfo
(fileName=None, filterChannelNames=False)¶ Reads MDF file and extracts its complete structure
Parameters: fileName : str, optional
file name. If not input, uses fileName attribute
filterChannelNames : bool, optional
flag to filter long channel names including module names separated by a ‘.’