mdfreader module documentation

Measured Data Format file reader main module

Platform and python version

With Unix and Windows for python 2.6+ and 3.2+

Author:Aymeric Rateau

Created on Sun Oct 10 12:57:28 2010

Dependencies

  • Python >2.6, >3.2 <http://www.python.org>
  • Numpy >1.6 <http://numpy.scipy.org>
  • Sympy to convert channels with formula
  • bitarray for not byte aligned data parsing
  • Matplotlib >1.0 <http://matplotlib.sourceforge.net>
  • NetCDF
  • h5py for the HDF5 export
  • xlwt for the excel export (not existing for python3)
  • openpyxl for the excel 2007 export
  • scipy for the Matlab file conversion
  • zlib to uncompress data block if needed

Attributes

PythonVersion : float
Python version currently running, needed for compatibility of both python 2.6+ and 3.2+

mdfreader module

class mdfreader.mdf(fileName=None, channelList=None, convertAfterRead=True, filterChannelNames=False)

Bases: mdf3reader.mdf3, mdf4reader.mdf4

mdf class

Notes

mdf class is a nested dict Channel name is the primary dict key of mdf class At a higher level, each channel includes the following keys :

  • ‘data’ : containing vector of data (numpy)

  • ‘unit’ : unit (string)

  • ‘master’ : master channel of channel (time, crank angle, etc.)

  • ‘description’ : Description of channel

  • ‘conversion’: mdfinfo nested dict for CCBlock.

    Exist if channel not converted, used to convert with getChannelData method

Examples

>>> import mdfreader
>>> yop=mdfreader.mdf('NameOfFile')
>>> yop.keys() # list channels names
>>> yop.masterChannelList() # list channels grouped by raster or master channel
>>> yop.plot('channelName') or yop.plot({'channel1','channel2'})
>>> yop.resample(0.1) or yop.resample(channelName='master3')
>>> yop.exportoCSV(sampling=0.01)
>>> yop.exportNetCDF()
>>> yop.exporttoHDF5()
>>> yop.exporttoMatlab()
>>> yop.exporttoExcel()
>>> yop.exporttoXlsx()
>>> yop.convertToPandas() # converts data groups into pandas dataframes
>>> yop.keepChannels({'channel1','channel2','channel3'}) # drops all the channels except the one in argument
>>> yop.getChannelData('channelName') # returns channel numpy array

Attributes

fileName (str) file name
MDFVersionNumber (int) mdf file version number
masterChannelList (dict) Represents data structure: a key per master channel with corresponding value containing a list of channels One key or master channel represents then a data group having same sampling interval.
multiProc (bool) Flag to request channel conversion multi processed for performance improvement. One thread per data group.
file_metadata (dict) file metadata with minimum keys : author, organisation, project, subject, comment, time, date

Methods

read( fileName = None, multiProc = False, channelList=None, convertAfterRead=True, filterChannelNames=False ) reads mdf file version 3.x and 4.x
write( fileName=None ) writes simple mdf 3.3 file
getChannelData( channelName ) returns channel numpy array
convertAllChannel() converts all channel data according to CCBlock information
getChannelUnit( channelName ) returns channel unit
plot( channels ) Plot channels with Matplotlib
resample( samplingTime = 0.1, masterChannel=None ) Resamples all data groups
exportToCSV( filename = None, sampling = 0.1 ) Exports mdf data into CSV file
exportToNetCDF( filename = None, sampling = None ) Exports mdf data into netcdf file
exportToHDF5( filename = None, sampling = None ) Exports mdf class data structure into hdf5 file
exportToMatlab( filename = None ) Exports mdf class data structure into Matlab file
exportToExcel( filename = None ) Exports mdf data into excel 95 to 2003 file
exportToXlsx( filename=None ) Exports mdf data into excel 2007 and 2010 file
convertToPandas( sampling=None ) converts mdf data structure into pandas dataframe(s)
keepChannels( channelList ) keeps only list of channels and removes the other channels
mergeMdf( mdfClass ): Merges data of 2 mdf classes
allPlot()
convertAllChannel()

Converts all channels from raw data to converted data according to CCBlock information Converted data will take more memory.

convertToPandas(sampling=None)

converts mdf data structure into pandas dataframe(s)

Parameters:

sampling : float, optional

resampling interval

Notes

One pandas dataframe is converted per data group Not adapted yet for mdf4 as it considers only time master channels

exportToCSV(filename=None, sampling=None)

Exports mdf data into CSV file

Parameters:

filename : str, optional

file name. If no name defined, it will use original mdf name and path

sampling : float, optional

sampling interval. None by default

Notes

Data saved in CSV fille be automatically resampled as it is difficult to save in this format data not sharing same master channel Warning: this can be slow for big data, CSV is text format after all

exportToExcel(filename=None)

Exports mdf data into excel 95 to 2003 file

Parameters:

filename : str, optional

file name. If no name defined, it will use original mdf name and path

Notes

xlwt is not fast for even for small files, consider other binary formats like HDF5 or Matlab If there are more than 256 channels, data will be saved over different worksheets Also Excel 203 is becoming rare these days

exportToHDF5(filename=None, sampling=None)

Exports mdf class data structure into hdf5 file

Parameters:

filename : str, optional

file name. If no name defined, it will use original mdf name and path

sampling : float, optional

sampling interval.

Notes

The maximum attributes will be stored Data structure will be similar has it is in masterChannelList attribute

exportToMatlab(filename=None)

Export mdf data into Matlab file format 5, tentatively compressed

Parameters:

filename : str, optional

file name. If no name defined, it will use original mdf name and path

Notes

This method will dump all data into Matlab file but you will loose below information: - unit and descriptions of channel - data structure, what is corresponding master channel to a channel. Channels might have then different lengths

exportToNetCDF(filename=None, sampling=None)

Exports mdf data into netcdf file

Parameters:

filename : str, optional

file name. If no name defined, it will use original mdf name and path

sampling : float, optional

sampling interval.

exportToXlsx(filename=None)

Exports mdf data into excel 2007 and 2010 file

Parameters:

filename : str, optional

file name. If no name defined, it will use original mdf name and path

Notes

It is recommended to export resampled data for performances

getChannelData(channelName)

Return channel numpy array

Parameters:

channelName : str

channel name

Notes

This method is the safest to get channel data as numpy array from ‘data’ dict key might contain raw data

keepChannels(channelList)

keeps only list of channels and removes the other channels

Parameters:

channelList : list of str

list of channel names

mergeMdf(mdfClass)

Merges data of 2 mdf classes

Parameters:

mdfClass : mdf

mdf class instance to be merge with self

Notes

both classes must have been resampled, otherwise, impossible to know master channel to match create union of both channel lists and fill with Nan for unknown sections in channels

plot(channels)

Plot channels with Matplotlib

Parameters:

channels : str or list of str

channel name or list of channel names

Notes

Channel description and unit will be tentatively displayed with axis labels

read(fileName=None, multiProc=False, channelList=None, convertAfterRead=True, filterChannelNames=False)

reads mdf file version 3.x and 4.x

Parameters:

fileName : str, optional

file name

multiProc : bool

flag to activate multiprocessing of channel data conversion

channelList : list of str, optional

list of channel names to be read If you use channelList, reading might be much slower but it will save you memory. Can be used to read big files

convertAfterRead : bool, optional

flag to convert channel after read, True by default If you use convertAfterRead by setting it to false, all data from channels will be kept raw, no conversion applied. If many float are stored in file, you can gain from 3 to 4 times memory footprint To calculate value from channel, you can then use method .getChannelData()

filterChannelNames : bool, optional

flag to filter long channel names from its module names separated by ‘.’

Notes

If you keep convertAfterRead to true, you can set attribute mdf.multiProc to activate channel conversion in multiprocessing. Gain in reading time can be around 30% if file is big and using a lot of float channels

resample(samplingTime=None, masterChannel=None)

Resamples all data groups into one data group having defined sampling interval or sharing same master channel

Parameters:

samplingTime : float, optional

resampling interval, None by default. If None, will merge all datagroups into a unique datagroup having the highest sampling rate from all datagroups

**or**

masterChannel : str, optional

master channel name to be used for all channels

Notes

1. resampling is relatively safe for mdf3 as it contains only time series. However, mdf4 can contain also distance, angle, etc. It might make not sense to apply one resampling to several data groups that do not share same kind of master channel (like time resampling to distance or angle data groups) If several kind of data groups are used, you should better use pandas to resample

2. resampling will convert all your channels so be careful for big files and memory consumption

write(fileName=None)

Writes simple mdf 3.3 file

Parameters:

fileName : str, optional

Name of file If file name is not input, written file name will be the one read with appended ‘_new’ string before extension

Notes

All channels will be converted, so size might be bigger than original file

class mdfreader.mdfinfo(fileName=None, filterChannelNames=False)

Bases: dict

MDFINFO is a class gathering information from block headers in a MDF (Measure Data Format) file
Structure is nested dicts. Primary key is Block type, then data group, channel group and channel number. Examples of dicts
  • mdfinfo[‘HDBlock’] header block
  • mdfinfo[‘DGBlock’][dataGroup] Data Group block
  • mdfinfo[‘CGBlock’][dataGroup][channelGroup] Channel Group block
  • mdfinfo[‘CNBlock’][dataGroup][channelGroup][channel] Channel block including text blocks for comment and identifier
  • mdfinfo[‘CCBlock’][dataGroup][channelGroup][channel] Channel conversion information

Examples

>>> import mdfreader
>>> FILENAME='toto.dat'
>>> yop=mdfreader.mdfinfo(FILENAME)
or if you are just interested to have only list of channels
>>> yop=mdfreader.mdfinfo() # creates new instance f mdfinfo class
>>> yop=mdfreader.listChannels(FILENAME) # returns a simple list of channel names

Attributes

fileName (str) file name
mdfversion (int) mdf file version number

Methods

readinfo( fileName = None, filterChannelNames=False ) Reads MDF file and extracts its complete structure
listChannels( fileName = None ) Read MDF file blocks and returns a list of contained channels
listChannels(fileName=None)

Read MDF file blocks and returns a list of contained channels

Parameters:

fileName : string

file name

Returns:

nameList : list of string

list of channel names

readinfo(fileName=None, filterChannelNames=False)

Reads MDF file and extracts its complete structure

Parameters:

fileName : str, optional

file name. If not input, uses fileName attribute

filterChannelNames : bool, optional

flag to filter long channel names including module names separated by a ‘.’