Metadata-Version: 2.4
Name: dewesoft-dxd-converter
Version: 0.1.2
Summary: Pure Python library to convert Dewesoft .dxd and .dxz files without vendor libraries.
Project-URL: Homepage, https://github.com/pavleb/dewesoft_dxd_converter
Project-URL: Issues, https://github.com/pavleb/dewesoft_dxd_converter/issues
Author: Pavle Boškoski
License-File: LICENCE
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.13
Requires-Dist: hatch>=1.16.5
Requires-Dist: numpy>=2.4.4
Requires-Dist: tqdm>=4.67.3
Description-Content-Type: text/markdown

# Dewesoft DXD Converter (Pure Python)

Convert **Dewesoft .dxd .dxz** measurement files to **NumPy/CSV** using pure Python—**no vendor libraries required**. Works cross-platform and lets you script large DXD/DXZ exports for offline analysis.
An open-source cross-platform converter for automating large DXD/DXZ exports and analyzing Dewesoft data offline.

### DXZReader Usage
You can easily parse `.dxz` files or extracted folders using the `DXZReader` class:
```python
import dewesoft_dxd_converter

# Point to an extracted DXZ folder or .dxz file
reader = dewesoft_dxd_converter.DXZReader('file.dxz')

# The measurement setup (sample rates, channels, scaling) is automatically parsed
print(f"Sample Rate: {reader.measurement_setup.sample_rate}")
print(f"Number of Channels: {reader.measurement_setup.num_channels}")

# The events are parsed into a list of DewesoftEventRecord
for event in reader.events:
    print(f"Event ID: {event.event_id}, Type: {event.event_type}, Offset: {event.sample_offset}")

# Extract sample data for a specific channel (e.g., channel 0)
# This will return a NumPy array scaled to engineering units
channel_0_data = reader.get_samples(0)
print(channel_0_data)
```
## Dewesoft EVENTS File Structure

The `EVENTS` file is a binary sidecar file used to synchronize the raw data buckets in `IBDATA` files with real-world timestamps and measurement boundaries (Start/Stop). It utilizes a self-describing property stream encapsulated in record envelopes.

### 1. Global Header
The file begins with a master index of events present in the measurement.

| Offset | Type  | Description |
| :--- | :--- | :--- |
| 0x00 | Int32 | **Total Event Count**: Number of events recorded (usually 2: Start and Stop). |
| 0x04 | Int32 | **First Event ID**: The type ID of the first event (1 = `etStart`). |

### 2. Event Record Envelope
Each event is wrapped in a framed structure to ensure parser resiliency.

| Relative Offset | Type | Description |
| :--- | :--- | :--- |
| -4 | Int32 | **Event Type**: Matches `DWEventType` enum (1: Start, 2: Stop). |
| 0  | Byte  | **0x86**: Start-of-Record delimiter. |
| 1-6| String| **"EventS"**: Record signature. |
| 7+ | Body  | Data payload containing property blocks. |

### 3. Data Payload (Position Property)
Inside the record body, data is organized by Property IDs. The most critical for data alignment is Property ID `6` (**Sample Position**).

| Offset | Type | Description |
| :--- | :--- | :--- |
| 0 | Int32 | **Property ID (6)**: Identifies the following block as Position Data. |
| 4 | Int32 | **Bucket Index**: The 1-based index of the 1000-sample block where the event occurred. |
| 8 | Int32 | **Relative Offset**: A signed 32-bit integer (Two's Complement). |
| 12| 2xInt32| **Timestamp**: High-precision time data (often 0 for alignment events). |

### 4. Synchronization Logic
To calculate the exact sample where a measurement ends (the "True End"), the parser combines the Bucket Index and the Relative Offset. Because Dewesoft allocates data in full blocks (e.g., 1000 samples), the `etStop` event uses a negative offset to "trim" the unused padding at the end of the last bucket.

**Formula:**
`True Sample Index = (Bucket Index * Block Size) + Relative Offset`

**Example (Stop Event):**
- Bucket Index: `2`
- Relative Offset: `-952` (represented in binary as `72 252 255 255`)
- Block Size: `1000`
- **Result**: `(2 * 1000) - 952 = 1048` total valid samples.

### 5. Record Termination
| Type | Description |
| :--- | :--- |
| 4 Bytes | **0xFF000000**: Property list terminator. |
| Byte | **0x87**: End-of-Record delimiter. |
| String | **"EventS\0"**: Closing signature and null terminator. |

### DWEventType Enum Reference
| Value | Enum Name | Description |
| :--- | :--- | :--- |
| 1 | `etStart` | Recording start |
| 2 | `etStop` | Recording stop |
| 3 | `etTrigger` | Trigger event |
| 11 | `etVStart` | Video recording start |
| 12 | `etVStop` | Video recording stop |
| 20 | `etKeyboard` | Keyboard input |
| 21 | `etNotice` | System notice |
| 22 | `etVoice` | Voice annotation |
| 23 | `etPicture` | Picture capture |
| 24 | `etModule` | Module event |
| 25 | `etAlarm` | Alarm notification |
| 26 | `etCursorInfo` | Cursor information |
| 27 | `etAlarmLevel` | Alarm level change |


## DXZ file format
DXZ is a compressed format used by Dewesoft to store data.
The format is based on ZIP compression.
The files can be easily unzipped using standard unzip tools.
The result is a set of following files:

```bash
'BDATA  0'
'BDATA  1'
BinaryFiles
DATEINFO
DBASDAT2
DBDATA
EVENTS
IBDATA0
IBDATA1
IBDATA2
IBDATA3
IBDATA4
IBDATA5
INFO
INFO_
LASTPIC
LICINFO
MEASINFO
SETUP
SETUP_
SVDATA2
SVINFO
```
Binary data are stored in file ``DBDATA``.
It can be read simply with ``numpy`` as:
```python
import numpy as np
A = np.fromfile('DBDATA',dtype=np.uint8)
```
If the data is to big to fit in RAM one can use parts like:
```python
A = np.fromfile(root / 'DBDATA',dtype=np.uint8, count=N)
```
Where ``N`` is the number of bytes to read.

The next step is to convert the data in approriate format.
```python
dt = np.dtype(np.int16)
dt = dt.newbyteorder('<')    
B = np.frombuffer(A,dtype=dt)
```

This gives the raw acccess to the actual DAQ values from the ADC.
In case the ADC is 24bit the data type should be ``np.int32``.

The next step is proper scaling of the data.
The scaling factors are stored in the ``SETUP`` file in XML format.
```python
import xml.etree.ElementTree as ET
with open('SETUP','r') as f:
    xml_data = f.read()
root = ET.fromstring(xml_data)

sampleRates = [float(sr.text) for sr in root.findall('.//SampleRate')]
print('Sample rates:', sampleRates)
ai_dev = root.findall('.//Device[@Type="AI"]')
for device in ai_dev:
    print('Device Name:', device.find('.//Name').text)
    slots = device.findall('.//Slot')
    for slot in slots:
        used = slot.find('.//Used')
        if used is None:
            continue
        if used.text == 'True':
            name = slot.findall('.//Name')[0].text
            bits = slot.findall('.//BitsLog')[0].text
            scale = slot.findall('.//Scale')[0].text
            rangeMin = slot.findall('.//RangeMin')[0].text
            rangeMax = slot.findall('.//RangeMax')[0].text
            print(' Slot Name:', name, 'Bits:', bits, 'Scale:', scale, 'Range:', rangeMin, '--', rangeMax)
```




## DXD file format
Dewesoft has one of the best data acquisition systems on the market. 
It is used in many industries and applications. The data is stored in a proprietary format called DXD. 
Dewesoft provides free but not open source libraries to read the data.
They are available for Windows and Linux platforms.
Additionally, there are several python wrappers available to read the data.
It turns out that the format is not that complicated and can be read with a few lines of code.

### Usage
The complete parser is written in [convert.py](convert.py).
The simple usage is shown below:
```python
import dewesoft_dxd_converter
fname = 'sin_freq_9_500000_20190319_073502.dxd'

cc = dewesoft_dxd_converter.DXDReader(fname)
print(f'Number of channels {cc.number_of_channels}')
print(f'Sample rate {cc.sample_rate}')

channel_number = 2
eIn = cc.get_channel_data(channel_number)
print(cc.get_chanel_name(channel_number), cc.sample_rate)

cc.close()
```
The converter depends on ``numpy`` and ``tqdm``.

### Structure of the DXD file
The data are stored in pages.
The format is in little endian.

The first task is to locate the so-called table of contents or using DXD notation the ``INDEX``.
This is available in the first copuple of 100B in the file.
An example of the ``INDEX`` is shown below:
```hexdump
00000000  4d 55 4c 54 49 5f 53 54  52 45 41 4d 5f 46 49 4c  |MULTI_STREAM_FIL|
00000010  45 5f 56 45 52 30 32 31  30 36 00 00 00 44 65 77  |E_VER02106...Dew|
00000020  65 73 6f 66 74 5f 44 61  74 61 5f 37 2e 78 5f 5f  |esoft_Data_7.x__|
00000030  5f 5f 64 79 6e 61 6d 69  63 56 45 52 58 33 20 53  |__dynamicVERX3 S|
00000040  50 34 20 28 52 45 4c 45  41 53 45 2d 31 38 30 39  |P4 (RELEASE-1809|
00000050  32 30 29 00 00 00 00 00  00 00 00 00 00 00 00 00  |20).............|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 7e 00  |..............~.|
00000080  00 00 00 00 00 00 5f 5f  5f 49 4e 44 45 58 00 02  |......___INDEX..|
00000090  00 00 00 00 00 00 00 02  00 00 00 00 00 00 f8 03  |................|
000000a0  00 00 00 00 00 00 00 e0  07 00 00 00 4e 4c 4b 57  |............NLKW|
000000b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
```
In our case the ``__INDEX`` is at the location ``0x80`` and the location is ``0x0200``, which are the bytes just after the tag ``__INDEX`` swapped from little endian.

### Page structure
Each page has a starting tag ``PAG1``.
An example of the first page and location ``0x0200`` obtained using ``hexdump -C`` is shown below:
```hexdump
00000200  50 41 47 31 00 00 00 00  ff ff ff ff ff ff ff ff  |PAG1............|
00000210  ff ff ff ff ff ff ff ff  ff ff ff ff 00 00 00 00  |................|
```

The data can be stored in chunks that can spread multiple pages.
One can analyse the page structure in chunks of 8 bytes.
After the tag ``PAG1`` there are 5 bytes of ``0x00``.
The next 8 bytes is the location of the previous page or ``0xffffffffffffffff`` if it is the first page.
The next 8 bytes is the location of the next page or ``0xffffffffffffffff`` if it is the last page.


### Index page
The index page is the first page in the file in our case located at ``0x0200``.
Full dump is shown below:
```hexdump
00000200  50 41 47 31 00 00 00 00  ff ff ff ff ff ff ff ff  |PAG1............|
00000210  ff ff ff ff ff ff ff ff  ff ff ff ff 00 00 00 00  |................|
00000220  16 00 00 00 04 00 00 00  00 00 00 00 4d 45 41 53  |............MEAS|
00000230  49 4e 46 4f 00 0a 00 00  00 00 00 00 00 0a 00 00  |INFO............|
00000240  00 00 00 00 9a 00 00 00  00 00 00 00 00 e0 01 00  |................|
00000250  00 00 32 00 00 00 00 00  00 00 53 45 54 55 50 00  |..2.......SETUP.|
00000260  00 00 00 0c 00 00 00 00  00 00 00 cc 00 00 00 00  |................|
00000270  00 00 68 0f 00 00 06 00  00 00 00 e0 1f 00 00 00  |..h.............|
00000280  60 00 00 00 00 00 00 00  42 49 4e 46 4f 00 00 00  |`.......BINFO...|
00000290  00 ec 00 00 00 00 00 00  00 ec 00 00 00 00 00 00  |................|
000002a0  40 00 00 00 00 00 00 00  00 e0 03 00 00 00 8e 00  |@...............|
000002b0  00 00 00 00 00 00 42 44  41 54 41 00 00 00 00 f0  |......BDATA.....|
000002c0  00 00 00 00 00 00 00 00  01 00 00 00 00 00 40 00  |..............@.|
000002d0  00 00 04 00 00 00 00 e0  03 00 00 00 bc 00 00 00  |................|
000002e0  00 00 00 00 45 56 45 4e  54 53 00 00 00 08 01 00  |....EVENTS......|
000002f0  00 00 00 00 00 08 01 00  00 00 00 00 96 00 00 00  |................|
00000300  00 00 00 00 00 e0 0f 00  00 01 ea 00 00 00 00 00  |................|
00000310  00 00 52 45 47 49 4e 46  4f 00 00 04 01 00 00 00  |..REGINFO.......|
00000320  00 00 00 04 01 00 00 00  00 00 8f 00 00 00 00 00  |................|
00000330  00 00 00 e0 01 00 00 00  18 01 00 00 00 00 00 00  |................|
00000340  44 42 44 41 54 41 00 00  00 18 01 00 00 00 00 00  |DBDATA..........|
```
One can notice that there are text tags that are followed by the location of the data.
At the present we are interested in ``DBDATA`` and ``SETUP``.
The tag ``DBDATA`` is the location of the binary data.
The tag ``SETUP`` is the location of the XML data describing the DAQ setup with all correction factors and calibration data.

### Setup page
The setup page is the page that contains the XML data.
From the index above the ``SETUP`` is located at ``0x0c00``.
Those are the bytes located after +3B from the end of the tag ``SETUP``.
The ``hexdump`` of the begining of the ``SETUP`` page is shown below:
```hexdump
00000c00  50 41 47 31 00 00 00 00  ff ff ff ff ff ff ff ff  |PAG1............|
00000c10  00 2c 00 00 00 00 00 00  01 00 00 00 00 00 00 00  |.,..............|
00000c20  3c 3f 78 6d 6c 20 76 65  72 73 69 6f 6e 3d 22 31  |<?xml version="1|
00000c30  2e 30 22 20 65 6e 63 6f  64 69 6e 67 3d 22 55 54  |.0" encoding="UT|
00000c40  46 2d 38 22 3f 3e 0d 0a  3c 44 61 74 61 46 69 6c  |F-8"?>..<DataFil|
00000c50  65 53 65 74 75 70 3e 0d  0a 09 3c 53 79 73 74 65  |eSetup>...<Syste|
```
From the page structure we can see that the next page is located at ``0x2c00`` and that the there is no previous page since the address is ``0xffffffffffffffff`` or in decimal -1.
The actual data starts with offset of ``0x20``B from the beginning of the page.
In order to extract the complete XML one should parse all of the remaining pages.

So the header of page at ``0x2c00`` is shown below:
```hexdump
00002c00  50 41 47 31 01 00 00 00  00 0c 00 00 00 00 00 00  |PAG1............|
00002c10  00 4c 00 00 00 00 00 00  01 00 00 00 00 00 00 00  |.L..............|
00002c20  6e 67 6c 65 53 65 6e 73  6f 72 3e 54 72 75 65 3c  |ngleSensor>True<|
```
It is clear that the previous page is at ``0x0c00`` and the next page is at ``0x4c00``.
Following all the way to the last page its header is shown below:
```hexdump
0000cc00  50 41 47 31 06 00 00 00  00 ac 00 00 00 00 00 00  |PAG1............|
0000cc10  ff ff ff ff ff ff ff ff  01 00 00 00 00 00 00 00  |................|
0000cc20  61 70 74 69 6f 6e 3e 0d  0a 09 09 09 09 09 09 09  |aption>.........|
```
From the page structure we can see that the previous page is located at ``0xac00`` and that the there is no next page since the address is ``0xffffffffffffffff`` or in decimal -1.
So starting with offset of ``0x20``B from the beginning of the page one can extract the XML data.
Concatenating all this together will give the complete XML data that can be easily parsed.

### Structure of the binary data
From the index page tag ``DBDATA`` we can see that the binary data is located at ``0x011800``.
The header of the page is shown below:
```hexdump
00011800  50 41 47 31 00 00 00 00  fe ff ff ff ff ff ff ff  |PAG1............|
00011810  fe ff ff ff ff ff ff ff  06 00 00 00 e0 5f 21 00  |............._!.|
00011820  a8 67 e8 67 04 68 5f 68  73 67 47 68 0b 68 5b 68  |.g.g.h_hsgGh.h[h|
```
Unlike the previous pages, the values for the next and previous page are not ``0xffffffffffffffff`` but ``0xfeffffffffffffff`` or -2 in decimal.
From the patterns in the files we deduce that after the next page segment there are 4B of what we call page type, in this case ``0x06``.
The last 4B are the size of the data in the page, in this case ``0x215fe0`` or 2160000B.
This means that the next page will start at ``0x011800 + 0x215fe0 + 0x20 = 0x227800``.
The data starts at offset of ``0x20``B from the beginning of the page.
The dump at ``0x227800`` is shown below:
```hexdump
00227800  50 41 47 31 00 00 00 00  fe ff ff ff ff ff ff ff  |PAG1............|
00227810  fe ff ff ff ff ff ff ff  08 00 00 00 e0 db 00 00  |................|
00227820  00 5e cd 46 00 76 d4 46  be 82 d0 46 4c 61 df 3e  |.^.F.v.F...FLa.>|
```
The same pattern is repeated.
For this page the page type is ``0x08`` and the size of the data is ``0xdbe000``.
Following the same pattern one can parse the complete file and locate the begining of each page its type and data length.
In such a way one can extract the table of contents of the file.

### Reading the binary data
In our case the DAC was 16bit.
This is also visible from the setup XML.
The data is stored in little endian in ``uint16_t`` format.
**I assume that if you have 24bit DAC the data will be stored with ``uint32_t``.**
In our cases we have observed 3 types of pages:
- ``0x06``: This is the page that contains the data.
- ``0x08`` and
- ``0x0a``.

I have not observed any other types of pages and sadly I do not have information regarding types ``0x08`` and ``0x0a``.

For multichannel data the data is stored in interleaved format with chunks of 1000 samples.
The page sizes are not always multiple of 1000 samples.
Therefore the remaining samples should be concatenated with the next page.

### Converting the data from ``uint`` to ``float``
From one binary page one can read the data with the following python code:
```python
# data_len is the size of the data in bytes
# data_start is the location of the data start in the file
# A is the array containing the data
dt = np.dtype(np.int16)
dt = dt.newbyteorder('<')    
B = np.frombuffer(A[data_start:data_start+data_len],dtype=dt)
```

Assuming that all data is read and properly concatenated one can convert the data to ``float`` with the following code:
```python
# number_of_channels is the number of channels
# wished_channel is the channel that you want to extract
assert wished_channel < number_of_channels
reshaped_array = B.reshape(-1, 1000)
eI = reshaped_array[wished_channel::number_of_channels,:].reshape(1,-1).squeeze()
```
The conversion from float can be made using the data from the XML setup.
Assuimg that the data is stored in the variable ``eI`` and the XML data is stored in the variable ``xml_data`` one can convert the data with the following code:
```python
import xml.etree.ElementTree as ET
root = ET.fromstring(tostr(np.concatenate(xml_data)).strip())
setup = list(root.iter('DewesoftSetup'))[0]
slots = setup.findall('.//Slot') # number of channels

scale = setup.findall(f".//Slot[@Index='{wished_channel}']/AmplScale")[0]
scale = float(scale.text)
interscept = setup.findall(f".//Slot[@Index='{wished_channel}']/AmplOffset")[0]
interscept = float(interscept.text)

scale = scale*10/(np.iinfo(np.uint16).max+1)

converted = eI*scale - interscept
```

### Open issues
- Since I don't have access to 24bit DAC I cannot confirm the data format of ``uint16_t``.
- The analysis does not extract the events
- There are +3B or +2B offsets when reading addresses from the index table