Metadata-Version: 2.4
Name: pyioutils
Version: 1.0.0
Summary: Python package for reading and writing files in HDF5 (serial and parallel), PETSc and raw binary
Author-email: Samar Khatiwala <samkat6@gmail.com>, Jamie Carr <jammehcarr@gmail.com>
Project-URL: Homepage, https://github.com/samarkhatiwala/pyioutils
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE.txt
Requires-Dist: numpy>=2.0.2
Requires-Dist: h5py
Requires-Dist: scipy
Dynamic: license-file

This repository contains utility functions written in Python for reading and writing files in different formats. They provide high level convenience wrappers for low level functions and libraries. Currently supported formats include: direct access binary, [PETSc](http://www.mcs.anl.gov/petsc/) and [HDF5](https://www.hdfgroup.org/solutions/hdf5/). The HDF5 functions support both serial and parallel I/O and allow conveniently and efficiently saving/loading a wide variety of native Python datatypes as well as NumPy arrays.

**IMPORTANT**: Please do NOT post this code on your own github or other website. See LICENSE.txt for licensing information.

Feel free to email if you have any questions: <samkat6@gmail.com>

------------------------------------------------------------------------------------------------

**Installation**

```bash
pip install pyioutils
```
For reading/writing HDF5 files in parallel, you will additionally need to install my `pympiutils` package:
```bash
pip install pympiutils
```
Note that parallelization depends on an MPI-enabled HDF5 installation and the `h5py` module built against it (instructions coming soon ...).

**Usage**

For more usage examples, see [here](https://github.com/samarkhatiwala/tmm/utils/pytmmutils/examples). For parallel HDF5 I/O see [here](https://github.com/samarkhatiwala/pympiutils/examples).

```python
# Example 1: raw binary I/O
import numpy as np
from pyioutils.binaryio import write_binary, read_binary

# Write array as 64 bit float:
x=np.linspace(0., 7., num=8)
write_binary('x.bin', x, prec='>f8')

# Read data back in (default is as big endian 64 bit float)
xr=read_binary('x.bin', prec='f8', machineformat='>')

# Check
np.allclose(x,xr)

# Write multi-dim array to big endian binary file
# Data are written in Fortran (column major) order
y=np.reshape(x, (4,2), order='F')
write_binary('y.bin', y, prec='>f8')

# Read entire file and reshape using Fortran (column major) order
yr=read_binary('y.bin', dims=(4,2))

# Check
np.allclose(y,yr)

# Assume data are in blocks of size recSize; read the second record (iRec=1). This 
# assumes data are in the default big endian 64 bit float format.
y2=read_binary('y.bin', iRec=1, recSize=4)

# Write array to big endian binary file with header
z=np.array([280.,281.,282.,283.])
# Header as 32 bit int
write_binary('z.bin', len(z), prec='>i4')
# Append data as 64 bit float
write_binary('z.bin', z,prec='>f8', append=True)

# Read data back in skipping header
Tr=read_binary('z.bin', offsetBytes=4)
```

```python
# Example 2: PETSc I/O
import numpy as np
from pyioutils.petscio import readPetscBinVec, writePetscBinVec, getPetscBinVecFileStats

# Write array to a PETSc Vec file
x=np.linspace(0.,7.,num=8)
writePetscBinVec("xy.petsc", x)

# Append another Vec
y=np.linspace(5.,10.,num=8)
writePetscBinVec("xy.petsc", y, append=True)

# Read both back in (multiple Vecs are returned as a list)
xy=readPetscBinVec("xy.petsc",nRec=-1)

# Check
np.allclose(x,xy[0])
np.allclose(y,xy[1])

# Read back in second Vec in file
yr=readPetscBinVec("xy.petsc", nRec=1, startRec=1)

# Check
np.allclose(y,yr)

# Multiple arrays can be written by passing them as a list
writePetscBinVec("xy2.petsc", [x, y])

# Query contents of a PETSc Vec file
vecLength, numVecs, numBytes=getPetscBinVecFileStats("xy.petsc")
print(f"{vecLength}, {numVecs}, {numBytes}")
```

```python
# Example 3: HDF5 I/O
# For parallel I/O see the examples at: 
# https://github.com/samarkhatiwala/pympiutils/examples
import numpy as np
from pyioutils.hdfio import load, save

# Create some data
# NumPy array
x=np.arange(100,dtype=np.float64)
# List
y=[np.arange(n,dtype=np.float64) for n in range(1,5)]
# String
z='This is a test'
# Dictionary
d=dict({'v1': [1,2,3],'v2': 'some text', 'v3': np.ones(4)})

# Save the first three variables to HDF5 file passing them as a dictionary
save({'x': x, 'y': y, 'z': z}, 'data.h5')

# Add (append) the fourth one to the same file
save({'d': d}, "data.h5", append=True)

# Read it all back in
s=load("data.h5")

# Inspect
s.keys()
s.d.keys()
```
