Usage¶
Installation¶
To use MIMIM, first install it using pip:
$ pip install multi-med-image-ml
Dataloader¶
Imaging data is required to get started with this. It was designed and tested with brain MRI/PET/CT data, though any 3D data is applicable. The simplest application is two folders of NIFTI images.
from multi_med_image_ml.MedImageLoader import *
folder1 = '/path/to/data1'
folder2 = '/path/to/data2'
dataloader = MedImageLoader(folder1,folder2)
for image,label in dataloader:
...
Sample datasets of brain images may be downloaded from sources like OpenNeuro.
Data may also be encapsulated in the BatchRecord class, which is recommended for very large datasets.
dataloader = MedImageLoader(folder1,folder2,return_obj=True)
for b in dataloader:
print(b.get_X()) # image
print(b.get_Y()) # label
They may also be batched by the patient:
dataloader = MedImageLoader(folder1,
folder2,
return_obj=True,
group_by_pid=True)
for b in dataloader:
...
MedImageLoader may also take in a pandas dataframe containing references to each cached image with the associated metadata:
pandas_path = '/path/to/dataframe.pkl'
dataloader = MedImageLoader(pandas_path)
By default, it builds up this dataframe the first time it reads through a folder. The dataframe contains indices that are paths to image files and columns associated with metadata. To read in different variales from this dataframe, you may specify the labels as an argument:
pandas_path = '/path/to/dataframe.pkl'
dataloader = MedImageLoader(pandas_path,
label=["MRAcquisitionType"],
return_obj=True)
for p in dataloader:
p.get_X() # Image
p.get_Y() # Encoding of MRAcquisitionType
MedImageLoader by default builds up a database of all images accessed, as well as their metadata. This may be accessed in the designates directory.
By default, images are resized to 96x96x96. This may also be changed by specifying the X_dim parameter in the dataloader. Resized images are cached as .npy files.
Model and Training¶
The simplest way to train the multi-input module, as other pytorch models are trained, is as follows:
from multi_med_image_ml.models import *
from multi_med_image_ml.MedImageLoader import *
import torch
dataloader = MedImageLoader(folder1,folder2)
model = MultiInputModule()
optimizer = torch.optim.Adam(
model.classifier_parameters(),
betas = (0.5,0.999),
lr= 1e-5
)
loss_function = torch.nn.MSELoss()
for image,label in dataloader:
optimizer.zero_grad()
y_pred,_ = model(image)
loss = loss_function(label,y_pred)
loss.backward()
optimizer.step()
The `MultiInputTrainer`_ module allows for the confound regression functionalities and generally abstracts that process.
from multi_med_image_ml.models import *
from multi_med_image_ml.MedImageLoader import *
from multi_med_image_ml.MultiInputTrainer import *
model = MultiInputModule()
dataloader = MedImageLoader(imfolder1,imfolder2,
cache=True,
label=["MRAcquisitionType"],
confounds=["Slice Thickness","Repetition Time"],
return_obj = True,
batch_by_pid = True
)
trainer = MultiInputTrainer(model)
for i in range(3):
for p in dataloader:
trainer.loop(p,dataloader=medim_loader)
Testing¶
`MultiInputTester`_ is a more complex module that allows a variety of tests to be performed on the ML model. One is model performance:
from multi_med_image_ml.models import *
from multi_med_image_ml.MedImageLoader import *
from multi_med_image_ml.MultiInputTester import *
model = MultiInputModule()
dataloader = MedImageLoader(imfolder1,imfolder2,
cache=True,
label=["MRAcquisitionType"],
confounds=["Slice Thickness","Repetition Time"],
return_obj = True,
batch_by_pid = True
)
tester = MultiInputTester(model,dataloader.database)
tester.grad_cam()
for p in dataloader:
tester.loop(p)