Chapter 1: Introduction
This thesis examines machine learning applications in diagnostic imaging.

Chapter 2: Literature Review
CNNs have achieved 94% accuracy in tumor detection (Smith et al., 2023).
Transfer learning approaches show promise for medical imaging (Jones, 2024).
Recent vision transformers (ViT) achieve 96% on chest X-rays (Chen, 2025).
Multimodal approaches combining images and clinical notes show 98% (Lee, 2025).

Chapter 3: Methodology
We propose a hybrid CNN-Transformer architecture with 12 attention heads,
chosen because grid search over [4,8,12,16] showed 12 gave optimal validation
performance while maintaining reasonable training time.
Training uses the MIMIC-CXR dataset with data augmentation.

Chapter 4: Results
The hybrid model achieved 97.2% accuracy (p < 0.01), outperforming pure CNN
(94.1%) and pure Transformer (95.8%) baselines on the test set.

Chapter 5: Limitations and Future Work
Limited to chest X-rays only; generalization to other modalities requires
further study. Dataset skew toward adult patients.

Chapter 6: Conclusion
The hybrid CNN-Transformer architecture demonstrates statistically significant
improvements for chest X-ray analysis.
