Combining CNN and K-Means for Automated Detection of Tuberculosis Bacilli in Bacilloscopy Images

Título: Combining CNN and K-Means for Automated Detection of Tuberculosis Bacilli in Bacilloscopy Images

Autores: Thales Francisco Mota Carvalho, Vívian Ludimila Aguiar Santos, Lida Jouca de Assis Figueredo, Silvana Spindola de Miranda, Ricardo de Oliveira Duarte & Frederico Gadelha Guimarães

Resumo: Tuberculosis (TB) remains a major global health challenge, especially in low-resource settings where access to accurate diagnosis is limited. This study investigates the application of Convolutional Neural Networks (CNNs) for the automatic identification of Mycobacterium tuberculosis bacilli in microscopy images. Four CNN architectures from the literature were implemented and evaluated using multiple annotated datasets. Results from image fragment classification tasks showed high performance across all models, with accuracy, sensitivity, and specificity exceeding 95%. However, further experiments simulating full-field bacillus detection using a combination of K-means segmentation and CNN classification revealed a performance drop, notably in precision, highlighting a potential overestimation in controlled fragment-based evaluations. To address this, an enhanced training strategy was proposed by augmenting the dataset with more representative negative samples. This approach significantly improved model precision and F-measure, albeit with a slight reduction in sensitivity. The results suggest that, although CNNs are effective in fragment classification, their application in real-world detection scenarios requires careful evaluation, particularly regarding dataset construction and region of interest selection. The study emphasizes the need for robust and context-aware validation strategies for the deployment of AI tools in TB diagnosis.

Palavras-chave: Tuberculosis; Deep Learning; CNN; Bacillus Detection; K-Means Segmentation; Bacilloscopy.

Páginas: 8

Código DOI: 10.21528/CBIC2025-1175866

Artigo em PDF: CBIC_2025_paper1175866.pdf

Arquivo BibTeX:
CBIC_2025_1175866.bib