Transformers-based Segmentation Model for 3D Medical Images

Friday 31 January 2025


Researchers have made significant strides in developing advanced algorithms for medical image segmentation, a crucial step in diagnosing and treating various diseases. Recently, a team of scientists has proposed a novel approach that leverages transformers, a type of neural network architecture popularized by language models like BERT, to improve the accuracy and efficiency of 3D medical image segmentation.


The traditional method for segmenting medical images involves using convolutional neural networks (CNNs), which are designed to process data in a spatially-local manner. However, these networks can be limited in their ability to capture long-range dependencies and contextual relationships within the data. Transformers, on the other hand, are particularly well-suited for processing sequential data like text, but have also been shown to excel at image classification tasks.


The researchers modified the transformer architecture to create a novel model called TSUBF-Net, which stands for Transformer-based Segmentation Using Bi-directional Sample Collaborated Fusion. This model is designed specifically for 3D medical image segmentation and combines the strengths of transformers with those of CNNs.


TSUBF-Net consists of two main components: a feature extractor and a segmentation network. The feature extractor uses a combination of convolutional layers and transformer encoder blocks to extract high-level features from the input images. These features are then fed into the segmentation network, which is responsible for generating the final segmentation mask.


The key innovation of TSUBF-Net lies in its use of bi-directional sample collaborated fusion (BSCF) to combine the outputs of multiple feature extraction layers. This process allows the model to leverage the strengths of each layer and generate a more accurate and robust segmentation result.


To evaluate the performance of TSUBF-Net, the researchers conducted extensive experiments on several publicly available datasets, including the Adenoid Hypertrophy Segmentation Dataset (AHSD) and the Automated Cardiac Diagnosis Challenge (ACDC). The results showed that TSUBF-Net outperformed state-of-the-art methods in terms of accuracy and smoothness of segmentation.


One of the most significant advantages of TSUBF-Net is its ability to handle complex medical images with ease. Unlike traditional CNN-based approaches, which can struggle with large and irregularly-shaped objects, TSUBF-Net’s transformer architecture allows it to capture long-range dependencies and contextual relationships within the data.


The researchers also demonstrated that TSUBF-Net can be easily adapted to segment different types of medical images, such as MRI and CT scans.


Cite this article: “Transformers-based Segmentation Model for 3D Medical Images”, The Science Archive, 2025.


Medical Image Segmentation, Transformers, Neural Networks, Cnns, 3D Images, Tsubf-Net, Bi-Directional Sample Collaborated Fusion, Bscf, Adenoid Hypertrophy Segmentation Dataset, Ahsd, Automated Card


Reference: Rulin Zhou, Yingjie Feng, Guankun Wang, Xiaopin Zhong, Zongze Wu, Qiang Wu, Xi Zhang, “TSUBF-Net: Trans-Spatial UNet-like Network with Bi-direction Fusion for Segmentation of Adenoid Hypertrophy in CT” (2024).


Discussion