Skip to content

VTF-Net: A Visual Temporal Feature Network for Robust Retinal OCT Image Segmentation

Notifications You must be signed in to change notification settings

IMOP-lab/VTF-Net-Pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 

Repository files navigation

VTF-Net: A Visual Temporal Feature Network for Robust Retinal OCT Image Segmentation

Figure1:Detailed network structure of our proposed VTF-NET Figure1: Detailed network structure of our proposed VTF-NET

Methods

VTFE

Fugure2 Figure2: Visual representation of the VTFE structure.

MSAF

Fugure3 Figure3: Struture of MSAF

Installation

The hardware configuration consisted of a desktop system equipped with two NVIDIA 3080 GPUs, an Intel E5-2690V4 CPU, and 256 GB of RAM. The software environment was constituted of Python 3.9, PyTorch 2.0.0, and CUDA 11.8, with the training framework being realized through PyTorch's DistributedDataParallel (DDP) implementation.

Experiment

Datasets

Datasets Quantity Training Set Validation Set Testing Set
CMED-18k 10000 7200 800 2000

baseline

We provide GitHub links pointing to the PyTorch implementation code for all networks compared in this experiment here, so you can easily reproduce all these projects.

UNet;FCN8s; SegNet; PSPNet; ENet; ICNet; UNet+AttGate DANet; LEDNet; DUNet; CENet; CGNet; OCNet; GCN,

Results

Table1 Table1: The results of segmentation performance of the proposed method against 14 baseline models, evaluated on the CMED-18K dataset. Metrics include dice coefficient, HD, HD95, NCC, and Kappa statistic. The highest performance values for each metric are highlighted in red, with the second highest marked in blue.

Figure2 Figure4: Illustration of results between VTF-Net and 14 baseline models. The first row presents the original input images, followed by corresponding results, including zoomed-in views of edema regions to highlight segmentation detail.

All experiments were executed under identical conditions, and the results are detailed in Table1 and Figure2. VTF-Net showed competitive results across various evaluation metrics.

Abaltion study

Key components of VTFE

Table2 Table2: Ablation study results for the VTF-Net architecture, comparing the impact of individual modules—VTFE, MSAF, AFRP, and EFRE—on segmentation performance across multiple metrics, including Dice coefficient, HD, HD95, NCC, and Kappa. The highest performance values are highlighted in red, while the second-highest are marked in blue, demonstrating the relative contributions of each module to the overall network efficacy.

Table3 Table3: Ablation study results for various attention fusion strategies within the MSAF module, illustrating their differential impacts on segmentation performance across multiple quantitative metrics. The CA and EMA attention mechanisms represent Coordinate Attention and Efficient Multi-Scale Attention, respectively, while CB denotes Convolutional Block Attention Module. FFT refers to the Fast Fourier Transform, LSK indicates a Large Selective Kernel Network, and CA EMA signifies the serial concatenation of CA and EMA outputs. The configuration labeled FFT + CA EMA demonstrates a parallel fusion of FFT and CA EMA outputs, and FSA represents a frequency split attention strategy. Red text highlights the highest values, whereas the second-highest scores are marked in blue, signifying performance optima across configurations.

Table4 Table4: Detailed results conducted on the VTFE module, evaluating the impact of variations in architectural parameters number of layers, kernel sizes, and convolutional blocks on segmentation metrics. Multiple configurations were tested to determine the optimal combination of these parameters, with the highest metric values marked in red and the second-highest in blue. The configuration employing 4 layers, 5x5 kernels, and 2 convolutional blocks demonstrated the most favorable performance, indicating the importance of deeper feature hierarchies and larger receptive fields for capturing complex patterns in retinal OCT segmentation.

Question

If you have any question, please concat '[email protected].

About

VTF-Net: A Visual Temporal Feature Network for Robust Retinal OCT Image Segmentation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages