VTF-Net: A Visual Temporal Feature Network for Robust Retinal OCT Image Segmentation

Project page | Our laboratory page

Figure1: Detailed network structure of our proposed VTF-NET

Methods

VTFE

Figure2: Visual representation of the VTFE structure.

MSAF

Figure3: Struture of MSAF

Installation

The hardware configuration consisted of a desktop system equipped with two NVIDIA 3080 GPUs, an Intel E5-2690V4 CPU, and 256 GB of RAM. The software environment was constituted of Python 3.9, PyTorch 2.0.0, and CUDA 11.8, with the training framework being realized through PyTorch's DistributedDataParallel (DDP) implementation.

Experiment

Datasets

Datasets	Quantity	Training Set	Validation Set	Testing Set
CMED-18k	10000	7200	800	2000

baseline

We provide GitHub links pointing to the PyTorch implementation code for all networks compared in this experiment here, so you can easily reproduce all these projects.

UNet;FCN8s; SegNet; PSPNet; ENet; ICNet; UNet+AttGate DANet; LEDNet; DUNet; CENet; CGNet; OCNet; GCN,

Results

Table1: The results of segmentation performance of the proposed method against 14 baseline models, evaluated on the CMED-18K dataset. Metrics include dice coefficient, HD, HD95, NCC, and Kappa statistic. The highest performance values for each metric are highlighted in red, with the second highest marked in blue.

Figure4: Illustration of results between VTF-Net and 14 baseline models. The first row presents the original input images, followed by corresponding results, including zoomed-in views of edema regions to highlight segmentation detail.

All experiments were executed under identical conditions, and the results are detailed in Table1 and Figure2. VTF-Net showed competitive results across various evaluation metrics.

Abaltion study

Key components of VTFE

Table2: Ablation study results for the VTF-Net architecture, comparing the impact of individual modules—VTFE, MSAF, AFRP, and EFRE—on segmentation performance across multiple metrics, including Dice coefficient, HD, HD95, NCC, and Kappa. The highest performance values are highlighted in red, while the second-highest are marked in blue, demonstrating the relative contributions of each module to the overall network efficacy.

Table3: Ablation study results for various attention fusion strategies within the MSAF module, illustrating their differential impacts on segmentation performance across multiple quantitative metrics. The CA and EMA attention mechanisms represent Coordinate Attention and Efficient Multi-Scale Attention, respectively, while CB denotes Convolutional Block Attention Module. FFT refers to the Fast Fourier Transform, LSK indicates a Large Selective Kernel Network, and CA EMA signifies the serial concatenation of CA and EMA outputs. The configuration labeled FFT + CA EMA demonstrates a parallel fusion of FFT and CA EMA outputs, and FSA represents a frequency split attention strategy. Red text highlights the highest values, whereas the second-highest scores are marked in blue, signifying performance optima across configurations.

Table4: Detailed results conducted on the VTFE module, evaluating the impact of variations in architectural parameters number of layers, kernel sizes, and convolutional blocks on segmentation metrics. Multiple configurations were tested to determine the optimal combination of these parameters, with the highest metric values marked in red and the second-highest in blue. The configuration employing 4 layers, 5x5 kernels, and 2 convolutional blocks demonstrated the most favorable performance, indicating the importance of deeper feature hierarchies and larger receptive fields for capturing complex patterns in retinal OCT segmentation.

Question

If you have any question, please concat '[email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
figures		figures
model		model
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VTF-Net: A Visual Temporal Feature Network for Robust Retinal OCT Image Segmentation

Project page | Our laboratory page

Methods

VTFE

MSAF

Installation

Experiment

Datasets

baseline

Results

Abaltion study

Key components of VTFE

Question

About

Releases

Packages

Languages

IMOP-lab/VTF-Net-Pytorch

Folders and files

Latest commit

History

Repository files navigation

VTF-Net: A Visual Temporal Feature Network for Robust Retinal OCT Image Segmentation

Project page | Our laboratory page

Methods

VTFE

MSAF

Installation

Experiment

Datasets

baseline

Results

Abaltion study

Key components of VTFE

Question

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages