This is an unofficial implementation of Deep Orthogonal Fusion of Local and Global Features (DOLG) in TensorFlow 2 (Keras)
. Paper.
It seeks to design an effective single-stage solution by integrating local and global information inside images into compact image representations. It attentively extracts representative local information with multi-atrous convolutions and self-attention at first. Components orthogonal to the global image representation are then extracted from the local information. At last, the orthogonal components are concatenated with the global representation as a complementary, and then aggregation is performed to generate the final representation.
Prerequisites: Check requirements.txt
pip install dolg-tensorflow
or
git clone https://github.com/innat/DOLG-TensorFlow.git
[Option 1]
First, create a model with two output branch, one for local branch and other for global branch. It's needed for DOLG model. See the demo below.
base = applications.EfficientNetB0(...)
new_base = keras.Model(
[base.inputs],
[
base.get_layer('block5c_add').output, # fol local branch
base.get_layer('block7a_project_bn').output # for global branch
]
)
second, now use the above created model as follows.
from models.DOLG import DOLGNet
dolg_net = DOLGNet(new_base, num_classes=num_classe, activation='softmax')
dolg_net.build_graph().summary()
[Option 2]
Apart from the above approach, we can also integrate dolg layers with a custom model. Here is one example,
# component of DOLG model
from layers.GeM import GeneralizedMeanPooling2D
from layers.LocalBranch import DOLGLocalBranch
from layers.OrtholFusion import OrthogonalFusion
vision_input = keras.Input(shape=(img_shape, img_shape, 1), name="img")
x = Conv2D(...)(vision_input)
x = Conv2D ...
y = x = DOLGLocalBranch(IMG_SIZE=img_shape)(x)
x = MaxPooling2D(...)(x)
x = Conv2D ...
gem_pool = GeneralizedMeanPooling2D()(x)
gem_dens = Dense(1024, activation=None)(gem_pool)
vision_output = OrthogonalFusion()([y, gem_dens])
vision = keras.Model(vision_input, vision_output, name="vision")
vision.summary(expand_nested=True, line_length=110)
The DOLG concept can be integrated into any computer vision models i.e. NFNet
, ResNeSt
, or EfficietNet
. Here are some end-to-end code examples.