zea.models.deeplabv3¶
DeepLabV3+ architecture for multi-class segmentation. For more details see https://arxiv.org/abs/1802.02611.
Functions
|
Build DeepLabV3+ model for semantic segmentation. |
|
Implement Atrous Spatial Pyramid Pooling (ASPP) module. |
|
Create a convolution block with batch normalization and ReLU activation. |
- zea.models.deeplabv3.DeeplabV3Plus(image_shape, num_classes, pretrained_weights=None)[source]¶
Build DeepLabV3+ model for semantic segmentation.
DeepLabV3+ combines the benefits of spatial pyramid pooling and encoder-decoder architecture. It uses a ResNet50 backbone as encoder, ASPP for multi-scale feature extraction, and a simple decoder for recovering spatial details.
Architecture: 1. Encoder: ResNet50 backbone with atrous convolutions 2. ASPP: Multi-scale feature extraction 3. Decoder: Simple decoder with skip connections 4. Output: Final segmentation prediction
Reference: https://arxiv.org/abs/1802.02611
- Parameters:
image_shape (tuple) – Input image shape as (height, width, channels)
num_classes (int) – Number of output classes for segmentation
pretrained_weights (str, optional) – Pretrained weights for ResNet50 backbone. Defaults to None.
- Returns:
Complete DeepLabV3+ model
- Return type:
keras.Model
- zea.models.deeplabv3.DilatedSpatialPyramidPooling(dspp_input)[source]¶
Implement Atrous Spatial Pyramid Pooling (ASPP) module.
ASPP captures multi-scale context by applying parallel atrous convolutions with different dilation rates. This helps the model understand objects at multiple scales.
The module consists of: - Global average pooling branch - 1x1 convolution branch - 3x3 convolutions with dilation rates 6, 12, and 18
Reference: https://arxiv.org/abs/1706.05587
- Parameters:
dspp_input (Tensor) – Input feature tensor from encoder
- Returns:
Multi-scale feature representation
- Return type:
Tensor
- zea.models.deeplabv3.convolution_block(block_input, num_filters=256, kernel_size=3, dilation_rate=1, use_bias=False)[source]¶
Create a convolution block with batch normalization and ReLU activation.
This is a standard building block used throughout the DeepLabV3+ architecture, consisting of Conv2D -> BatchNormalization -> ReLU.
- Parameters:
block_input (Tensor) – Input tensor to the convolution block
num_filters (int) – Number of output filters/channels. Defaults to 256.
kernel_size (int) – Size of the convolution kernel. Defaults to 3.
dilation_rate (int) – Dilation rate for dilated convolution. Defaults to 1.
use_bias (bool) – Whether to use bias in the convolution layer. Defaults to False.
- Returns:
Output tensor after convolution, batch normalization, and ReLU
- Return type:
Tensor