RuntimeError when using DPT with tu-vit_base_patch16_224.augreg_in21k encoder #1120

liushuaibelief · 2025-04-09T03:00:51Z

When using segmentation_models_pytorch.DPT with the encoder tu-vit_base_patch16_224.augreg_in21k and default parameters, I get a runtime error related to tensor shape mismatch.

Code to reproduce

import segmentation_models_pytorch as smp
import torch

model = smp.DPT(
    encoder_name='tu-vit_base_patch16_224.augreg_in21k',
    encoder_depth=4,
    encoder_weights='imagenet',
    encoder_output_indices=None,
    decoder_readout='cat',
    decoder_intermediate_channels=(256, 512, 1024, 1024),
    decoder_fusion_channels=256,
    in_channels=3,
    classes=1,
    activation=None,
    aux_params=None
)

x = torch.rand(8, 3, 224, 224)
y = model(x)  # RuntimeError occurs here

Error traceback
RuntimeError: The expanded size of the tensor (196) must match the existing size (8) at non-singleton dimension 1. Target sizes: [8, 196, 768]. Tensor sizes: [8, 768]

Environment
segmentation-models-pytorch: latest version (0.4.1.dev0)
timm: 1.0.15
pytorch: 2.4.0
python: 3.10.14
OS: Windows 10

I also tried setting encoder_weights=None and explicitly specifying encoder_output_indices=(3, 6, 9, 11), but the same error occurs. It seems the encoder is returning [B, C] (e.g., [8, 768]) instead of the expected [B, N, C] shape, causing reshape operations in DPT to fail. Please let me know if I'm missing something in the usage of ViT encoders with DPT.

The text was updated successfully, but these errors were encountered:

qubvel · 2025-04-09T09:15:29Z

Hi @liushuaibelief, huge thanks for opening the issue and for the reproducing example! It should be fixed now 🤗

qubvel mentioned this issue Apr 9, 2025

Fix cls token slicing for DPT #1121

Merged

qubvel closed this as completed in #1121 Apr 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError when using DPT with tu-vit_base_patch16_224.augreg_in21k encoder #1120

RuntimeError when using DPT with tu-vit_base_patch16_224.augreg_in21k encoder #1120

liushuaibelief commented Apr 9, 2025

qubvel commented Apr 9, 2025 •

edited

Loading

RuntimeError when using DPT with tu-vit_base_patch16_224.augreg_in21k encoder #1120

RuntimeError when using DPT with tu-vit_base_patch16_224.augreg_in21k encoder #1120

Comments

liushuaibelief commented Apr 9, 2025

qubvel commented Apr 9, 2025 • edited Loading

qubvel commented Apr 9, 2025 •

edited

Loading