This is the official repository of the NeurIPS 2025 paper "λ-Orthogonality Regularization for Compatible Representation Learning" by Simone Ricci, Niccolò Biondi, Federico Pernici, Ioannis Patras and Alberto Del Bimbo
Retrieval systems rely on representations learned by increasingly powerful models. However, due to the high training cost and inconsistencies in learned representations, there is significant interest in facilitating communication between representations and ensuring compatibility across independently trained neural networks. In the literature, two primary approaches are commonly used to adapt different learned representations: affine transformations, which adapt well to specific distributions but can significantly alter the original representation, and orthogonal transformations, which preserve the original structure with strict geometric constraints but limit adaptability. A key challenge is adapting the latent spaces of updated models to align with those of previous models on downstream distributions while preserving the newly learned representation spaces. In this paper, we impose a relaxed orthogonality constraint, namely
Overview of the proposed approach for achieving representation compatibility during retrieval system updates. A newly independently trained model is aligned to the old representation space via an orthogonal transformation
@inproceedings{ricci2025orthogonality,
title={$\lambda$-Orthogonality Regularization for Compatible Representation Learning},
author={Simone Ricci and Niccolo Biondi and Federico Pernici and Ioannis Patras and Alberto Del Bimbo},
booktitle={The Thirty-Ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://arxiv.org/abs/2509.16664}
}- Create and Activate Conda Environment
conda create -y -n lambda_orthogonality python=3.12
conda activate lambda_orthogonality- Ensure you have the correct version of PyTorch and torchvision
# CUDA 12.1
conda install pytorch==2.1.1 torchvision==0.16.1 pytorch-cuda=12.1 -c pytorch -c nvidia- Cloning Repository and Requirements
git clone https://github.com/miccunifi/lambda_orthogonality.git
cd lambda_orthogonality/
chmod +x install_requirements.sh
./install_requirements.sh- Extract feature vector for CIFAR100 dataset with a ResNet-18 and a ViT_L_16
python extract_features_pretrained_models.py --dataset cifar100- Extract feature vector for ImageNet1K (the dataset path needs to be added in the code) dataset with a ResNet-18 and a ViT_L_16
python extract_features_pretrained_models.py --dataset imagenet1kFeatures vectors and relative labels will be extracted in "./extracted_features" folder.
If a new dataset or model is required, adapt the dataloader for the specific dataset or modify the model architecture as needed.
Training and Evaluation
- Train both farward and backward adapter on CIFAR100 with our
$\lambda$ -orthogonal regularization ($\lambda = 12$ ) and run evaluation on it.
python main.py --method lambda_orthThe main.py has multiple parameters that can be adjusted for training and evaluation, including:
-
--method: Loss method to use (e.g., "fct", "fastfill", "lambda_orth"). -
--train_dataset: Name of the training dataset (e.g., "cifar100", "imagenet1k"). -
--orth_reg: Orthogonal regularization limit (the desired value of$\lambda$ ). If strict orthogonal regularization is needed, set this to -2. -
--epochs: Number of training epochs. -
--temperature: Temperature value for the contrastive loss function. -
--init_orthogonal: Flag to initialize the adapters with orthogonal weights. -
--bias: Whether to use bias in the adapters. -
--batch_size_train: Training batch size. -
--partial_backfilling: Enable partial backfilling. -
--backfilling_list: List of backfilling methods for partial backfilling (e.g., "mse", "ract", "random", "sigma").
These parameters allow for fine-tuning the training process and adapting the model to different datasets and tasks.
Here there is the implementation of our λ-Orthogonality Regularization loss.
def lambda_orthogonality_loss(self):
"""
Computes the Frobenius norm of (W W^T - I) and applies a smooth thresholding.
Returns:
loss (torch.Tensor): Smoothly scaled orthogonality loss.
"""
W = self.fc.weight # Shape: [features_size, features_size]
# Compute W W^T
WWt = torch.matmul(W, W.t())
# Create identity matrix
I = torch.eye(WWt.size(0), device=W.device)
# Compute Frobenius norm of (WW^T - I)
fro_norm = torch.norm(WWt - I, p='fro')
# Apply smooth scaling using a sigmoid function
scaling_factor = torch.sigmoid((fro_norm - self.threshold) * self.alpha)
# Final loss is scaled by the sigmoid factor
loss = scaling_factor * fro_norm
return loss