λ-Orthogonality Regularization for Compatible Representation Learning

This is the official repository of the NeurIPS 2025 paper "λ-Orthogonality Regularization for Compatible Representation Learning" by Simone Ricci, Niccolò Biondi, Federico Pernici, Ioannis Patras and Alberto Del Bimbo

Overview

Abstract

Retrieval systems rely on representations learned by increasingly powerful models. However, due to the high training cost and inconsistencies in learned representations, there is significant interest in facilitating communication between representations and ensuring compatibility across independently trained neural networks. In the literature, two primary approaches are commonly used to adapt different learned representations: affine transformations, which adapt well to specific distributions but can significantly alter the original representation, and orthogonal transformations, which preserve the original structure with strict geometric constraints but limit adaptability. A key challenge is adapting the latent spaces of updated models to align with those of previous models on downstream distributions while preserving the newly learned representation spaces. In this paper, we impose a relaxed orthogonality constraint, namely $\lambda$-orthogonality regularization, while learning an affine transformation, to obtain distribution-specific adaptation while retaining the original learned representations. Extensive experiments across various architectures and datasets validate our approach, demonstrating that it preserves the model's zero-shot performance and ensures compatibility across model updates.

Overview of the proposed approach for achieving representation compatibility during retrieval system updates. A newly independently trained model is aligned to the old representation space via an orthogonal transformation $B_{\perp}$, which preserves geometric structure. A forward transformation $F$ maps the old representations to the backward-aligned space of the new model. Only the transformation parameters are optimized during training, while model parameters remain fixed.

Citation

@inproceedings{ricci2025orthogonality,
  title={$\lambda$-Orthogonality Regularization for Compatible Representation Learning},
  author={Simone Ricci and Niccolo Biondi and Federico Pernici and Ioannis Patras and Alberto Del Bimbo},
  booktitle={The Thirty-Ninth Annual Conference on Neural Information Processing Systems},
  year={2025},
  url={https://arxiv.org/abs/2509.16664}
}

Installation Guide

Create and Activate Conda Environment

conda create -y -n lambda_orthogonality python=3.12
conda activate lambda_orthogonality

Ensure you have the correct version of PyTorch and torchvision

# CUDA 12.1
conda install pytorch==2.1.1 torchvision==0.16.1 pytorch-cuda=12.1 -c pytorch -c nvidia

Cloning Repository and Requirements

git clone https://github.com/miccunifi/lambda_orthogonality.git
cd lambda_orthogonality/
chmod +x install_requirements.sh
./install_requirements.sh

Features Extraction

Extract feature vector for CIFAR100 dataset with a ResNet-18 and a ViT_L_16

python extract_features_pretrained_models.py --dataset cifar100

Extract feature vector for ImageNet1K (the dataset path needs to be added in the code) dataset with a ResNet-18 and a ViT_L_16

python extract_features_pretrained_models.py --dataset imagenet1k

Features vectors and relative labels will be extracted in "./extracted_features" folder.

If a new dataset or model is required, adapt the dataloader for the specific dataset or modify the model architecture as needed.

Training and Evaluation

Train both farward and backward adapter on CIFAR100 with our $\lambda$-orthogonal regularization ($\lambda = 12$) and run evaluation on it.

python main.py --method lambda_orth

The main.py has multiple parameters that can be adjusted for training and evaluation, including:

--method: Loss method to use (e.g., "fct", "fastfill", "lambda_orth").
--train_dataset: Name of the training dataset (e.g., "cifar100", "imagenet1k").
--orth_reg: Orthogonal regularization limit (the desired value of $\lambda$). If strict orthogonal regularization is needed, set this to -2.
--epochs: Number of training epochs.
--temperature: Temperature value for the contrastive loss function.
--init_orthogonal: Flag to initialize the adapters with orthogonal weights.
--bias: Whether to use bias in the adapters.
--batch_size_train: Training batch size.
--partial_backfilling: Enable partial backfilling.
--backfilling_list: List of backfilling methods for partial backfilling (e.g., "mse", "ract", "random", "sigma").

These parameters allow for fine-tuning the training process and adapting the model to different datasets and tasks.