Publications

You can also find my articles on my Google Scholar profile.

With Limited Data for Multimodal Alignment, Let the STRUCTURE Guide You.

Published in NeurIPS, 2025

STRUCTURE is a framework for building multimodal models in low-data regimes by aligning frozen unimodal foundation models, enabling strong performance on zero-shot classification and retrieval tasks. It introduces a simple, plug-and-play regularization technique that preserves the geometric structure of each modality’s latent space and aligns layers with the highest representational similarity across modalities.

Download Paper

Cross-domain Open-world Discovery

Published in ICML, 2024

This work CROW is a cross-domain open-world discovery method that enables automatic assignment of samples to seen classes and discovery of novel classes under a domain shift. CROW introduces a cluster-then-match strategy enabled by a well-structured representation space of foundation models.

Download Paper