BioX-Bridge: Model Bridging for Unsupervised Cross-Modal Knowledge Transfer across Biosignals
Chenqi Li, Yu Liu, Timothy Denison, Tingting Zhu
Abstract
Biosignals offer valuable insights into the physiological states of the human body. Although biosignal modalities differ in functionality, signal fidelity, sensor comfort, and cost, they are often intercorrelated, reflecting the holistic and interconnected nature of human physiology. This opens up the possibility of performing the same tasks using alternative biosignal modalities, thereby improving the accessibility, usability, and adaptability of health monitoring systems. However, the limited availability of large labeled datasets presents challenges for training models tailored to specific tasks and modalities of interest. Unsupervised cross-modal knowledge transfer offers a promising solution by leveraging knowledge from an existing modality to support model training for a new modality. Existing methods are typically based on knowledge distillation, which requires running a teacher model alongside student model training, resulting in high computational and memory overhead. This challenge is further exacerbated by the recent development of foundation models that demonstrate superior performance and generalization across tasks at the cost of large model sizes. To this end, we explore a new framework for unsupervised cross-modal knowledge transfer of biosignals by training a lightweight bridge network to align the intermediate representations and enable information flow between foundation models and across modalities. Specifically, we introduce an efficient strategy for selecting alignment positions where the bridge should be constructed, along with a flexible prototype network as the bridge architecture. Extensive experiments across multiple biosignal modalities, tasks, and datasets show that BioX-Bridge reduces the number of trainable parameters by 88-99\% while maintaining or even improving transfer performance compared to state-of-the-art methods.
BioX-Bridge enables parameter-efficient cross-modal knowledge transfer across biosignals using lightweight prototype-based bridge networks between foundation models.
- Efficient framework for unsupervised cross-modal biosignal transfer without knowledge distillation overhead
- Two-stage bridge position selection strategy identifying optimal connection points between representations
- Prototype network architecture reducing trainable parameters by 88-99% while maintaining performance
- model bridging
- prototype networks
- cross-modal transfer
- foundation models
- ISRUC
- WESAD
- FOG
Depends on availability of pre-trained models for each biosignal modality
from the paperInference time depends on bridge position within model
from the paper
Explore task-agnostic methods for better generality in multi-task scenarios
from the paperInvestigate transfer using unpaired data for any modality combination
from the paperExplore BioX-Bridge for datasets with more than two modalities
from the paper
Author keywords
- biosignal
- ai for healthcare
- humans and ai
- unsupervised cross-modal knowledge transfer
Related orals
Multimodal Aligned Semantic Knowledge for Unpaired Image-text Matching
MASK aligns semantic knowledge between images and text using word embeddings as bridges to match out-of-distribution words in unpaired matching.
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
ScaleCUA scales open-source computer use agents with cross-platform dataset and dual-loop data pipeline.
VibeVoice: Expressive Podcast Generation with Next-Token Diffusion
Presents VibeVoice for zero-shot expressive long-form multi-speaker podcast generation using next-token diffusion.
UALM: Unified Audio Language Model for Understanding, Generation and Reasoning
UALM unified audio language model handles understanding, text-to-audio generation, and multimodal reasoning in single model with UALM-Reason for cross-modal generative reasoning.
MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction
MetaEmbed uses learnable meta tokens with matryoshka training to enable test-time scaling for multimodal retrieval balancing quality and efficiency.