Abstract: Detecting small and information-scarce objects within complex 3-D backgrounds remains a critical yet challenging task in industrial scenarios. While existing multimodal approaches leverage ...