This paper systematically surveys current efforts on the evaluation, attack, and defense of MLLMs' safety on images and text.
Feb 1, 2024
Analysis across 12 state-of-the-art models reveals that MLLMs are susceptible to breaches instigated by the approach, even when the equipped LLMs have been safety-aligned. Furthermore, this work proposes a straightforward yet effective prompting strategy to enhance the resilience of MLLMs against these types of attacks.
Nov 29, 2023
Jan 1, 2023
Jan 1, 2022