| Authors | Zhen Xiang, Linzhi Zheng, Yanjie Li, Junyuan Hong, Qinbin Li, Han Xie, Jiawei Zhang, Zidi Xiong, Chulin Xie, Carl Yang, Dawn Song, Bo Li | | Published | 2024-06-13 | | Keywords | LLMs, AI Agents, GPT, Reinforcement Learning | | Link | Read More |
GuardAgent is a novel approach to enhancing the safety and trustworthiness of LLM-powered agents. It provides a guardrail through knowledge-enabled reasoning, showcasing strong generalization capabilities and effectiveness in moderating inputs/outputs. This paper introduces two benchmarks and demonstrates GuardAgent’s adaptability to emergent LLM agents.