Publikations-Information

Gentle Steering: Designing Personalized Privacy Guardrails for LLMs

MT/PT

Status	open
Advisor	Katharina Barlage, Philipp Thalhammer
Professor	Prof. Dr. Florian Alt

Task

Large Language Models (LLMs) are increasingly embedded in everyday tools, yet they often encourage users to disclose sensitive or personally identifiable information. This thesis explores the design and evaluation of personalized, local guardrails that help users maintain privacy while interacting with LLMs.

The proposed system acts as an intermediary layer between the user and the LLM. It analyzes outgoing prompts and incoming responses to detect potential privacy risks and adjusts outputs accordinglyâwhile preserving the original user intent. The project will investigate:

How personalized guardrails can be designed to adapt to individual privacy preferences
Whether explanations (e.g., why content was modified) influence user trust, understanding, and acceptance
How users react to repeated interventions when they exhibit recurring risky behavior
Trade-offs between usability, transparency, and privacy protection

Keywords

Usable Privacy, Human-AI Interaction, Personalization, Explainable AI (XAI), Privacy Nudges