Publikations-Information
Gentle Steering: Designing Personalized Privacy Guardrails for LLMs
MT/PT
| Status | open |
| Advisor | Katharina Barlage, Philipp Thalhammer |
| Professor | Prof. Dr. Florian Alt |
Task
Large Language Models (LLMs) are increasingly embedded in everyday tools, yet they often encourage users to disclose sensitive or personally identifiable information. This thesis explores the design and evaluation of personalized, local guardrails that help users maintain privacy while interacting with LLMs.
The proposed system acts as an intermediary layer between the user and the LLM. It analyzes outgoing prompts and incoming responses to detect potential privacy risks and adjusts outputs accordinglyâwhile preserving the original user intent. The project will investigate:
- How personalized guardrails can be designed to adapt to individual privacy preferences
- Whether explanations (e.g., why content was modified) influence user trust, understanding, and acceptance
- How users react to repeated interventions when they exhibit recurring risky behavior
- Trade-offs between usability, transparency, and privacy protection
