Coverage Maps: Instruments for Guiding the Unbounded House of LLM Behaviors

November 13, 2025

158

AI coverage units boundaries on acceptable conduct for AI fashions, however that is difficult within the context of huge language fashions (LLMs): how do you guarantee protection over an enormous conduct house? We introduce coverage maps, an strategy to AI coverage design impressed by the apply of bodily mapmaking. As an alternative of aiming for full protection, coverage maps support efficient navigation via intentional design decisions about which points to seize and which to summary away. With Coverage Projector, an interactive device for designing LLM coverage maps, an AI practitioner can survey the panorama of mannequin input-output pairs, outline customized areas (e.g., “violence”), and navigate these areas with if-then coverage guidelines that may act on LLM outputs (e.g., if output accommodates “violence” and “graphic particulars,” then rewrite with out “graphic particulars”). Coverage Projector helps interactive coverage authoring utilizing LLM classification and steering and a map visualization reflecting the AI practitioner’s work. In an analysis with 12 AI security specialists, our system helps coverage designers craft insurance policies round problematic mannequin behaviors similar to incorrect gender assumptions and dealing with of rapid bodily security threats.

† Stanford College
‡ Carnegie Mellon College
** Work carried out whereas at Apple

Coverage Maps: Instruments for Guiding the Unbounded House of LLM Behaviors

Related Articles

2011 Stata Convention recap – The Stata Weblog

Enterprises want Tier 1 supplier relationships to ship on AI

The Obtain: an unique Jeff VanderMeer story and AI fashions too scary to launch

Latest Articles

2011 Stata Convention recap – The Stata Weblog

Enterprises want Tier 1 supplier relationships to ship on AI

The Obtain: an unique Jeff VanderMeer story and AI fashions too scary to launch

Viktor Orbán concedes Hungarian election: What it implies that strongman chief misplaced.

Tweaking the scent of cat meals can encourage fussy felines to eat