Content Policy Enforcement

Foldspace can automatically analyze and flag user-generated content across conversations, inputs, and output streams.

When content violates or risks violating a policy, the system attaches a policy flag and masks the relevant content in the conversation to prevent unsafe or sensitive material from being displayed.

Enterprise Feature Content policy enforcement is available for enterprise customers. To enable or configure moderation and context-sensitive policy handling, contact [email protected].


Supported Policy Categories

Policy NameDescription
Dangerous ContentContent that facilitates, promotes, or enables access to harmful goods, services, or activities.
HarassmentContent that is malicious, intimidating, bullying, or abusive towards others.
Sexually ExplicitContent that is sexually explicit in nature.
Hate SpeechContent that is generally accepted as being hate speech.
Medical InformationContent that promotes, facilitates, or enables access to harmful medical advice or guidance.
Violence & GoreContent that includes gratuitous or realistic descriptions of violence and/or gore.
Obscenity & ProfanityContent that contains vulgar, profane, or offensive language.

How It Works

Foldspace scans conversational and generated content in real time. If a message matches one or more of the policies above:

  • The system flags the violation for moderation and observability.
  • The affected text is masked in the conversation to maintain a safe, compliant user experience.

This ensures AI-driven interactions remain aligned with safety standards and organizational policies.