Guardrails for Amazon Bedrock

Implement safeguards customized to your application requirements and responsible AI policies

Build responsible AI applications with Guardrails for Amazon Bedrock

Guardrails for Amazon Bedrock provides additional customizable safeguards on top of the native protections of FMs, delivering safety protections that is among the best in the industry by:

Blocking as much as 85% more harmful content
Filtering over 75% hallucinated responses for RAG and summarization workloads
Enabling customers to customize and apply safety, privacy and truthfulness protections within a single solution

Bring a consistent level of AI safety across all your applications

Guardrails for Amazon Bedrock evaluates user inputs and FM responses based on use case specific policies, and provides an additional layer of safeguards regardless of the underlying FM. Guardrails for Amazon Bedrock is the only responsible AI capability oﬀered by a major cloud provider that enables customers to build and customize safety, privacy, and truthfulness protections for their generative AI applications in a single solution, and it works with all large language models (LLMs) in Amazon Bedrock, as well as ﬁne-tuned models. Customers can create multiple guardrails, each configured with a different combination of controls, and use these guardrails across different applications and use cases. Guardrails can also be integrated with Agents and Knowledge Bases for Amazon Bedrock to build generative AI applications aligned with your responsible AI policies. In addition, Guardrails offers a ApplyGuardrail API to evaluate user inputs and model responses generated by any custom or third-party FM outside of Bedrock.

Block undesirable topics in your generative AI applications

Organizations recognize the need to manage interactions within generative AI applications for a relevant and safe user experience. They want to further customize interactions to remain on topics relevant to their business and align with company policies. Using a short natural language description, Guardrails for Amazon Bedrock allows you to define a set of topics to avoid within the context of your application. Guardrails detects and blocks user inputs and FM responses that fall into the restricted topics. For example, a banking assistant can be designed to avoid topics related to investment advice.

guardrails for amazon bedrock content filters

Filter harmful content based on your responsible AI policies

Guardrails for Amazon Bedrock provides content filters with configurable thresholds to filter harmful content across hate, insults, sexual, violence, misconduct (including criminal activity), and safeguard against prompt attacks (prompt injection and jailbreak). Most FMs already provide built-in protections to prevent the generation of harmful responses. In addition to these protections, Guardrails lets you configure thresholds across the different content categories to filter out harmful interactions. Increasing the strength of the filter increases the aggressiveness of the filtering. Guardrails automatically evaluates both user input and model responses to detect and help prevent content that falls into restricted categories. For example, an ecommerce site can design its online assistant to avoid using inappropriate language, such as hate speech or insults.

guardrails for amazon bedrock denied topics

Redact sensitive information (PII) to protect privacy

Guardrails for Amazon Bedrock allows you to detect sensitive content such as personally identifiable information (PII) in user inputs and FM responses. You can select from a list of predefined PII or define custom sensitive information type using regular expressions (RegEx). Based on the use case, you can selectively reject inputs containing sensitive information or redact them in FM responses. For example, you can redact users’ personal information while generating summaries from customer and agent conversation transcripts in a call center.

Block inappropriate content with a custom word filter

Guardrails for Amazon Bedrock allow you to configure a set of custom words or phrases that you want to detect and block in the interaction between your users and generative AI applications. This will also allow you to detect and block profanity as well as specific custom words such as competitor names or other oﬀensive words.

Detect hallucinations in model responses using contextual grounding checks

Organizations need to deploy truthful and trustworthy generative AI applications to maintain and grow users’ trust. However, applications built using FMs can generate incorrect information due to hallucinations. For example, FMs can generate responses that deviate from the source information, conflate multiple pieces of information, or invent new information. Guardrails for Amazon Bedrock supports contextual grounding checks to detect and filter hallucinations if the responses are not grounded (e.g., factually inaccurate or new information) in the source information and irrelevant to user’s query or instruction. Contextual grounding checks can be used to detect hallucinations for RAG, summarization, and conversational applications, where source information can be used as reference to validate the model response.

Next steps

Blog

Detect hallucinations & safeguard applications built with any FM

Read the blog

Blog

Safeguard a generative AI travel agent with prompt engineering and Guardrails for Amazon Bedrock

Read the blog

Blog