Building an Enterprise Chatbot with Open WebUI, AWS Bedrock, and Custom RAG Pipelines

An enterprise chatbot project that integrates Open WebUI, AWS Bedrock, and custom RAG pipelines. It supports secure, scalable deployments on AWS EC2, enables flexible tool use, and is highly customizable for enterprise settings.

Security

Hosted on AWS EC2 with a private network via VPN
Okta OIDC Single Sign-On (SSO) integrated for user authentication
Adheres to enterprise-level security practices

License

Permissive license
- Fully customizable
- Access to the complete codebase
- Redistribution for profit is restricted

Front-End

UI: Open WebUI
Branding Support: Customizable logos, colors, and interface text

LLM Integration

Enforced Enterprise Policies
- No training or data sharing by provider
LLM Providers:
- AWS Bedrock
- LiteLLM (local routing or multi-provider support)

RAG Architecture

1. Built-in Knowledgebase (Open WebUI)

Uses local embedding model: all-MiniLM-L6-v2
- Ref: SBERT
Indexing & Vectorization done locally
Configurable:
- Chunk size, top-p, reranking models
Template-driven response generation
Limitation: < 1GB total data
Example query: “Tell me about gradient descent”

2. AWS Knowledgebase (Enterprise-Scale)

Designed for larger datasets (>1GB)
Bedrock supports:
- Multiple parsing strategies
- Custom chunking logic
- Embedding + graph-based retrieval
Integrated with proprietary enterprise content

Tool Use

Tools are custom-built
Allow Open WebUI to interact with:
- AWS Knowledgebase
- Bedrock-hosted LLMs
Native tool-calling experience
Returns citations for transparency

MCP Proxy (Model Context Protocol)

Connects local MCP server with remote applications via OpenAPI
Acts as a proxy bridge
- Ref: MCP GitHub
Enables:
- Time-based and memory-sensitive tools
- Unified context management

Features Covered

🔍 Web search integration
📁 Document upload and retrieval
🔧 Tool execution via LLM
📚 Scalable document ingestion (via AWS or local)

Next Steps

Optimize performance and latency
Improve multi-user load balancing
Extend model support and feedback collection
Add end-to-end logging and observability