Building an Enterprise Chatbot with Open WebUI, AWS Bedrock, and Custom RAG Pipelines
An enterprise chatbot project that integrates Open WebUI, AWS Bedrock, and custom RAG pipelines. It supports secure, scalable deployments on AWS EC2, enables flexible tool use, and is highly customizable for enterprise settings.
Security
- Hosted on AWS EC2 with a private network via VPN
- Okta OIDC Single Sign-On (SSO) integrated for user authentication
- Adheres to enterprise-level security practices
License
- Permissive license
- Fully customizable
- Access to the complete codebase
- Redistribution for profit is restricted
Front-End
- UI: Open WebUI
- Branding Support: Customizable logos, colors, and interface text
LLM Integration
- Enforced Enterprise Policies
- No training or data sharing by provider
- LLM Providers:
- AWS Bedrock
- LiteLLM (local routing or multi-provider support)
RAG Architecture
1. Built-in Knowledgebase (Open WebUI)
- Uses local embedding model:
all-MiniLM-L6-v2- Ref: SBERT
- Indexing & Vectorization done locally
- Configurable:
- Chunk size, top-p, reranking models
- Template-driven response generation
- Limitation: < 1GB total data
- Example query: “Tell me about gradient descent”
2. AWS Knowledgebase (Enterprise-Scale)
- Designed for larger datasets (>1GB)
- Bedrock supports:
- Multiple parsing strategies
- Custom chunking logic
- Embedding + graph-based retrieval
- Integrated with proprietary enterprise content
Tool Use
- Tools are custom-built
- Allow Open WebUI to interact with:
- AWS Knowledgebase
- Bedrock-hosted LLMs
- Native tool-calling experience
- Returns citations for transparency
MCP Proxy (Model Context Protocol)
- Connects local MCP server with remote applications via OpenAPI
- Acts as a proxy bridge
- Ref: MCP GitHub
- Enables:
- Time-based and memory-sensitive tools
- Unified context management
Features Covered
- 🔍 Web search integration
- 📁 Document upload and retrieval
- 🔧 Tool execution via LLM
- 📚 Scalable document ingestion (via AWS or local)
Next Steps
- Optimize performance and latency
- Improve multi-user load balancing
- Extend model support and feedback collection
- Add end-to-end logging and observability