AI appliance • Local inferencing • Made in Europe
Wunderschön. Höchste Performance.
The Silent is a fine example of German engineering. It is based on our successful Silent Bricks performance storage solution and has everything a system needs.
Overview & Software • Appliance • CARE
AI Appliance
An AI appliance is a dedicated hardware unit in which an AI model, computing infrastructure (GPU), data storage and software stack are combined into a ready-to-operate system — as an alternative to a self-built AI server or cloud AI.
On-Premises AI
On-premises AI refers to AI systems operated entirely on an organization's own hardware in its own data center or server room — without cloud connectivity, without data transfer to external services.
Local AI systems are not all the same
On-premises AI has undeniable advantages. Data protection, data sovereignty and cost independence are all arguments in favour of companies and public authorities running their own AI systems. However, many solutions come with drawbacks and require compromises.
Too expensive
Large companies often rely on AI systems developed entirely in-house, including custom LLMs. However, these specialised language models must be trained using the company’s own data and regularly retrained to ensure they can provide the most accurate answers possible. This requires enormous computing power in the form of high-end GPUs, which are becoming prohibitively expensive.
Too cheap
Thanks to YouTube, it is possible to put together your own systems with relatively little effort. However, fine-tuning the components requires a great deal of expertise to ensure they do not hinder one another. Furthermore, these systems lack professional support and ongoing development. And due to the sharp rise in prices for storage and GPUs, performance and scalability are usually expensive or even impossible to achieve.
On-Premises AI
On-premises AI refers to AI systems operated entirely on an organization's own hardware in its own data center or server room — without cloud connectivity, without data transfer to external services.
Short-sighted
AI is evolving at an incredible pace. Some systems are locked into specific language models and are therefore unable to benefit from these advances. Others are undersized from the outset and not designed to accommodate growth or new possibilities. Both of these factors hinder the deployment of AI and can lead to more shadow AI, even though that is precisely what we are trying to avoid.
Silent AI is different
Silent AI is the turnkey, professional AI system for businesses, public authorities and organisations that wish to use AI with sensitive data without having to worry about lengthy setup processes or IT maintenance.
Performance
Current GPU
Silent AI Vault
CARE SLA
Der Silent AI Vault
Silent AI stores all data locally in a vector database, which the language model then accesses via . All data is stored with high availability on an easily replaceable storage module containing 12 NVMe modules, which is protected against data loss by redundancy.
Each Silent AI Vault offers 12, 24, 48 or 96 TB (gross) of storage; the Silent AI appliance has 2 slots for Silent AI Vaults.
RAG (Retrieval-Augmented Generation)
RAG is an AI architecture in which a language model does not answer from memory but retrieves answers from a defined, controlled dataset and generates responses on that basis — structurally eliminating hallucinations.
RAG (Retrieval-Augmented Generation)
RAG is an AI architecture in which a language model does not answer from memory but retrieves answers from a defined, controlled dataset and generates responses on that basis — structurally eliminating hallucinations.
Technical specifications
Platform | |
Provision | As a turnkey appliance, on-premises |
GPU | Latest Nvidia GPU, 96 GB VRAM, fanless |
Storage for local data | |
Storage | Silent AI Vault storage module with 12x NVMe, triple parity |
Storage slots | 2 slots for Silent AI Vault |
Storage capacity | 12, 24, 48 or 96 TB (gross) per Silent AI Vault |
Network & Interfaces | |
Data | Dual 10GbE (RJ45) or 10GbE SFP+ |
Optional | Dual 25/100GbE (QSFP) |
Management | 1× 1GbE Admin 1× 1GbE IPMI |
Physical data | |
Power consumption | Idle: approx. 170 W Typical: approx. 636 W (1 Silent AI Vault active) Maximum: 1,200 W |
heat dissipation | Typical: 2170 BTU/h Maximum: 4094 BTU/h |
Dimensions | 2U × 19 inches × 640 mm Weight approx. 17 kg |
AI & Integration | |
Architecture | RAG (Retrieval-Augmented Generation) |
Identity Management | LDAP/AD, OIDC/SSO (Okta, Entra ID, Google Workspace) or a hybrid solution |
User interface | Web-UI · Chat API from mid-2026 |
Authentication | LDAP, SSO |
Connectors | MS Office, MS SharePoint, MS Outlook/Exchange, Confluence & Jira, Dozuki, Slack, Nextcloud, SMB/File Server, PDF upload, Web Scraper More in development / on request API from mid-2026 |
Compliance | |
Data protection | GDPR-compliant, no transfers to third countries |
Regulatory framework | NIS2-ready, compliant with the EU AI Act |
Origin | German product · not subject to the US CLOUD Act / FISA Section 702 |
SLA | Up to 10 years, fixed terms |
The exact configuration may vary depending on availability and requirements. The information provided here is for reference purposes only.
Frequently asked questions
How many users and requests can the Silent AI hardware handle?
This depends largely on the specific use case and configuration. Complex queries involving a large number of data sources require more processing power than simple queries. Under normal circumstances, Silent AI can process and respond to several thousand queries per hour.
Which GPU does Silent AI use?
The appliance is equipped with a current Nvidia model. The exact specifications depend on availability and intended use, and are always selected in consultation with the customer. As the GPU is used solely for inference and not for training language models, the specific performance is not a critical factor.
How much storage space is available for the vector database, etc.?
The Silent
AI Appliance
An AI appliance is a dedicated hardware unit in which an AI model, computing infrastructure (GPU), data storage and software stack are combined into a ready-to-operate system — as an alternative to a self-built AI server or cloud AI.