Self-hosted document intelligence

Store your documents, then ask your AI — and get answers cited to the source.

DocuVault is an open-source, self-hosted alternative to cloud document tools. Your files and your AI never leave your servers, and role-based access is enforced on every search and every answer.

  • Permission-aware AI
  • Source-cited answers
  • Self-hosted & open-source

DocuVault*

Role-Based Access ControlVersion ControlAudit-readyPermission-aware AISelf-hostedSource-Cited AnswersDesktop Agent SyncMulti-tenant SaaSVoice Q&ADeterministic AnalyticsAir-gapped DeploymentOpen Source · MIT
Role-Based Access ControlVersion ControlAudit-readyPermission-aware AISelf-hostedSource-Cited AnswersDesktop Agent SyncMulti-tenant SaaSVoice Q&ADeterministic AnalyticsAir-gapped DeploymentOpen Source · MIT

Document intelligence

DocuVaultisopen-source,anassistantthatreadsonlywhatyoucan.Role-basedaccess,enforcedoneveryquery.

Built by Prince as an alternative to cloud document silos, DocuVault keeps your files and your AI inside your own walls. Answers are grounded in your documents and cited to the source with Qwen2.5-7B, and the model never sees a page the asking user is not allowed to open.

Anassistantthatactuallyknowsyourdocuments.

Ask a question in natural language. DocuVault retrieves the right passages with vector search, generates a grounded answer with Qwen2.5-7B, and cites its sources — and it only ever reads documents the asking user has permission to access.

Live answer

What is our retention period for contracts?
Contracts are retained for 7 years after termination, then moved to cold archive. Legal holds override this schedule.Policy_v3.pdf, p.4Retention_SOP.docx, §2.1

Scoped to your permissions · 2 of 118 contracts readable

Source citations on every answer

Each response links back to the exact documents and passages it was grounded in — no unverifiable claims.

Conversation memory for follow-ups

Ask a follow-up and the assistant keeps the thread — context carries across the whole conversation.

Permission-aware retrieval

Retrieval runs inside your access-control layer. The model never sees a document the asking user cannot open.

Studio-gradedocumentworkflowsforteamsthatself-host.Builtforcontrol.PoweredbyyourownAI.

Your secure vault.

Permission-aware RAG.

01
  • Source citations on every answer
  • Conversation memory for follow-ups
  • Permission-scoped vector retrieval
  • Grounded generation with Qwen2.5-7B
Learn more

Access & versioning.

02
  • Hierarchical RBAC, guest to admin
  • Version history with change notes
  • Document locking to prevent clashes
Learn more

Secure collaboration.

03
  • Share links with password & expiry
  • Threaded comments and replies
  • Soft delete with full recovery
Learn more
Everythinganenterprisedocumentplatformneeds.

Role-Based Access Control

Hierarchical permission levels from guest to admin, enforced on every document, search, and AI query.

Access modes

Public, private, role-based, or custom per-user sharing — pick the visibility model per document.

Version control

Track every version with change notes; lock documents to prevent concurrent edits.

Collaboration

Threaded comments and replies on any document keep review discussions next to the source.

Secure sharing

Shareable links with password protection, expiry dates, and access limits.

Advanced search

Full-text search with category, tag, owner, and date filters across the entire vault.

Notifications

Real-time alerts on shares, comments, and permission changes — no silent state drift.

Soft delete & recovery

Deleted documents are recoverable by design. Nothing is ever lost by accident.

Desktop Agent sync

A background agent watches your folders and auto-syncs files — just like Google Drive, but for your private vault.

Multi-tenant SaaS

Full per-organisation data and AI isolation. Deploy once, serve many tenants — each with their own admin, roles, and vector store.

Voice Q&A

Speak your query and hear the answer. Faster-Whisper transcribes, the RAG engine retrieves, and edge-tts narrates — all local.

Deterministic analytics

Excel and CSV questions run through pandas — numbers computed, never hallucinated. AI only narrates pre-computed results.

Desktop Agent

Nomanualuploads,ever.

The DocuVault Desktop Agent runs quietly in the background — just like Google Drive — watching folders you choose and automatically syncing new and changed files into your vault. No drag-and-drop, no browser tabs. Documents appear the moment they are saved to disk.

  • Watches any folder on Windows or macOS — just like Google Drive
  • Auto-versions every file change using MD5 dedup
  • Queues syncs while offline and flushes when you reconnect
  • Live heartbeat and status visible in your dashboard
  • One-click installer built and downloaded directly from the app

DocuVault Agent

~/Documents/Vault

Online

Q1-Financial-Report.pdf

2.4 MB

Synced

SOP-Equipment-Maintenance.docx

840 KB

Synced

Equipment-Log-May.xlsx

1.1 MB

Syncing…

Annual-Compliance-Audit.pdf

5.2 MB

Queued

4 files · 9.5 MB total

Auto-sync on
Oneplatform.Everyorganisation.

Deploy DocuVault as a SaaS for multiple organisations — each fully isolated, each self-governing. Every new user goes through an admin-approval gate before gaining any access.

Full tenant isolation

Each organisation gets its own vector store, document namespace, and AI context. No data ever crosses tenant boundaries.

  • Per-org vector collections
  • Isolated document namespaces
  • Separate AI configuration per tenant

Admin-gated onboarding

New users register, verify their email via OTP, then wait in a pending queue. The org admin reviews and approves before anyone gets in.

  • Approval dashboard with live pending count
  • Admin assigns role on approval

Hierarchical RBAC

Permission levels from 1–100 cover Guest, Regular, and Admin. Every document, search, and AI query respects the requester's role.

  • Custom roles — create, edit, delete
  • Per-document access level (public / private / role-based / custom)
  • RBAC enforced on every AI retrieval

User onboarding flow

01

Register

Email OTP verified

02

Pending

Awaiting admin review

03

Approved

Admin grants a role

04

Active

Full scoped access

Productionarchitecture,fullyunderyourcontrol.
Django + DRFREST API · auth · RBACPostgreSQLdocuments · versionsQwen3-Embeddingvector embeddingsQwen2.5-7Bgrounded generationGPU inferenceaccelerated serving

Django + DRF API · PostgreSQL · Qwen3-Embedding vectors · Qwen2.5-7B LLM · GPU-accelerated inference

Runitinsideyourownwalls.

DocuVault is open-source and self-hosted — your documents and your AI never leave your infrastructure. Clone, configure, and deploy.

bash — deploy
$ git clone https://github.com/ft-prince/DocuVault$ cd DocuVault# configure database + model settings in .env$ docker compose up
Questions,answered.

On your servers. DocuVault is fully self-hosted — documents, embeddings, and the language model all run inside your own infrastructure. Nothing is sent to a third-party API.