On-Premises Deployment — Recall by The Retrieval Company

ENTERPRISE CAPABILITIES

BUILT FOR FIRMS THAT TAKE
SECURITY SERIOUSLY.

Everything in Recall Desktop, plus the infrastructure controls your IT team expects.

FIRM-WIDE POWER

One server. Your entire firm.

Deploy Recall on your office server so every attorney can access the same research. It works on your existing hardware with one simple Docker setup.

DEPLOYMENT STATUS

docker compose up -d running

recall-server healthy

recall-embedder healthy

STAYS IN THE BUILDING

Zero external calls.

Your client files never touch the internet. Recall works entirely behind your office firewall, keeping your research truly private and protected.

AIR-GAPPED CAPABLE

SHARED ACCESS

Collaborative research.

Multiple attorneys on the same case. Share document folders, coordinate your research, and manage your firm's archives in one central place.

MULTI-USER CONCURRENT

YOUR LIBRARY, CENTRALIZED

Every filing, brief, and opinion in one place.

Store your firm's entire document library on your own server. Your IT team manages backups and security using the tools they already know.

LIBRARY STATISTICS

Active matters 47

Documents indexed 12,847

Total embeddings 1.2M vectors

FLEXIBLE

Built for your firm.

Recall adapts to your office setup. Your IT team chooses the models, storage, and performance tier that fits your specific needs.

COMPLIANCE

Meets your rules.

Keep data in your jurisdiction. No complex workarounds — just simple, local control over your firm's data and regulatory requirements.

SUPPORT

We help you deploy.

Our team works directly with your IT staff to plan, configure, and validate your on-premises installation. Ongoing support included.

SERVER SPECIFICATIONS

WHAT YOU NEED TO
GET STARTED.

Recall Server scales from single-rack deployments to full data center installations.

Standard

Small to mid-size firms (5–25 users)

CPU16+ cores (Xeon / EPYC)
RAM64 GB ECC
GPUNVIDIA A4000 (16 GB VRAM)
Storage1 TB NVMe SSD
OSUbuntu Server 22.04 LTS
RuntimeDocker + Compose

Max models at this tier

Embedding model ~400M – 600M params

Large language model Up to ~70B params (incl. MoE)

70B-class models on a single GPU. Fast concurrent inference for the entire firm.

Enterprise

Large organizations (100+ users)

CPU64+ cores (dual EPYC)
RAM512 GB+ ECC
GPUMulti-GPU (H100 80 GB x4–x8)
Storage10 TB+ NVMe RAID
OrchestrationKubernetes
NetworkingInfiniBand / NVLink

Max models at this tier

Frontier language model ~400B – 1T+ params (incl. MoE)

Any open-weight model no upper limit

Frontier-class models at full precision. Multi-GPU tensor parallelism across H100 clusters with NVLink.

Your infrastructure.
Your data. Your control.

HOW IT DEPLOYS.

BUILT FOR FIRMS THAT TAKE
SECURITY SERIOUSLY.

One server. Your entire firm.

Zero external calls.

Collaborative research.

Every filing, brief, and opinion in one place.

Built for your firm.

Meets your rules.

We help you deploy.

WHAT YOU NEED TO
GET STARTED.

Standard

Max models at this tier

Recommended

Max models at this tier

Enterprise

Max models at this tier

READY TO DEPLOY ON YOUR TERMS?

Your infrastructure.Your data. Your control.

One server. Your entire firm.

Zero external calls.

Collaborative research.

Every filing, brief, and opinion in one place.

Built for your firm.

Meets your rules.

We help you deploy.

Standard

Max models at this tier

Recommended

Max models at this tier

Enterprise

Max models at this tier

READY TO DEPLOY ON YOUR TERMS?

Your infrastructure.
Your data. Your control.