Weaviate

Name: Weaviate
Rating: 8.5 (1 reviews)

🇳🇱

Amsterdam-built open-source vector database with hybrid search and generative AI modules

8.5/10

EU-BuiltGDPREU DataOpen SourceFree Tier

Weaviate is an Amsterdam-based open-source vector database built in Go, designed for AI-native applications requiring semantic search, hybrid retrieval, and generative AI modules. Founded in 2019 as SeMI Technologies and renamed Weaviate B.V. in 2023, the company raised a $50M Series B from Index Ventures and Battery Ventures. The core database is BSD-3-Clause licensed and self-hostable; Weaviate Cloud offers managed hosting with Flex, Plus, and Premium plans.

Headquarters

Amsterdam, Netherlands

Founded

2019

Pricing

Open Source

EU Data Hosting

Yes

Employees

51-200

Open Source

Yes

vector-databaseopen-sourcehybrid-searchraggenerative-aimulti-tenancy

Ratings

Ease of Use7.5

Feature Depth9.0

Value for Money8.5

EU Compliance9.0

Support Quality7.0

Integration Ecosystem9.0

Features

Core Features

✓Vector similarity search (HNSW index)
✓Hybrid search combining dense vectors with BM25 keyword scoring
✓Multi-tenancy with per-tenant data isolation
✓Generative AI modules (query-time LLM summarisation and generation)
✓Built-in vectorizer modules (OpenAI, Cohere, Hugging Face, AWS Bedrock, Vertex AI)
✓GraphQL and REST query interfaces
✓Batch import and near-real-time indexing
✓Replication and horizontal scaling
✓Schema management with object cross-references
✓Filters and metadata-based pre/post filtering

Standout Features

★Generative modules let the database act as a retrieval-augmented generation engine — results can be summarised or transformed by an LLM at query time
★Hybrid fusion (Reciprocal Rank Fusion) merges keyword and vector results without requiring external orchestration
★Named vectors allow multiple vector representations per object (e.g., image + text embeddings on the same document)
★BYOC (Bring Your Own Cloud) deployment option on AWS, GCP, or Azure within the customer's own VPC

Compliance

☖GDPR compliant (Dutch B.V. entity under EU law)
☖EU data residency available on Weaviate Cloud
☖SOC 2 Type II (enterprise cloud tier)
☖Data Processing Agreement available
☖BSD-3-Clause open source licence — full self-hosting option eliminates data transfer to third parties

Pricing

14-day free trial available

Open Source

Free

Full database feature set
Self-hosted on any infrastructure
BSD-3-Clause licence
Community support via GitHub and Discord
No restrictions on scale or tenants

Flex

Pay-as-you-go

Managed cloud hosting
All core database features
Built-in RBAC
AI-native services (Embeddings, Agents)
Automated upgrades
99.5% uptime SLA (shared cloud)
14-day free sandbox

Plus

$280/mo

Everything in Flex
Annual commitment options
Enhanced security controls
99.9% uptime SLA
Shared or dedicated deployments

Premium / BYOC

Contact Sales

Dedicated infrastructure
99.95% uptime SLA
Priority response times
Bring Your Own Cloud (AWS, GCP, Azure)
Business-critical SLA options

Billing: monthly, annual

Integrations & API

LangChainLlamaIndexHaystackOpenAICohereHugging FaceAWS BedrockVertex AISnowflakeDatabricksKubernetes

API AvailableWebhook Support

Support

Community-forumGithubDocumentationEnterprise-supportDocs: ExcellentCommunity Forum

Pros

✓Open-source BSD-3-Clause licence means the full database can be self-hosted on any EU cloud without vendor lock-in
✓Hybrid search combines dense vector search with BM25 keyword search in a single query — most vector databases require separate pipelines for this
✓Built-in vectorizer modules (OpenAI, Cohere, Hugging Face, AWS Bedrock) eliminate external embedding preprocessing steps
✓Multi-tenancy is first-class — thousands of isolated tenants share a single cluster with per-tenant data isolation, ideal for SaaS applications
✓Dutch B.V. legal entity with Amsterdam HQ gives EU-native data governance; GDPR compliance is structural, not bolted on

Cons

✕Managed cloud pricing has shifted significantly since late 2025 (new Flex/Plus/Premium tiers) and the Flex plan starts at $45/month — not a true free managed option
✕The GraphQL-heavy query interface has a steeper learning curve than SQL-familiar developers expect from a database
✕Memory footprint for large collections is substantial — production deployments at scale require significant infrastructure planning
✕Community support is the primary channel for self-hosted users; enterprise support requires a Plus or Premium cloud contract

Frequently Asked Questions

Yes. The Weaviate database is released under the BSD-3-Clause licence, which permits commercial use, modification, and distribution. The full feature set — including hybrid search, multi-tenancy, and generative modules — is available in the open-source build. Weaviate Cloud is the managed SaaS offering, but self-hosting has no restrictions.

Weaviate B.V. is incorporated in Amsterdam, Netherlands, and Weaviate Cloud can host data within the EU. For maximum data sovereignty, the BYOC (Bring Your Own Cloud) option deploys within your own AWS, GCP, or Azure account in any region of your choosing, including EU-only regions.

Weaviate is open-source and self-hostable; Pinecone is closed-source SaaS only. Weaviate's hybrid search (vector + BM25) is built in, whereas Pinecone added sparse-dense hybrid later. Weaviate's built-in generative modules reduce external API calls for RAG workloads. Pinecone has simpler onboarding and a more predictable pricing model at low scale; Weaviate is generally cheaper at high vector volumes when self-hosted.

Yes. Multi-tenancy is a first-class feature in Weaviate. Thousands of isolated tenants can share a single cluster, with each tenant's data fully isolated. Tenants can be activated or deactivated dynamically, which is particularly useful for SaaS applications where most tenants are inactive at any given time, reducing memory footprint.

Weaviate B.V. is a Dutch company incorporated in Amsterdam, Netherlands, and is subject to GDPR by default. The company offers a Data Processing Agreement for Weaviate Cloud customers. Self-hosted deployments give full control over data residency, with no data leaving your infrastructure.

What Is Weaviate?

Before "vector database" was a category, Bob van Luijt and Etienne Dilocker were building one in Amsterdam. The project started in 2019 under the name SeMI Technologies, a consultancy working with knowledge graphs and semantic search. The team quickly realised that storing and querying vector embeddings was the actual hard problem, and pivoted to build the database layer directly. By 2023 the company renamed itself Weaviate B.V., reflecting that the open-source project had outgrown the consultancy origins entirely.

Today Weaviate is one of the most widely adopted vector databases in the world, with the core codebase downloaded millions of times and integrations across every major AI framework. The $50M Series B from Index Ventures and Battery Ventures in 2024 (at a $200M valuation) confirmed the trajectory. The operating entity remains Weaviate B.V., incorporated in Amsterdam at Prinsengracht 769A, with the Delaware holding company existing purely for US venture capital mechanics. This is the same structure Sanity (Oslo) and other European AI companies use to accept American institutional investment.

The database is written in Go, licensed under BSD-3-Clause, and can be self-hosted on any infrastructure without restrictions. Weaviate Cloud is the managed SaaS layer, currently structured across Flex ($45/month base), Plus ($280/month), and Premium/BYOC (custom) plans.

The primary use cases are retrieval-augmented generation (RAG), semantic search, recommendation systems, and image/multimodal search. Teams building LLM applications use Weaviate as the retrieval layer that fetches relevant context before sending it to a language model.

Key Features

Hybrid Search: Vector Plus Keyword in One Query

Most vector databases handle approximate nearest-neighbour search on dense embeddings. Weaviate goes further by fusing dense vector search with BM25 keyword scoring in a single query operation, using Reciprocal Rank Fusion (RRF) to merge the result sets.

The practical consequence is meaningful. A pure vector search for "quarterly revenue decline" might surface documents about general financial performance; a pure keyword search might miss semantically relevant phrasing. The hybrid approach captures both the semantic meaning and the literal term match, returning ranked results that are typically more relevant for enterprise search and RAG applications. Competing tools like Pinecone added sparse-dense hybrid search later and as an add-on; in Weaviate it has been native since v1.17.

Generative Modules: RAG at the Database Level

Weaviate's generative modules allow you to attach an LLM call to a query. The database retrieves relevant objects, then passes them directly to OpenAI, Cohere, Hugging Face, AWS Bedrock, or Vertex AI for summarisation, transformation, or answer generation, all within the same API call.

This means a RAG pipeline that would typically require three steps (retrieve, format context, call LLM) can execute in one. For teams building chatbots or document Q&A systems, removing that intermediate orchestration step reduces latency and simplifies application code. LangChain and LlamaIndex both integrate with Weaviate's generative modules, so existing orchestration pipelines can adopt the feature incrementally.

Multi-Tenancy for SaaS Applications

Multi-tenancy in Weaviate is genuinely first-class, not an afterthought. A single Weaviate cluster can host thousands of isolated tenants, each with their own data partitioned at the storage level. Tenants can be activated or deactivated dynamically; inactive tenants are offloaded to disk, reducing the memory footprint to only those actively queried.

For SaaS builders, this design pattern is significant. A product serving 10,000 customers no longer needs 10,000 separate database instances or complex application-level sharding. The Weaviate team has documented production deployments handling 100,000+ tenants on a single cluster. Qdrant, Weaviate's Berlin-based EU counterpart, has added multi-tenancy as well, but Weaviate's implementation is more mature and better documented.

Named Vectors: Multiple Representations per Object

Weaviate supports named vectors, which allow a single object to carry multiple separate vector representations. A product catalogue entry, for instance, can store one vector from a text embedding model (for natural-language search) and a separate vector from a multimodal vision model (for image similarity). Both representations are queryable independently or in combination.

This capability is particularly relevant for e-commerce, media, and healthcare applications where documents or items have inherently multimodal content. Named vectors eliminate the workaround of duplicating objects into separate collections with different vectorizers.

Built-In Vectorizer Modules

Weaviate handles vectorization internally. Instead of requiring your application to call an embedding API, preprocess the result, and then insert vectors, you define which vectorizer module to use at the schema level. On import, Weaviate calls the configured model automatically.

Supported vectorizers include OpenAI, Cohere, Hugging Face (Inference API and local models), AWS Bedrock, and Google Vertex AI. For teams with existing model infrastructure, the raw vector import path is also fully supported.

Pricing

The self-hosted open-source build is free with no restrictions. This is not a crippled community edition. It is the complete database, supporting all features including hybrid search, generative modules, multi-tenancy, and replication. For European engineering teams with an existing Kubernetes infrastructure on AWS, GCP, or a EU-based cloud like OVHcloud or Hetzner, self-hosting Weaviate is the economically dominant choice at most scales.

Weaviate Cloud's updated tier structure (revised in late 2025) starts with a Flex plan at approximately $45/month base, scaling with vector count, storage, and queries. The 14-day free sandbox is available without a credit card. The Plus plan at $280/month adds annual commitment options, stronger SLAs (99.9% uptime), and dedicated deployment options. Premium and BYOC (Bring Your Own Cloud) tiers serve enterprise requirements with dedicated infrastructure, 99.95% uptime SLAs, and deployment within the customer's own AWS, GCP, or Azure account.

BYOC is particularly relevant for regulated industries where data cannot transit through a third-party managed service. The database runs in the customer's VPC; Weaviate provides the software and operational tooling.

The previous Serverless pricing model (at ~$0.095 per million 1k-dimensional vectors per hour) has been replaced by the Flex plan, which bundles storage, compute, and queries into a simpler monthly base with usage scaling. Historical pricing guides may reference the older model.

EU Compliance & Privacy

Weaviate B.V. is incorporated in Amsterdam under Dutch law, placing it squarely under GDPR jurisdiction. The Delaware holding company used for VC structuring does not affect the operational entity's legal obligations or data processing activities.

For Weaviate Cloud deployments, data can be hosted in EU regions. A Data Processing Agreement is available for enterprise customers. Self-hosted deployments are fully air-gapped from Weaviate infrastructure: no telemetry is sent by default, and data never leaves the customer's environment.

The BSD-3-Clause licence provides an additional layer of assurance: the code is auditable, forkable, and carries no proprietary restrictions. For organisations with strict data sovereignty requirements, this is the cleanest possible compliance posture for a managed database component.

Weaviate Cloud holds SOC 2 Type II certification at the enterprise tier. The company publishes a Trust Center with audit reports and security documentation.

Who It's Best For

If you are building a RAG application and want to reduce the number of external API calls in your retrieval pipeline, Weaviate's built-in generative modules and native LangChain/LlamaIndex integration make it one of the most productive starting points available.

If you are running a SaaS product with many distinct customer data sets, the multi-tenancy architecture fits naturally. Each customer gets isolated storage without requiring separate deployments.

If your team manages its own infrastructure and has Kubernetes experience, the self-hosted open-source build eliminates cloud database costs entirely while retaining the full feature set.

If you need simple, serverless vector storage with predictable per-query pricing and no operational overhead, Pinecone's managed service has lower setup friction. Weaviate's Flex plan is competitive but requires more configuration choices upfront.

If you want a fully EU-native vector database from an EU-headquartered company with comparable open-source credentials, Qdrant (Berlin, Rust-based) is the closest comparator. Both are strong choices for AI developer tools use cases; Qdrant benchmarks faster on raw throughput in some configurations, while Weaviate's generative modules and named vectors offer more built-in AI capabilities.

The Verdict

Weaviate earns its position as one of the most feature-complete vector databases available. The hybrid search implementation, generative modules, and multi-tenancy architecture address real engineering problems that application teams otherwise solve with custom middleware. The Amsterdam origin and Dutch B.V. legal structure make EU compliance structural rather than contractual.

The trade-offs are real. The GraphQL query interface is not intuitive for teams coming from SQL or document databases, and the managed cloud pricing has shifted significantly. Self-hosted deployments require genuine Kubernetes competence to operate reliably at scale. For teams willing to invest in that learning curve, Weaviate delivers substantial capabilities that reduce application-layer complexity considerably.

Frequently Asked Questions

Is Weaviate truly open source?

Yes. The Weaviate database is released under the BSD-3-Clause licence, permitting commercial use, modification, and distribution without restriction. The full feature set (hybrid search, multi-tenancy, generative modules) is available in the open-source build. Weaviate Cloud is the managed SaaS offering, but self-hosting has no limitations.

Where is Weaviate data hosted?

Weaviate B.V. is incorporated in Amsterdam, Netherlands. Weaviate Cloud can host data in EU regions. The BYOC option deploys within your own AWS, GCP, or Azure account, including EU-only regions, giving complete control over data residency.

How does Weaviate compare to Pinecone?

Weaviate is open-source and self-hostable; Pinecone is closed-source SaaS only. Weaviate's hybrid search is built in and more mature. Weaviate's generative modules reduce external API calls in RAG pipelines. Pinecone has simpler onboarding at small scale; Weaviate is cheaper at high vector volumes when self-hosted, and is incorporated in the EU.

Does Weaviate support multi-tenancy?

Yes. Multi-tenancy is a first-class feature. Thousands of isolated tenants share a single cluster with full data isolation at the storage level. Inactive tenants are offloaded to disk, reducing memory consumption. Production deployments with 100,000+ tenants on a single cluster are documented in Weaviate's case studies.

Is Weaviate GDPR compliant?

Weaviate B.V. is a Dutch company under GDPR jurisdiction by default. A Data Processing Agreement is available for Weaviate Cloud customers. Self-hosted deployments are fully air-gapped: no data leaves the customer's infrastructure. The BSD-3-Clause licence makes the codebase auditable for security review.

Weaviate is an EU alternative to

Pinecone Milvus Chroma

Related Products

Qdrant🇩🇪

High-performance open-source vector database built in Rust

EU-BuiltFreemiumOpen SourceEU DataReviewed

Alternative to Pinecone, Weaviate

Visit Website

Vespa.ai🇳🇴

Trondheim-built open-source tensor-native search and vector database, spun out of Yahoo

EEAOpen SourceOpen SourceEU DataReviewed

Alternative to Pinecone, Elasticsearch, Milvus

Visit Website

What Is Weaviate?

Key Features

Hybrid Search: Vector Plus Keyword in One Query

Generative Modules: RAG at the Database Level

Multi-Tenancy for SaaS Applications

Named Vectors: Multiple Representations per Object

Built-In Vectorizer Modules

Pricing

EU Compliance & Privacy

Weaviate Cloud holds SOC 2 Type II certification at the enterprise tier. The company publishes a Trust Center with audit reports and security documentation.

Who It's Best For

If you are running a SaaS product with many distinct customer data sets, the multi-tenancy architecture fits naturally. Each customer gets isolated storage without requiring separate deployments.

If your team manages its own infrastructure and has Kubernetes experience, the self-hosted open-source build eliminates cloud database costs entirely while retaining the full feature set.