# Vector Database

Managed vector store backed by Pinecone. Store, search, and retrieve embeddings over HTTP. Tenant isolation is enforced server-side — agents never see or control the underlying isolation mechanism.

## Prerequisites

Provision this resource before use. Edge requests without provisioning will error.

### Provision

    curl -s -X POST https://cohesivity.ai/api/resources/vector-database \
      -H "Authorization: Bearer <coh_management_key>"

### Delete

    curl -s -X DELETE https://cohesivity.ai/api/resources/vector-database \
      -H "Authorization: Bearer <coh_management_key>"

**Important:** Provision this resource now, before building or running the application. Provisioning is the agent's job, not the application's.

## Common Mistakes

- **Not specifying dimensions at provision time.** Dimensions are required and must match your embedding model output size.
- **Trying to change dimensions after provisioning.** Profile (dimensions + metric) is immutable once provisioned. Delete the resource first if you need a different profile.
- **Passing a namespace field.** Tenant isolation is handled server-side. Any namespace field in your request body is ignored.

## Provision with Profile

Vector database requires a profile (dimensions + metric) at provision time:

    curl -s -X POST https://cohesivity.ai/api/resources/vector-database \
      -H "Authorization: Bearer <coh_management_key>" \
      -H "Content-Type: application/json" \
      -d '{"dimensions": 768, "metric": "cosine"}'

**Supported dimensions:** 384, 768, 1024, 1536, 3072
**Supported metrics:** cosine (default), euclidean, dotproduct

Choose dimensions to match your embedding model:
- 3072: OpenAI text-embedding-3-large
- 1536: OpenAI text-embedding-3-small
- 1024: Cohere embed-v3
- 384: all-MiniLM-L6-v2

> **Server-side only.** `coh_application_key` is a secret. Call this from your `vercel-hosting` API routes, `cloudflare-workers`, or your own server tier — never from a browser, mobile app, or other client-side code. See the canonical key-secrecy directive in `.cohesivity` for details.

## Edge Usage

- **Base URL:** https://cohesivity.ai/edge/vector-database
- **Auth:** `coh_application_key` as the **key** query parameter
- **Method:** POST only (all endpoints)

### Upsert Vectors

    POST https://cohesivity.ai/edge/vector-database?key=<coh_application_key>
    {
      "vectors": [
        { "id": "doc-1", "values": [0.1, 0.2, ...], "metadata": { "title": "My Doc", "category": "faq" } },
        { "id": "doc-2", "values": [0.3, 0.4, ...], "metadata": { "title": "Other Doc" } }
      ]
    }

Response: `{ "upsertedCount": 2 }`

### Query (Similarity Search)

    POST https://cohesivity.ai/edge/vector-database/query?key=<coh_application_key>
    {
      "vector": [0.1, 0.2, ...],
      "topK": 5,
      "includeMetadata": true,
      "filter": { "category": { "$eq": "faq" } }
    }

Response:

    {
      "matches": [
        { "id": "doc-1", "score": 0.95, "metadata": { "title": "My Doc", "category": "faq" } },
        { "id": "doc-2", "score": 0.82, "metadata": { "title": "Other Doc" } }
      ]
    }

### Fetch by IDs

    POST https://cohesivity.ai/edge/vector-database/fetch?key=<coh_application_key>
    { "ids": ["doc-1", "doc-2"] }

Response: `{ "vectors": { "doc-1": { "id": "doc-1", "values": [...], "metadata": {...} }, ... } }`

### Delete

    POST https://cohesivity.ai/edge/vector-database/delete?key=<coh_application_key>

Delete by IDs: `{ "ids": ["doc-1", "doc-2"] }`
Delete by filter: `{ "filter": { "category": { "$eq": "old" } } }`
Delete all your vectors: `{ "deleteAll": true }`

Response: `{ "success": true }`

## Tenant Isolation

Every tenant's vectors are isolated by a server-injected Pinecone namespace. The isolation boundary is injected on every request — agents cannot read, set, or override it. Tenants sharing the same profile (dimensions + metric) share underlying infrastructure, while requests are scoped to that tenant namespace.

## End-to-End RAG Workflow

Build a retrieval-augmented generation pipeline using OpenAI embeddings + vector-database + OpenAI generation:

### 1. Provision resources

    curl -s -X POST https://cohesivity.ai/api/resources \
      -H "Authorization: Bearer <coh_management_key>" \
      -H "Content-Type: application/json" \
      -d '{"resources": ["openai-api", "vector-database"], "vector-database": {"dimensions": 1536}}'

### 2. Generate embeddings (text-embedding-3-small, 1536 dimensions)

    POST https://cohesivity.ai/edge/openai-api/v1/embeddings?key=<coh_application_key>
    {
      "model": "text-embedding-3-small",
      "input": "Your document text here"
    }

    → { "data": [{ "embedding": [0.1, 0.2, ...] }] }  // 1536-dimensional vector

### 3. Store in vector database

    POST https://cohesivity.ai/edge/vector-database?key=<coh_application_key>
    { "vectors": [{ "id": "doc-1", "values": [0.1, 0.2, ...], "metadata": { "text": "Your document text", "source": "faq" } }] }

### 4. Query with user question embedding

    // First embed the question using step 2, then:
    POST https://cohesivity.ai/edge/vector-database/query?key=<coh_application_key>
    { "vector": [<question_embedding>], "topK": 3, "includeMetadata": true }

### 5. Generate answer with retrieved context

    POST https://cohesivity.ai/edge/openai-api/v1/responses?key=<coh_application_key>
    {
      "model": "gpt-5-nano",
      "input": "Context: <retrieved docs>\n\nQuestion: <user question>\nAnswer:"
    }

## Metadata Filtering

Query and delete support Pinecone metadata filter syntax:

    { "category": { "$eq": "faq" } }
    { "price": { "$gt": 10, "$lte": 100 } }
    { "$and": [{ "category": { "$eq": "faq" } }, { "lang": { "$eq": "en" } }] }

See Pinecone docs for full filter operator reference.

## Launch Rate Limits

Ephemeral tenants pause as a whole if any authoritative hard cap below is exceeded. Claimed tiers use account-scoped buckets shared across every project owned by the Cohesivity user; OpenAI, AI Gateway, Deepgram, and Exa are fluid-only after tier, rate, and concurrency checks; AI Gateway and Deepgram have no fixed monthly usage bucket for claimed tiers.

**Ephemeral**

- read units: 50000 per ephemeral tenant lifetime before claim or expiry
- write units: 100000 per ephemeral tenant lifetime before claim or expiry
- requests: 30 per minute

**Claimed Free**

- requests: 120 per minute
- read units: 1000000 per month
- write units: 500000 per month

**Claimed Plus**

- requests: 600 per minute
- read units: 10000000 per month
- write units: 5000000 per month

**Claimed Pro**

- requests: 3000 per minute
- read units: 50000000 per month
- write units: 25000000 per month