331 lines
9.7 KiB
Markdown
331 lines
9.7 KiB
Markdown
# Subscription Query Complexity Analysis
|
|
|
|
## Overview
|
|
|
|
This document analyzes how Nostr REQ subscription filters would be implemented across different schema designs, focusing on query complexity, performance implications, and implementation burden.
|
|
|
|
## Nostr REQ Filter Specification Recap
|
|
|
|
Clients send REQ messages with filters containing:
|
|
- **`ids`**: List of specific event IDs
|
|
- **`authors`**: List of pubkeys
|
|
- **`kinds`**: List of event kinds
|
|
- **`#<letter>`**: Tag filters (e.g., `#e` for event refs, `#p` for pubkey mentions)
|
|
- **`since`/`until`**: Time range filters
|
|
- **`limit`**: Maximum events to return
|
|
|
|
### Key Filter Behaviors:
|
|
- **Multiple filters = OR logic**: Match any filter
|
|
- **Within filter = AND logic**: Match all specified conditions
|
|
- **Lists = IN logic**: Match any value in the list
|
|
- **Tag filters**: Must have at least one matching tag
|
|
|
|
## Schema Comparison for REQ Handling
|
|
|
|
### Current Simple Schema (Single Table)
|
|
```sql
|
|
CREATE TABLE event (
|
|
id TEXT PRIMARY KEY,
|
|
pubkey TEXT NOT NULL,
|
|
created_at INTEGER NOT NULL,
|
|
kind INTEGER NOT NULL,
|
|
content TEXT NOT NULL,
|
|
sig TEXT NOT NULL
|
|
);
|
|
|
|
CREATE TABLE tag (
|
|
id TEXT NOT NULL, -- event ID
|
|
name TEXT NOT NULL,
|
|
value TEXT NOT NULL,
|
|
parameters TEXT
|
|
);
|
|
```
|
|
|
|
#### Sample REQ Query Implementation:
|
|
```sql
|
|
-- Filter: {"authors": ["pubkey1", "pubkey2"], "kinds": [1, 6], "#p": ["target_pubkey"]}
|
|
SELECT DISTINCT e.*
|
|
FROM event e
|
|
WHERE e.pubkey IN ('pubkey1', 'pubkey2')
|
|
AND e.kind IN (1, 6)
|
|
AND EXISTS (
|
|
SELECT 1 FROM tag t
|
|
WHERE t.id = e.id AND t.name = 'p' AND t.value = 'target_pubkey'
|
|
)
|
|
ORDER BY e.created_at DESC, e.id ASC
|
|
LIMIT ?;
|
|
```
|
|
|
|
### Multi-Table Schema Challenge
|
|
|
|
With separate tables (`events_regular`, `events_replaceable`, `events_ephemeral`, `events_addressable`), a REQ filter could potentially match events across ALL tables.
|
|
|
|
#### Problem Example:
|
|
Filter: `{"kinds": [1, 0, 20001, 30023]}`
|
|
- Kind 1 → `events_regular`
|
|
- Kind 0 → `events_replaceable`
|
|
- Kind 20001 → `events_ephemeral`
|
|
- Kind 30023 → `events_addressable`
|
|
|
|
This requires **4 separate queries + UNION**, significantly complicating the implementation.
|
|
|
|
## Multi-Table Query Complexity
|
|
|
|
### Scenario 1: Cross-Table Kind Filter
|
|
```sql
|
|
-- Filter: {"kinds": [1, 0, 30023]}
|
|
-- Requires querying 3 different tables
|
|
|
|
SELECT id, pubkey, created_at, kind, content, sig FROM events_regular
|
|
WHERE kind = 1
|
|
UNION ALL
|
|
SELECT id, pubkey, created_at, kind, content, sig FROM events_replaceable
|
|
WHERE kind = 0
|
|
UNION ALL
|
|
SELECT id, pubkey, created_at, kind, content, sig FROM events_addressable
|
|
WHERE kind = 30023
|
|
ORDER BY created_at DESC, id ASC
|
|
LIMIT ?;
|
|
```
|
|
|
|
### Scenario 2: Cross-Table Author Filter
|
|
```sql
|
|
-- Filter: {"authors": ["pubkey1"]}
|
|
-- Must check ALL tables for this author
|
|
|
|
SELECT id, pubkey, created_at, kind, content, sig FROM events_regular
|
|
WHERE pubkey = 'pubkey1'
|
|
UNION ALL
|
|
SELECT id, pubkey, created_at, kind, content, sig FROM events_replaceable
|
|
WHERE pubkey = 'pubkey1'
|
|
UNION ALL
|
|
SELECT id, pubkey, created_at, kind, content, sig FROM events_ephemeral
|
|
WHERE pubkey = 'pubkey1'
|
|
UNION ALL
|
|
SELECT id, pubkey, created_at, kind, content, sig FROM events_addressable
|
|
WHERE pubkey = 'pubkey1'
|
|
ORDER BY created_at DESC, id ASC
|
|
LIMIT ?;
|
|
```
|
|
|
|
### Scenario 3: Complex Multi-Condition Filter
|
|
```sql
|
|
-- Filter: {"authors": ["pubkey1"], "kinds": [1, 0], "#p": ["target"], "since": 1234567890}
|
|
-- Extremely complex with multiple UNIONs and tag JOINs
|
|
|
|
WITH regular_results AS (
|
|
SELECT DISTINCT r.*
|
|
FROM events_regular r
|
|
JOIN tags_regular tr ON r.id = tr.event_id
|
|
WHERE r.pubkey = 'pubkey1'
|
|
AND r.kind = 1
|
|
AND r.created_at >= 1234567890
|
|
AND tr.name = 'p' AND tr.value = 'target'
|
|
),
|
|
replaceable_results AS (
|
|
SELECT DISTINCT rp.*
|
|
FROM events_replaceable rp
|
|
JOIN tags_replaceable trp ON (rp.pubkey, rp.kind) = (trp.event_pubkey, trp.event_kind)
|
|
WHERE rp.pubkey = 'pubkey1'
|
|
AND rp.kind = 0
|
|
AND rp.created_at >= 1234567890
|
|
AND trp.name = 'p' AND trp.value = 'target'
|
|
)
|
|
SELECT * FROM regular_results
|
|
UNION ALL
|
|
SELECT * FROM replaceable_results
|
|
ORDER BY created_at DESC, id ASC
|
|
LIMIT ?;
|
|
```
|
|
|
|
## Implementation Burden Analysis
|
|
|
|
### Single Table Approach
|
|
```c
|
|
// Simple - one query builder function
|
|
char* build_filter_query(cJSON* filters) {
|
|
// Build single SELECT with WHERE conditions
|
|
// Single ORDER BY and LIMIT
|
|
// One execution path
|
|
}
|
|
```
|
|
|
|
### Multi-Table Approach
|
|
```c
|
|
// Complex - requires routing and union logic
|
|
char* build_multi_table_query(cJSON* filters) {
|
|
// 1. Analyze kinds to determine which tables to query
|
|
// 2. Split filters per table type
|
|
// 3. Build separate queries for each table
|
|
// 4. Union results with complex ORDER BY
|
|
// 5. Handle LIMIT across UNION (tricky!)
|
|
}
|
|
|
|
typedef struct {
|
|
bool needs_regular;
|
|
bool needs_replaceable;
|
|
bool needs_ephemeral;
|
|
bool needs_addressable;
|
|
cJSON* regular_filter;
|
|
cJSON* replaceable_filter;
|
|
cJSON* ephemeral_filter;
|
|
cJSON* addressable_filter;
|
|
} filter_routing_t;
|
|
```
|
|
|
|
### Query Routing Complexity
|
|
|
|
For each REQ filter, we must:
|
|
|
|
1. **Analyze kinds** → Determine which tables to query
|
|
2. **Split filters** → Create per-table filter conditions
|
|
3. **Handle tag filters** → Different tag table references per event type
|
|
4. **Union results** → Merge with proper ordering
|
|
5. **Apply LIMIT** → Complex with UNION queries
|
|
|
|
## Performance Implications
|
|
|
|
### Single Table Advantages:
|
|
- ✅ **Single query execution**
|
|
- ✅ **One index strategy**
|
|
- ✅ **Simple LIMIT handling**
|
|
- ✅ **Unified ORDER BY**
|
|
- ✅ **No UNION overhead**
|
|
|
|
### Multi-Table Disadvantages:
|
|
- ❌ **Multiple query executions**
|
|
- ❌ **UNION sorting overhead**
|
|
- ❌ **Complex LIMIT application**
|
|
- ❌ **Index fragmentation across tables**
|
|
- ❌ **Result set merging complexity**
|
|
|
|
## Specific REQ Filter Challenges
|
|
|
|
### 1. LIMIT Handling with UNION
|
|
```sql
|
|
-- WRONG: Limit applies to each subquery
|
|
(SELECT * FROM events_regular WHERE ... LIMIT 100)
|
|
UNION ALL
|
|
(SELECT * FROM events_replaceable WHERE ... LIMIT 100)
|
|
-- Could return 200 events!
|
|
|
|
-- CORRECT: Limit applies to final result
|
|
SELECT * FROM (
|
|
SELECT * FROM events_regular WHERE ...
|
|
UNION ALL
|
|
SELECT * FROM events_replaceable WHERE ...
|
|
ORDER BY created_at DESC, id ASC
|
|
) LIMIT 100;
|
|
-- But this sorts ALL results before limiting!
|
|
```
|
|
|
|
### 2. Tag Filter Complexity
|
|
Each event type needs different tag table joins:
|
|
- `events_regular` → `tags_regular`
|
|
- `events_replaceable` → `tags_replaceable` (with composite key)
|
|
- `events_addressable` → `tags_addressable` (with composite key)
|
|
- `events_ephemeral` → `tags_ephemeral`
|
|
|
|
### 3. Subscription State Management
|
|
With multiple tables, subscription state becomes complex:
|
|
- Which tables does this subscription monitor?
|
|
- How to efficiently check new events across tables?
|
|
- Different trigger/notification patterns per table
|
|
|
|
## Alternative: Unified Event View
|
|
|
|
### Hybrid Approach: Views Over Multi-Tables
|
|
```sql
|
|
-- Create unified view for queries
|
|
CREATE VIEW all_events AS
|
|
SELECT
|
|
'regular' as event_type,
|
|
id, pubkey, created_at, kind, content, sig
|
|
FROM events_regular
|
|
UNION ALL
|
|
SELECT
|
|
'replaceable' as event_type,
|
|
id, pubkey, created_at, kind, content, sig
|
|
FROM events_replaceable
|
|
UNION ALL
|
|
SELECT
|
|
'ephemeral' as event_type,
|
|
id, pubkey, created_at, kind, content, sig
|
|
FROM events_ephemeral
|
|
UNION ALL
|
|
SELECT
|
|
'addressable' as event_type,
|
|
id, pubkey, created_at, kind, content, sig
|
|
FROM events_addressable;
|
|
|
|
-- Unified tag view
|
|
CREATE VIEW all_tags AS
|
|
SELECT event_id, name, value, parameters FROM tags_regular
|
|
UNION ALL
|
|
SELECT CONCAT(event_pubkey, ':', event_kind), name, value, parameters FROM tags_replaceable
|
|
UNION ALL
|
|
SELECT event_id, name, value, parameters FROM tags_ephemeral
|
|
UNION ALL
|
|
SELECT CONCAT(event_pubkey, ':', event_kind, ':', d_tag), name, value, parameters FROM tags_addressable;
|
|
```
|
|
|
|
### REQ Query Against Views:
|
|
```sql
|
|
-- Much simpler - back to single-table complexity
|
|
SELECT DISTINCT e.*
|
|
FROM all_events e
|
|
JOIN all_tags t ON e.id = t.event_id
|
|
WHERE e.pubkey IN (?)
|
|
AND e.kind IN (?)
|
|
AND t.name = 'p' AND t.value = ?
|
|
ORDER BY e.created_at DESC, e.id ASC
|
|
LIMIT ?;
|
|
```
|
|
|
|
## Recommendation
|
|
|
|
**The multi-table approach creates significant subscription query complexity that may outweigh its benefits.**
|
|
|
|
### Key Issues:
|
|
1. **REQ filters don't map to event types** - clients filter by kind, author, tags, not storage semantics
|
|
2. **UNION query complexity** - much harder to optimize and implement
|
|
3. **Subscription management burden** - must monitor multiple tables
|
|
4. **Performance uncertainty** - UNION queries may be slower than single table
|
|
|
|
### Alternative Recommendation:
|
|
|
|
**Modified Single Table with Event Type Column:**
|
|
|
|
```sql
|
|
CREATE TABLE events (
|
|
id TEXT PRIMARY KEY,
|
|
pubkey TEXT NOT NULL,
|
|
created_at INTEGER NOT NULL,
|
|
kind INTEGER NOT NULL,
|
|
event_type TEXT NOT NULL, -- 'regular', 'replaceable', 'ephemeral', 'addressable'
|
|
content TEXT NOT NULL,
|
|
sig TEXT NOT NULL,
|
|
tags JSON,
|
|
|
|
-- Replaceable event fields
|
|
replaced_at INTEGER,
|
|
|
|
-- Addressable event fields
|
|
d_tag TEXT,
|
|
|
|
-- Unique constraints per event type
|
|
CONSTRAINT unique_replaceable
|
|
UNIQUE (pubkey, kind) WHERE event_type = 'replaceable',
|
|
CONSTRAINT unique_addressable
|
|
UNIQUE (pubkey, kind, d_tag) WHERE event_type = 'addressable'
|
|
);
|
|
```
|
|
|
|
### Benefits:
|
|
- ✅ **Simple REQ queries** - single table, familiar patterns
|
|
- ✅ **Type enforcement** - partial unique constraints handle replacement logic
|
|
- ✅ **Performance** - unified indexes, no UNIONs
|
|
- ✅ **Implementation simplicity** - minimal changes from current code
|
|
- ✅ **Future flexibility** - can split tables later if needed
|
|
|
|
This approach gets the best of both worlds: protocol compliance through constraints, but query simplicity through unified storage. |