Basic relay functionality completed
This commit is contained in:
331
docs/subscription_query_analysis.md
Normal file
331
docs/subscription_query_analysis.md
Normal file
@@ -0,0 +1,331 @@
|
||||
# Subscription Query Complexity Analysis
|
||||
|
||||
## Overview
|
||||
|
||||
This document analyzes how Nostr REQ subscription filters would be implemented across different schema designs, focusing on query complexity, performance implications, and implementation burden.
|
||||
|
||||
## Nostr REQ Filter Specification Recap
|
||||
|
||||
Clients send REQ messages with filters containing:
|
||||
- **`ids`**: List of specific event IDs
|
||||
- **`authors`**: List of pubkeys
|
||||
- **`kinds`**: List of event kinds
|
||||
- **`#<letter>`**: Tag filters (e.g., `#e` for event refs, `#p` for pubkey mentions)
|
||||
- **`since`/`until`**: Time range filters
|
||||
- **`limit`**: Maximum events to return
|
||||
|
||||
### Key Filter Behaviors:
|
||||
- **Multiple filters = OR logic**: Match any filter
|
||||
- **Within filter = AND logic**: Match all specified conditions
|
||||
- **Lists = IN logic**: Match any value in the list
|
||||
- **Tag filters**: Must have at least one matching tag
|
||||
|
||||
## Schema Comparison for REQ Handling
|
||||
|
||||
### Current Simple Schema (Single Table)
|
||||
```sql
|
||||
CREATE TABLE event (
|
||||
id TEXT PRIMARY KEY,
|
||||
pubkey TEXT NOT NULL,
|
||||
created_at INTEGER NOT NULL,
|
||||
kind INTEGER NOT NULL,
|
||||
content TEXT NOT NULL,
|
||||
sig TEXT NOT NULL
|
||||
);
|
||||
|
||||
CREATE TABLE tag (
|
||||
id TEXT NOT NULL, -- event ID
|
||||
name TEXT NOT NULL,
|
||||
value TEXT NOT NULL,
|
||||
parameters TEXT
|
||||
);
|
||||
```
|
||||
|
||||
#### Sample REQ Query Implementation:
|
||||
```sql
|
||||
-- Filter: {"authors": ["pubkey1", "pubkey2"], "kinds": [1, 6], "#p": ["target_pubkey"]}
|
||||
SELECT DISTINCT e.*
|
||||
FROM event e
|
||||
WHERE e.pubkey IN ('pubkey1', 'pubkey2')
|
||||
AND e.kind IN (1, 6)
|
||||
AND EXISTS (
|
||||
SELECT 1 FROM tag t
|
||||
WHERE t.id = e.id AND t.name = 'p' AND t.value = 'target_pubkey'
|
||||
)
|
||||
ORDER BY e.created_at DESC, e.id ASC
|
||||
LIMIT ?;
|
||||
```
|
||||
|
||||
### Multi-Table Schema Challenge
|
||||
|
||||
With separate tables (`events_regular`, `events_replaceable`, `events_ephemeral`, `events_addressable`), a REQ filter could potentially match events across ALL tables.
|
||||
|
||||
#### Problem Example:
|
||||
Filter: `{"kinds": [1, 0, 20001, 30023]}`
|
||||
- Kind 1 → `events_regular`
|
||||
- Kind 0 → `events_replaceable`
|
||||
- Kind 20001 → `events_ephemeral`
|
||||
- Kind 30023 → `events_addressable`
|
||||
|
||||
This requires **4 separate queries + UNION**, significantly complicating the implementation.
|
||||
|
||||
## Multi-Table Query Complexity
|
||||
|
||||
### Scenario 1: Cross-Table Kind Filter
|
||||
```sql
|
||||
-- Filter: {"kinds": [1, 0, 30023]}
|
||||
-- Requires querying 3 different tables
|
||||
|
||||
SELECT id, pubkey, created_at, kind, content, sig FROM events_regular
|
||||
WHERE kind = 1
|
||||
UNION ALL
|
||||
SELECT id, pubkey, created_at, kind, content, sig FROM events_replaceable
|
||||
WHERE kind = 0
|
||||
UNION ALL
|
||||
SELECT id, pubkey, created_at, kind, content, sig FROM events_addressable
|
||||
WHERE kind = 30023
|
||||
ORDER BY created_at DESC, id ASC
|
||||
LIMIT ?;
|
||||
```
|
||||
|
||||
### Scenario 2: Cross-Table Author Filter
|
||||
```sql
|
||||
-- Filter: {"authors": ["pubkey1"]}
|
||||
-- Must check ALL tables for this author
|
||||
|
||||
SELECT id, pubkey, created_at, kind, content, sig FROM events_regular
|
||||
WHERE pubkey = 'pubkey1'
|
||||
UNION ALL
|
||||
SELECT id, pubkey, created_at, kind, content, sig FROM events_replaceable
|
||||
WHERE pubkey = 'pubkey1'
|
||||
UNION ALL
|
||||
SELECT id, pubkey, created_at, kind, content, sig FROM events_ephemeral
|
||||
WHERE pubkey = 'pubkey1'
|
||||
UNION ALL
|
||||
SELECT id, pubkey, created_at, kind, content, sig FROM events_addressable
|
||||
WHERE pubkey = 'pubkey1'
|
||||
ORDER BY created_at DESC, id ASC
|
||||
LIMIT ?;
|
||||
```
|
||||
|
||||
### Scenario 3: Complex Multi-Condition Filter
|
||||
```sql
|
||||
-- Filter: {"authors": ["pubkey1"], "kinds": [1, 0], "#p": ["target"], "since": 1234567890}
|
||||
-- Extremely complex with multiple UNIONs and tag JOINs
|
||||
|
||||
WITH regular_results AS (
|
||||
SELECT DISTINCT r.*
|
||||
FROM events_regular r
|
||||
JOIN tags_regular tr ON r.id = tr.event_id
|
||||
WHERE r.pubkey = 'pubkey1'
|
||||
AND r.kind = 1
|
||||
AND r.created_at >= 1234567890
|
||||
AND tr.name = 'p' AND tr.value = 'target'
|
||||
),
|
||||
replaceable_results AS (
|
||||
SELECT DISTINCT rp.*
|
||||
FROM events_replaceable rp
|
||||
JOIN tags_replaceable trp ON (rp.pubkey, rp.kind) = (trp.event_pubkey, trp.event_kind)
|
||||
WHERE rp.pubkey = 'pubkey1'
|
||||
AND rp.kind = 0
|
||||
AND rp.created_at >= 1234567890
|
||||
AND trp.name = 'p' AND trp.value = 'target'
|
||||
)
|
||||
SELECT * FROM regular_results
|
||||
UNION ALL
|
||||
SELECT * FROM replaceable_results
|
||||
ORDER BY created_at DESC, id ASC
|
||||
LIMIT ?;
|
||||
```
|
||||
|
||||
## Implementation Burden Analysis
|
||||
|
||||
### Single Table Approach
|
||||
```c
|
||||
// Simple - one query builder function
|
||||
char* build_filter_query(cJSON* filters) {
|
||||
// Build single SELECT with WHERE conditions
|
||||
// Single ORDER BY and LIMIT
|
||||
// One execution path
|
||||
}
|
||||
```
|
||||
|
||||
### Multi-Table Approach
|
||||
```c
|
||||
// Complex - requires routing and union logic
|
||||
char* build_multi_table_query(cJSON* filters) {
|
||||
// 1. Analyze kinds to determine which tables to query
|
||||
// 2. Split filters per table type
|
||||
// 3. Build separate queries for each table
|
||||
// 4. Union results with complex ORDER BY
|
||||
// 5. Handle LIMIT across UNION (tricky!)
|
||||
}
|
||||
|
||||
typedef struct {
|
||||
bool needs_regular;
|
||||
bool needs_replaceable;
|
||||
bool needs_ephemeral;
|
||||
bool needs_addressable;
|
||||
cJSON* regular_filter;
|
||||
cJSON* replaceable_filter;
|
||||
cJSON* ephemeral_filter;
|
||||
cJSON* addressable_filter;
|
||||
} filter_routing_t;
|
||||
```
|
||||
|
||||
### Query Routing Complexity
|
||||
|
||||
For each REQ filter, we must:
|
||||
|
||||
1. **Analyze kinds** → Determine which tables to query
|
||||
2. **Split filters** → Create per-table filter conditions
|
||||
3. **Handle tag filters** → Different tag table references per event type
|
||||
4. **Union results** → Merge with proper ordering
|
||||
5. **Apply LIMIT** → Complex with UNION queries
|
||||
|
||||
## Performance Implications
|
||||
|
||||
### Single Table Advantages:
|
||||
- ✅ **Single query execution**
|
||||
- ✅ **One index strategy**
|
||||
- ✅ **Simple LIMIT handling**
|
||||
- ✅ **Unified ORDER BY**
|
||||
- ✅ **No UNION overhead**
|
||||
|
||||
### Multi-Table Disadvantages:
|
||||
- ❌ **Multiple query executions**
|
||||
- ❌ **UNION sorting overhead**
|
||||
- ❌ **Complex LIMIT application**
|
||||
- ❌ **Index fragmentation across tables**
|
||||
- ❌ **Result set merging complexity**
|
||||
|
||||
## Specific REQ Filter Challenges
|
||||
|
||||
### 1. LIMIT Handling with UNION
|
||||
```sql
|
||||
-- WRONG: Limit applies to each subquery
|
||||
(SELECT * FROM events_regular WHERE ... LIMIT 100)
|
||||
UNION ALL
|
||||
(SELECT * FROM events_replaceable WHERE ... LIMIT 100)
|
||||
-- Could return 200 events!
|
||||
|
||||
-- CORRECT: Limit applies to final result
|
||||
SELECT * FROM (
|
||||
SELECT * FROM events_regular WHERE ...
|
||||
UNION ALL
|
||||
SELECT * FROM events_replaceable WHERE ...
|
||||
ORDER BY created_at DESC, id ASC
|
||||
) LIMIT 100;
|
||||
-- But this sorts ALL results before limiting!
|
||||
```
|
||||
|
||||
### 2. Tag Filter Complexity
|
||||
Each event type needs different tag table joins:
|
||||
- `events_regular` → `tags_regular`
|
||||
- `events_replaceable` → `tags_replaceable` (with composite key)
|
||||
- `events_addressable` → `tags_addressable` (with composite key)
|
||||
- `events_ephemeral` → `tags_ephemeral`
|
||||
|
||||
### 3. Subscription State Management
|
||||
With multiple tables, subscription state becomes complex:
|
||||
- Which tables does this subscription monitor?
|
||||
- How to efficiently check new events across tables?
|
||||
- Different trigger/notification patterns per table
|
||||
|
||||
## Alternative: Unified Event View
|
||||
|
||||
### Hybrid Approach: Views Over Multi-Tables
|
||||
```sql
|
||||
-- Create unified view for queries
|
||||
CREATE VIEW all_events AS
|
||||
SELECT
|
||||
'regular' as event_type,
|
||||
id, pubkey, created_at, kind, content, sig
|
||||
FROM events_regular
|
||||
UNION ALL
|
||||
SELECT
|
||||
'replaceable' as event_type,
|
||||
id, pubkey, created_at, kind, content, sig
|
||||
FROM events_replaceable
|
||||
UNION ALL
|
||||
SELECT
|
||||
'ephemeral' as event_type,
|
||||
id, pubkey, created_at, kind, content, sig
|
||||
FROM events_ephemeral
|
||||
UNION ALL
|
||||
SELECT
|
||||
'addressable' as event_type,
|
||||
id, pubkey, created_at, kind, content, sig
|
||||
FROM events_addressable;
|
||||
|
||||
-- Unified tag view
|
||||
CREATE VIEW all_tags AS
|
||||
SELECT event_id, name, value, parameters FROM tags_regular
|
||||
UNION ALL
|
||||
SELECT CONCAT(event_pubkey, ':', event_kind), name, value, parameters FROM tags_replaceable
|
||||
UNION ALL
|
||||
SELECT event_id, name, value, parameters FROM tags_ephemeral
|
||||
UNION ALL
|
||||
SELECT CONCAT(event_pubkey, ':', event_kind, ':', d_tag), name, value, parameters FROM tags_addressable;
|
||||
```
|
||||
|
||||
### REQ Query Against Views:
|
||||
```sql
|
||||
-- Much simpler - back to single-table complexity
|
||||
SELECT DISTINCT e.*
|
||||
FROM all_events e
|
||||
JOIN all_tags t ON e.id = t.event_id
|
||||
WHERE e.pubkey IN (?)
|
||||
AND e.kind IN (?)
|
||||
AND t.name = 'p' AND t.value = ?
|
||||
ORDER BY e.created_at DESC, e.id ASC
|
||||
LIMIT ?;
|
||||
```
|
||||
|
||||
## Recommendation
|
||||
|
||||
**The multi-table approach creates significant subscription query complexity that may outweigh its benefits.**
|
||||
|
||||
### Key Issues:
|
||||
1. **REQ filters don't map to event types** - clients filter by kind, author, tags, not storage semantics
|
||||
2. **UNION query complexity** - much harder to optimize and implement
|
||||
3. **Subscription management burden** - must monitor multiple tables
|
||||
4. **Performance uncertainty** - UNION queries may be slower than single table
|
||||
|
||||
### Alternative Recommendation:
|
||||
|
||||
**Modified Single Table with Event Type Column:**
|
||||
|
||||
```sql
|
||||
CREATE TABLE events (
|
||||
id TEXT PRIMARY KEY,
|
||||
pubkey TEXT NOT NULL,
|
||||
created_at INTEGER NOT NULL,
|
||||
kind INTEGER NOT NULL,
|
||||
event_type TEXT NOT NULL, -- 'regular', 'replaceable', 'ephemeral', 'addressable'
|
||||
content TEXT NOT NULL,
|
||||
sig TEXT NOT NULL,
|
||||
tags JSON,
|
||||
|
||||
-- Replaceable event fields
|
||||
replaced_at INTEGER,
|
||||
|
||||
-- Addressable event fields
|
||||
d_tag TEXT,
|
||||
|
||||
-- Unique constraints per event type
|
||||
CONSTRAINT unique_replaceable
|
||||
UNIQUE (pubkey, kind) WHERE event_type = 'replaceable',
|
||||
CONSTRAINT unique_addressable
|
||||
UNIQUE (pubkey, kind, d_tag) WHERE event_type = 'addressable'
|
||||
);
|
||||
```
|
||||
|
||||
### Benefits:
|
||||
- ✅ **Simple REQ queries** - single table, familiar patterns
|
||||
- ✅ **Type enforcement** - partial unique constraints handle replacement logic
|
||||
- ✅ **Performance** - unified indexes, no UNIONs
|
||||
- ✅ **Implementation simplicity** - minimal changes from current code
|
||||
- ✅ **Future flexibility** - can split tables later if needed
|
||||
|
||||
This approach gets the best of both worlds: protocol compliance through constraints, but query simplicity through unified storage.
|
||||
Reference in New Issue
Block a user