9.7 KiB
Subscription Query Complexity Analysis
Overview
This document analyzes how Nostr REQ subscription filters would be implemented across different schema designs, focusing on query complexity, performance implications, and implementation burden.
Nostr REQ Filter Specification Recap
Clients send REQ messages with filters containing:
ids: List of specific event IDsauthors: List of pubkeyskinds: List of event kinds#<letter>: Tag filters (e.g.,#efor event refs,#pfor pubkey mentions)since/until: Time range filterslimit: Maximum events to return
Key Filter Behaviors:
- Multiple filters = OR logic: Match any filter
- Within filter = AND logic: Match all specified conditions
- Lists = IN logic: Match any value in the list
- Tag filters: Must have at least one matching tag
Schema Comparison for REQ Handling
Current Simple Schema (Single Table)
CREATE TABLE event (
id TEXT PRIMARY KEY,
pubkey TEXT NOT NULL,
created_at INTEGER NOT NULL,
kind INTEGER NOT NULL,
content TEXT NOT NULL,
sig TEXT NOT NULL
);
CREATE TABLE tag (
id TEXT NOT NULL, -- event ID
name TEXT NOT NULL,
value TEXT NOT NULL,
parameters TEXT
);
Sample REQ Query Implementation:
-- Filter: {"authors": ["pubkey1", "pubkey2"], "kinds": [1, 6], "#p": ["target_pubkey"]}
SELECT DISTINCT e.*
FROM event e
WHERE e.pubkey IN ('pubkey1', 'pubkey2')
AND e.kind IN (1, 6)
AND EXISTS (
SELECT 1 FROM tag t
WHERE t.id = e.id AND t.name = 'p' AND t.value = 'target_pubkey'
)
ORDER BY e.created_at DESC, e.id ASC
LIMIT ?;
Multi-Table Schema Challenge
With separate tables (events_regular, events_replaceable, events_ephemeral, events_addressable), a REQ filter could potentially match events across ALL tables.
Problem Example:
Filter: {"kinds": [1, 0, 20001, 30023]}
- Kind 1 →
events_regular - Kind 0 →
events_replaceable - Kind 20001 →
events_ephemeral - Kind 30023 →
events_addressable
This requires 4 separate queries + UNION, significantly complicating the implementation.
Multi-Table Query Complexity
Scenario 1: Cross-Table Kind Filter
-- Filter: {"kinds": [1, 0, 30023]}
-- Requires querying 3 different tables
SELECT id, pubkey, created_at, kind, content, sig FROM events_regular
WHERE kind = 1
UNION ALL
SELECT id, pubkey, created_at, kind, content, sig FROM events_replaceable
WHERE kind = 0
UNION ALL
SELECT id, pubkey, created_at, kind, content, sig FROM events_addressable
WHERE kind = 30023
ORDER BY created_at DESC, id ASC
LIMIT ?;
Scenario 2: Cross-Table Author Filter
-- Filter: {"authors": ["pubkey1"]}
-- Must check ALL tables for this author
SELECT id, pubkey, created_at, kind, content, sig FROM events_regular
WHERE pubkey = 'pubkey1'
UNION ALL
SELECT id, pubkey, created_at, kind, content, sig FROM events_replaceable
WHERE pubkey = 'pubkey1'
UNION ALL
SELECT id, pubkey, created_at, kind, content, sig FROM events_ephemeral
WHERE pubkey = 'pubkey1'
UNION ALL
SELECT id, pubkey, created_at, kind, content, sig FROM events_addressable
WHERE pubkey = 'pubkey1'
ORDER BY created_at DESC, id ASC
LIMIT ?;
Scenario 3: Complex Multi-Condition Filter
-- Filter: {"authors": ["pubkey1"], "kinds": [1, 0], "#p": ["target"], "since": 1234567890}
-- Extremely complex with multiple UNIONs and tag JOINs
WITH regular_results AS (
SELECT DISTINCT r.*
FROM events_regular r
JOIN tags_regular tr ON r.id = tr.event_id
WHERE r.pubkey = 'pubkey1'
AND r.kind = 1
AND r.created_at >= 1234567890
AND tr.name = 'p' AND tr.value = 'target'
),
replaceable_results AS (
SELECT DISTINCT rp.*
FROM events_replaceable rp
JOIN tags_replaceable trp ON (rp.pubkey, rp.kind) = (trp.event_pubkey, trp.event_kind)
WHERE rp.pubkey = 'pubkey1'
AND rp.kind = 0
AND rp.created_at >= 1234567890
AND trp.name = 'p' AND trp.value = 'target'
)
SELECT * FROM regular_results
UNION ALL
SELECT * FROM replaceable_results
ORDER BY created_at DESC, id ASC
LIMIT ?;
Implementation Burden Analysis
Single Table Approach
// Simple - one query builder function
char* build_filter_query(cJSON* filters) {
// Build single SELECT with WHERE conditions
// Single ORDER BY and LIMIT
// One execution path
}
Multi-Table Approach
// Complex - requires routing and union logic
char* build_multi_table_query(cJSON* filters) {
// 1. Analyze kinds to determine which tables to query
// 2. Split filters per table type
// 3. Build separate queries for each table
// 4. Union results with complex ORDER BY
// 5. Handle LIMIT across UNION (tricky!)
}
typedef struct {
bool needs_regular;
bool needs_replaceable;
bool needs_ephemeral;
bool needs_addressable;
cJSON* regular_filter;
cJSON* replaceable_filter;
cJSON* ephemeral_filter;
cJSON* addressable_filter;
} filter_routing_t;
Query Routing Complexity
For each REQ filter, we must:
- Analyze kinds → Determine which tables to query
- Split filters → Create per-table filter conditions
- Handle tag filters → Different tag table references per event type
- Union results → Merge with proper ordering
- Apply LIMIT → Complex with UNION queries
Performance Implications
Single Table Advantages:
- ✅ Single query execution
- ✅ One index strategy
- ✅ Simple LIMIT handling
- ✅ Unified ORDER BY
- ✅ No UNION overhead
Multi-Table Disadvantages:
- ❌ Multiple query executions
- ❌ UNION sorting overhead
- ❌ Complex LIMIT application
- ❌ Index fragmentation across tables
- ❌ Result set merging complexity
Specific REQ Filter Challenges
1. LIMIT Handling with UNION
-- WRONG: Limit applies to each subquery
(SELECT * FROM events_regular WHERE ... LIMIT 100)
UNION ALL
(SELECT * FROM events_replaceable WHERE ... LIMIT 100)
-- Could return 200 events!
-- CORRECT: Limit applies to final result
SELECT * FROM (
SELECT * FROM events_regular WHERE ...
UNION ALL
SELECT * FROM events_replaceable WHERE ...
ORDER BY created_at DESC, id ASC
) LIMIT 100;
-- But this sorts ALL results before limiting!
2. Tag Filter Complexity
Each event type needs different tag table joins:
events_regular→tags_regularevents_replaceable→tags_replaceable(with composite key)events_addressable→tags_addressable(with composite key)events_ephemeral→tags_ephemeral
3. Subscription State Management
With multiple tables, subscription state becomes complex:
- Which tables does this subscription monitor?
- How to efficiently check new events across tables?
- Different trigger/notification patterns per table
Alternative: Unified Event View
Hybrid Approach: Views Over Multi-Tables
-- Create unified view for queries
CREATE VIEW all_events AS
SELECT
'regular' as event_type,
id, pubkey, created_at, kind, content, sig
FROM events_regular
UNION ALL
SELECT
'replaceable' as event_type,
id, pubkey, created_at, kind, content, sig
FROM events_replaceable
UNION ALL
SELECT
'ephemeral' as event_type,
id, pubkey, created_at, kind, content, sig
FROM events_ephemeral
UNION ALL
SELECT
'addressable' as event_type,
id, pubkey, created_at, kind, content, sig
FROM events_addressable;
-- Unified tag view
CREATE VIEW all_tags AS
SELECT event_id, name, value, parameters FROM tags_regular
UNION ALL
SELECT CONCAT(event_pubkey, ':', event_kind), name, value, parameters FROM tags_replaceable
UNION ALL
SELECT event_id, name, value, parameters FROM tags_ephemeral
UNION ALL
SELECT CONCAT(event_pubkey, ':', event_kind, ':', d_tag), name, value, parameters FROM tags_addressable;
REQ Query Against Views:
-- Much simpler - back to single-table complexity
SELECT DISTINCT e.*
FROM all_events e
JOIN all_tags t ON e.id = t.event_id
WHERE e.pubkey IN (?)
AND e.kind IN (?)
AND t.name = 'p' AND t.value = ?
ORDER BY e.created_at DESC, e.id ASC
LIMIT ?;
Recommendation
The multi-table approach creates significant subscription query complexity that may outweigh its benefits.
Key Issues:
- REQ filters don't map to event types - clients filter by kind, author, tags, not storage semantics
- UNION query complexity - much harder to optimize and implement
- Subscription management burden - must monitor multiple tables
- Performance uncertainty - UNION queries may be slower than single table
Alternative Recommendation:
Modified Single Table with Event Type Column:
CREATE TABLE events (
id TEXT PRIMARY KEY,
pubkey TEXT NOT NULL,
created_at INTEGER NOT NULL,
kind INTEGER NOT NULL,
event_type TEXT NOT NULL, -- 'regular', 'replaceable', 'ephemeral', 'addressable'
content TEXT NOT NULL,
sig TEXT NOT NULL,
tags JSON,
-- Replaceable event fields
replaced_at INTEGER,
-- Addressable event fields
d_tag TEXT,
-- Unique constraints per event type
CONSTRAINT unique_replaceable
UNIQUE (pubkey, kind) WHERE event_type = 'replaceable',
CONSTRAINT unique_addressable
UNIQUE (pubkey, kind, d_tag) WHERE event_type = 'addressable'
);
Benefits:
- ✅ Simple REQ queries - single table, familiar patterns
- ✅ Type enforcement - partial unique constraints handle replacement logic
- ✅ Performance - unified indexes, no UNIONs
- ✅ Implementation simplicity - minimal changes from current code
- ✅ Future flexibility - can split tables later if needed
This approach gets the best of both worlds: protocol compliance through constraints, but query simplicity through unified storage.