9.8 KiB
9.8 KiB
Advanced Nostr Relay Schema Design
Overview
This document outlines the design for an advanced multi-table schema that enforces Nostr protocol compliance at the database level, with separate tables for different event types based on their storage and replacement characteristics.
Event Type Classification
Based on the Nostr specification, events are classified into four categories:
1. Regular Events
- Kinds:
1000 <= n < 10000||4 <= n < 45||n == 1||n == 2 - Storage Policy: All events stored permanently
- Examples: Text notes (1), Reposts (6), Reactions (7), Direct Messages (4)
2. Replaceable Events
- Kinds:
10000 <= n < 20000||n == 0||n == 3 - Storage Policy: Only latest per
(pubkey, kind)combination - Replacement Logic: Latest
created_at, then lowestidlexically - Examples: Metadata (0), Contacts (3), Mute List (10000)
3. Ephemeral Events
- Kinds:
20000 <= n < 30000 - Storage Policy: Not expected to be stored (optional temporary storage)
- Examples: Typing indicators, presence updates, ephemeral messages
4. Addressable Events
- Kinds:
30000 <= n < 40000 - Storage Policy: Only latest per
(pubkey, kind, d_tag)combination - Replacement Logic: Same as replaceable events
- Examples: Long-form content (30023), Application-specific data
SQLite JSON Capabilities Research
SQLite provides powerful JSON functions that could be leveraged for tag storage:
Core JSON Functions
-- Extract specific values
json_extract(column, '$.path')
-- Iterate through arrays
json_each(json_array_column)
-- Flatten nested structures
json_tree(json_column)
-- Validate JSON structure
json_valid(column)
-- Array operations
json_array_length(column)
json_extract(column, '$[0]') -- First element
Tag Query Examples
Find all 'e' tag references:
SELECT
id,
json_extract(value, '$[1]') as referenced_event_id,
json_extract(value, '$[2]') as relay_hint,
json_extract(value, '$[3]') as marker
FROM events, json_each(tags)
WHERE json_extract(value, '$[0]') = 'e';
Find events with specific hashtags:
SELECT id, content
FROM events, json_each(tags)
WHERE json_extract(value, '$[0]') = 't'
AND json_extract(value, '$[1]') = 'bitcoin';
Extract 'd' tag for addressable events:
SELECT
id,
json_extract(value, '$[1]') as d_tag_value
FROM events, json_each(tags)
WHERE json_extract(value, '$[0]') = 'd'
LIMIT 1;
JSON Functional Indexes
-- Index on hashtags
CREATE INDEX idx_hashtags ON events(
json_extract(tags, '$[*][1]')
) WHERE json_extract(tags, '$[*][0]') = 't';
-- Index on 'd' tags for addressable events
CREATE INDEX idx_d_tags ON events_addressable(
json_extract(tags, '$[*][1]')
) WHERE json_extract(tags, '$[*][0]') = 'd';
Proposed Schema Design
Option 1: Separate Tables with JSON Tags
-- Regular Events (permanent storage)
CREATE TABLE events_regular (
id TEXT PRIMARY KEY,
pubkey TEXT NOT NULL,
created_at INTEGER NOT NULL,
kind INTEGER NOT NULL,
content TEXT NOT NULL,
sig TEXT NOT NULL,
tags JSON,
first_seen INTEGER DEFAULT (strftime('%s', 'now')),
CONSTRAINT kind_regular CHECK (
(kind >= 1000 AND kind < 10000) OR
(kind >= 4 AND kind < 45) OR
kind = 1 OR kind = 2
)
);
-- Replaceable Events (latest per pubkey+kind)
CREATE TABLE events_replaceable (
pubkey TEXT NOT NULL,
kind INTEGER NOT NULL,
id TEXT NOT NULL,
created_at INTEGER NOT NULL,
content TEXT NOT NULL,
sig TEXT NOT NULL,
tags JSON,
replaced_at INTEGER DEFAULT (strftime('%s', 'now')),
PRIMARY KEY (pubkey, kind),
CONSTRAINT kind_replaceable CHECK (
(kind >= 10000 AND kind < 20000) OR
kind = 0 OR kind = 3
)
);
-- Ephemeral Events (temporary/optional storage)
CREATE TABLE events_ephemeral (
id TEXT PRIMARY KEY,
pubkey TEXT NOT NULL,
created_at INTEGER NOT NULL,
kind INTEGER NOT NULL,
content TEXT NOT NULL,
sig TEXT NOT NULL,
tags JSON,
expires_at INTEGER DEFAULT (strftime('%s', 'now', '+1 hour')),
CONSTRAINT kind_ephemeral CHECK (
kind >= 20000 AND kind < 30000
)
);
-- Addressable Events (latest per pubkey+kind+d_tag)
CREATE TABLE events_addressable (
pubkey TEXT NOT NULL,
kind INTEGER NOT NULL,
d_tag TEXT NOT NULL,
id TEXT NOT NULL,
created_at INTEGER NOT NULL,
content TEXT NOT NULL,
sig TEXT NOT NULL,
tags JSON,
replaced_at INTEGER DEFAULT (strftime('%s', 'now')),
PRIMARY KEY (pubkey, kind, d_tag),
CONSTRAINT kind_addressable CHECK (
kind >= 30000 AND kind < 40000
)
);
Indexes for Performance
-- Regular events indexes
CREATE INDEX idx_regular_pubkey ON events_regular(pubkey);
CREATE INDEX idx_regular_kind ON events_regular(kind);
CREATE INDEX idx_regular_created_at ON events_regular(created_at);
CREATE INDEX idx_regular_kind_created_at ON events_regular(kind, created_at);
-- Replaceable events indexes
CREATE INDEX idx_replaceable_created_at ON events_replaceable(created_at);
CREATE INDEX idx_replaceable_id ON events_replaceable(id);
-- Ephemeral events indexes
CREATE INDEX idx_ephemeral_expires_at ON events_ephemeral(expires_at);
CREATE INDEX idx_ephemeral_pubkey ON events_ephemeral(pubkey);
-- Addressable events indexes
CREATE INDEX idx_addressable_created_at ON events_addressable(created_at);
CREATE INDEX idx_addressable_id ON events_addressable(id);
-- JSON tag indexes (examples)
CREATE INDEX idx_regular_e_tags ON events_regular(
json_extract(tags, '$[*][1]')
) WHERE json_extract(tags, '$[*][0]') = 'e';
CREATE INDEX idx_regular_p_tags ON events_regular(
json_extract(tags, '$[*][1]')
) WHERE json_extract(tags, '$[*][0]') = 'p';
Option 2: Unified Tag Table Approach
-- Unified tag storage (alternative to JSON)
CREATE TABLE tags_unified (
event_id TEXT NOT NULL,
event_type TEXT NOT NULL, -- 'regular', 'replaceable', 'ephemeral', 'addressable'
tag_index INTEGER NOT NULL, -- Position in tag array
name TEXT NOT NULL,
value TEXT NOT NULL,
param_2 TEXT, -- Third element if present
param_3 TEXT, -- Fourth element if present
param_json TEXT, -- JSON for additional parameters
PRIMARY KEY (event_id, tag_index)
);
CREATE INDEX idx_tags_name_value ON tags_unified(name, value);
CREATE INDEX idx_tags_event_type ON tags_unified(event_type);
Implementation Strategy
1. Kind Classification Function (C Code)
typedef enum {
EVENT_TYPE_REGULAR,
EVENT_TYPE_REPLACEABLE,
EVENT_TYPE_EPHEMERAL,
EVENT_TYPE_ADDRESSABLE,
EVENT_TYPE_INVALID
} event_type_t;
event_type_t classify_event_kind(int kind) {
if ((kind >= 1000 && kind < 10000) ||
(kind >= 4 && kind < 45) ||
kind == 1 || kind == 2) {
return EVENT_TYPE_REGULAR;
}
if ((kind >= 10000 && kind < 20000) ||
kind == 0 || kind == 3) {
return EVENT_TYPE_REPLACEABLE;
}
if (kind >= 20000 && kind < 30000) {
return EVENT_TYPE_EPHEMERAL;
}
if (kind >= 30000 && kind < 40000) {
return EVENT_TYPE_ADDRESSABLE;
}
return EVENT_TYPE_INVALID;
}
2. Replacement Logic for Replaceable Events
-- Trigger for replaceable events
CREATE TRIGGER replace_event_on_insert
BEFORE INSERT ON events_replaceable
FOR EACH ROW
WHEN EXISTS (
SELECT 1 FROM events_replaceable
WHERE pubkey = NEW.pubkey AND kind = NEW.kind
)
BEGIN
DELETE FROM events_replaceable
WHERE pubkey = NEW.pubkey
AND kind = NEW.kind
AND (
created_at < NEW.created_at OR
(created_at = NEW.created_at AND id > NEW.id)
);
END;
3. D-Tag Extraction for Addressable Events
char* extract_d_tag(cJSON* tags) {
if (!tags || !cJSON_IsArray(tags)) {
return NULL;
}
cJSON* tag;
cJSON_ArrayForEach(tag, tags) {
if (cJSON_IsArray(tag) && cJSON_GetArraySize(tag) >= 2) {
cJSON* tag_name = cJSON_GetArrayItem(tag, 0);
cJSON* tag_value = cJSON_GetArrayItem(tag, 1);
if (cJSON_IsString(tag_name) && cJSON_IsString(tag_value)) {
if (strcmp(cJSON_GetStringValue(tag_name), "d") == 0) {
return strdup(cJSON_GetStringValue(tag_value));
}
}
}
}
return strdup(""); // Default empty d-tag
}
Advantages of This Design
1. Protocol Compliance
- Enforced at DB level: Schema constraints prevent invalid event storage
- Automatic replacement: Triggers handle replaceable/addressable event logic
- Type safety: Separate tables ensure correct handling per event type
2. Performance Benefits
- Targeted indexes: Each table optimized for its access patterns
- Reduced storage: Ephemeral events can be auto-expired
- Query optimization: SQLite can optimize queries per table structure
3. JSON Tag Benefits
- Atomic storage: Tags stored with their event
- Rich querying: SQLite JSON functions enable complex tag queries
- Schema flexibility: Can handle arbitrary tag structures
- Functional indexes: Index specific tag patterns efficiently
Migration Strategy
- Phase 1: Create new schema alongside existing
- Phase 2: Implement kind classification and routing logic
- Phase 3: Migrate existing data to appropriate tables
- Phase 4: Update application logic to use new tables
- Phase 5: Drop old schema after verification
Next Steps for Implementation
- Prototype JSON performance: Create test database with sample data
- Benchmark query patterns: Compare JSON vs normalized approaches
- Implement kind classification: Add routing logic to C code
- Create migration scripts: Handle existing data transformation
- Update test suite: Verify compliance with new schema