# Advanced Nostr Relay Schema Design ## Overview This document outlines the design for an advanced multi-table schema that enforces Nostr protocol compliance at the database level, with separate tables for different event types based on their storage and replacement characteristics. ## Event Type Classification Based on the Nostr specification, events are classified into four categories: ### 1. Regular Events - **Kinds**: `1000 <= n < 10000` || `4 <= n < 45` || `n == 1` || `n == 2` - **Storage Policy**: All events stored permanently - **Examples**: Text notes (1), Reposts (6), Reactions (7), Direct Messages (4) ### 2. Replaceable Events - **Kinds**: `10000 <= n < 20000` || `n == 0` || `n == 3` - **Storage Policy**: Only latest per `(pubkey, kind)` combination - **Replacement Logic**: Latest `created_at`, then lowest `id` lexically - **Examples**: Metadata (0), Contacts (3), Mute List (10000) ### 3. Ephemeral Events - **Kinds**: `20000 <= n < 30000` - **Storage Policy**: Not expected to be stored (optional temporary storage) - **Examples**: Typing indicators, presence updates, ephemeral messages ### 4. Addressable Events - **Kinds**: `30000 <= n < 40000` - **Storage Policy**: Only latest per `(pubkey, kind, d_tag)` combination - **Replacement Logic**: Same as replaceable events - **Examples**: Long-form content (30023), Application-specific data ## SQLite JSON Capabilities Research SQLite provides powerful JSON functions that could be leveraged for tag storage: ### Core JSON Functions ```sql -- Extract specific values json_extract(column, '$.path') -- Iterate through arrays json_each(json_array_column) -- Flatten nested structures json_tree(json_column) -- Validate JSON structure json_valid(column) -- Array operations json_array_length(column) json_extract(column, '$[0]') -- First element ``` ### Tag Query Examples #### Find all 'e' tag references: ```sql SELECT id, json_extract(value, '$[1]') as referenced_event_id, json_extract(value, '$[2]') as relay_hint, json_extract(value, '$[3]') as marker FROM events, json_each(tags) WHERE json_extract(value, '$[0]') = 'e'; ``` #### Find events with specific hashtags: ```sql SELECT id, content FROM events, json_each(tags) WHERE json_extract(value, '$[0]') = 't' AND json_extract(value, '$[1]') = 'bitcoin'; ``` #### Extract 'd' tag for addressable events: ```sql SELECT id, json_extract(value, '$[1]') as d_tag_value FROM events, json_each(tags) WHERE json_extract(value, '$[0]') = 'd' LIMIT 1; ``` ### JSON Functional Indexes ```sql -- Index on hashtags CREATE INDEX idx_hashtags ON events( json_extract(tags, '$[*][1]') ) WHERE json_extract(tags, '$[*][0]') = 't'; -- Index on 'd' tags for addressable events CREATE INDEX idx_d_tags ON events_addressable( json_extract(tags, '$[*][1]') ) WHERE json_extract(tags, '$[*][0]') = 'd'; ``` ## Proposed Schema Design ### Option 1: Separate Tables with JSON Tags ```sql -- Regular Events (permanent storage) CREATE TABLE events_regular ( id TEXT PRIMARY KEY, pubkey TEXT NOT NULL, created_at INTEGER NOT NULL, kind INTEGER NOT NULL, content TEXT NOT NULL, sig TEXT NOT NULL, tags JSON, first_seen INTEGER DEFAULT (strftime('%s', 'now')), CONSTRAINT kind_regular CHECK ( (kind >= 1000 AND kind < 10000) OR (kind >= 4 AND kind < 45) OR kind = 1 OR kind = 2 ) ); -- Replaceable Events (latest per pubkey+kind) CREATE TABLE events_replaceable ( pubkey TEXT NOT NULL, kind INTEGER NOT NULL, id TEXT NOT NULL, created_at INTEGER NOT NULL, content TEXT NOT NULL, sig TEXT NOT NULL, tags JSON, replaced_at INTEGER DEFAULT (strftime('%s', 'now')), PRIMARY KEY (pubkey, kind), CONSTRAINT kind_replaceable CHECK ( (kind >= 10000 AND kind < 20000) OR kind = 0 OR kind = 3 ) ); -- Ephemeral Events (temporary/optional storage) CREATE TABLE events_ephemeral ( id TEXT PRIMARY KEY, pubkey TEXT NOT NULL, created_at INTEGER NOT NULL, kind INTEGER NOT NULL, content TEXT NOT NULL, sig TEXT NOT NULL, tags JSON, expires_at INTEGER DEFAULT (strftime('%s', 'now', '+1 hour')), CONSTRAINT kind_ephemeral CHECK ( kind >= 20000 AND kind < 30000 ) ); -- Addressable Events (latest per pubkey+kind+d_tag) CREATE TABLE events_addressable ( pubkey TEXT NOT NULL, kind INTEGER NOT NULL, d_tag TEXT NOT NULL, id TEXT NOT NULL, created_at INTEGER NOT NULL, content TEXT NOT NULL, sig TEXT NOT NULL, tags JSON, replaced_at INTEGER DEFAULT (strftime('%s', 'now')), PRIMARY KEY (pubkey, kind, d_tag), CONSTRAINT kind_addressable CHECK ( kind >= 30000 AND kind < 40000 ) ); ``` ### Indexes for Performance ```sql -- Regular events indexes CREATE INDEX idx_regular_pubkey ON events_regular(pubkey); CREATE INDEX idx_regular_kind ON events_regular(kind); CREATE INDEX idx_regular_created_at ON events_regular(created_at); CREATE INDEX idx_regular_kind_created_at ON events_regular(kind, created_at); -- Replaceable events indexes CREATE INDEX idx_replaceable_created_at ON events_replaceable(created_at); CREATE INDEX idx_replaceable_id ON events_replaceable(id); -- Ephemeral events indexes CREATE INDEX idx_ephemeral_expires_at ON events_ephemeral(expires_at); CREATE INDEX idx_ephemeral_pubkey ON events_ephemeral(pubkey); -- Addressable events indexes CREATE INDEX idx_addressable_created_at ON events_addressable(created_at); CREATE INDEX idx_addressable_id ON events_addressable(id); -- JSON tag indexes (examples) CREATE INDEX idx_regular_e_tags ON events_regular( json_extract(tags, '$[*][1]') ) WHERE json_extract(tags, '$[*][0]') = 'e'; CREATE INDEX idx_regular_p_tags ON events_regular( json_extract(tags, '$[*][1]') ) WHERE json_extract(tags, '$[*][0]') = 'p'; ``` ### Option 2: Unified Tag Table Approach ```sql -- Unified tag storage (alternative to JSON) CREATE TABLE tags_unified ( event_id TEXT NOT NULL, event_type TEXT NOT NULL, -- 'regular', 'replaceable', 'ephemeral', 'addressable' tag_index INTEGER NOT NULL, -- Position in tag array name TEXT NOT NULL, value TEXT NOT NULL, param_2 TEXT, -- Third element if present param_3 TEXT, -- Fourth element if present param_json TEXT, -- JSON for additional parameters PRIMARY KEY (event_id, tag_index) ); CREATE INDEX idx_tags_name_value ON tags_unified(name, value); CREATE INDEX idx_tags_event_type ON tags_unified(event_type); ``` ## Implementation Strategy ### 1. Kind Classification Function (C Code) ```c typedef enum { EVENT_TYPE_REGULAR, EVENT_TYPE_REPLACEABLE, EVENT_TYPE_EPHEMERAL, EVENT_TYPE_ADDRESSABLE, EVENT_TYPE_INVALID } event_type_t; event_type_t classify_event_kind(int kind) { if ((kind >= 1000 && kind < 10000) || (kind >= 4 && kind < 45) || kind == 1 || kind == 2) { return EVENT_TYPE_REGULAR; } if ((kind >= 10000 && kind < 20000) || kind == 0 || kind == 3) { return EVENT_TYPE_REPLACEABLE; } if (kind >= 20000 && kind < 30000) { return EVENT_TYPE_EPHEMERAL; } if (kind >= 30000 && kind < 40000) { return EVENT_TYPE_ADDRESSABLE; } return EVENT_TYPE_INVALID; } ``` ### 2. Replacement Logic for Replaceable Events ```sql -- Trigger for replaceable events CREATE TRIGGER replace_event_on_insert BEFORE INSERT ON events_replaceable FOR EACH ROW WHEN EXISTS ( SELECT 1 FROM events_replaceable WHERE pubkey = NEW.pubkey AND kind = NEW.kind ) BEGIN DELETE FROM events_replaceable WHERE pubkey = NEW.pubkey AND kind = NEW.kind AND ( created_at < NEW.created_at OR (created_at = NEW.created_at AND id > NEW.id) ); END; ``` ### 3. D-Tag Extraction for Addressable Events ```c char* extract_d_tag(cJSON* tags) { if (!tags || !cJSON_IsArray(tags)) { return NULL; } cJSON* tag; cJSON_ArrayForEach(tag, tags) { if (cJSON_IsArray(tag) && cJSON_GetArraySize(tag) >= 2) { cJSON* tag_name = cJSON_GetArrayItem(tag, 0); cJSON* tag_value = cJSON_GetArrayItem(tag, 1); if (cJSON_IsString(tag_name) && cJSON_IsString(tag_value)) { if (strcmp(cJSON_GetStringValue(tag_name), "d") == 0) { return strdup(cJSON_GetStringValue(tag_value)); } } } } return strdup(""); // Default empty d-tag } ``` ## Advantages of This Design ### 1. Protocol Compliance - **Enforced at DB level**: Schema constraints prevent invalid event storage - **Automatic replacement**: Triggers handle replaceable/addressable event logic - **Type safety**: Separate tables ensure correct handling per event type ### 2. Performance Benefits - **Targeted indexes**: Each table optimized for its access patterns - **Reduced storage**: Ephemeral events can be auto-expired - **Query optimization**: SQLite can optimize queries per table structure ### 3. JSON Tag Benefits - **Atomic storage**: Tags stored with their event - **Rich querying**: SQLite JSON functions enable complex tag queries - **Schema flexibility**: Can handle arbitrary tag structures - **Functional indexes**: Index specific tag patterns efficiently ## Migration Strategy 1. **Phase 1**: Create new schema alongside existing 2. **Phase 2**: Implement kind classification and routing logic 3. **Phase 3**: Migrate existing data to appropriate tables 4. **Phase 4**: Update application logic to use new tables 5. **Phase 5**: Drop old schema after verification ## Next Steps for Implementation 1. **Prototype JSON performance**: Create test database with sample data 2. **Benchmark query patterns**: Compare JSON vs normalized approaches 3. **Implement kind classification**: Add routing logic to C code 4. **Create migration scripts**: Handle existing data transformation 5. **Update test suite**: Verify compliance with new schema