# Final Schema Recommendation: Hybrid Single Table Approach ## Executive Summary After analyzing the subscription query complexity, **the multi-table approach creates more problems than it solves**. REQ filters don't align with storage semantics - clients filter by kind, author, and tags regardless of event type classification. **Recommendation: Modified Single Table with Event Type Classification** ## The Multi-Table Problem ### REQ Filter Reality Check - Clients send: `{"kinds": [1, 0, 30023], "authors": ["pubkey"], "#p": ["target"]}` - Multi-table requires: 3 separate queries + UNION + complex ordering - Single table requires: 1 query with simple WHERE conditions ### Query Complexity Explosion ```sql -- Multi-table nightmare for simple filter WITH results AS ( SELECT * FROM events_regular WHERE kind = 1 AND pubkey = ? UNION ALL SELECT * FROM events_replaceable WHERE kind = 0 AND pubkey = ? UNION ALL SELECT * FROM events_addressable WHERE kind = 30023 AND pubkey = ? ) SELECT r.* FROM results r JOIN multiple_tag_tables t ON complex_conditions ORDER BY created_at DESC, id ASC LIMIT ?; -- vs Single table simplicity SELECT e.* FROM events e, json_each(e.tags) t WHERE e.kind IN (1, 0, 30023) AND e.pubkey = ? AND json_extract(t.value, '$[0]') = 'p' AND json_extract(t.value, '$[1]') = ? ORDER BY e.created_at DESC, e.id ASC LIMIT ?; ``` ## Recommended Schema: Hybrid Approach ### Core Design Philosophy - **Single table for REQ query simplicity** - **Event type classification for protocol compliance** - **JSON tags for atomic storage and rich querying** - **Partial unique constraints for replacement logic** ### Schema Definition ```sql CREATE TABLE events ( id TEXT PRIMARY KEY, pubkey TEXT NOT NULL, created_at INTEGER NOT NULL, kind INTEGER NOT NULL, event_type TEXT NOT NULL CHECK (event_type IN ('regular', 'replaceable', 'ephemeral', 'addressable')), content TEXT NOT NULL, sig TEXT NOT NULL, tags JSON NOT NULL DEFAULT '[]', first_seen INTEGER NOT NULL DEFAULT (strftime('%s', 'now')), -- Additional fields for addressable events d_tag TEXT GENERATED ALWAYS AS ( CASE WHEN event_type = 'addressable' THEN json_extract(tags, '$[*][1]') FROM json_each(tags) WHERE json_extract(value, '$[0]') = 'd' LIMIT 1 ELSE NULL END ) STORED, -- Replacement tracking replaced_at INTEGER, -- Protocol compliance constraints CONSTRAINT unique_replaceable UNIQUE (pubkey, kind) WHERE event_type = 'replaceable', CONSTRAINT unique_addressable UNIQUE (pubkey, kind, d_tag) WHERE event_type = 'addressable' AND d_tag IS NOT NULL ); ``` ### Event Type Classification Function ```sql -- Function to determine event type from kind CREATE VIEW event_type_lookup AS SELECT CASE WHEN (kind >= 1000 AND kind < 10000) OR (kind >= 4 AND kind < 45) OR kind = 1 OR kind = 2 THEN 'regular' WHEN (kind >= 10000 AND kind < 20000) OR kind = 0 OR kind = 3 THEN 'replaceable' WHEN kind >= 20000 AND kind < 30000 THEN 'ephemeral' WHEN kind >= 30000 AND kind < 40000 THEN 'addressable' ELSE 'unknown' END as event_type, kind FROM ( -- Generate all possible kind values for lookup WITH RECURSIVE kinds(kind) AS ( SELECT 0 UNION ALL SELECT kind + 1 FROM kinds WHERE kind < 65535 ) SELECT kind FROM kinds ); ``` ### Performance Indexes ```sql -- Core query patterns CREATE INDEX idx_events_pubkey ON events(pubkey); CREATE INDEX idx_events_kind ON events(kind); CREATE INDEX idx_events_created_at ON events(created_at DESC); CREATE INDEX idx_events_event_type ON events(event_type); -- Composite indexes for common filters CREATE INDEX idx_events_pubkey_created_at ON events(pubkey, created_at DESC); CREATE INDEX idx_events_kind_created_at ON events(kind, created_at DESC); CREATE INDEX idx_events_type_created_at ON events(event_type, created_at DESC); -- JSON tag indexes for common patterns CREATE INDEX idx_events_e_tags ON events( json_extract(tags, '$[*][1]') ) WHERE json_extract(tags, '$[*][0]') = 'e'; CREATE INDEX idx_events_p_tags ON events( json_extract(tags, '$[*][1]') ) WHERE json_extract(tags, '$[*][0]') = 'p'; CREATE INDEX idx_events_hashtags ON events( json_extract(tags, '$[*][1]') ) WHERE json_extract(tags, '$[*][0]') = 't'; -- Addressable events d_tag index CREATE INDEX idx_events_d_tag ON events(d_tag) WHERE event_type = 'addressable' AND d_tag IS NOT NULL; ``` ### Replacement Logic Implementation #### Replaceable Events Trigger ```sql CREATE TRIGGER handle_replaceable_events BEFORE INSERT ON events FOR EACH ROW WHEN NEW.event_type = 'replaceable' BEGIN -- Delete older replaceable events with same pubkey+kind DELETE FROM events WHERE event_type = 'replaceable' AND pubkey = NEW.pubkey AND kind = NEW.kind AND ( created_at < NEW.created_at OR (created_at = NEW.created_at AND id > NEW.id) ); END; ``` #### Addressable Events Trigger ```sql CREATE TRIGGER handle_addressable_events BEFORE INSERT ON events FOR EACH ROW WHEN NEW.event_type = 'addressable' BEGIN -- Delete older addressable events with same pubkey+kind+d_tag DELETE FROM events WHERE event_type = 'addressable' AND pubkey = NEW.pubkey AND kind = NEW.kind AND d_tag = NEW.d_tag AND ( created_at < NEW.created_at OR (created_at = NEW.created_at AND id > NEW.id) ); END; ``` ## Implementation Strategy ### C Code Integration #### Event Type Classification ```c typedef enum { EVENT_TYPE_REGULAR, EVENT_TYPE_REPLACEABLE, EVENT_TYPE_EPHEMERAL, EVENT_TYPE_ADDRESSABLE, EVENT_TYPE_UNKNOWN } event_type_t; event_type_t classify_event_kind(int kind) { if ((kind >= 1000 && kind < 10000) || (kind >= 4 && kind < 45) || kind == 1 || kind == 2) { return EVENT_TYPE_REGULAR; } if ((kind >= 10000 && kind < 20000) || kind == 0 || kind == 3) { return EVENT_TYPE_REPLACEABLE; } if (kind >= 20000 && kind < 30000) { return EVENT_TYPE_EPHEMERAL; } if (kind >= 30000 && kind < 40000) { return EVENT_TYPE_ADDRESSABLE; } return EVENT_TYPE_UNKNOWN; } const char* event_type_to_string(event_type_t type) { switch (type) { case EVENT_TYPE_REGULAR: return "regular"; case EVENT_TYPE_REPLACEABLE: return "replaceable"; case EVENT_TYPE_EPHEMERAL: return "ephemeral"; case EVENT_TYPE_ADDRESSABLE: return "addressable"; default: return "unknown"; } } ``` #### Simplified Event Storage ```c int store_event(cJSON* event) { // Extract fields cJSON* id = cJSON_GetObjectItem(event, "id"); cJSON* pubkey = cJSON_GetObjectItem(event, "pubkey"); cJSON* created_at = cJSON_GetObjectItem(event, "created_at"); cJSON* kind = cJSON_GetObjectItem(event, "kind"); cJSON* content = cJSON_GetObjectItem(event, "content"); cJSON* sig = cJSON_GetObjectItem(event, "sig"); // Classify event type event_type_t type = classify_event_kind(cJSON_GetNumberValue(kind)); // Serialize tags to JSON cJSON* tags = cJSON_GetObjectItem(event, "tags"); char* tags_json = cJSON_Print(tags ? tags : cJSON_CreateArray()); // Single INSERT statement - database handles replacement via triggers const char* sql = "INSERT INTO events (id, pubkey, created_at, kind, event_type, content, sig, tags) " "VALUES (?, ?, ?, ?, ?, ?, ?, ?)"; sqlite3_stmt* stmt; int rc = sqlite3_prepare_v2(g_db, sql, -1, &stmt, NULL); if (rc != SQLITE_OK) { free(tags_json); return -1; } sqlite3_bind_text(stmt, 1, cJSON_GetStringValue(id), -1, SQLITE_STATIC); sqlite3_bind_text(stmt, 2, cJSON_GetStringValue(pubkey), -1, SQLITE_STATIC); sqlite3_bind_int64(stmt, 3, (sqlite3_int64)cJSON_GetNumberValue(created_at)); sqlite3_bind_int(stmt, 4, (int)cJSON_GetNumberValue(kind)); sqlite3_bind_text(stmt, 5, event_type_to_string(type), -1, SQLITE_STATIC); sqlite3_bind_text(stmt, 6, cJSON_GetStringValue(content), -1, SQLITE_STATIC); sqlite3_bind_text(stmt, 7, cJSON_GetStringValue(sig), -1, SQLITE_STATIC); sqlite3_bind_text(stmt, 8, tags_json, -1, SQLITE_TRANSIENT); rc = sqlite3_step(stmt); sqlite3_finalize(stmt); free(tags_json); return (rc == SQLITE_DONE) ? 0 : -1; } ``` #### Simple REQ Query Building ```c char* build_filter_query(cJSON* filter) { // Build single query against events table // Much simpler than multi-table approach GString* query = g_string_new("SELECT * FROM events WHERE 1=1"); // Handle ids filter cJSON* ids = cJSON_GetObjectItem(filter, "ids"); if (ids && cJSON_IsArray(ids)) { g_string_append(query, " AND id IN ("); // Add parameter placeholders g_string_append(query, ")"); } // Handle authors filter cJSON* authors = cJSON_GetObjectItem(filter, "authors"); if (authors && cJSON_IsArray(authors)) { g_string_append(query, " AND pubkey IN ("); // Add parameter placeholders g_string_append(query, ")"); } // Handle kinds filter cJSON* kinds = cJSON_GetObjectItem(filter, "kinds"); if (kinds && cJSON_IsArray(kinds)) { g_string_append(query, " AND kind IN ("); // Add parameter placeholders g_string_append(query, ")"); } // Handle tag filters (#e, #p, etc.) cJSON* item; cJSON_ArrayForEach(item, filter) { char* key = item->string; if (key && key[0] == '#' && strlen(key) == 2) { char tag_name = key[1]; g_string_append_printf(query, " AND EXISTS (SELECT 1 FROM json_each(tags) " "WHERE json_extract(value, '$[0]') = '%c' " "AND json_extract(value, '$[1]') IN (", tag_name); // Add parameter placeholders g_string_append(query, "))"); } } // Handle time range cJSON* since = cJSON_GetObjectItem(filter, "since"); if (since) { g_string_append(query, " AND created_at >= ?"); } cJSON* until = cJSON_GetObjectItem(filter, "until"); if (until) { g_string_append(query, " AND created_at <= ?"); } // Standard ordering and limit g_string_append(query, " ORDER BY created_at DESC, id ASC"); cJSON* limit = cJSON_GetObjectItem(filter, "limit"); if (limit) { g_string_append(query, " LIMIT ?"); } return g_string_free(query, FALSE); } ``` ## Benefits of This Approach ### 1. Query Simplicity - ✅ Single table = simple REQ queries - ✅ No UNION complexity - ✅ Familiar SQL patterns - ✅ Easy LIMIT and ORDER BY handling ### 2. Protocol Compliance - ✅ Event type classification enforced - ✅ Replacement logic via triggers - ✅ Unique constraints prevent duplicates - ✅ Proper handling of all event types ### 3. Performance - ✅ Unified indexes across all events - ✅ No join overhead for basic queries - ✅ JSON tag indexes for complex filters - ✅ Single table scan for cross-kind queries ### 4. Implementation Simplicity - ✅ Minimal changes from current code - ✅ Database handles replacement logic - ✅ Simple event storage function - ✅ No complex routing logic needed ### 5. Future Flexibility - ✅ Can add columns for new event types - ✅ Can split tables later if needed - ✅ Easy to add new indexes - ✅ Extensible constraint system ## Migration Path ### Phase 1: Schema Update 1. Add `event_type` column to existing events table 2. Add JSON `tags` column 3. Create classification triggers 4. Add partial unique constraints ### Phase 2: Data Migration 1. Classify existing events by kind 2. Convert existing tag table data to JSON 3. Verify constraint compliance 4. Update indexes ### Phase 3: Code Updates 1. Update event storage to use new schema 2. Simplify REQ query building 3. Remove tag table JOIN logic 4. Test subscription filtering ### Phase 4: Optimization 1. Monitor query performance 2. Add specialized indexes as needed 3. Tune replacement triggers 4. Consider ephemeral event cleanup ## Conclusion This hybrid approach achieves the best of both worlds: - **Protocol compliance** through event type classification and constraints - **Query simplicity** through unified storage - **Performance** through targeted indexes - **Implementation ease** through minimal complexity The multi-table approach, while theoretically cleaner, creates a subscription query nightmare that would significantly burden the implementation. The hybrid single-table approach provides all the benefits with manageable complexity.