Files

Your Name 662feab881 Basic relay functionality completed

2025-09-04 07:10:13 -04:00

13 KiB

Raw Blame History

Final Schema Recommendation: Hybrid Single Table Approach

Executive Summary

After analyzing the subscription query complexity, the multi-table approach creates more problems than it solves. REQ filters don't align with storage semantics - clients filter by kind, author, and tags regardless of event type classification.

Recommendation: Modified Single Table with Event Type Classification

The Multi-Table Problem

REQ Filter Reality Check

Clients send: {"kinds": [1, 0, 30023], "authors": ["pubkey"], "#p": ["target"]}
Multi-table requires: 3 separate queries + UNION + complex ordering
Single table requires: 1 query with simple WHERE conditions

Query Complexity Explosion

-- Multi-table nightmare for simple filter
WITH results AS (
    SELECT * FROM events_regular WHERE kind = 1 AND pubkey = ? 
    UNION ALL
    SELECT * FROM events_replaceable WHERE kind = 0 AND pubkey = ?
    UNION ALL  
    SELECT * FROM events_addressable WHERE kind = 30023 AND pubkey = ?
)
SELECT r.* FROM results r 
JOIN multiple_tag_tables t ON complex_conditions
ORDER BY created_at DESC, id ASC LIMIT ?;

-- vs Single table simplicity
SELECT e.* FROM events e, json_each(e.tags) t
WHERE e.kind IN (1, 0, 30023) 
  AND e.pubkey = ?
  AND json_extract(t.value, '$[0]') = 'p'
  AND json_extract(t.value, '$[1]') = ?
ORDER BY e.created_at DESC, e.id ASC LIMIT ?;

Recommended Schema: Hybrid Approach

Core Design Philosophy

Single table for REQ query simplicity
Event type classification for protocol compliance
JSON tags for atomic storage and rich querying
Partial unique constraints for replacement logic

Schema Definition

CREATE TABLE events (
    id TEXT PRIMARY KEY,
    pubkey TEXT NOT NULL,
    created_at INTEGER NOT NULL,
    kind INTEGER NOT NULL,
    event_type TEXT NOT NULL CHECK (event_type IN ('regular', 'replaceable', 'ephemeral', 'addressable')),
    content TEXT NOT NULL,
    sig TEXT NOT NULL,
    tags JSON NOT NULL DEFAULT '[]',
    first_seen INTEGER NOT NULL DEFAULT (strftime('%s', 'now')),
    
    -- Additional fields for addressable events
    d_tag TEXT GENERATED ALWAYS AS (
        CASE 
            WHEN event_type = 'addressable' THEN
                json_extract(tags, '$[*][1]') 
                FROM json_each(tags) 
                WHERE json_extract(value, '$[0]') = 'd'
                LIMIT 1
            ELSE NULL 
        END
    ) STORED,
    
    -- Replacement tracking
    replaced_at INTEGER,
    
    -- Protocol compliance constraints
    CONSTRAINT unique_replaceable 
        UNIQUE (pubkey, kind) 
        WHERE event_type = 'replaceable',
        
    CONSTRAINT unique_addressable
        UNIQUE (pubkey, kind, d_tag) 
        WHERE event_type = 'addressable' AND d_tag IS NOT NULL
);

Event Type Classification Function

-- Function to determine event type from kind
CREATE VIEW event_type_lookup AS
SELECT 
    CASE 
        WHEN (kind >= 1000 AND kind < 10000) OR 
             (kind >= 4 AND kind < 45) OR 
             kind = 1 OR kind = 2 THEN 'regular'
        WHEN (kind >= 10000 AND kind < 20000) OR 
             kind = 0 OR kind = 3 THEN 'replaceable'
        WHEN kind >= 20000 AND kind < 30000 THEN 'ephemeral'
        WHEN kind >= 30000 AND kind < 40000 THEN 'addressable'
        ELSE 'unknown'
    END as event_type,
    kind
FROM (
    -- Generate all possible kind values for lookup
    WITH RECURSIVE kinds(kind) AS (
        SELECT 0
        UNION ALL
        SELECT kind + 1 FROM kinds WHERE kind < 65535
    )
    SELECT kind FROM kinds
);

Performance Indexes

-- Core query patterns
CREATE INDEX idx_events_pubkey ON events(pubkey);
CREATE INDEX idx_events_kind ON events(kind);  
CREATE INDEX idx_events_created_at ON events(created_at DESC);
CREATE INDEX idx_events_event_type ON events(event_type);

-- Composite indexes for common filters
CREATE INDEX idx_events_pubkey_created_at ON events(pubkey, created_at DESC);
CREATE INDEX idx_events_kind_created_at ON events(kind, created_at DESC);
CREATE INDEX idx_events_type_created_at ON events(event_type, created_at DESC);

-- JSON tag indexes for common patterns
CREATE INDEX idx_events_e_tags ON events(
    json_extract(tags, '$[*][1]')
) WHERE json_extract(tags, '$[*][0]') = 'e';

CREATE INDEX idx_events_p_tags ON events(
    json_extract(tags, '$[*][1]')
) WHERE json_extract(tags, '$[*][0]') = 'p';

CREATE INDEX idx_events_hashtags ON events(
    json_extract(tags, '$[*][1]')
) WHERE json_extract(tags, '$[*][0]') = 't';

-- Addressable events d_tag index
CREATE INDEX idx_events_d_tag ON events(d_tag) 
WHERE event_type = 'addressable' AND d_tag IS NOT NULL;

Replacement Logic Implementation

Replaceable Events Trigger

CREATE TRIGGER handle_replaceable_events
BEFORE INSERT ON events
FOR EACH ROW
WHEN NEW.event_type = 'replaceable'
BEGIN
    -- Delete older replaceable events with same pubkey+kind
    DELETE FROM events 
    WHERE event_type = 'replaceable'
      AND pubkey = NEW.pubkey 
      AND kind = NEW.kind
      AND (
          created_at < NEW.created_at OR 
          (created_at = NEW.created_at AND id > NEW.id)
      );
END;

Addressable Events Trigger

CREATE TRIGGER handle_addressable_events  
BEFORE INSERT ON events
FOR EACH ROW
WHEN NEW.event_type = 'addressable'
BEGIN
    -- Delete older addressable events with same pubkey+kind+d_tag
    DELETE FROM events
    WHERE event_type = 'addressable'
      AND pubkey = NEW.pubkey
      AND kind = NEW.kind
      AND d_tag = NEW.d_tag
      AND (
          created_at < NEW.created_at OR
          (created_at = NEW.created_at AND id > NEW.id)  
      );
END;

Implementation Strategy

C Code Integration

Event Type Classification

typedef enum {
    EVENT_TYPE_REGULAR,
    EVENT_TYPE_REPLACEABLE, 
    EVENT_TYPE_EPHEMERAL,
    EVENT_TYPE_ADDRESSABLE,
    EVENT_TYPE_UNKNOWN
} event_type_t;

event_type_t classify_event_kind(int kind) {
    if ((kind >= 1000 && kind < 10000) || 
        (kind >= 4 && kind < 45) || 
        kind == 1 || kind == 2) {
        return EVENT_TYPE_REGULAR;
    }
    if ((kind >= 10000 && kind < 20000) || 
        kind == 0 || kind == 3) {
        return EVENT_TYPE_REPLACEABLE;
    }
    if (kind >= 20000 && kind < 30000) {
        return EVENT_TYPE_EPHEMERAL;
    }
    if (kind >= 30000 && kind < 40000) {
        return EVENT_TYPE_ADDRESSABLE;
    }
    return EVENT_TYPE_UNKNOWN;
}

const char* event_type_to_string(event_type_t type) {
    switch (type) {
        case EVENT_TYPE_REGULAR: return "regular";
        case EVENT_TYPE_REPLACEABLE: return "replaceable";
        case EVENT_TYPE_EPHEMERAL: return "ephemeral";
        case EVENT_TYPE_ADDRESSABLE: return "addressable";
        default: return "unknown";
    }
}

Simplified Event Storage

int store_event(cJSON* event) {
    // Extract fields
    cJSON* id = cJSON_GetObjectItem(event, "id");
    cJSON* pubkey = cJSON_GetObjectItem(event, "pubkey");
    cJSON* created_at = cJSON_GetObjectItem(event, "created_at");
    cJSON* kind = cJSON_GetObjectItem(event, "kind");
    cJSON* content = cJSON_GetObjectItem(event, "content");
    cJSON* sig = cJSON_GetObjectItem(event, "sig");
    
    // Classify event type
    event_type_t type = classify_event_kind(cJSON_GetNumberValue(kind));
    
    // Serialize tags to JSON
    cJSON* tags = cJSON_GetObjectItem(event, "tags");
    char* tags_json = cJSON_Print(tags ? tags : cJSON_CreateArray());
    
    // Single INSERT statement - database handles replacement via triggers
    const char* sql = 
        "INSERT INTO events (id, pubkey, created_at, kind, event_type, content, sig, tags) "
        "VALUES (?, ?, ?, ?, ?, ?, ?, ?)";
        
    sqlite3_stmt* stmt;
    int rc = sqlite3_prepare_v2(g_db, sql, -1, &stmt, NULL);
    if (rc != SQLITE_OK) {
        free(tags_json);
        return -1;
    }
    
    sqlite3_bind_text(stmt, 1, cJSON_GetStringValue(id), -1, SQLITE_STATIC);
    sqlite3_bind_text(stmt, 2, cJSON_GetStringValue(pubkey), -1, SQLITE_STATIC);
    sqlite3_bind_int64(stmt, 3, (sqlite3_int64)cJSON_GetNumberValue(created_at));
    sqlite3_bind_int(stmt, 4, (int)cJSON_GetNumberValue(kind));
    sqlite3_bind_text(stmt, 5, event_type_to_string(type), -1, SQLITE_STATIC);
    sqlite3_bind_text(stmt, 6, cJSON_GetStringValue(content), -1, SQLITE_STATIC);
    sqlite3_bind_text(stmt, 7, cJSON_GetStringValue(sig), -1, SQLITE_STATIC);
    sqlite3_bind_text(stmt, 8, tags_json, -1, SQLITE_TRANSIENT);
    
    rc = sqlite3_step(stmt);
    sqlite3_finalize(stmt);
    free(tags_json);
    
    return (rc == SQLITE_DONE) ? 0 : -1;
}

Simple REQ Query Building

char* build_filter_query(cJSON* filter) {
    // Build single query against events table
    // Much simpler than multi-table approach
    
    GString* query = g_string_new("SELECT * FROM events WHERE 1=1");
    
    // Handle ids filter
    cJSON* ids = cJSON_GetObjectItem(filter, "ids");
    if (ids && cJSON_IsArray(ids)) {
        g_string_append(query, " AND id IN (");
        // Add parameter placeholders
        g_string_append(query, ")");
    }
    
    // Handle authors filter  
    cJSON* authors = cJSON_GetObjectItem(filter, "authors");
    if (authors && cJSON_IsArray(authors)) {
        g_string_append(query, " AND pubkey IN (");
        // Add parameter placeholders
        g_string_append(query, ")");
    }
    
    // Handle kinds filter
    cJSON* kinds = cJSON_GetObjectItem(filter, "kinds");
    if (kinds && cJSON_IsArray(kinds)) {
        g_string_append(query, " AND kind IN (");
        // Add parameter placeholders  
        g_string_append(query, ")");
    }
    
    // Handle tag filters (#e, #p, etc.)
    cJSON* item;
    cJSON_ArrayForEach(item, filter) {
        char* key = item->string;
        if (key && key[0] == '#' && strlen(key) == 2) {
            char tag_name = key[1];
            g_string_append_printf(query,
                " AND EXISTS (SELECT 1 FROM json_each(tags) "
                "WHERE json_extract(value, '$[0]') = '%c' "
                "AND json_extract(value, '$[1]') IN (", tag_name);
            // Add parameter placeholders
            g_string_append(query, "))");
        }
    }
    
    // Handle time range
    cJSON* since = cJSON_GetObjectItem(filter, "since");
    if (since) {
        g_string_append(query, " AND created_at >= ?");
    }
    
    cJSON* until = cJSON_GetObjectItem(filter, "until");
    if (until) {
        g_string_append(query, " AND created_at <= ?");
    }
    
    // Standard ordering and limit
    g_string_append(query, " ORDER BY created_at DESC, id ASC");
    
    cJSON* limit = cJSON_GetObjectItem(filter, "limit");
    if (limit) {
        g_string_append(query, " LIMIT ?");
    }
    
    return g_string_free(query, FALSE);
}

Benefits of This Approach

1. Query Simplicity

✅ Single table = simple REQ queries
✅ No UNION complexity
✅ Familiar SQL patterns
✅ Easy LIMIT and ORDER BY handling

2. Protocol Compliance

✅ Event type classification enforced
✅ Replacement logic via triggers
✅ Unique constraints prevent duplicates
✅ Proper handling of all event types

3. Performance

✅ Unified indexes across all events
✅ No join overhead for basic queries
✅ JSON tag indexes for complex filters
✅ Single table scan for cross-kind queries

4. Implementation Simplicity

✅ Minimal changes from current code
✅ Database handles replacement logic
✅ Simple event storage function
✅ No complex routing logic needed

5. Future Flexibility

✅ Can add columns for new event types
✅ Can split tables later if needed
✅ Easy to add new indexes
✅ Extensible constraint system

Migration Path

Phase 1: Schema Update

Add event_type column to existing events table
Add JSON tags column
Create classification triggers
Add partial unique constraints

Phase 2: Data Migration

Classify existing events by kind
Convert existing tag table data to JSON
Verify constraint compliance
Update indexes

Phase 3: Code Updates

Update event storage to use new schema
Simplify REQ query building
Remove tag table JOIN logic
Test subscription filtering

Phase 4: Optimization

Monitor query performance
Add specialized indexes as needed
Tune replacement triggers
Consider ephemeral event cleanup

Conclusion

This hybrid approach achieves the best of both worlds:

Protocol compliance through event type classification and constraints
Query simplicity through unified storage
Performance through targeted indexes
Implementation ease through minimal complexity

The multi-table approach, while theoretically cleaner, creates a subscription query nightmare that would significantly burden the implementation. The hybrid single-table approach provides all the benefits with manageable complexity.

13 KiB Raw Blame History