Files

Your Name 662feab881 Basic relay functionality completed

2025-09-04 07:10:13 -04:00

9.8 KiB

Raw Blame History

Advanced Nostr Relay Schema Design

Overview

This document outlines the design for an advanced multi-table schema that enforces Nostr protocol compliance at the database level, with separate tables for different event types based on their storage and replacement characteristics.

Event Type Classification

Based on the Nostr specification, events are classified into four categories:

1. Regular Events

Kinds: 1000 <= n < 10000 || 4 <= n < 45 || n == 1 || n == 2
Storage Policy: All events stored permanently
Examples: Text notes (1), Reposts (6), Reactions (7), Direct Messages (4)

2. Replaceable Events

Kinds: 10000 <= n < 20000 || n == 0 || n == 3
Storage Policy: Only latest per (pubkey, kind) combination
Replacement Logic: Latest created_at, then lowest id lexically
Examples: Metadata (0), Contacts (3), Mute List (10000)

3. Ephemeral Events

Kinds: 20000 <= n < 30000
Storage Policy: Not expected to be stored (optional temporary storage)
Examples: Typing indicators, presence updates, ephemeral messages

4. Addressable Events

Kinds: 30000 <= n < 40000
Storage Policy: Only latest per (pubkey, kind, d_tag) combination
Replacement Logic: Same as replaceable events
Examples: Long-form content (30023), Application-specific data

SQLite JSON Capabilities Research

SQLite provides powerful JSON functions that could be leveraged for tag storage:

Core JSON Functions

-- Extract specific values
json_extract(column, '$.path') 

-- Iterate through arrays
json_each(json_array_column)

-- Flatten nested structures  
json_tree(json_column)

-- Validate JSON structure
json_valid(column)

-- Array operations
json_array_length(column)
json_extract(column, '$[0]')  -- First element

Tag Query Examples

Find all 'e' tag references:

SELECT 
    id,
    json_extract(value, '$[1]') as referenced_event_id,
    json_extract(value, '$[2]') as relay_hint,
    json_extract(value, '$[3]') as marker
FROM events, json_each(tags)
WHERE json_extract(value, '$[0]') = 'e';

Find events with specific hashtags:

SELECT id, content 
FROM events, json_each(tags)
WHERE json_extract(value, '$[0]') = 't' 
  AND json_extract(value, '$[1]') = 'bitcoin';

Extract 'd' tag for addressable events:

SELECT 
    id,
    json_extract(value, '$[1]') as d_tag_value
FROM events, json_each(tags)
WHERE json_extract(value, '$[0]') = 'd'
LIMIT 1;

JSON Functional Indexes

-- Index on hashtags
CREATE INDEX idx_hashtags ON events(
    json_extract(tags, '$[*][1]')
) WHERE json_extract(tags, '$[*][0]') = 't';

-- Index on 'd' tags for addressable events
CREATE INDEX idx_d_tags ON events_addressable(
    json_extract(tags, '$[*][1]')
) WHERE json_extract(tags, '$[*][0]') = 'd';

Proposed Schema Design

Option 1: Separate Tables with JSON Tags

-- Regular Events (permanent storage)
CREATE TABLE events_regular (
    id TEXT PRIMARY KEY,
    pubkey TEXT NOT NULL,
    created_at INTEGER NOT NULL,
    kind INTEGER NOT NULL,
    content TEXT NOT NULL,
    sig TEXT NOT NULL,
    tags JSON,
    first_seen INTEGER DEFAULT (strftime('%s', 'now')),
    CONSTRAINT kind_regular CHECK (
        (kind >= 1000 AND kind < 10000) OR 
        (kind >= 4 AND kind < 45) OR 
        kind = 1 OR kind = 2
    )
);

-- Replaceable Events (latest per pubkey+kind)
CREATE TABLE events_replaceable (
    pubkey TEXT NOT NULL,
    kind INTEGER NOT NULL,
    id TEXT NOT NULL,
    created_at INTEGER NOT NULL,
    content TEXT NOT NULL,
    sig TEXT NOT NULL,
    tags JSON,
    replaced_at INTEGER DEFAULT (strftime('%s', 'now')),
    PRIMARY KEY (pubkey, kind),
    CONSTRAINT kind_replaceable CHECK (
        (kind >= 10000 AND kind < 20000) OR 
        kind = 0 OR kind = 3
    )
);

-- Ephemeral Events (temporary/optional storage)
CREATE TABLE events_ephemeral (
    id TEXT PRIMARY KEY,
    pubkey TEXT NOT NULL,
    created_at INTEGER NOT NULL,
    kind INTEGER NOT NULL,
    content TEXT NOT NULL,
    sig TEXT NOT NULL,
    tags JSON,
    expires_at INTEGER DEFAULT (strftime('%s', 'now', '+1 hour')),
    CONSTRAINT kind_ephemeral CHECK (
        kind >= 20000 AND kind < 30000
    )
);

-- Addressable Events (latest per pubkey+kind+d_tag)
CREATE TABLE events_addressable (
    pubkey TEXT NOT NULL,
    kind INTEGER NOT NULL,
    d_tag TEXT NOT NULL,
    id TEXT NOT NULL,
    created_at INTEGER NOT NULL,
    content TEXT NOT NULL,
    sig TEXT NOT NULL,
    tags JSON,
    replaced_at INTEGER DEFAULT (strftime('%s', 'now')),
    PRIMARY KEY (pubkey, kind, d_tag),
    CONSTRAINT kind_addressable CHECK (
        kind >= 30000 AND kind < 40000
    )
);

Indexes for Performance

-- Regular events indexes
CREATE INDEX idx_regular_pubkey ON events_regular(pubkey);
CREATE INDEX idx_regular_kind ON events_regular(kind);
CREATE INDEX idx_regular_created_at ON events_regular(created_at);
CREATE INDEX idx_regular_kind_created_at ON events_regular(kind, created_at);

-- Replaceable events indexes  
CREATE INDEX idx_replaceable_created_at ON events_replaceable(created_at);
CREATE INDEX idx_replaceable_id ON events_replaceable(id);

-- Ephemeral events indexes
CREATE INDEX idx_ephemeral_expires_at ON events_ephemeral(expires_at);
CREATE INDEX idx_ephemeral_pubkey ON events_ephemeral(pubkey);

-- Addressable events indexes
CREATE INDEX idx_addressable_created_at ON events_addressable(created_at);
CREATE INDEX idx_addressable_id ON events_addressable(id);

-- JSON tag indexes (examples)
CREATE INDEX idx_regular_e_tags ON events_regular(
    json_extract(tags, '$[*][1]')
) WHERE json_extract(tags, '$[*][0]') = 'e';

CREATE INDEX idx_regular_p_tags ON events_regular(
    json_extract(tags, '$[*][1]')  
) WHERE json_extract(tags, '$[*][0]') = 'p';

Option 2: Unified Tag Table Approach

-- Unified tag storage (alternative to JSON)
CREATE TABLE tags_unified (
    event_id TEXT NOT NULL,
    event_type TEXT NOT NULL, -- 'regular', 'replaceable', 'ephemeral', 'addressable'
    tag_index INTEGER NOT NULL, -- Position in tag array
    name TEXT NOT NULL,
    value TEXT NOT NULL,
    param_2 TEXT, -- Third element if present
    param_3 TEXT, -- Fourth element if present
    param_json TEXT, -- JSON for additional parameters
    PRIMARY KEY (event_id, tag_index)
);

CREATE INDEX idx_tags_name_value ON tags_unified(name, value);
CREATE INDEX idx_tags_event_type ON tags_unified(event_type);

Implementation Strategy

1. Kind Classification Function (C Code)

typedef enum {
    EVENT_TYPE_REGULAR,
    EVENT_TYPE_REPLACEABLE,
    EVENT_TYPE_EPHEMERAL, 
    EVENT_TYPE_ADDRESSABLE,
    EVENT_TYPE_INVALID
} event_type_t;

event_type_t classify_event_kind(int kind) {
    if ((kind >= 1000 && kind < 10000) || 
        (kind >= 4 && kind < 45) || 
        kind == 1 || kind == 2) {
        return EVENT_TYPE_REGULAR;
    }
    
    if ((kind >= 10000 && kind < 20000) || 
        kind == 0 || kind == 3) {
        return EVENT_TYPE_REPLACEABLE;
    }
    
    if (kind >= 20000 && kind < 30000) {
        return EVENT_TYPE_EPHEMERAL;
    }
    
    if (kind >= 30000 && kind < 40000) {
        return EVENT_TYPE_ADDRESSABLE;
    }
    
    return EVENT_TYPE_INVALID;
}

2. Replacement Logic for Replaceable Events

-- Trigger for replaceable events
CREATE TRIGGER replace_event_on_insert 
BEFORE INSERT ON events_replaceable
FOR EACH ROW
WHEN EXISTS (
    SELECT 1 FROM events_replaceable 
    WHERE pubkey = NEW.pubkey AND kind = NEW.kind
)
BEGIN
    DELETE FROM events_replaceable 
    WHERE pubkey = NEW.pubkey 
      AND kind = NEW.kind
      AND (
          created_at < NEW.created_at OR 
          (created_at = NEW.created_at AND id > NEW.id)
      );
END;

3. D-Tag Extraction for Addressable Events

char* extract_d_tag(cJSON* tags) {
    if (!tags || !cJSON_IsArray(tags)) {
        return NULL;
    }
    
    cJSON* tag;
    cJSON_ArrayForEach(tag, tags) {
        if (cJSON_IsArray(tag) && cJSON_GetArraySize(tag) >= 2) {
            cJSON* tag_name = cJSON_GetArrayItem(tag, 0);
            cJSON* tag_value = cJSON_GetArrayItem(tag, 1);
            
            if (cJSON_IsString(tag_name) && cJSON_IsString(tag_value)) {
                if (strcmp(cJSON_GetStringValue(tag_name), "d") == 0) {
                    return strdup(cJSON_GetStringValue(tag_value));
                }
            }
        }
    }
    
    return strdup(""); // Default empty d-tag
}

Advantages of This Design

1. Protocol Compliance

Enforced at DB level: Schema constraints prevent invalid event storage
Automatic replacement: Triggers handle replaceable/addressable event logic
Type safety: Separate tables ensure correct handling per event type

2. Performance Benefits

Targeted indexes: Each table optimized for its access patterns
Reduced storage: Ephemeral events can be auto-expired
Query optimization: SQLite can optimize queries per table structure

3. JSON Tag Benefits

Atomic storage: Tags stored with their event
Rich querying: SQLite JSON functions enable complex tag queries
Schema flexibility: Can handle arbitrary tag structures
Functional indexes: Index specific tag patterns efficiently

Migration Strategy

Phase 1: Create new schema alongside existing
Phase 2: Implement kind classification and routing logic
Phase 3: Migrate existing data to appropriate tables
Phase 4: Update application logic to use new tables
Phase 5: Drop old schema after verification

Next Steps for Implementation

Prototype JSON performance: Create test database with sample data
Benchmark query patterns: Compare JSON vs normalized approaches
Implement kind classification: Add routing logic to C code
Create migration scripts: Handle existing data transformation
Update test suite: Verify compliance with new schema

9.8 KiB Raw Blame History