337 lines
9.8 KiB
Markdown
337 lines
9.8 KiB
Markdown
# Advanced Nostr Relay Schema Design
|
|
|
|
## Overview
|
|
|
|
This document outlines the design for an advanced multi-table schema that enforces Nostr protocol compliance at the database level, with separate tables for different event types based on their storage and replacement characteristics.
|
|
|
|
## Event Type Classification
|
|
|
|
Based on the Nostr specification, events are classified into four categories:
|
|
|
|
### 1. Regular Events
|
|
- **Kinds**: `1000 <= n < 10000` || `4 <= n < 45` || `n == 1` || `n == 2`
|
|
- **Storage Policy**: All events stored permanently
|
|
- **Examples**: Text notes (1), Reposts (6), Reactions (7), Direct Messages (4)
|
|
|
|
### 2. Replaceable Events
|
|
- **Kinds**: `10000 <= n < 20000` || `n == 0` || `n == 3`
|
|
- **Storage Policy**: Only latest per `(pubkey, kind)` combination
|
|
- **Replacement Logic**: Latest `created_at`, then lowest `id` lexically
|
|
- **Examples**: Metadata (0), Contacts (3), Mute List (10000)
|
|
|
|
### 3. Ephemeral Events
|
|
- **Kinds**: `20000 <= n < 30000`
|
|
- **Storage Policy**: Not expected to be stored (optional temporary storage)
|
|
- **Examples**: Typing indicators, presence updates, ephemeral messages
|
|
|
|
### 4. Addressable Events
|
|
- **Kinds**: `30000 <= n < 40000`
|
|
- **Storage Policy**: Only latest per `(pubkey, kind, d_tag)` combination
|
|
- **Replacement Logic**: Same as replaceable events
|
|
- **Examples**: Long-form content (30023), Application-specific data
|
|
|
|
## SQLite JSON Capabilities Research
|
|
|
|
SQLite provides powerful JSON functions that could be leveraged for tag storage:
|
|
|
|
### Core JSON Functions
|
|
```sql
|
|
-- Extract specific values
|
|
json_extract(column, '$.path')
|
|
|
|
-- Iterate through arrays
|
|
json_each(json_array_column)
|
|
|
|
-- Flatten nested structures
|
|
json_tree(json_column)
|
|
|
|
-- Validate JSON structure
|
|
json_valid(column)
|
|
|
|
-- Array operations
|
|
json_array_length(column)
|
|
json_extract(column, '$[0]') -- First element
|
|
```
|
|
|
|
### Tag Query Examples
|
|
|
|
#### Find all 'e' tag references:
|
|
```sql
|
|
SELECT
|
|
id,
|
|
json_extract(value, '$[1]') as referenced_event_id,
|
|
json_extract(value, '$[2]') as relay_hint,
|
|
json_extract(value, '$[3]') as marker
|
|
FROM events, json_each(tags)
|
|
WHERE json_extract(value, '$[0]') = 'e';
|
|
```
|
|
|
|
#### Find events with specific hashtags:
|
|
```sql
|
|
SELECT id, content
|
|
FROM events, json_each(tags)
|
|
WHERE json_extract(value, '$[0]') = 't'
|
|
AND json_extract(value, '$[1]') = 'bitcoin';
|
|
```
|
|
|
|
#### Extract 'd' tag for addressable events:
|
|
```sql
|
|
SELECT
|
|
id,
|
|
json_extract(value, '$[1]') as d_tag_value
|
|
FROM events, json_each(tags)
|
|
WHERE json_extract(value, '$[0]') = 'd'
|
|
LIMIT 1;
|
|
```
|
|
|
|
### JSON Functional Indexes
|
|
```sql
|
|
-- Index on hashtags
|
|
CREATE INDEX idx_hashtags ON events(
|
|
json_extract(tags, '$[*][1]')
|
|
) WHERE json_extract(tags, '$[*][0]') = 't';
|
|
|
|
-- Index on 'd' tags for addressable events
|
|
CREATE INDEX idx_d_tags ON events_addressable(
|
|
json_extract(tags, '$[*][1]')
|
|
) WHERE json_extract(tags, '$[*][0]') = 'd';
|
|
```
|
|
|
|
## Proposed Schema Design
|
|
|
|
### Option 1: Separate Tables with JSON Tags
|
|
|
|
```sql
|
|
-- Regular Events (permanent storage)
|
|
CREATE TABLE events_regular (
|
|
id TEXT PRIMARY KEY,
|
|
pubkey TEXT NOT NULL,
|
|
created_at INTEGER NOT NULL,
|
|
kind INTEGER NOT NULL,
|
|
content TEXT NOT NULL,
|
|
sig TEXT NOT NULL,
|
|
tags JSON,
|
|
first_seen INTEGER DEFAULT (strftime('%s', 'now')),
|
|
CONSTRAINT kind_regular CHECK (
|
|
(kind >= 1000 AND kind < 10000) OR
|
|
(kind >= 4 AND kind < 45) OR
|
|
kind = 1 OR kind = 2
|
|
)
|
|
);
|
|
|
|
-- Replaceable Events (latest per pubkey+kind)
|
|
CREATE TABLE events_replaceable (
|
|
pubkey TEXT NOT NULL,
|
|
kind INTEGER NOT NULL,
|
|
id TEXT NOT NULL,
|
|
created_at INTEGER NOT NULL,
|
|
content TEXT NOT NULL,
|
|
sig TEXT NOT NULL,
|
|
tags JSON,
|
|
replaced_at INTEGER DEFAULT (strftime('%s', 'now')),
|
|
PRIMARY KEY (pubkey, kind),
|
|
CONSTRAINT kind_replaceable CHECK (
|
|
(kind >= 10000 AND kind < 20000) OR
|
|
kind = 0 OR kind = 3
|
|
)
|
|
);
|
|
|
|
-- Ephemeral Events (temporary/optional storage)
|
|
CREATE TABLE events_ephemeral (
|
|
id TEXT PRIMARY KEY,
|
|
pubkey TEXT NOT NULL,
|
|
created_at INTEGER NOT NULL,
|
|
kind INTEGER NOT NULL,
|
|
content TEXT NOT NULL,
|
|
sig TEXT NOT NULL,
|
|
tags JSON,
|
|
expires_at INTEGER DEFAULT (strftime('%s', 'now', '+1 hour')),
|
|
CONSTRAINT kind_ephemeral CHECK (
|
|
kind >= 20000 AND kind < 30000
|
|
)
|
|
);
|
|
|
|
-- Addressable Events (latest per pubkey+kind+d_tag)
|
|
CREATE TABLE events_addressable (
|
|
pubkey TEXT NOT NULL,
|
|
kind INTEGER NOT NULL,
|
|
d_tag TEXT NOT NULL,
|
|
id TEXT NOT NULL,
|
|
created_at INTEGER NOT NULL,
|
|
content TEXT NOT NULL,
|
|
sig TEXT NOT NULL,
|
|
tags JSON,
|
|
replaced_at INTEGER DEFAULT (strftime('%s', 'now')),
|
|
PRIMARY KEY (pubkey, kind, d_tag),
|
|
CONSTRAINT kind_addressable CHECK (
|
|
kind >= 30000 AND kind < 40000
|
|
)
|
|
);
|
|
```
|
|
|
|
### Indexes for Performance
|
|
|
|
```sql
|
|
-- Regular events indexes
|
|
CREATE INDEX idx_regular_pubkey ON events_regular(pubkey);
|
|
CREATE INDEX idx_regular_kind ON events_regular(kind);
|
|
CREATE INDEX idx_regular_created_at ON events_regular(created_at);
|
|
CREATE INDEX idx_regular_kind_created_at ON events_regular(kind, created_at);
|
|
|
|
-- Replaceable events indexes
|
|
CREATE INDEX idx_replaceable_created_at ON events_replaceable(created_at);
|
|
CREATE INDEX idx_replaceable_id ON events_replaceable(id);
|
|
|
|
-- Ephemeral events indexes
|
|
CREATE INDEX idx_ephemeral_expires_at ON events_ephemeral(expires_at);
|
|
CREATE INDEX idx_ephemeral_pubkey ON events_ephemeral(pubkey);
|
|
|
|
-- Addressable events indexes
|
|
CREATE INDEX idx_addressable_created_at ON events_addressable(created_at);
|
|
CREATE INDEX idx_addressable_id ON events_addressable(id);
|
|
|
|
-- JSON tag indexes (examples)
|
|
CREATE INDEX idx_regular_e_tags ON events_regular(
|
|
json_extract(tags, '$[*][1]')
|
|
) WHERE json_extract(tags, '$[*][0]') = 'e';
|
|
|
|
CREATE INDEX idx_regular_p_tags ON events_regular(
|
|
json_extract(tags, '$[*][1]')
|
|
) WHERE json_extract(tags, '$[*][0]') = 'p';
|
|
```
|
|
|
|
### Option 2: Unified Tag Table Approach
|
|
|
|
```sql
|
|
-- Unified tag storage (alternative to JSON)
|
|
CREATE TABLE tags_unified (
|
|
event_id TEXT NOT NULL,
|
|
event_type TEXT NOT NULL, -- 'regular', 'replaceable', 'ephemeral', 'addressable'
|
|
tag_index INTEGER NOT NULL, -- Position in tag array
|
|
name TEXT NOT NULL,
|
|
value TEXT NOT NULL,
|
|
param_2 TEXT, -- Third element if present
|
|
param_3 TEXT, -- Fourth element if present
|
|
param_json TEXT, -- JSON for additional parameters
|
|
PRIMARY KEY (event_id, tag_index)
|
|
);
|
|
|
|
CREATE INDEX idx_tags_name_value ON tags_unified(name, value);
|
|
CREATE INDEX idx_tags_event_type ON tags_unified(event_type);
|
|
```
|
|
|
|
## Implementation Strategy
|
|
|
|
### 1. Kind Classification Function (C Code)
|
|
```c
|
|
typedef enum {
|
|
EVENT_TYPE_REGULAR,
|
|
EVENT_TYPE_REPLACEABLE,
|
|
EVENT_TYPE_EPHEMERAL,
|
|
EVENT_TYPE_ADDRESSABLE,
|
|
EVENT_TYPE_INVALID
|
|
} event_type_t;
|
|
|
|
event_type_t classify_event_kind(int kind) {
|
|
if ((kind >= 1000 && kind < 10000) ||
|
|
(kind >= 4 && kind < 45) ||
|
|
kind == 1 || kind == 2) {
|
|
return EVENT_TYPE_REGULAR;
|
|
}
|
|
|
|
if ((kind >= 10000 && kind < 20000) ||
|
|
kind == 0 || kind == 3) {
|
|
return EVENT_TYPE_REPLACEABLE;
|
|
}
|
|
|
|
if (kind >= 20000 && kind < 30000) {
|
|
return EVENT_TYPE_EPHEMERAL;
|
|
}
|
|
|
|
if (kind >= 30000 && kind < 40000) {
|
|
return EVENT_TYPE_ADDRESSABLE;
|
|
}
|
|
|
|
return EVENT_TYPE_INVALID;
|
|
}
|
|
```
|
|
|
|
### 2. Replacement Logic for Replaceable Events
|
|
```sql
|
|
-- Trigger for replaceable events
|
|
CREATE TRIGGER replace_event_on_insert
|
|
BEFORE INSERT ON events_replaceable
|
|
FOR EACH ROW
|
|
WHEN EXISTS (
|
|
SELECT 1 FROM events_replaceable
|
|
WHERE pubkey = NEW.pubkey AND kind = NEW.kind
|
|
)
|
|
BEGIN
|
|
DELETE FROM events_replaceable
|
|
WHERE pubkey = NEW.pubkey
|
|
AND kind = NEW.kind
|
|
AND (
|
|
created_at < NEW.created_at OR
|
|
(created_at = NEW.created_at AND id > NEW.id)
|
|
);
|
|
END;
|
|
```
|
|
|
|
### 3. D-Tag Extraction for Addressable Events
|
|
```c
|
|
char* extract_d_tag(cJSON* tags) {
|
|
if (!tags || !cJSON_IsArray(tags)) {
|
|
return NULL;
|
|
}
|
|
|
|
cJSON* tag;
|
|
cJSON_ArrayForEach(tag, tags) {
|
|
if (cJSON_IsArray(tag) && cJSON_GetArraySize(tag) >= 2) {
|
|
cJSON* tag_name = cJSON_GetArrayItem(tag, 0);
|
|
cJSON* tag_value = cJSON_GetArrayItem(tag, 1);
|
|
|
|
if (cJSON_IsString(tag_name) && cJSON_IsString(tag_value)) {
|
|
if (strcmp(cJSON_GetStringValue(tag_name), "d") == 0) {
|
|
return strdup(cJSON_GetStringValue(tag_value));
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
return strdup(""); // Default empty d-tag
|
|
}
|
|
```
|
|
|
|
## Advantages of This Design
|
|
|
|
### 1. Protocol Compliance
|
|
- **Enforced at DB level**: Schema constraints prevent invalid event storage
|
|
- **Automatic replacement**: Triggers handle replaceable/addressable event logic
|
|
- **Type safety**: Separate tables ensure correct handling per event type
|
|
|
|
### 2. Performance Benefits
|
|
- **Targeted indexes**: Each table optimized for its access patterns
|
|
- **Reduced storage**: Ephemeral events can be auto-expired
|
|
- **Query optimization**: SQLite can optimize queries per table structure
|
|
|
|
### 3. JSON Tag Benefits
|
|
- **Atomic storage**: Tags stored with their event
|
|
- **Rich querying**: SQLite JSON functions enable complex tag queries
|
|
- **Schema flexibility**: Can handle arbitrary tag structures
|
|
- **Functional indexes**: Index specific tag patterns efficiently
|
|
|
|
## Migration Strategy
|
|
|
|
1. **Phase 1**: Create new schema alongside existing
|
|
2. **Phase 2**: Implement kind classification and routing logic
|
|
3. **Phase 3**: Migrate existing data to appropriate tables
|
|
4. **Phase 4**: Update application logic to use new tables
|
|
5. **Phase 5**: Drop old schema after verification
|
|
|
|
## Next Steps for Implementation
|
|
|
|
1. **Prototype JSON performance**: Create test database with sample data
|
|
2. **Benchmark query patterns**: Compare JSON vs normalized approaches
|
|
3. **Implement kind classification**: Add routing logic to C code
|
|
4. **Create migration scripts**: Handle existing data transformation
|
|
5. **Update test suite**: Verify compliance with new schema |