416 lines
13 KiB
Markdown
416 lines
13 KiB
Markdown
# Final Schema Recommendation: Hybrid Single Table Approach
|
|
|
|
## Executive Summary
|
|
|
|
After analyzing the subscription query complexity, **the multi-table approach creates more problems than it solves**. REQ filters don't align with storage semantics - clients filter by kind, author, and tags regardless of event type classification.
|
|
|
|
**Recommendation: Modified Single Table with Event Type Classification**
|
|
|
|
## The Multi-Table Problem
|
|
|
|
### REQ Filter Reality Check
|
|
- Clients send: `{"kinds": [1, 0, 30023], "authors": ["pubkey"], "#p": ["target"]}`
|
|
- Multi-table requires: 3 separate queries + UNION + complex ordering
|
|
- Single table requires: 1 query with simple WHERE conditions
|
|
|
|
### Query Complexity Explosion
|
|
```sql
|
|
-- Multi-table nightmare for simple filter
|
|
WITH results AS (
|
|
SELECT * FROM events_regular WHERE kind = 1 AND pubkey = ?
|
|
UNION ALL
|
|
SELECT * FROM events_replaceable WHERE kind = 0 AND pubkey = ?
|
|
UNION ALL
|
|
SELECT * FROM events_addressable WHERE kind = 30023 AND pubkey = ?
|
|
)
|
|
SELECT r.* FROM results r
|
|
JOIN multiple_tag_tables t ON complex_conditions
|
|
ORDER BY created_at DESC, id ASC LIMIT ?;
|
|
|
|
-- vs Single table simplicity
|
|
SELECT e.* FROM events e, json_each(e.tags) t
|
|
WHERE e.kind IN (1, 0, 30023)
|
|
AND e.pubkey = ?
|
|
AND json_extract(t.value, '$[0]') = 'p'
|
|
AND json_extract(t.value, '$[1]') = ?
|
|
ORDER BY e.created_at DESC, e.id ASC LIMIT ?;
|
|
```
|
|
|
|
## Recommended Schema: Hybrid Approach
|
|
|
|
### Core Design Philosophy
|
|
- **Single table for REQ query simplicity**
|
|
- **Event type classification for protocol compliance**
|
|
- **JSON tags for atomic storage and rich querying**
|
|
- **Partial unique constraints for replacement logic**
|
|
|
|
### Schema Definition
|
|
|
|
```sql
|
|
CREATE TABLE events (
|
|
id TEXT PRIMARY KEY,
|
|
pubkey TEXT NOT NULL,
|
|
created_at INTEGER NOT NULL,
|
|
kind INTEGER NOT NULL,
|
|
event_type TEXT NOT NULL CHECK (event_type IN ('regular', 'replaceable', 'ephemeral', 'addressable')),
|
|
content TEXT NOT NULL,
|
|
sig TEXT NOT NULL,
|
|
tags JSON NOT NULL DEFAULT '[]',
|
|
first_seen INTEGER NOT NULL DEFAULT (strftime('%s', 'now')),
|
|
|
|
-- Additional fields for addressable events
|
|
d_tag TEXT GENERATED ALWAYS AS (
|
|
CASE
|
|
WHEN event_type = 'addressable' THEN
|
|
json_extract(tags, '$[*][1]')
|
|
FROM json_each(tags)
|
|
WHERE json_extract(value, '$[0]') = 'd'
|
|
LIMIT 1
|
|
ELSE NULL
|
|
END
|
|
) STORED,
|
|
|
|
-- Replacement tracking
|
|
replaced_at INTEGER,
|
|
|
|
-- Protocol compliance constraints
|
|
CONSTRAINT unique_replaceable
|
|
UNIQUE (pubkey, kind)
|
|
WHERE event_type = 'replaceable',
|
|
|
|
CONSTRAINT unique_addressable
|
|
UNIQUE (pubkey, kind, d_tag)
|
|
WHERE event_type = 'addressable' AND d_tag IS NOT NULL
|
|
);
|
|
```
|
|
|
|
### Event Type Classification Function
|
|
|
|
```sql
|
|
-- Function to determine event type from kind
|
|
CREATE VIEW event_type_lookup AS
|
|
SELECT
|
|
CASE
|
|
WHEN (kind >= 1000 AND kind < 10000) OR
|
|
(kind >= 4 AND kind < 45) OR
|
|
kind = 1 OR kind = 2 THEN 'regular'
|
|
WHEN (kind >= 10000 AND kind < 20000) OR
|
|
kind = 0 OR kind = 3 THEN 'replaceable'
|
|
WHEN kind >= 20000 AND kind < 30000 THEN 'ephemeral'
|
|
WHEN kind >= 30000 AND kind < 40000 THEN 'addressable'
|
|
ELSE 'unknown'
|
|
END as event_type,
|
|
kind
|
|
FROM (
|
|
-- Generate all possible kind values for lookup
|
|
WITH RECURSIVE kinds(kind) AS (
|
|
SELECT 0
|
|
UNION ALL
|
|
SELECT kind + 1 FROM kinds WHERE kind < 65535
|
|
)
|
|
SELECT kind FROM kinds
|
|
);
|
|
```
|
|
|
|
### Performance Indexes
|
|
|
|
```sql
|
|
-- Core query patterns
|
|
CREATE INDEX idx_events_pubkey ON events(pubkey);
|
|
CREATE INDEX idx_events_kind ON events(kind);
|
|
CREATE INDEX idx_events_created_at ON events(created_at DESC);
|
|
CREATE INDEX idx_events_event_type ON events(event_type);
|
|
|
|
-- Composite indexes for common filters
|
|
CREATE INDEX idx_events_pubkey_created_at ON events(pubkey, created_at DESC);
|
|
CREATE INDEX idx_events_kind_created_at ON events(kind, created_at DESC);
|
|
CREATE INDEX idx_events_type_created_at ON events(event_type, created_at DESC);
|
|
|
|
-- JSON tag indexes for common patterns
|
|
CREATE INDEX idx_events_e_tags ON events(
|
|
json_extract(tags, '$[*][1]')
|
|
) WHERE json_extract(tags, '$[*][0]') = 'e';
|
|
|
|
CREATE INDEX idx_events_p_tags ON events(
|
|
json_extract(tags, '$[*][1]')
|
|
) WHERE json_extract(tags, '$[*][0]') = 'p';
|
|
|
|
CREATE INDEX idx_events_hashtags ON events(
|
|
json_extract(tags, '$[*][1]')
|
|
) WHERE json_extract(tags, '$[*][0]') = 't';
|
|
|
|
-- Addressable events d_tag index
|
|
CREATE INDEX idx_events_d_tag ON events(d_tag)
|
|
WHERE event_type = 'addressable' AND d_tag IS NOT NULL;
|
|
```
|
|
|
|
### Replacement Logic Implementation
|
|
|
|
#### Replaceable Events Trigger
|
|
```sql
|
|
CREATE TRIGGER handle_replaceable_events
|
|
BEFORE INSERT ON events
|
|
FOR EACH ROW
|
|
WHEN NEW.event_type = 'replaceable'
|
|
BEGIN
|
|
-- Delete older replaceable events with same pubkey+kind
|
|
DELETE FROM events
|
|
WHERE event_type = 'replaceable'
|
|
AND pubkey = NEW.pubkey
|
|
AND kind = NEW.kind
|
|
AND (
|
|
created_at < NEW.created_at OR
|
|
(created_at = NEW.created_at AND id > NEW.id)
|
|
);
|
|
END;
|
|
```
|
|
|
|
#### Addressable Events Trigger
|
|
```sql
|
|
CREATE TRIGGER handle_addressable_events
|
|
BEFORE INSERT ON events
|
|
FOR EACH ROW
|
|
WHEN NEW.event_type = 'addressable'
|
|
BEGIN
|
|
-- Delete older addressable events with same pubkey+kind+d_tag
|
|
DELETE FROM events
|
|
WHERE event_type = 'addressable'
|
|
AND pubkey = NEW.pubkey
|
|
AND kind = NEW.kind
|
|
AND d_tag = NEW.d_tag
|
|
AND (
|
|
created_at < NEW.created_at OR
|
|
(created_at = NEW.created_at AND id > NEW.id)
|
|
);
|
|
END;
|
|
```
|
|
|
|
## Implementation Strategy
|
|
|
|
### C Code Integration
|
|
|
|
#### Event Type Classification
|
|
```c
|
|
typedef enum {
|
|
EVENT_TYPE_REGULAR,
|
|
EVENT_TYPE_REPLACEABLE,
|
|
EVENT_TYPE_EPHEMERAL,
|
|
EVENT_TYPE_ADDRESSABLE,
|
|
EVENT_TYPE_UNKNOWN
|
|
} event_type_t;
|
|
|
|
event_type_t classify_event_kind(int kind) {
|
|
if ((kind >= 1000 && kind < 10000) ||
|
|
(kind >= 4 && kind < 45) ||
|
|
kind == 1 || kind == 2) {
|
|
return EVENT_TYPE_REGULAR;
|
|
}
|
|
if ((kind >= 10000 && kind < 20000) ||
|
|
kind == 0 || kind == 3) {
|
|
return EVENT_TYPE_REPLACEABLE;
|
|
}
|
|
if (kind >= 20000 && kind < 30000) {
|
|
return EVENT_TYPE_EPHEMERAL;
|
|
}
|
|
if (kind >= 30000 && kind < 40000) {
|
|
return EVENT_TYPE_ADDRESSABLE;
|
|
}
|
|
return EVENT_TYPE_UNKNOWN;
|
|
}
|
|
|
|
const char* event_type_to_string(event_type_t type) {
|
|
switch (type) {
|
|
case EVENT_TYPE_REGULAR: return "regular";
|
|
case EVENT_TYPE_REPLACEABLE: return "replaceable";
|
|
case EVENT_TYPE_EPHEMERAL: return "ephemeral";
|
|
case EVENT_TYPE_ADDRESSABLE: return "addressable";
|
|
default: return "unknown";
|
|
}
|
|
}
|
|
```
|
|
|
|
#### Simplified Event Storage
|
|
```c
|
|
int store_event(cJSON* event) {
|
|
// Extract fields
|
|
cJSON* id = cJSON_GetObjectItem(event, "id");
|
|
cJSON* pubkey = cJSON_GetObjectItem(event, "pubkey");
|
|
cJSON* created_at = cJSON_GetObjectItem(event, "created_at");
|
|
cJSON* kind = cJSON_GetObjectItem(event, "kind");
|
|
cJSON* content = cJSON_GetObjectItem(event, "content");
|
|
cJSON* sig = cJSON_GetObjectItem(event, "sig");
|
|
|
|
// Classify event type
|
|
event_type_t type = classify_event_kind(cJSON_GetNumberValue(kind));
|
|
|
|
// Serialize tags to JSON
|
|
cJSON* tags = cJSON_GetObjectItem(event, "tags");
|
|
char* tags_json = cJSON_Print(tags ? tags : cJSON_CreateArray());
|
|
|
|
// Single INSERT statement - database handles replacement via triggers
|
|
const char* sql =
|
|
"INSERT INTO events (id, pubkey, created_at, kind, event_type, content, sig, tags) "
|
|
"VALUES (?, ?, ?, ?, ?, ?, ?, ?)";
|
|
|
|
sqlite3_stmt* stmt;
|
|
int rc = sqlite3_prepare_v2(g_db, sql, -1, &stmt, NULL);
|
|
if (rc != SQLITE_OK) {
|
|
free(tags_json);
|
|
return -1;
|
|
}
|
|
|
|
sqlite3_bind_text(stmt, 1, cJSON_GetStringValue(id), -1, SQLITE_STATIC);
|
|
sqlite3_bind_text(stmt, 2, cJSON_GetStringValue(pubkey), -1, SQLITE_STATIC);
|
|
sqlite3_bind_int64(stmt, 3, (sqlite3_int64)cJSON_GetNumberValue(created_at));
|
|
sqlite3_bind_int(stmt, 4, (int)cJSON_GetNumberValue(kind));
|
|
sqlite3_bind_text(stmt, 5, event_type_to_string(type), -1, SQLITE_STATIC);
|
|
sqlite3_bind_text(stmt, 6, cJSON_GetStringValue(content), -1, SQLITE_STATIC);
|
|
sqlite3_bind_text(stmt, 7, cJSON_GetStringValue(sig), -1, SQLITE_STATIC);
|
|
sqlite3_bind_text(stmt, 8, tags_json, -1, SQLITE_TRANSIENT);
|
|
|
|
rc = sqlite3_step(stmt);
|
|
sqlite3_finalize(stmt);
|
|
free(tags_json);
|
|
|
|
return (rc == SQLITE_DONE) ? 0 : -1;
|
|
}
|
|
```
|
|
|
|
#### Simple REQ Query Building
|
|
```c
|
|
char* build_filter_query(cJSON* filter) {
|
|
// Build single query against events table
|
|
// Much simpler than multi-table approach
|
|
|
|
GString* query = g_string_new("SELECT * FROM events WHERE 1=1");
|
|
|
|
// Handle ids filter
|
|
cJSON* ids = cJSON_GetObjectItem(filter, "ids");
|
|
if (ids && cJSON_IsArray(ids)) {
|
|
g_string_append(query, " AND id IN (");
|
|
// Add parameter placeholders
|
|
g_string_append(query, ")");
|
|
}
|
|
|
|
// Handle authors filter
|
|
cJSON* authors = cJSON_GetObjectItem(filter, "authors");
|
|
if (authors && cJSON_IsArray(authors)) {
|
|
g_string_append(query, " AND pubkey IN (");
|
|
// Add parameter placeholders
|
|
g_string_append(query, ")");
|
|
}
|
|
|
|
// Handle kinds filter
|
|
cJSON* kinds = cJSON_GetObjectItem(filter, "kinds");
|
|
if (kinds && cJSON_IsArray(kinds)) {
|
|
g_string_append(query, " AND kind IN (");
|
|
// Add parameter placeholders
|
|
g_string_append(query, ")");
|
|
}
|
|
|
|
// Handle tag filters (#e, #p, etc.)
|
|
cJSON* item;
|
|
cJSON_ArrayForEach(item, filter) {
|
|
char* key = item->string;
|
|
if (key && key[0] == '#' && strlen(key) == 2) {
|
|
char tag_name = key[1];
|
|
g_string_append_printf(query,
|
|
" AND EXISTS (SELECT 1 FROM json_each(tags) "
|
|
"WHERE json_extract(value, '$[0]') = '%c' "
|
|
"AND json_extract(value, '$[1]') IN (", tag_name);
|
|
// Add parameter placeholders
|
|
g_string_append(query, "))");
|
|
}
|
|
}
|
|
|
|
// Handle time range
|
|
cJSON* since = cJSON_GetObjectItem(filter, "since");
|
|
if (since) {
|
|
g_string_append(query, " AND created_at >= ?");
|
|
}
|
|
|
|
cJSON* until = cJSON_GetObjectItem(filter, "until");
|
|
if (until) {
|
|
g_string_append(query, " AND created_at <= ?");
|
|
}
|
|
|
|
// Standard ordering and limit
|
|
g_string_append(query, " ORDER BY created_at DESC, id ASC");
|
|
|
|
cJSON* limit = cJSON_GetObjectItem(filter, "limit");
|
|
if (limit) {
|
|
g_string_append(query, " LIMIT ?");
|
|
}
|
|
|
|
return g_string_free(query, FALSE);
|
|
}
|
|
```
|
|
|
|
## Benefits of This Approach
|
|
|
|
### 1. Query Simplicity
|
|
- ✅ Single table = simple REQ queries
|
|
- ✅ No UNION complexity
|
|
- ✅ Familiar SQL patterns
|
|
- ✅ Easy LIMIT and ORDER BY handling
|
|
|
|
### 2. Protocol Compliance
|
|
- ✅ Event type classification enforced
|
|
- ✅ Replacement logic via triggers
|
|
- ✅ Unique constraints prevent duplicates
|
|
- ✅ Proper handling of all event types
|
|
|
|
### 3. Performance
|
|
- ✅ Unified indexes across all events
|
|
- ✅ No join overhead for basic queries
|
|
- ✅ JSON tag indexes for complex filters
|
|
- ✅ Single table scan for cross-kind queries
|
|
|
|
### 4. Implementation Simplicity
|
|
- ✅ Minimal changes from current code
|
|
- ✅ Database handles replacement logic
|
|
- ✅ Simple event storage function
|
|
- ✅ No complex routing logic needed
|
|
|
|
### 5. Future Flexibility
|
|
- ✅ Can add columns for new event types
|
|
- ✅ Can split tables later if needed
|
|
- ✅ Easy to add new indexes
|
|
- ✅ Extensible constraint system
|
|
|
|
## Migration Path
|
|
|
|
### Phase 1: Schema Update
|
|
1. Add `event_type` column to existing events table
|
|
2. Add JSON `tags` column
|
|
3. Create classification triggers
|
|
4. Add partial unique constraints
|
|
|
|
### Phase 2: Data Migration
|
|
1. Classify existing events by kind
|
|
2. Convert existing tag table data to JSON
|
|
3. Verify constraint compliance
|
|
4. Update indexes
|
|
|
|
### Phase 3: Code Updates
|
|
1. Update event storage to use new schema
|
|
2. Simplify REQ query building
|
|
3. Remove tag table JOIN logic
|
|
4. Test subscription filtering
|
|
|
|
### Phase 4: Optimization
|
|
1. Monitor query performance
|
|
2. Add specialized indexes as needed
|
|
3. Tune replacement triggers
|
|
4. Consider ephemeral event cleanup
|
|
|
|
## Conclusion
|
|
|
|
This hybrid approach achieves the best of both worlds:
|
|
|
|
- **Protocol compliance** through event type classification and constraints
|
|
- **Query simplicity** through unified storage
|
|
- **Performance** through targeted indexes
|
|
- **Implementation ease** through minimal complexity
|
|
|
|
The multi-table approach, while theoretically cleaner, creates a subscription query nightmare that would significantly burden the implementation. The hybrid single-table approach provides all the benefits with manageable complexity. |