Finished BUD 1

This commit is contained in:
Your Name
2025-08-18 21:51:54 -04:00
parent e641c813eb
commit 95ccb3a9c4
24 changed files with 1728 additions and 31 deletions

115
README.md
View File

@@ -231,9 +231,122 @@ Successful uploads return blob descriptors:
│ └── b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553.pdf
├── a8/
│ └── 47/
│ └── a8472f6d93e42c1e5b4e9f3a7b2c8d4e6f9a1b3c5d7e8f0a1b2c3d4e5f6a7b8.png
│ └── a8472f6d93e42c1e5b4e6f9a1b3c5d7e8f0a1b2c3d4e6f9a1b3c5d7e8f0a1b2c3d4e5f6a7b8.png
```
## File and Extension Handling
ginxsom implements a sophisticated file and extension handling strategy that optimizes both performance and flexibility:
### Database-Driven Architecture
The system uses the SQLite database as the **single source of truth** for blob existence and metadata:
```sql
-- Database schema (clean SHA-256 hashes, no extensions)
CREATE TABLE blobs (
sha256 TEXT PRIMARY KEY, -- Clean hash: b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553
size INTEGER NOT NULL,
type TEXT NOT NULL,
uploaded_at INTEGER NOT NULL,
uploader_pubkey TEXT,
filename TEXT -- Original filename: document.pdf
);
```
**Key Benefits:**
- **Database Lookup vs Filesystem**: FastCGI queries the database instead of checking filesystem
- **Single Source of Truth**: Database definitively knows if a blob exists
- **Extension Irrelevant**: Database uses clean SHA-256 hashes without extensions
- **Performance**: Database queries are faster than filesystem stat() calls
### URL and Extension Support
ginxsom supports flexible URL patterns for maximum client compatibility:
```
# Both forms work identically:
GET /b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553
GET /b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553.pdf
# HEAD requests work with or without extensions:
HEAD /b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553
HEAD /b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553.pdf
```
### nginx File Storage Strategy
Files are stored on disk **with extensions** for nginx MIME type detection and optimal performance:
```nginx
# Blossom-compliant nginx configuration
location ~ ^/([a-f0-9]{64}).*$ {
root /var/lib/ginxsom/blobs;
try_files /$1* =404;
}
```
**How it works:**
1. **Hash-only lookup**: nginx extracts the 64-character SHA-256 hash from the URL, ignoring any extension
2. **Wildcard matching**: `try_files /$1*` finds any file starting with that hash
3. **Blossom compliance**: Serves correct file regardless of URL extension
**Examples:**
Client requests: `b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553.pdf`
- nginx extracts hash: `b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553`
- nginx looks for: `b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553*`
- nginx finds: `b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553.pdf`
- nginx serves the PDF with correct `Content-Type: application/pdf`
Client requests: `b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553.mp3`
- nginx extracts same hash: `b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553`
- nginx looks for: `b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553*`
- nginx finds: `b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553.pdf`
- nginx serves the PDF (not 404) with correct `Content-Type: application/pdf`
**Key Benefits:**
- **Blossom Protocol Compliance**: Accepts any extension, returns correct MIME type
- **Performance**: nginx serves files directly without FastCGI involvement
- **Flexibility**: URLs work with correct extension, wrong extension, or no extension
- **MIME Detection**: nginx determines Content-Type from actual file extension on disk
This approach ensures that files are always found by their SHA-256 hash regardless of what extension (if any) is used in the request URL, while maintaining nginx's excellent static file serving performance.
### HEAD Request Handling
HEAD requests are processed differently to ensure accuracy:
1. **nginx receives HEAD request**: `/b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553.pdf`
2. **nginx forwards to FastCGI**: All HEAD requests go to ginxsom for metadata
3. **ginxsom extracts clean hash**: Strips `.pdf``b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553`
4. **ginxsom queries database**: `SELECT size, type FROM blobs WHERE sha256 = ?`
5. **ginxsom returns headers**: Content-Length, Content-Type, etc.
### Why This Approach?
**Performance:**
- GET requests: nginx serves directly from disk (no database hit)
- HEAD requests: Single database query (no filesystem checking)
- Extension mismatches: Eliminated by database-driven approach
**Reliability:**
- Database is authoritative source of blob existence
- No race conditions between filesystem and metadata
- Consistent behavior regardless of URL format
**Flexibility:**
- Clients can use URLs with or without extensions
- Browser-friendly URLs with proper file extensions
- API-friendly clean hash URLs
**Scalability:**
- nginx handles the heavy lifting (file serving)
- FastCGI only processes metadata operations
- Database queries scale better than filesystem operations
## Development
### Project Structure