Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

VLDB 2027 Systems Paper (In Preparation)

Target: PVLDB Vol. 20, Industry Track (rolling deadlines starting ~April 2026)

Working Title: Samyama: A Unified Graph-Vector Database with Cost-Based Query Planning and Late Materialization

Status: In preparation. Extends Paper 1 (arxiv:2603.08036) with deeper evaluation and VLDB-specific focus.

Strategy

The arxiv preprint covers everything broadly (11 pages). For VLDB we focus and deepen on 3 core database contributions:

  1. Late materialization for property graphs (NodeRef/EdgeRef) — 4x traversal speedup, 60% memory reduction
  2. Graph-native cost-based planner (ADR-015) — triple-level statistics, plan enumeration, direction reversal, ExpandInto
  3. Hybrid CSR adjacency — frozen tier + write buffer, two-phase bulk loading, 1B edges in 3h46m for $2.50

De-emphasize: GPU acceleration, agentic enrichment, optimization solvers, RDF (mention as system features, don’t evaluate deeply).

Add: LDBC SNB comparison, billion-edge evaluation, cross-KG federation at scale.

Current State

  • LaTeX: research/vldb/samyama-vldb.tex (62KB, most developed draft)
  • Plan: research/vldb/PLAN.md
  • Formatting: research/vldb/FORMATTING-TODO.md

Key New Results (since arxiv v2)

ResultImpact
1B-edge biomedical trifecta (74M nodes, $2.50 spot)Scale credibility
WCO TrieJoin for cyclic patternsAlgorithmic contribution
Edge stub memory optimization (24GB savings at 1B edges)Engineering contribution
6-KG federation (biomedical + public health, 305K nodes, 40/40)Federation story
Rayon parallel algorithms (PageRank, LCC, CDLP)Parallelism