Developer API

Graph Model.

The KGL graph model is built through a three-stage pipeline: file loading, link resolution, and graph construction. This document covers the first two stages in detail.

Overview

The pipeline transforms a collection of .kgl files on disk into an in-memory representation suitable for querying and analysis:

pipeline
.kgl files on disk
        ↓
   Loader (Step 3a)
        ↓
  LoadResult (dict of ParseResults by path)
        ↓
   Resolver (Step 3b)
        ↓
  ResolvedGraph (nodes + edges + unresolved)
        ↓
   Graph (Step 3c)
        ↓
  GraphQuery (traversal API)

Step 3a — File Loader

Purpose

Walk a directory tree, find all .kgl files (excluding _schema.kgl), parse each one, and return a structured collection of parse results.

API

python
from kgl.loader import Loader, LoadResult

result = Loader.load("./")

LoadResult

python
@dataclass
class LoadResult:
    files: dict[str, ParseResult]   # normalised absolute path → ParseResult
    errors: list[LoadError]         # files that could not be read

files

A dictionary mapping normalised absolute paths to ParseResult objects. Paths are normalised with os.path.abspath(), ensuring they are unambiguous and platform-independent for downstream resolution.

All absolute paths use forward slashes internally; Windows paths are normalised to this form.

errors

A list of LoadError objects for files that could not be opened (permissions, encoding, etc.). Parse errors within a file are recorded in the ParseResult itself, not here.

Behaviour

Example

python
from kgl.loader import Loader

result = Loader.load("./")

# result.files:
# {
#   "/home/user/project/people/alice.kgl": ParseResult(...),
#   "/home/user/project/people/diana.kgl": ParseResult(...),
#   "/home/user/project/projects/search.kgl": ParseResult(...),
#   ...
# }
#
# result.errors: []

Step 3b — Resolver

Purpose

Take a LoadResult and resolve all RawLink targets into concrete (source_node, target_node) pairs. Produces the full edge list and records any links whose targets could not be found.

API

python
from kgl.resolver import Resolver, ResolvedGraph

rg = Resolver.resolve(load_result)

ResolvedGraph

python
@dataclass
class ResolvedGraph:
    nodes: dict[str, Node]          # node.id → Node
    edges: list[Edge]               # all resolved edges
    unresolved: list[str]           # link strings that could not be resolved

nodes

A dictionary of all nodes, keyed by their globally unique ID: "{normalised_file}#{name}". The ID format ensures that nodes with the same name in different files have different IDs, and that IDs are deterministic across runs.

edges

A list of all successfully resolved edges. Each edge has:

unresolved

A list of link strings (e.g. "people/bob.kgl", "projects/search.kgl#Missing") that could not be resolved. These represent broken links or missing targets.

Resolution algorithm

Pass 1 — Node Registry

For every RawNode in every ParseResult, construct a Node and register it by ID "{normalised_file}#{name}":

python
node = Node(
    id=f"{norm_path}#{raw_node.name}",
    source_file=norm_path,
    types=raw_node.types,
    name=raw_node.name,
    fields=raw_node.fields,
    tags=raw_node.tags,
    body=raw_node.body,
)
resolved.nodes[node.id] = node

Pass 1 completes before any link resolution, so circular links are safe — the entire graph is indexed before traversal begins.

Pass 2 — Link Resolution

For every RawLink in every RawNode:

  1. Resolve the target file path relative to the source file's directory:
    python
    resolved_target_path = os.path.normpath(
        os.path.join(os.path.dirname(source_file), target_file)
    )
  2. Resolve the target node:
    • If target_node is set (fragment anchor), look up "{resolved_file}#{target_node}" in the registry.
    • If target_node is not set, take the first node registered for resolved_file.
  3. Create an edge if the target is found, or record the link in unresolved if not.

Edge ID

Edge IDs are deterministic 12-character SHA-1 hashes:

python
edge_id = hashlib.sha1(
    f"{source_node.id}|{rel_types_str}|{target_node.id}".encode()
).hexdigest()[:12]

The same source → target → rel_types triple always produces the same edge ID, even across different runs or file orderings.

Example

Given this .kgl file structure:

input files
people/alice.kgl:
  @Person Alice Nguyen
  [mentors] → people/diana.kgl#Diana Park
      started: 2023-06

people/diana.kgl:
  @Person Diana Park
  → alice.kgl

projects/search.kgl:
  @Project Search Revamp
  [depends-on] → infrastructure.kgl

After resolution, ResolvedGraph contains:

python — result
ResolvedGraph(
    nodes={
        "/abs/people/alice.kgl#Alice Nguyen": Node(...),
        "/abs/people/diana.kgl#Diana Park": Node(...),
        "/abs/projects/search.kgl#Search Revamp": Node(...),
    },
    edges=[
        Edge(
            id="...",
            source=Node(name="Alice Nguyen", ...),
            target=Node(name="Diana Park", ...),
            rel_types=["mentors"],
            weight="hard",
            properties={"started": "2023-06"},
        ),
        Edge(
            id="...",
            source=Node(name="Diana Park", ...),
            target=Node(name="Alice Nguyen", ...),
            rel_types=[],
            weight="hard",
            properties={},
        ),
        Edge(
            id="...",
            source=Node(name="Search Revamp", ...),
            target=Node(name="Infrastructure", ...),
            rel_types=["depends-on"],
            weight="hard",
            properties={},
        ),
    ],
    unresolved=[],
)
LinkSourceFragmentResolved toNotes
→ people/diana.kgl#Diana Park people/alice.kgl Diana Park Node ...diana.kgl#Diana Park Explicit fragment
→ people.kgl people/alice.kgl First node in people.kgl No fragment — first wins
→ ../projects/search.kgl people/alice.kgl First node in resolved path Relative path handled correctly
→ nonexistent.kgl people/alice.kgl Recorded in unresolved
→ people/diana.kgl#Missing people/alice.kgl Missing Fragment doesn't exist; in unresolved

Next steps

Step 3c wraps ResolvedGraph in a Graph container and provides GraphQuery for efficient traversal and lookup. The Graph object is the entry point for the OpenCypher query engine.