vulpea · emacs

Vulpea v2.1: queries, fixes, and performance

Vulpea v2.1 adds graph diagnostics, a link querying API, extraction fixes, and non-blocking polling. A recap of everything since v2.0.0.

Vulpea v2.1.0 is the first feature release since the v2 rewrite. It covers a lot of ground: new query functions for exploring your note graph, fixes for edge cases in extraction, and a performance improvement that eliminates the last blocking path in the sync system.

This post covers everything that shipped since v2.0.0, including the v2.0.1 patch release.

#1Graph diagnostics

The biggest addition is a set of functions for auditing your note graph. These came out of my own need to clean up a 13k-note collection.

#2Dead links

(vulpea-db-query-dead-links) ;; => ((#s(vulpea-note ...) . "nonexistent-id") ...)

Returns broken id: links - cases where a note links to an ID that doesn't exist in the database. Useful after bulk operations or migrations where notes may have been deleted without updating references.

#2Orphan and isolated notes

;; Notes nothing links to (vulpea-db-query-orphan-notes) ;; Notes with no connections at all (no incoming or outgoing links) (vulpea-db-query-isolated-notes)

Orphans have no incoming links. Isolated notes have no connections in either direction - they exist in the database but are completely disconnected from the graph. The distinction matters: an orphan might still link out to useful things, while an isolated note is truly floating.

#2Title collisions

(vulpea-db-query-title-collisions) ;; => (("Wine" . (#s(vulpea-note ...) #s(vulpea-note ...))) ...) ;; File-level only (vulpea-db-query-title-collisions 0)

Finds notes sharing the same title. In a large collection, duplicates accumulate silently. This makes them visible.

#1Link querying API

Previously, vulpea stored links in the database but only exposed them indirectly through vulpea-db-query-by-links-some/every (find notes that link TO a target). There was no way to query links as objects.

Now there is:

;; All links (vulpea-db-query-links) ;; => ((:source "id1" :dest "id2" :type "id" :pos 100) ...) ;; Filter by type (vulpea-db-query-links-by-type "https") ;; Outgoing links from a note (vulpea-db-query-links-from "note-id") ;; Incoming links (backlinks) to a note (vulpea-db-query-links-to "note-id")

Each link is a plist with :source, :dest, :type, and :pos. The backlinks widget in vulpea-ui already uses this API, and it opens the door for link statistics, graph visualisation, and other tools built on top of the link data.

#1Extraction fixes

Two bugs in the extraction pipeline that affected real-world usage:

#2Links in titles

agzam reported that notes with links in their titles were stored with raw org markup:

The Memory Illusion [[id:26868D41][book]] by Dr. [[id:julia_shaw_dr][Julia Shaw]]

This happened because org-element returns :raw-value for heading titles and raw strings for #+title: keywords. Neither strips bracket link syntax.

The fix applies org-link-display-format at extraction time, so titles are stored as clean display text. The links themselves are still extracted and persisted in the links table - you don't lose any graph connectivity.

This also applies to outline paths, so heading hierarchies display cleanly in selection interfaces.

#2Case-insensitive property keys

Org-mode treats property names case-insensitively, but org-element returns them as-is. A file with :id: (lowercase) instead of :ID: would have its note silently skipped, because the downstream assoc lookups expected uppercase keys.

The fix is one line: upcase the property key at the extraction boundary. Since all properties flow through a single function, this handles every consumer at once.

#1Non-blocking polling

This one I wrote about separately in a previous post regarding startup performance. The same principle applies to polling.

When fswatch isn't available, vulpea falls back to polling - periodically scanning directories for changed files. Before this release, each poll cycle called directory-files-recursively synchronously, blocking the editor for ~700ms on my 13,800-file collection. Every two seconds.

The fix: replace the synchronous scan with an async fd (or find) subprocess. The same approach that fixed startup now fixes polling.

ScenarioBeforeAfter
Blocking time per poll (fd)~700ms0ms
Blocking time per poll (find)~700ms0ms
Total scan time (fd)~700ms~50ms (background)
Total scan time (find)~700ms~900ms (background)

The key number is blocking time: zero in both cases. The scan runs in a subprocess, the comparison runs in the callback. Emacs never freezes.

That said, the difference between fd and find is significant. If you use polling mode, install fd:

brew install fd # macOS apt install fd-find # Debian/Ubuntu pacman -S fd # Arch

And if you sync notes via git or cloud tools, install fswatch too - it detects changes instantly with zero overhead, no scanning needed.

#1Other improvements from v2.0.1

The v2.0.1 patch release included several fixes worth mentioning:

  • Heading-level metadata was not being persisted to the database due to an org-element-map scoping issue. Fixed.
  • Heading link extraction was including links from child headlines. A heading-level note would incorrectly inherit links from its nested sub-headings.
  • Links in non-note subtrees (headings without an ID) were lost entirely. They're now attributed to the nearest ancestor note.
  • Attachment query (vulpea-db-query-attachments-by-path) was added for efficiently querying attachment destinations with a single SQL query.
  • Async startup was reworked to be non-blocking (the 171x improvement - 1371ms down to 8ms).

#1Upgrading

Delete your database file and rebuild:

(vulpea-db-close) (delete-file vulpea-db-location) (vulpea-db-sync-full-scan)

The schema hasn't changed, but the extraction fixes (title cleanup, property key normalisation) mean existing data may be stale.

If you're coming from org-roam or vulpea v1, see the migration guide.

#1What's next

Vulpea is the core - it provides the data layer and query API. Many of the features mentioned here will eventually have corresponding UI in vulpea-ui. The planned work spans both projects:

vulpea (core):

  • Unlinked mentions - find notes whose title appears in other notes' text without an id: link
  • Schema validation - define expectations for notes (required metadata, properties) and validate against them
  • Note renaming with link updating - rename a note's title and update all incoming link descriptions across the knowledge base

vulpea-ui:

  • Diagnostics view - surface dead links, orphans, and title collisions in a browsable UI
  • Unlinked mentions widget - show potential links in the sidebar
  • Collection view - table/card mode for browsing notes with saved queries and filtering
  • Semantic backlink previews - show surrounding context for each backlink, not just the note title

#1Thanks

This release was shaped by community feedback. Thanks to agzam for the detailed bug report on #220 that led to the title and property key fixes, Mugu-Mugu for catching the heading-level metadata bug (#200), ventruvian for reporting the subtree link scoping issue (#202), and John Wiegley for both reporting and fixing missing buffer-meta functions (#192).

And to everyone using vulpea - thank you. Knowing people rely on this for their daily workflows is what keeps the project moving.

If you have ideas or run into issues, open an issue.

The code is at github.com/d12frosted/vulpea. Available on MELPA.