Nodal Points Digest #2: LLM personas, circles as esoteric interfaces, and the anxiety of writing
A few things that shaped how I’m seeing from last week. Some are projects/experiments I’m building, some things I read, some just collisions between ideas that I haven’t settled yet: my first attempt at LLM persona steering; research on circular interfaces since medieval times; making an interface for Bret Victor’s 391 references on one page; midjourney experiments; random books I picked up.
🧭Steering LLM personality with vectors, not prompts
I’ve been spending sometime with representation engineering, a mech interp sub-field [?] that asks: what if you could find “directions” inside an LLM’s internal activations that correspond to personality/mood, and then nudge the model along those directions at inference time? Without fine-tuning/re-training, that is.
The foundational idea comes from the Representation Engineering paper (Zou et al., 2023), which showed that concepts like honesty, fairness, and harmlessness correspond to linear directions you can extract and manipulate. Theia Vogel made this very accessible with the repeng library and a wonderful blog post that walks through training vectors for things like happiness and creativity. More recently, Lu et al.’s the Assistant Axis mapped out a full “persona space” using 275 character archetypes, and found that the main dimension separating different personas captures a spectrum from “assistant-like” to “role-playing.”

