Hadto note
Autoresearch for ontologies needs a field crew
A Hadto prototype idea: score the distance between a base ontology and a proposed expansion, then use owner interviews to reduce uncertainty before the model changes.
Why this matters
This post shows how handoff discipline and customer-facing work turn private founder skill into something the business can keep using.
Why this note is here
Main point: States a point Hadto should prove with examples, sources, or customer work.
Why trust it: Grounded in visible responsibility and operating experience.
A model proposes a clean expansion to a service-business ontology.
Add WarrantyCallbackDelay. Connect it to dispatch, technician assignment, customer communication, and unpaid rework. Give it a few example cases. The name sounds right. The shape looks useful. A dashboard could count it by branch, crew, and customer.
The owner hesitates.
Some callbacks are workmanship failures. Some are manufacturer warranty issues. Some are access problems where the customer was not home. Some are sales promises that never should have been made. Some are goodwill visits that protect the account even when nobody is technically at fault.
The proposed class has found a real pressure point. It has also blurred the distinctions that decide money, accountability, and customer trust.
Hadto already knows AI can propose ontology candidates. The harder question is, “How far is this proposed expansion from the base ontology, and which field question would reduce that distance fastest?”
That translation puts Karpathy’s autoresearch to work in a different domain.
Autoresearch works because the loop is tight. An agent changes training code, runs a fixed experiment, reads the score, keeps the change if it improves val_bpb, and rolls back if it does not. The metric is narrow. It still gives the loop a spine. Ontology work needs a spine too: reducing uncertainty without breaking the commitments the business already runs on.
Distance before acceptance
A proposed expansion should enter a distance-scoring lane before it reaches the accepted ontology. The score should ask several kinds of questions at once.
Structural distance: how many new classes, relations, roles, and constraints does the candidate add? Does it fit under an existing distinction, or is it trying to create a new operating kind?
Logical distance: does the candidate stay consistent with the base model? Does it inherit rules that would make ordinary cases impossible or route authority to the wrong person?
Competency-question distance: which existing business questions can now be answered better? Which questions become ambiguous? Which new questions must be answered before the candidate is useful?
Evidence distance: what proof supports the candidate? Did it come from one owner’s memory, repeated job notes, payer rules, customer complaints, dispatch records, photos, interviews, or only a model cluster?
Manifold distance: do the examples sit near known cases, or do they occupy a region the current ontology does not explain well? Goodfire’s recent work on steering along manifolds points at a useful habit for Hadto: curved structure can make the safe move different from the obvious straight edit through one label.
Transformation distance: can the candidate be reached from the base ontology by a small, explainable transformation, or does it require changing what the base model treats as evidence, authority, responsibility, or closure? Christandl’s tensor-resource framing is useful here as an analogy because it asks what transformations are possible while preserving the resource constraints that matter.
These scores do not need to be final truth. They need to create a ranked uncertainty list.
The next experiment is a field question
Karpathy’s loop runs another training experiment. The ontology loop should often run another field question.
For WarrantyCallbackDelay, the system might ask who decides whether a revisit is warranty, goodwill, manufacturer defect, customer access, or sales correction. It might ask which answer changes billing, which answer changes technician accountability, which answer changes the customer promise, and what evidence closes the case without asking the owner to remember the story.
The distance score should generate those questions. If logical distance is high, ask about rules and exceptions. If evidence distance is high, ask for records and examples. If competency-question distance is high, ask what decision the business still cannot make. If transformation distance is high, ask whether the proposed category is really one category or several smaller ones.
Simulated owner answers can help test the prototype. They are scaffolding. The real loop needs human field work. Someone has to ask the owner, dispatcher, technician, office manager, and apprentice how the work is actually distinguished when money, blame, trust, and scheduling are on the line.
Then the system rescores the candidate.
The result might be acceptance: WarrantyCallbackDelay is a real operating kind with stable evidence and a review path.
It might be revision: split the candidate into workmanship callback, manufacturer warranty revisit, customer-access delay, and goodwill retention visit.
It might be rejection: the label was only a report shortcut and should not become shared ontology.
Each outcome is progress if the business learned something reviewable.
The geometry is only useful if it reaches the truck
Frank Nielsen’s overview of information geometry is a reminder that distance can be more than a flat spreadsheet difference. Families of models can have shape. Paths can matter. Local moves can be cheap in one direction and expensive in another.
For Hadto, the math is a design provocation rather than decoration.
Service owners do not need a lecture about manifolds. They need a system that notices when a proposed edit is close to accepted business memory in name only. If a new class would change who owns the next action, which record counts as evidence, or whether a customer can be billed, the distance is operationally large even when the label sounds familiar.
That distinction belongs in the product. A mysterious score would be weak. The owner should receive the next useful question. The apprentice should inherit the accepted distinction, the rejected alternatives, the evidence trail, and the reason the business chose one category over another.
Hadto’s loop
The prototype shape is simple enough to build:
- Propose an ontology expansion over the base model.
- Score structural, logical, competency-question, evidence, manifold, and transformation distance.
- Generate the smallest set of field questions likely to reduce the largest uncertainty.
- Collect or simulate owner/operator answers.
- Rescore the candidate.
- Accept, revise, or reject the expansion.
The loop should keep its base commitments visible at every step. A candidate cannot win by sounding plausible. It has to lower uncertainty without damaging the operating contract the current business already trusts.
Here the autoresearch analogy earns its place.
The machine is allowed to keep proposing. The score is allowed to get worse when new evidence exposes a gap. The owner is allowed to answer the question that the model could not settle. The ontology is allowed to change only after the field answer has somewhere durable to live.
Hadto’s thesis says owner/operator knowledge should become infrastructure instead of private founder memory.
An ontology-distance loop is one way to make that thesis testable. It turns a proposed model edit into a measurable uncertainty, turns that uncertainty into field work, and turns the field answer into business memory another operator can inherit.
Autoresearch for ontologies is a business learning which distinctions are real enough to carry work.
Source evidence used in this note: public source materials reviewed for the prototype framing, including Andrej Karpathy’s autoresearch repository, Karpathy’s autoresearch program loop, Frank Nielsen’s The Many Faces of Information Geometry, Goodfire’s public research listing for Steering Along Manifolds to Control Neural Networks, and Matthias Christandl’s The tensor as an informational resource.
Follow this concept
- Use the founder-dependence audit when this note exposes handoff risk
Move from the ownership idea to the service that makes private founder judgment visible.
- Read the governance rules behind owner handoff
Check how ordinary control, reserved matters, and reporting support the person running the business.
Read next
- Benchmark the ontology against the business
Evidence: Adds facts or examples behind an existing point.
- The ontology learned when the proof got better
Evidence: Adds facts or examples behind an existing point.
- Big-company AI is not the SMB playbook
Contrast: Shows a path Hadto does not want to copy.