The ontology learned when the proof got better

A proof note from the OntoGPT-to-Hadto pass: schema-bound extraction counted only after the evaluator showed better home-services ontology behavior.

Why this matters

This post shows how handoff discipline and customer-facing work turn private founder skill into something the business can keep using.

Why this note is here

Evidence: Adds facts or examples behind an existing point.

What supports it: Uses evidence, definitions, and cause-and-effect.

Model suggestions improve the business only when evidence gates make the ontology better.

ontology researchhome servicesowner operatorshadto

David’s first pointer was arXiv 2412.00608. It was useful, but it was not the target. He corrected the lane to Monarch OntoGPT, and that changed the build.

The first paper points toward LLM-assisted ontology extraction and knowledge graph generation with user control. The OntoGPT correction made the lesson sharper. OntoGPT and its SPIRES method ask for a schema and source text, then produce structured extraction shaped by that schema. The SPIRES paper also keeps grounding in view: extracted terms should connect back to existing identifiers and allowed vocabularies where possible.

The distinction blocks free-form LLM ontology invention.

Hadto’s useful pattern was schema plus source text, then grounded structured extraction. The machine can propose a service-call fact, an invoice fact, or a dispatch exception candidate. It does not get to smuggle a new business model into the ontology because the sentence sounded plausible.

The older rule still holds

This does not replace the April rule that AI should propose ontology candidates instead of authoring the business model. It makes that rule runnable.

A candidate can look clean and still be wrong. EmergencyCall sounds like a reasonable home-services class until the base ontology already has hs:ServiceCall with hs:is_emergency. ServiceVisit sounds normal until the current work model already distinguishes the service call, work order, technician assignment, and invoice trail. invoice_amount sounds harmless until the invoicing model already has inv:hasTotalAmount.

The evaluator has to catch those errors before the accepted ontology teaches them to the next operator.

The score had to be real

Bench4KE and OE-Assist supplied the second lesson. Ontology help needs gold fixtures and deterministic metrics. Otherwise a model can sound helpful while leaving the business no better at answering its own questions.

Bench4KE evaluates competency-question generation against gold datasets with repeatable metrics. OE-Assist checks whether competency questions are actually modeled, then compares the answer to a manually created gold standard. The operator lesson is plain: do not trust the assistant because it explained itself. Trust it only when the fixture gets better.

HAD-770 turned that lesson into a schema-guided evaluator for Hadto. It scored precision, recall, F1, schema conformance, source grounding, competency-question traceability, prefix validity, and unsupported assertions. HAD-771 then used that evaluator on a real home-services expansion instead of a toy example.

The proof moved:

precision +0.600
recall +0.667
F1 +0.636
schema conformance +0.600
source grounding +0.400
CQ traceability +0.600
prefix validity +0.600
unsupported/hallucinated assertions reduced by 3

At that point the ontology work became more than a model suggestion.

What the home-services pass learned

The accepted expansion promoted hs:response_tier, hs:ServiceInvoice, hs:creates_invoice, hs:DispatchException, hs:TechnicianRouteQueue, hs:has_dispatch_exception, hs:queued_service_call, and hs:queued_work_order.

It also reused hs:ServiceCall, hs:is_emergency, hs:assigned_technician, and inv:hasTotalAmount instead of adding duplicate EmergencyCall, ServiceVisit, Customer, or invoice_amount terms.

The difference matters to an operator.

hs:response_tier lets the business distinguish emergency, same-day, and routine promises without making a separate emergency-call class. hs:DispatchException names the broken dispatch path, while hs:has_dispatch_exception attaches the exception to the work already in the system. hs:TechnicianRouteQueue, hs:queued_service_call, and hs:queued_work_order say where the work is waiting before a technician can act. hs:ServiceInvoice and hs:creates_invoice connect completed service to billing, while inv:hasTotalAmount keeps the money fact in the invoice vocabulary that already owns it.

The ontology acquired dispatch, route-queue, response-tier, and invoice facts without pretending every familiar phrase deserved a new class.

The operator test

The full adventure was not paper to prompt to ontology. It was paper to correction, correction to extraction method, extraction method to benchmark lesson, benchmark lesson to evaluator, evaluator to measured expansion.

An owner-facing system needs that whole chain.

When a dispatcher asks which calls are waiting for a technician route, the answer should come from modeled queue facts. When a manager asks which emergency calls missed the promise, the answer should reuse service calls and response tiers instead of hunting for a separate emergency-call object. When finance asks what invoice a service call created, the answer should cross the service and invoice boundary without inventing a second amount field.

The model can help find those facts. The ontology only learns when the source text is attached, the schema accepts the shape, the competency questions trace through, the prefixes stay valid, duplicate terms are rejected, and the score improves against the fixture.

That is the public lesson from the OntoGPT-to-Hadto pass: AI-assisted ontology engineering is useful when it creates a better operating contract, not when it produces a more confident list of names.

Source evidence used in this note: public materials reviewed for the proof framing, including arXiv 2412.00608, Monarch OntoGPT on GitHub, the OntoGPT documentation, the Bioinformatics paper on Structured Prompt Interrogation and Recursive Extraction of Semantics, Bench4KE, and OE-Assist. Existing Hadto posts checked for anti-duplication included AI should propose ontology candidates, not author the business model, Autoresearch for ontologies needs a field crew, and recent ontology and business-fact notes from May 2026.

Follow this concept

Use the founder-dependence audit when this note exposes handoff risk
Move from the ownership idea to the service that makes private founder judgment visible.
Read the governance rules behind owner handoff
Check how ordinary control, reserved matters, and reporting support the person running the business.