Logic Guard Layer: A Tutorial

From fundamentals to architecture - step by step

Introduction: What Is This About?

Imagine you ask an AI: "What was the water level on the Rhine yesterday?" The AI responds: "The water level in Cologne was 4.23 meters."

The answer sounds convincing. But how do you know if it's correct?

The problem: Modern language models (like ChatGPT or Claude) are excellent at generating plausible-sounding text. But they have no built-in mechanism that checks whether their statements are true. They can fabricate facts without "noticing" it.

The Logic Guard Layer is a system that sits between the AI and the user, checking: Is what the AI claims actually true?

This tutorial explains step by step how this works.

Part 1: Understanding the Problem

1.1 How Do Language Models Generate Text?

Before we can solve the problem, we need to understand it. Language models generate text word by word (more precisely: token by token). At each step, they "guess" which word is most likely to come next.

Example: Given the beginning "The sky is...", the model calculates probabilities:

Next Word	Probability
blue	45%
cloudy	25%
gray	15%
green	0.1%

The model then selects one of these words (usually one with high probability) and continues.

1.2 The Mathematical Formulation

Mathematically expressed: The model maximizes the conditional probability of the next token.

If we write a text as a sequence of tokens \(x_1, x_2, x_3, \ldots, x_T\), then the model computes:

P(x_t \mid x_1, x_2, \ldots, x_{t-1})

This reads as: "The probability of token \(x_t\), given all previous tokens."

1.3 The Core Problem

Here lies the problem: The model optimizes for linguistic plausibility, not for truth.

The sentence "The water level in Cologne was 4.23 meters" is linguistically flawless. It follows the rules of grammar. It sounds like something that could appear in a news article.

But the model has no idea whether this value is correct. It generated the sentence because it sounds plausible - not because it is true.

Key Point: Linguistic plausibility \(\neq\) factual correctness

Part 2: What Is a "Claim"?

2.1 From Text to Structure

To check whether a statement is correct, we must first formalize it. We need to transform the unstructured text into a structure that a computer can process.

Example Text:

"The water level at station Cologne was 3.45 meters on July 15, 2024 at 14:00."

This sentence contains a claim. To make it verifiable, we break it down into its components.

2.2 The Claim Structure

We define a Claim as a 5-tuple - meaning an ordered list of five elements:

c = (\text{subject}, \text{predicate}, \text{object}, \text{unit}, \text{provenance})

What do these terms mean?

Element	Meaning	Example
subject	What is the claim about?	Station Cologne
predicate	What is being claimed? (the relation)	hasWaterLevel
object	What value is being claimed?	3.45
unit	In which unit?	meters
provenance	When/where does the claim originate?	2024-07-15T14:00:00Z

2.3 Example of Extraction

From our example sentence we get:

c = (\text{"Station Cologne"}, \text{hasWaterLevel}, 3.45, \text{meters}, \text{2024-07-15T14:00:00Z})

Now we have something we can verify! We can ask:

Is there a station named "Cologne"?
Did this station have a water level reading on July 15, 2024 at 14:00?
Was this water level really 3.45 meters?

2.4 The Extraction Function

We call the transformation from text to claims claim extraction and write it as a function:

\phi: T \rightarrow \mathcal{S}

Where:

\(T\) = the set of all possible texts
\(\mathcal{S}\) = the set of all possible claims
\(\phi\) = the extraction function

In words: \(\phi\) takes a text and returns a set of claims.

Part 3: Schema Validation (TBox)

3.1 What Is a Schema?

Before we check whether a fact is true, we can check whether it is possible. We call this schema validation.

A schema defines the rules of a domain:

What entities exist? (Stations, motors, documents, ...)
What properties can they have?
What values are allowed?

Examples of schema rules:

Rule	Meaning
WaterLevel >= 0	A water level cannot be negative
Temperature has unit	Temperature values need a unit (C, K, F)
Start < End	A start time must precede an end time

3.2 TBox Constraints

In knowledge representation, the schema is called the TBox (Terminological Box). The TBox contains concepts and rules, but no concrete facts.

A TBox constraint is a validation function:

\tau: \mathcal{S} \rightarrow \{\texttt{VALID}, \texttt{INVALID}\}

This means: \(\tau\) takes a claim and returns whether it conforms to the schema or not.

3.3 Types of Constraints

There are different types of schema constraints:

1. Type Constraints: Is the value of the correct type?

hasWaterLevel expects a number, not text

2. Range Constraints: Is the value within the allowed range?

\text{object} \in [\text{min}, \text{max}]

3. Unit Constraints: Is the unit compatible?

\text{unit} \in \text{allowedUnits}(\text{predicate})

4. Physical Constraints: Is the value physically possible?

Temperature >= -273.15 C (absolute zero)

Important: Schema validation tells us whether a claim is possible - not whether it is true.

Part 4: Fact Validation (ABox)

4.1 From Schema to Facts

Schema validation filters impossible claims. But many possible claims are still false.

To check whether a claim is true, we need external data sources - authoritative APIs and knowledge bases.

4.2 ABox: The Fact Base

In knowledge representation, the collection of concrete facts is called the ABox (Assertional Box). The ABox contains instance data.

4.3 The Problem of Missing Information

What happens when we check a claim and the data source has no answer?

Possible responses:

"Yes, the value was 3.45 m" -> Claim confirmed
"No, the value was 2.87 m" -> Claim refuted
"I have no data for this time period" -> ???

4.4 Two World Assumptions: CWA and OWA

Closed-World Assumption (CWA):

"What is not in the knowledge base is false."

Open-World Assumption (OWA):

"What is not in the knowledge base is unknown."

4.5 The Error Algebra

The Logic Guard Layer distinguishes different validation results:

\mathcal{R} = \{\texttt{EXISTS}, \texttt{MISMATCH}, \texttt{NOT\_FOUND\_*}, \texttt{LOOKUP\_FAILURE}, \texttt{UNKNOWN}\}

Status	Meaning	Example
`EXISTS`	Claim matches source	Water level was indeed 3.45 m
`MISMATCH`	Claim contradicts source	Source says 2.87 m, not 3.45 m
`LOOKUP_FAILURE`	Technical error	API not reachable
`UNKNOWN`	Not decidable	Insufficient information

4.6 The NOT_FOUND Differentiation

The crucial contribution: NOT_FOUND is further broken down:

Status	Meaning	World Assumption
`NOT_FOUND_ABSENCE`	Entity definitively does not exist	CWA
`NOT_FOUND_OUT_OF_SCOPE`	Source does not cover this area	-
`NOT_FOUND_INCOMPLETE`	Source is incomplete	OWA

Part 5: The Self-Correction Loop

5.1 What to Do When Errors Occur?

Naive solution: Discard everything, regenerate.

Problem: This also loses the correct parts.

Better solution: Only correct the faulty claims - the Self-Correction Loop.

5.2 The Basic Idea

Validate all claims
If errors are found: Ask the LLM to correct only the faulty claims
Validate again
Repeat until everything is correct (or abort)

5.3 The Algorithm

Input:
  - S: the original claims
  - V: the validation results
  - k_max: maximum number of attempts

Algorithm:
  S' <- S                          // Work with a copy
  history <- {hash(S)}             // Remember all states

  Repeat k_max times:
    errors <- all claims with errors in V

    If no errors:
      Return S'                    // Success!

    S'_new <- LLM corrects S' based on errors

    If hash(S'_new) already in history:
      Abort: CYCLE_DETECTED        // We're going in circles

    If drift(S', S'_new, errors) > epsilon:
      Abort: DRIFT_TOO_HIGH        // Too much changed

    Add hash(S'_new) to history
    S' <- S'_new

  Abort: MAX_ATTEMPTS_REACHED

5.4 Problem 1: Cycles

What if the LLM jumps back and forth between two states?

Solution: We store a hash of each state. When a hash appears again, we detect the cycle and abort.

5.5 Problem 2: Semantic Drift

What if the LLM accidentally changes correct claims?

5.6 The Drift Metric

We define the drift rate \(d\) mathematically:

d(S, S', E) = \frac{|\{c \in S \setminus E : c \notin S' \lor \text{modified}(c, S')\}|}{|S \setminus E|}

In words:

\text{Drift Rate} = \frac{\text{Number of non-erroneous claims that were changed anyway}}{\text{Total number of non-erroneous claims}}

Part 6: The Overall Architecture

6.1 The Pipeline

+-------------+     +--------------+     +--------------+
|  LLM Output | --> |    Claim     | --> |    TBox      |
|    (Text)   |     |  Extraction  |     | Validation   |
+-------------+     +--------------+     +--------------+
                           |                    |
                           v                    v
                    +--------------+     +--------------+
                    |   Repair     | <-- |    ABox      |
                    |    Loop      |     | Validation   |
                    +--------------+     +--------------+
                           |
                           v
                    +--------------+
                    |  Validated   |
                    |    Output    |
                    +--------------+

6.2 Components Overview

Component	Input	Output	Function
Claim Extraction	Text	Claims	Text -> Structure
TBox Validation	Claims + Ontology	Valid/Invalid	Schema checking
ABox Validation	Claims + Data sources	Validation results	Fact checking
Repair Loop	Claims + Errors	Corrected claims	Iterative correction

6.3 Mathematical Notation

Claim Extraction: \(\phi: T \rightarrow \mathcal{S}\)
TBox Validation: \(v_T: \mathcal{S} \times \mathcal{O} \rightarrow \{0,1\}^n\)
ABox Validation: \(v_A: \mathcal{S} \times \mathcal{K} \rightarrow \mathcal{R}^n\)
Repair: \(\rho: (\mathcal{S}, \mathcal{V}) \rightarrow \mathcal{S}'\)

Part 7: A Complete Example

7.1 The Task

Input to the LLM:

"Describe the current water level on the Rhine near Cologne."

Output from the LLM:

"The water level at the Cologne measuring station was 3.45 meters on July 15, 2024 at 14:00. This is within the normal range for this time of year."

7.2 Step 1: Claim Extraction

Claim 1:

c_1 = (\text{"Cologne measuring station"}, \text{exists}, \text{true}, -, -)

Claim 2:

c_2 = (\text{"Cologne measuring station"}, \text{hasWaterLevel}, 3.45, \text{m}, \text{2024-07-15T14:00})

7.3 Step 2: TBox Validation

Claim	Constraint	Result
\(c_1\)	Stations can exist	VALID
\(c_2\)	Water level is number >= 0	VALID
\(c_2\)	Unit is meters	VALID

7.4 Step 3: ABox Validation

Query 1: Does station "Cologne" exist?

API response: Yes
Result: EXISTS

Query 2: Water level Cologne on July 15, 2024 at 14:00?

API response: 2.87 m
Claim states: 3.45 m
Result: MISMATCH

7.5 Step 4: Repair Loop

Iteration 1:

Error: \(c_2\) has MISMATCH (3.45 m instead of 2.87 m)
LLM corrects the value
Re-validation: EXISTS
Drift check: 0% - OK

7.6 Final Output

"The water level at the Cologne measuring station was 2.87 meters on July 15, 2024 at 14:00. This is within the normal range for this time of year."

This output is now verified - not just plausible.

Part 8: Summary

8.1 The Core Ideas

Language models optimize for plausibility, not truth.
Claims formalize assertions. A claim is a 5-tuple.
Two-stage validation checks possibility and truth. TBox (Schema) + ABox (Facts)
The error algebra differentiates epistemic states. "Not found" != "False"
The Self-Correction Loop corrects precisely. With cycle detection and drift control.

8.2 The Mathematical Building Blocks

Concept	Notation	Meaning
Claim	\(c = (s, p, o, u, prov)\)	Structured assertion
Extraction	\(\phi: T \rightarrow \mathcal{S}\)	Text -> Claims
TBox Constraint	\(\tau: \mathcal{S} \rightarrow \{V, I\}\)	Schema checking
Error Algebra	\(\mathcal{R}\)	Possible validation results
Drift Rate	\(d(S, S', E)\)	Measure of unintended changes

8.3 Why This Matters

In safety-critical applications - medicine, engineering, law, finance - false information is not just annoying, but dangerous. The Logic Guard Layer provides a way to ground AI systems in reality.

Appendix: Glossary

Term	Explanation
ABox	Assertional Box - the collection of concrete facts in a knowledge base
Claim	An atomic, verifiable assertion
CWA	Closed-World Assumption - what is not known is false
Drift	Unintended modification of correct claims during correction
Hallucination	An AI-generated statement without factual basis
OWA	Open-World Assumption - what is not known is unknown
TBox	Terminological Box - the schema of a knowledge base
Token	The smallest unit that a language model processes

This tutorial conveys the conceptual foundations of the Logic Guard Layer. The technical implementation requires additional knowledge about ontologies (OWL), APIs, and language model integration.