
"LLMs [large language models] confidently claim things that are manifestly untrue. Enforce has developed Verity, a tool that helps minimise false claims and fake sources from self-hosted LLMs."
"Verity is not simply an LLM-as-judge setup in which one LLM evaluates the output of another. Rather, it's a set of seven layers designed to review model output. The system involves: strict rules for fact sourcing; a strong critic LLM that differs from the primary model family; a small critic LLM similar to the strong critic but from different training data; an encoder transformer trained on entailment labels; a regex evaluator; a stochastic re-sampler for catching low-confidence guesses; and a logprob analyser that checks token entropy."
"The Verity MCP server offers access to a set of smaller models that will try to assess the accuracy of a primary local LLM, something more people have begun to explore in response to rising prices at cloud AI providers, availability issues, and privacy concerns."
"The hardware recommendations call for a system with two GPUs, but that's to allow concurrent delivery of a second opinion from the Verity checker. On a machine with one GPU, like a MacBook P"
Verity is a self-hosted MCP server that helps minimize false claims and fake sources produced by local LLMs. It provides access to smaller models that assess the accuracy of a primary local LLM, addressing concerns about cloud pricing, availability, and privacy. Verity is not a simple LLM-as-judge approach; it uses seven layers to review model output. These layers include strict rules for fact sourcing, a strong critic LLM from a different model family, a small critic LLM trained on different data, an entailment-label encoder transformer, a regex evaluator, a stochastic re-sampler for low-confidence guesses, and a logprob analyzer using token entropy. Hardware guidance assumes two GPUs for concurrent second-opinion delivery.
Read at theregister
Unable to calculate read time
Collection
[
|
...
]