Quantization at the Edge: Evaluating Inference Performance and Quality for SLM Driven Conversational Agents in Virtual Worlds

Nisiotis, Louis; Markov, Nikita

Lists

Tools

Nisiotis, Louis ORCID: 0000-0002-8018-1352 and Markov, Nikita (2026) Quantization at the Edge: Evaluating Inference Performance and Quality for SLM Driven Conversational Agents in Virtual Worlds. N/A . (Submitted)

Full text not available from this repository.

Official URL: https://doi.org/10.36227/techrxiv.177273703.328884...

Abstract

Quantised small language models (SLMs) deployed at the edge are increasingly important for enabling high performance AI agents in interactive virtual worlds, where low latency operation is key for seamless user experience. This paper investigates the deployment of quantised SLMs at the edge to meet performance requirements and maintaining high response quality to support real time conversational agent interactions in an immersive virtual world. We developed a RAG based back-end AI system to support virtual agent conversational capabilities in a virtual world prototype, and experimented with two SLM families and three quantisation levels (8-bit, 4-bit, and 3-bit) to evaluate system performance through end-to-end latency, time-to-first-token, throughput, and memory headroom. We also assess response quality via expert evaluations of accuracy, relevance, safety, and human alignment against non-quantised baselines. Results show that moderate quantisation enables performant practical edge deployment with good response quality, demonstrating that edge-deployed quantised SLMs can support real-time narrative interactions and remain practically useful for virtual-world agents, providing evidence for deploying high performance conversational AI in immersive environments.

Repository Staff Only: item control page

Altmetric

View Altmetric information about this item.

Summary Table

CORE (COnnecting REpositories)

Item Type:	Article
Creators (Authors or editors):	Creators Email ORCID ORCID Put Code Nisiotis, Louis lnisiotis@uclan.ac.uk https://orcid.org/0000-0002-8018-1352 UNSPECIFIED Markov, Nikita nmarkov@uclan.ac.uk UNSPECIFIED UNSPECIFIED
Uncontrolled Keywords (separate with ;):	Artificial Intelligence; Small Language Models; Edge Computing; Virtual Worlds
Subjects:	I - Computer science > I400 - Artificial intelligence
Schools:	School of Engineering and Computing > Computing
ID Code:	58624
Depositing User ID:	Christopher Waddington
Date Deposited:	12 Mar 2026 15:36
Last Modified:	12 Mar 2026 15:36

Welcome to

Lancashire Online Knowledge

Quantization at the Edge: Evaluating Inference Performance and Quality for SLM Driven Conversational Agents in Virtual Worlds

Abstract

Follow Us

Search

Welcome to

Lancashire Online Knowledge

Quantization at the Edge: Evaluating Inference Performance and Quality for SLM Driven Conversational Agents in Virtual Worlds

Abstract

Follow Us