§ Research

Evaluating what
cannot be measured
by accuracy alone.

Technological systems increasingly shape how we think, feel, and act, yet our means of evaluating them rarely keep pace with their influence. Latency, throughput, and accuracy tell us how a system performs, but not how it aligns with the people who use it — their cognitive rhythms, emotional states, ethical intuitions, or real-world constraints.

§ A Central Question

How do we design and evaluate human-centered technologies in ways that reflect the richness of human cognition and the variability of human values?

This question cuts across domains — whether the system is a video game, a brain–computer interface, an AI-generated image, or an augmented reality tool in a control room. Each project stems from this commitment: design evaluation methods that center the full human experience.

§ Orientation

A methods-first laboratory.

Our work is domain-agnostic. We do not bind ourselves to a single application area, but rather design and refine methodologies that travel across contexts — clinical, computational, societal. The unifying thread: evaluation must reflect the complexity of human cognition, values, and lived experience.

§ Research Areas

Areas we have worked in.

01

Theme

Evaluating Hedonic Technologies

The limitations of conventional usability metrics are most obvious in pleasure-driven technologies — games, social platforms, and social robots — where performance alone cannot explain engagement. We develop engagement frameworks that account for emotional valence, challenge–skill balance, and perceived connectedness: what we term the playful consumption experience. Applied to video games, these metrics uncover both therapeutic benefits and risks of addiction; extended to advertising and social robotics, they reveal how personality, culture, and domain features shape affective responses.

02

Theme

Computational Evaluation of Human-System Interaction

In high-stakes or inaccessible domains, direct observation of users may be infeasible. We develop cognitive architectures that predict how users interact with technology — modelling workload, situational awareness, and decision strategies. Our primary application has been transportation systems, where we modelled driver situation awareness and brake perception–response time, surfacing usability and safety risks before deployment. These simulations don't substitute for human studies; they guide empirical inquiry toward the interactions most likely to challenge cognition.

03

Theme

Augmented Reality and Wearables in Critical Contexts

Many tools aim not to replace cognition but to extend it. In aviation, emergency response, and healthcare, even marginal gains in attention, memory, or spatial awareness can save lives. We design and evaluate AR systems supporting procedural memory, navigation, and training — interactive AR checklists for pilot trainees, spatial overlays for indoor navigation, and AR procedural support for process-control facilities. Well-designed augmentation reduces error rates, improves situational awareness, and accelerates learning under stress, contributing to a broader theory of cognitive scaffolding.

04

Theme

Human-Centered Evaluation of Artificial Intelligence

Generative AI produces speech, images, and decisions that closely resemble human outputs, yet evaluation remains stubbornly mechanical. Accuracy, BLEU scores, and pixel fidelity miss what matters most: does this feel real, understandable, and trustworthy? We build hybrid frameworks combining subjective judgments with automated metrics — validating perceptual realism questionnaires for text-to-image systems, collecting prosody and emotional-resonance ratings for generative speech, and studying explainability via SHAP, LIME, and symbolic systems in sensitive domains like healthcare and finance.

05

Theme

Designing for Moral Pluralism

Technological systems are deployed into morally complex situations, yet their decision logic tends to be rigid. We explore how context-sensitive ethical frameworks can inform design — studying how public moral judgments in autonomous driving shift between utilitarianism and deontology, how user trust in recommenders hinges on transparency about the moral assumptions embedded in curation, and how explanatory feedback in generative language models influences trust on ethically ambiguous queries. We argue for ethics as interaction — moral reasoning as emergent from context, feedback, and explanation.

06

Theme

Privacy, Safety, and Long-Term Impact

Evaluation must extend across time. Short-term usability gains can mask long-term harms or unanticipated shifts in behaviour. We investigate temporal dimensions through longitudinal studies of wearables, smart-home devices, and cyber-safety tools. Our privacy research spans regions and cultures, showing how design clarity and user agency shape data-sharing behaviour. We also evaluate generative AI for phishing detection that pairs algorithmic analysis with user education — empowering users as active participants in their own security.

§ Ongoing Research

In the lab right now.

LLM-Personality

LLM and Personality

Investigating how large language models express and respond to personality traits, and how human-like trait profiles shape interaction quality, trust, and task outcomes.

LLM-Usability

LLM-Based Usability Evaluation

Using large language models to simulate heuristic testing, identify interface flaws early, and accelerate accessible system design.

Embodied-AI

Embodied AI in Virtuality

Studying embodied artificial intelligence within immersive and virtual environments — how virtual agents learn, interact, and adapt when situated in simulated physical spaces.