human-centered computing group (hccg)

Technological systems increasingly shape how we think, feel, and act, yet our means of evaluating them rarely keep pace with their influence. Metrics such as latency, throughput, or accuracy offer only a partial view. They tell us how a system performs, but not how it aligns with the people who use it: their cognitive rhythms, emotional states, ethical intuitions, or real-world constraints. Our research is driven by a central question: How do we design and evaluate human-centered technologies in ways that reflect the richness of human cognition and the variability of human values?

This question cuts across domains, whether the system is a video game, a brain-computer interface, an AI-generated image, or an augmented reality tool in a control room. Our work explores how we can move beyond traditional benchmarks to develop new forms of evidence: computational models that simulate human behavior, empirical metrics that capture affect and meaning, and ethical frameworks that account for context and moral diversity. Each project stems from this core commitment: to design evaluation methods that center the full human experience.

We are a methods-first laboratory. Our work is domain-agnostic: we do not bind ourselves to a single application area, but rather design and refine methodologies that can travel across contexts, whether clinical, computational, or societal. The unifying thread is our belief that evaluation must reflect the complexity of human cognition, values, and lived experience.

Research themes

Some of our ongoing research themes include:

Evaluating Hedonic Technologies

The limitations of conventional usability metrics are perhaps most obvious in pleasure-driven technologies, games, social platforms, and social robots, where performance alone cannot explain user engagement. These systems aim to elicit joy, immersion, or social connection, yet we lack robust ways to measure such outcomes.

In response, we have developed engagement frameworks that account for emotional valence, challenge-skill balance, and perceived connectedness, what we term the playful consumption experience. Applied to video games, these metrics uncover both the therapeutic benefits of play and its potential for addiction. When extended to domains like digital advertising or social robotics, they reveal how personality traits, cultural norms, and domain-specific features interact to shape users’ affective responses. This body of work reframes hedonic technology not just as entertainment, but as a site for cognitive and emotional meaning-making, deserving of equally nuanced evaluation.

Computational Evaluation of Human-System Interaction

In high-stakes or inaccessible domains, direct observation of user behavior may be infeasible. To address this, we developed cognitive architectures to predict how users interact with technology, focusing on models of workload, situational awareness, and decision strategies.

Our primary application has been in transportation systems, where we modeled driver situation awareness and brake perception–response time. By simulating human cognitive limits in this context, these models highlighted potential usability and safety risks before deployment. Crucially, such simulations are not substitutes for human studies; rather, they guide empirical inquiry toward the interactions most likely to challenge cognition, not just system efficiency.

Augmented Reality and Wearables in Critical Contexts

Many of today’s tools aim not to replace human cognition but to extend it. In domains like aviation, emergency response, and healthcare, even marginal gains in attention, memory, or spatial awareness can save lives.

Our work in this space involves the design and evaluation of AR systems that support procedural memory, navigation, and training. For example, we’ve developed interactive AR checklists for pilot trainees and spatial overlays for indoor navigation in unfamiliar environments. We have also developed AR based procedural support for process control facilties. Across these studies, we found that well-designed augmentation reduces error rates, improves situational awareness, and accelerates learning, especially under stress. These findings contribute to a broader theory of cognitive scaffolding, in which interfaces offload mental demands to optimize decision quality in complex settings.

Human-Centered Evaluation of Artificial Intelligence

AI systems, particularly generative models, now produce speech, images, and decisions that closely resemble human outputs. Yet their evaluation remains stubbornly mechanical. Accuracy, BLEU scores, and pixel fidelity miss what matters most to users: does this output feel real, understandable, and trustworthy?

To bridge this gap, we’ve created hybrid evaluation frameworks that combine subjective user judgments with automated metrics. For text-to-image systems, we validated perceptual realism questionnaires that align with computational scores. For generative speech models, we collected user ratings of prosody and emotional resonance, dimensions often omitted from standard benchmarks.

We also study explainability, exploring how model-agnostic techniques (e.g., SHAP, LIME) and symbolic systems can foster trust, particularly in sensitive domains like healthcare and finance. These projects treat evaluation not as an afterthought, but as a co-design process, one that helps users understand, challenge, and adapt AI systems in meaningful ways.

Designing for Moral Pluralism

Technological systems are often deployed into morally complex situations, yet their decision logic tends to be rigid. We explore how context-sensitive ethical frameworks can inform design in domains like autonomous vehicles, recommender systems, and language models.

In the case of autonomous driving, for instance, we studied how public moral judgments shift between utalitarinism and deontology and developed adaptive decision models that reflect these variations. In recommender systems, our research shows that users’ trust hinges on transparency not just about data usage but about the moral assumptions embedded in algorithmic curation. And in generative language models, we evaluate how explanatory feedback influences user trust, especially when navigating socially charged or ethically ambiguous queries. Across these projects, we argue for ethics as interaction, a view that sees moral reasoning not as fixed rules, but as emergent from context, feedback, and explanation.

Privacy, Safety, and Long-Term Impact

Technological evaluation must also extend across time. Short-term usability gains can mask long-term harms or unanticipated shifts in user behavior. We investigate these temporal dimensions through longitudinal studies of wearables, smart home devices, and cyber safety tools.

Our research in privacy spans regions and cultures, showing how design clarity and user agency affect data-sharing behavior. We also evaluate generative AI for phishing detection that pair algorithmic analysis with user education, empowering users as active participants in their own security. These efforts reflect a broader concern: how can we ensure that technologies remain safe, ethical, and meaningful not just on day one, but across years of use?

Ongoing Research

Statistical explainability (Stat-XAI): Creating a framework that integrates inferential statistics (ANOVA, regression, chi-square) with effect size analysis to rigorously evaluate feature importance.

LLM-based usability evaluation: Using large language models to simulate heuristic testing, identifying interface flaws early and accelerating accessible system design.

Agentic AI & LLM evaluation: Developing dynamic benchmark competitions for LLMs, assessing reasoning, adaptability, and robustness in evolving environments.

human-centered computing group (hccg)

Research themes

This website uses cookies.