Human Latent Metrics

Kye Shimizu; Kazuma Takada; Shunichi Kasahara; Naoto Ienaga; Maki Sugimoto

doi:https://dl.acm.org/doi/10.1145/3548814.3551460

Back to Projects

Human Latent Metrics

Kye Shimizu, Kazuma Takada, Shunichi Kasahara, Naoto Ienaga, Maki Sugimoto

Abstract

Perceptual and Cognitive Response Correlates to Distance in GAN Latent Space for Facial Images

Research Overview

Generative adversarial networks (GANs) generate high-dimensional vector spaces (latent spaces) that can interchangeably represent vectors as images. Advancements have extended their ability to computationally generate images indistinguishable from real images such as faces, and more importantly, to manipulate images using their inherit vector values in the latent space.

This interchangeability of latent vectors has the potential to calculate not only the distance in the latent space, but also the human perceptual and cognitive distance toward images, that is, how humans perceive and recognize images. However, it is still unclear how the distance in the latent space correlates with human perception and cognition.

Research Methodology

Our studies investigated the relationship between latent vectors and human perception or cognition through psycho-visual experiments that manipulates the latent vectors of face images.

Perception Study

A change perception task was used to examine whether participants could perceive visual changes in face images before and after moving an arbitrary distance in the latent space.

Cognition Study

A face recognition task was utilized to examine whether participants could recognize a face as the same, even after moving an arbitrary distance in the latent space.

Key Findings

Our experiments show that the distance between face images in the latent space correlates with human perception and cognition for visual changes in face imagery, which can be modeled with a logistic function.

Impact & Applications

By utilizing our methodology, it will be possible to interchangeably convert between the distance in the latent space and the metric of human perception and cognition, potentially leading to:

Image processing that better reflects human perception and cognition
Improved AI-generated content quality assessment
Better understanding of how AI models represent visual information
Enhanced human-computer interaction through perceptual alignment
Applications in facial recognition and biometrics

Collaborators

This research was conducted in collaboration between Sony CSL (Computer Science Laboratories) and Keio University’s Sugimoto Lab, with support from the JST Moonshot R&D Program.

Acknowledgments

Research supported by JST Moonshot R&D Program JPMJMS2013.

More Images

Credits

Kye Shimizu (Sony CSL)Kazuma Takada (Sony CSL / OIST)Shunichi Kasahara (Sony CSL / OIST)Naoto Ienaga (Keio, Sugimoto Lab)Maki Sugimoto (Keio, Sugimoto Lab)

Exhibitions

ACM Symposium on Applied Perception (SAP) 2022

Online

2022-09-22 to 2022-09-23

Grants

JST Moonshot R&D Program JPMJMS2013(2021)

MIT Media Lab Sandbox Award(2025)

ASPIRE (Achieving Sustainable Partnerships in Innovation, Research, and Entrepreneurship)(2025)

Publications

ACM SAP(2022)

Shimizu, Kye, Naoto Ienaga, Kazuma Takada, Maki Sugimoto, and Shunichi Kasahara. 2022. "Human Latent Metrics: Perceptual and Cognitive Response Correlates to Distance in GAN Latent Space for Facial Images." In ACM Symposium on Applied Perception 2022, 1-10. SAP '22 3. New York, NY, USA: Association for Computing Machinery.

View PDF

Talks & Presentations

Human Latent Metrics: Perceptual and Cognitive Response Correlates to Distance in GAN Latent Space for Facial Images(2022)

https://sap.acm.org/2022/index.html