ArtMentor: AI-Assisted Evaluation of Artworks to Explore Multimodal Large Language Models Capabilities

Can Multimodal Large Language Models (MLLMs), with capabilities in perception, recognition, understanding, and reasoning, act as independent assistants in art evaluation dialogues? Current MLLM evaluation methods, reliant on subjective human scoring or costly interviews, lack comprehensive scenario coverage. This paper proposes a process-oriented Human-Computer Interaction (HCI) space design for more accurate MLLM assessment and development. This approach aids teachers in efficient art evaluation and records interactions for MLLM capability assessment. We introduce ArtMentor, a comprehensive space integrating a dataset and three systems for optimized MLLM evaluation. It includes 380 sessions from five art teachers across nine critical dimensions. The modular system features entity recognition, review generation, and suggestion generation agents, enabling iterative upgrades. Machine learning and natural language processing ensure reliable evaluations. Results confirm GPT-4o’s effectiveness in assisting teachers in art evaluation dialogues. Our contributions are available at https://artmentor.github.io/.

Shanghai Institute of Artificial Intelligence for Education, Shanghai, China

East China Normal University, Shanghai, China

Zheiiang University of Technology, Hangzhou, China

East China Normal University, Shanghai, China

Zhejiang University of Technology, Hangzhou, China

Zhejiang Normal University, Jinhua, China

University of Shanghai for Science and Technology, Shanghai, China

10.1145/3706598.3713274

https://dl.acm.org/doi/10.1145/3706598.3713274

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)

G416+G417

7 件の発表