ArtMentor: AI-Assisted Evaluation of Artworks to Explore Multimodal Large Language Models Capabilities

要旨

Can Multimodal Large Language Models (MLLMs), with capabilities in perception, recognition, understanding, and reasoning, act as independent assistants in art evaluation dialogues? Current MLLM evaluation methods, reliant on subjective human scoring or costly interviews, lack comprehensive scenario coverage. This paper proposes a process-oriented Human-Computer Interaction (HCI) space design for more accurate MLLM assessment and development. This approach aids teachers in efficient art evaluation and records interactions for MLLM capability assessment. We introduce ArtMentor, a comprehensive space integrating a dataset and three systems for optimized MLLM evaluation. It includes 380 sessions from five art teachers across nine critical dimensions. The modular system features entity recognition, review generation, and suggestion generation agents, enabling iterative upgrades. Machine learning and natural language processing ensure reliable evaluations. Results confirm GPT-4o’s effectiveness in assisting teachers in art evaluation dialogues. Our contributions are available at https://artmentor.github.io/.

著者
Chanjin Zheng
Shanghai Institute of Artificial Intelligence for Education, Shanghai, China
Zengyi Yu
East China Normal University, Shanghai, China
Yilin Jiang
Zheiiang University of Technology, Hangzhou, China
Mingzi Zhang
East China Normal University, Shanghai, China
Xunuo Lu
Zhejiang University of Technology, Hangzhou, China
Jing Jin
Zhejiang Normal University, Jinhua, China
Liteng Gao
University of Shanghai for Science and Technology, Shanghai, China
DOI

10.1145/3706598.3713274

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713274

動画

会議: CHI 2025

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)

セッション: Learning, Creating, and Understanding Art

G416+G417
7 件の発表
2025-04-30 20:10:00
2025-04-30 21:40:00
日本語まとめ
読み込み中…