Reconstructing realistic digital twins has become crucial as advances in mixed reality, metaverse, and robotics demand more accurate simulations for the physical world. Despite technical progress, building high-fidelity digital twins from a systematic and human-centered perspective remains underexplored. Drawing from the human processing model, we decompose human-centric reality into perception, motion, and cognition, and define a reality-preserving digital twin (RPDT) as a reconstruction integrating these dimensions. We present RealTwin, an attribute-graph-based representation and inference framework for RPDT. Leveraging the grounding capabilities of Multimodal Large Language Models (MLLMs), RealTwin chains AI tools to construct attribute graphs that faithfully encode real-world properties. We validate RealTwin through both technical evaluation, showing promising success in graph parsing and attribute inference, and a user study, assessing its applicability across diverse user groups. Enlightened by RealTwin, we discuss critical issues, including ecology, interaction space, and real-world adoption, for future end-to-end, fine-grained, and scalable digital twin reconstruction.
ACM CHI Conference on Human Factors in Computing Systems