Large-scale urban 3D models are now central to smart-city visualisation, digital-twin platforms, urban planning review, and spatial decision support. Their effective deployment, however, remains constrained by the computational burden of real-time rendering and by the semantic fragmentation introduced by conventional tile-based level-of-detail (LOD) scheduling. This paper presents a spatial-distribution-aware data organisation framework for large-scale urban 3D scenes that integrates a statically constructed three-level R-tree with a dynamically constructed adaptive quadtree. The method first classifies buildings at macro-, meso-, and micro-scales according to administrative boundaries, planning blocks, and inter-building similarity, and then records these relationships in an R-tree. A 3D tiled LOD model is subsequently organised through an adaptive quadtree, while the R-tree is used as a semantic constraint to guide tile scheduling and preserve the integrity of the user’s area of interest. The framework is demonstrated using the Kowloon Peninsula of Hong Kong, China, covering approximately 39.028 km2 with a 90 GB 3D model dataset containing about 26,000 buildings. The model is processed into 21 LOD levels and evaluated against a conventional loading strategy. The results show that the proposed approach reduces visible building fragmentation at macro-, meso-, and micro-scales, preserves semantic coherence within 1.5 s of loading, improves cross-scale loading speed, and increases real-time rendering performance by approximately 10 frames per second. These findings position the method as a practically relevant data-organisation strategy for smart-city systems that require both rendering fluency and cognitively coherent urban scene presentation.