Ontology-Guided YOLOv8 for Semantic Object Detection and Scene Interpretation in Remote-Sensing Smart-City Environments

Baojia Gong

Research article

Journal of Urban Development and Smart Cities

Volume 2 Issue 1
Pages: 196
- 209
DOI: https://doi.org/10.66033/judsc2025-220
Download PDF

Ontology-Guided YOLOv8 for Semantic Object Detection and Scene Interpretation in Remote-Sensing Smart-City Environments

Author(s): ^¹

¹College of Civil Engineering, Lanzhou Jiaotong University, Lanzhou, Gansu, 730070, China

Baojia Gong

College of Civil Engineering, Lanzhou Jiaotong University, Lanzhou, Gansu, 730070, China

Published: 15/10/2025

Abstract

Object detection in remotely sensed satellite imagery is increasingly important for urban planning, disaster management, and environmental monitoring in smart-city settings. This manuscript presents a coherent and publication-ready account of an ontology-guided deep learning framework that integrates a lightweight YOLOv8 detector with an ontology reasoning module for semantic scene interpretation. The system is designed to detect five urban-environment classes—residences, roads, shorelines, swimming pools, and vegetation—from Sentinel-2 MSI imagery collected over the southern Durban metropolitan region of KwaZulu-Natal, South Africa. The dataset consists of 92 annotated images resized to 640 × 640 pixels, partitioned into 61 training, 21 validation, and 10 testing images, then augmented to 6,100 training, 2,100 validation, and 1,000 testing images. The visual recognition component employs a YOLOv8 architecture with a C2f-based backbone/neck design and anchor-free detection heads, while the semantic layer uses RDF/OWL concepts queried through SPARQL to represent hierarchical class relations, object adjacency, and interpretable scene semantics. On the proposed dataset, the YOLOv8 model attains 68% precision, 60% recall, 43% mAP@50, and 17.5% mAP@50–95, with the highest class-specific precision observed for swimming pools (62.7%) and the highest class-specific mAP@50 for shorelines (99.5%). The ontology remains lightweight and scalable, with a maximum depth of inheritance of 3 and a maximum number of children of 4, enabling efficient reasoning with low computational demand. By combining object detection with structured semantic inference, the framework provides an interpretable analytical layer for smart-city land-cover understanding, disaster-aware urban monitoring, and knowledge-driven scene analysis.

Keywords: smart cities; urban planning; remote sensing; object detection; image analysis; ontology; knowledge representation; semantic reasoning; YOLOv8; disaster management

Copyright © 2025 Baojia Gong. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Cite this Article

APA

Gong, B. (2025). Ontology-Guided YOLOv8 for Semantic Object Detection and Scene Interpretation in Remote-Sensing Smart-City Environments. Journal of Urban Development and Smart Cities, 2(1), 196-209. https://doi.org/10.66033/judsc2025-220

MLA

Gong, Baojia. "Ontology-Guided YOLOv8 for Semantic Object Detection and Scene Interpretation in Remote-Sensing Smart-City Environments." Journal of Urban Development and Smart Cities, vol. 2, no. 1, 2025, pp. 196-209.

Chicago

Harvard

Gong, B., 2025. Ontology-Guided YOLOv8 for Semantic Object Detection and Scene Interpretation in Remote-Sensing Smart-City Environments. Journal of Urban Development and Smart Cities, 2(1), pp.196-209.

Vancouver

Gong B. Ontology-Guided YOLOv8 for Semantic Object Detection and Scene Interpretation in Remote-Sensing Smart-City Environments. Journal of Urban Development and Smart Cities. 2025;2(1):196-209.

Contents

Journal of Urban Development and Smart Cities

Ontology-Guided YOLOv8 for Semantic Object Detection and Scene Interpretation in Remote-Sensing Smart-City Environments

Abstract

Cite this Article