Contents

Ontology-Guided YOLOv8 for Semantic Object Detection and Scene Interpretation in Remote-Sensing Smart-City Environments

Author(s): B. W. Gong1
1School of Mechanical and Electronic Engineering, Hubei Polytechnic University, Huangshi, 435003, China
B. W. Gong
School of Mechanical and Electronic Engineering, Hubei Polytechnic University, Huangshi, 435003, China

Abstract

Object detection in remotely sensed satellite imagery is increasingly important for urban planning, disaster management, and environmental monitoring in smart-city settings. This manuscript presents a coherent and publication-ready account of an ontology-guided deep learning framework that integrates a lightweight YOLOv8 detector with an ontology reasoning module for semantic scene interpretation. The system is designed to detect five urban-environment classes—residences, roads, shorelines, swimming pools, and vegetation—from Sentinel-2 MSI imagery collected over the southern Durban metropolitan region of KwaZulu-Natal, South Africa. The dataset consists of 92 annotated images resized to 640 × 640 pixels, partitioned into 61 training, 21 validation, and 10 testing images, then augmented to 6,100 training, 2,100 validation, and 1,000 testing images. The visual recognition component employs a YOLOv8 architecture with a C2f-based backbone/neck design and anchor-free detection heads, while the semantic layer uses RDF/OWL concepts queried through SPARQL to represent hierarchical class relations, object adjacency, and interpretable scene semantics. On the proposed dataset, the YOLOv8 model attains 68% precision, 60% recall, 43% mAP@50, and 17.5% mAP@50–95, with the highest class-specific precision observed for swimming pools (62.7%) and the highest class-specific mAP@50 for shorelines (99.5%). The ontology remains lightweight and scalable, with a maximum depth of inheritance of 3 and a maximum number of children of 4, enabling efficient reasoning with low computational demand. By combining object detection with structured semantic inference, the framework provides an interpretable analytical layer for smart-city land-cover understanding, disaster-aware urban monitoring, and knowledge-driven scene analysis.

Keywords: smart cities; urban planning; remote sensing; object detection; image analysis; ontology; knowledge representation; semantic reasoning; YOLOv8; disaster management
Copyright © 2025 B. W. Gong. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.