Robust and lightweight UAV visual localization in GNSS-denied environments using a variational autoencoder

8 views

Authors

  • Phan Huy Anh (Corresponding Author) Institute of Information Technology and Electronics, Academy of Military Science and Technology
  • Ngo Van Quan Institute of Information Technology and Electronics, Academy of Military Science and Technology
  • Bui Thi Thanh Tam Institute of Information Technology and Electronics, Academy of Military Science and Technology
  • Cao Van Toan Institute of Information Technology and Electronics, Academy of Military Science and Technology

DOI:

https://doi.org/10.54939/1859-1043.j.mst.112.2026.56-63

Keywords:

UAV; Visual localization; Variational autoencoder.

Abstract

This paper proposes a robust and lightweight visual localization framework for UAVs in GNSS-denied environments. Utilizing a Variational Autoencoder (VAE) trained on full RGB imagery, the system extracts rich features compressed into an optimized 256-dimensional latent space to accommodate onboard constraints. These features are matched using an unnormalized  Euclidean distance, while a Linear Kalman Filter (LKF) smooths the trajectory. Experiments demonstrate this model outperforms baselines, achieving a raw RMSE of 0.087 m, which improves to 0.065 m with the LKF. This approach ensures stable, highly accurate real-time navigation.

References

[1]. E. P. Herrera-Granda, J. C. Torres-Cantero, A. Rosales, and D. H. Peluffo-Ordóñez. “A Comparison of Monocular Visual SLAM and Visual Odometry Methods Applied to 3D Reconstruction”. Appl. Sci., vol. 13, no. 15, p. 8837, (2023).

[2]. X. Yu, C. Wang, X. Li, and J. Zhang. “A Robust Learned Feature-Based Visual Odometry”. IEEE Trans. Instrum. Meas., vol. 72, pp. 1-12, (2023).

[3]. I. Moskalenko, A. Kornilova, and G. Ferrer. “Visual place recognition for aerial imagery: A survey”. Rob. Auton. Syst., vol. 183, p. 104837, (2024).

[4]. Y. Wang et al. “Multi-Modal Aerial-Ground Cross-View Place Recognition with Neural ODEs”. Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), (2025).

[5]. M. Bianchi and T. D. Barfoot. “UAV localization using autoencoded satellite images”. IEEE Robot. Autom. Lett., vol. 6, no. 2, pp. 1761-1768, (2021).

[6]. N. V. Quan, P. H. Anh, B. T. T. Tam, and N. C. Thanh. “Efficient UAV localization using combined autoencoder and SIFT”. Tạp chí Nghiên cứu KH&CN quân sự, (2024) (in Vietnamese).

[7]. H. Steck, C. Ekanadham, and N. Kallus. “Is Cosine-Similarity of Embeddings Really About Similarity?”. Companion Proceedings of the ACM Web Conference 2024, pp. 887-890, (2024).

Downloads

Published

25-06-2026

How to Cite

[1]
D. A. Phan Huy, Ngo Van Quan, Bui Thi Thanh Tam, and Cao Van Toan, “Robust and lightweight UAV visual localization in GNSS-denied environments using a variational autoencoder”, J. Mil. Sci. Technol., vol. 112, no. 112, pp. 56–63, Jun. 2026.

Issue

Section

Electronics & Automation

Most read articles by the same author(s)