Real-time long-term target tracking on an ARM platform with NPU acceleration and integration into UAV line-of-sight stabilization
9 viewsDOI:
https://doi.org/10.54939/1859-1043.j.mst.109.2026.25-34Keywords:
Long-term target tracking; YOLOv10s; fDSST; NPU; ARM; UAV line of sight stabilization.Abstract
This paper presents a real-time long-term target tracking algorithm optimized for ARM embedded platforms with integrated NPU acceleration. The system combines a pruned and quantized YOLOv10s detector with a NEON-optimized fDSST tracker. The two blocks are linked via an adaptive confidence index based on multi-feature fusion and a hysteresis mechanism to activate the detector only when necessary. Theoretical analysis demonstrates the boundedness of the correlation filter, the stability of the adaptive weight update process, and the exponential bounding of the probability of false state transitions. Experimental results on the Orange Pi 5 Max platform show that the system achieves an average speed of 19 FPS for detection and over 100 FPS for tracking, while maintaining stability in the presence of delay, noise, and transient occlusion. Monte-Carlo simulations and line-of-sight (LOS) stabilization simulations on a UAV rotating platform confirm a mean maximum angular error of approximately 0.006 rad and the ability to quickly re-track after target loss. The algorithm has potential applications in real-time optical surveillance, reconnaissance, and line-of-sight stabilization systems.
References
[1]. Bolme, D. S., Beveridge, J. R., Draper, B. A., & Lui, Y. M., “Visual Object Tracking Using Adaptive Correlation Filters”, Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2010), San Francisco, CA, USA, pp. 2544-2550, (2010). DOI: https://doi.org/10.1109/CVPR.2010.5539960
[2]. Henriques, J. F., Caseiro, R., Martins, P., & Batista, J., “High-Speed Tracking with Kernelized Correlation Filters”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 3, pp. 583-596, (2015). DOI: https://doi.org/10.1109/TPAMI.2014.2345390
[3]. Danelljan, M., Häger, G., Khan, F. S., & Felsberg, M., “Accurate Scale Estimation for Robust Visual Tracking”, Proceedings of the British Machine Vision Conference (BMVC 2014), BMVA Press, (2014). DOI: https://doi.org/10.5244/C.28.65
[4]. Bertinetto, L., Valmadre, J., Henriques, J. F., Vedaldi, A., & Torr, P. H. S., “Fully-Convolutional Siamese Networks for Object Tracking”, Proceedings of the 2016 European Conference on Computer Vision (ECCV 2016), Amsterdam, Netherlands, pp. 850-865, (2016). DOI: https://doi.org/10.1007/978-3-319-48881-3_56
[5]. Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., & Yan, J., “SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks”, Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019), Long Beach, CA, USA, pp. 4282-4291, (2019). DOI: https://doi.org/10.1109/CVPR.2019.00441
[6]. Chen, X., Wang, D., Cheng, M.-M., Zhang, W., & Hu, X., “Transformer Tracking”, Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021), Nashville, TN, USA, pp. 2856-2866, (2021). DOI: https://doi.org/10.1109/CVPR46437.2021.00803
[7]. Kalal, Z., Mikolajczyk, K., and Matas, J., “Tracking-Learning-Detection”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 7, pp. 1409-1422, (2012). DOI: https://doi.org/10.1109/TPAMI.2011.239
[8]. Ma, C., Yang, X., Zhang, C., & Yang, M.-H., “Long-Term Correlation Tracking”, Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), Boston, MA, USA, pp. 5388-5396, (2015). DOI: https://doi.org/10.1109/CVPR.2015.7299177
[9]. Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., & Hu, W., “Distractor-Aware Siamese Networks for Visual Object Tracking”, Proceedings of the 15th European Conference on Computer Vision (ECCV 2018), Part IX, Springer, Cham, Switzerland, pp. 103-119, (2018). DOI: https://doi.org/10.1007/978-3-030-01240-3_7
[10]. Danelljan, M., Bhat, G., Khan, F. S., & Felsberg, M., “ECO: Efficient Convolution Operators for Tracking”, Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), Honolulu, HI, USA, pp. 6931-6939, (2017). DOI: https://doi.org/10.1109/CVPR.2017.733
[11]. Danelljan, M., Häger, G., Khan, F. S., & Felsberg, M., “Discriminative Scale Space Tracking”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 8, pp. 1561-1575, (2016). DOI: https://doi.org/10.1109/TPAMI.2016.2609928
[12]. Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G., “YOLOv10: Real-Time End-to-End Object Detection”, arXiv preprint arXiv:2405.14458, (2024).
[13]. Khalil, H.K., “Nonlinear Systems”, Prentice Hall, 3rd ed., (2002).
[14]. Duchi, J., Hazan, E., and Singer, Y., “Adaptive Subgradient Methods for Online Learning and Stochastic Optimization”, Journal of Machine Learning Research, vol. 12, pp. 2121–2159, (2011).
[15]. Kushner, H.J., and Yin, G.G., “Stochastic Approximation and Recursive Algorithms and Applications”, Springer-Verlag, 2nd ed., (2003).
[16]. Sontag, E.D., “Input to State Stability: Basic Concepts and Results”, Nonlinear and Optimal Control Theory, Lecture Notes in Mathematics, vol. 1932, Springer, pp. 163–220, (2008). DOI: https://doi.org/10.1007/978-3-540-77653-6_3
[17]. Luo, C., et al., “Real-time visual target tracking based on correlation filters and adaptive confidence fusion”, IEEE Transactions on Industrial Electronics, vol. 69, no. 8, pp. 8423–8434, (2022).
[18]. Hoeffding, W., “Probability Inequalities for Sums of Bounded Random Variables”, The Collected Works of Wassily Hoeffding, Springer Series in Statistics, Springer, New York, NY, (1994). https://doi.org/10.1007/978-1-4612-0865-5_26 DOI: https://doi.org/10.1007/978-1-4612-0865-5_26
