Hostname: page-component-745bb68f8f-5r2nc Total loading time: 0 Render date: 2025-01-10T18:05:30.305Z Has data issue: false hasContentIssue false

Indoor Scene recognition for Micro Aerial Vehicles Navigation using Enhanced SIFT-ScSPM Descriptors

Published online by Cambridge University Press:  05 July 2019

B. Anbarasu*
Affiliation:
(Madras Institute of Technology Campus, Anna University, Chennai, India)
G. Anitha
Affiliation:
(Madras Institute of Technology Campus, Anna University, Chennai, India)
*

Abstract

In this paper, a new scene recognition visual descriptor called Enhanced Scale Invariant Feature Transform-based Sparse coding Spatial Pyramid Matching (Enhanced SIFT-ScSPM) descriptor is proposed by combining a Bag of Words (BOW)-based visual descriptor (SIFT-ScSPM) and Gist-based descriptors (Enhanced Gist-Enhanced multichannel Gist (Enhanced mGist)). Indoor scene classification is carried out by multi-class linear and non-linear Support Vector Machine (SVM) classifiers. Feature extraction methodology and critical review of several visual descriptors used for indoor scene recognition in terms of experimental perspectives have been discussed in this paper. An empirical study is conducted on the Massachusetts Institute of Technology (MIT) 67 indoor scene classification data set and assessed the classification accuracy of state-of-the-art visual descriptors and the proposed Enhanced mGist, Speeded Up Robust Features-Spatial Pyramid Matching (SURF-SPM) and Enhanced SIFT-ScSPM visual descriptors. Experimental results show that the proposed Enhanced SIFT-ScSPM visual descriptor performs better with higher classification rate, precision, recall and area under the Receiver Operating Characteristic (ROC) curve values with respect to the state-of-the-art and the proposed Enhanced mGist and SURF-SPM visual descriptors.

Type
Research Article
Copyright
Copyright © The Royal Institute of Navigation 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

Anbarasu, B. and Anitha, G. (2017). Vision-Based Heading and Lateral Deviation Estimation for Indoor Navigation of a Quadrotor. IETE Journal of Research, 63(4), 597603.Google Scholar
Bay, H., Ess, A., Tuytelaars, T. and Van Gool, L. (2008). Speeded-up robust features (SURF). Computer Vision and Image Understanding, 110(3), 346359.Google Scholar
Cakir, F., Güdükbay, U. and Ulusoy, Ö. (2011). Nearest-neighbor based metric functions for indoor scene recognition. Computer Vision and Image Understanding, 115(11), 14831492.Google Scholar
Dalal, N. and Triggs, B. (2005). Histograms of oriented gradients for human detection. Proceedings of IEEE conference on Computer Vision and Pattern Recognition, San Diego, CA, 886–893.Google Scholar
Elfiky, N., Khan, F.S., Van de Weijer, J. and Gonzalez, J. (2012). Discriminative compact pyramids for object and scene recognition. Pattern Recognition, 45(4), 16271636.Google Scholar
Haralick, R.M., Sternberg, S.R. and Zhuang, X. (1987). Image analysis using mathematical morphology. IEEE Transaction on Pattern Analysis and Machine Intelligence, 9(4), 532550.Google Scholar
Kawewong, A., Pimpup, R. and Hasegawa, O. (2013). Incremental Learning Framework for Indoor Scene Recognition. Proceedings of Twenty-Seventh AAAI Conference on Artificial Intelligence, 496–502.Google Scholar
Khan, S.H., Hayat, M., Bennamoun, M., Togneri, R. and Sohel, A. (2016). A discriminative representation of convolutional features for indoor scene recognition. IEEE Transactions on Image Processing, 25(7), 33723383.Google Scholar
Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. Proceedings of the 25th International Conference on Neural Information Processing Systems, 1097–1105.Google Scholar
Lazebnik, S., Schmid, C. and Ponce, J. (2006). Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. Proceedings of IEEE conference on Computer Vision and Pattern Recognition, New York, 2169–2178.Google Scholar
Lecun, Y., Bottou, L., Bengio, Y. and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 22782324.Google Scholar
Lowe, D.G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91110.Google Scholar
Meng, X., Wang, Z. and Wu, L. (2012). Building global image features for scene recognition. Pattern Recognition, 45(1), 373380.Google Scholar
Ojala, T., Pietikäinen, M. and Mäenpää, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transaction on Pattern Analysis and Machine Intelligence, 24(7), 971987.Google Scholar
Oliva, A., and A. Torralba, A. (2006). Building the gist of a scene: the role of global image features in recognition. Progress in Brain Research, 55, 23-36.Google Scholar
Qin, J. and Yung, N.H.C. (2010). Scene categorization via contextual visual words. Pattern Recognition, 43(5), 18741888.Google Scholar
Quattoni, A. and Torralba, A. (2009). Recognizing indoor scenes. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, 413–420.Google Scholar
Rahman, M.M., Rahman, S., Rahman, R., Hossain, B.M.M. and Shoyaib, M. (2017). DTCTH: a discriminative local pattern descriptor for image classification. EURASIP Journal on Image and Video Processing, 2017(1), 124.Google Scholar
Soille, P. (2003). Morphological Image Analysis: Principles and Applications. Springer-Verlag.Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T. and Y. Gong, Y. (2010). Locality-constrained linear coding for image classification. Proceedings of IEEE conference on Computer Vision and Pattern Recognition, San Francisco, CA, 3360–3367.Google Scholar
Wei, X., Phung, S.L. and Bouzerdoum, A. (2016). Visual descriptors for scene categorization: experimental evaluation. Artificial Intelligence Review, 45(3), 333368.Google Scholar
Wu, J. and Rehg, J.M. (2011). CENTRIST: a visual descriptor for scene categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(8), 14891501.Google Scholar
Xiao, Y., Wu, J. and Yuan, J. (2014). MCENTRIST: a multi-channel feature generation mechanism for scene categorization. IEEE Transactions on Image Processing, 23( 2), 823836.Google Scholar
Xiao, J., Ehinger, K.A., Hays, J., Torralba, A. and Oliva, A. (2016). SUN database: exploring a large collection of scene categories. International Journal of Computer Vision, 119(1), 322.Google Scholar
Xie, L., Wang, J., Guo, B., Zhang, B. and Tian, Q. (2014). Orientational pyramid matching for recognizing indoor scenes. Proceedings of IEEE Conference on Computer Vision Pattern Recognition, Columbus, OH, 3734–3741.Google Scholar
Yang, J., Yu, K., Gong, Y. and Huang, T. (2009). Linear spatial pyramid matching using sparse coding for image classification. Proceedings of IEEE conference on Computer Vision and Pattern Recognition, Miami, FL, 1794–1801.Google Scholar
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A. and Oliva, A. (2014). Learning deep features for scene recognition using Places database. Proceedings of Advances in Neural Information Processing Systems, 487–495.Google Scholar