|
[1]Baraldi, L., Douze, M., Cucchiara, R., and Jégou, H. LAMV: Learning to alignand match videos with kernelized temporal layers. InProceedings of the IEEEConference on Computer Vision and Pattern Recognition(2018), pp. 7804–7813.[2]Cai, Y., Yang, L., Ping, W., Wang, F., Mei, T., Hua, X.-S., and Li, S. Million-scale near-duplicate video retrieval system. InACM Multimedia(2011),pp. 837–838.[3]Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. Imagenet: Alarge-scale hierarchical image database. In2009IEEEconferenceoncomputervision and pattern recognition(2009), Ieee, pp. 248–255.[4]Douze, M., Jégou, H., and Schmid, C. An image-based approach to video copydetection with spatio-temporal post-filtering.IEEE Transactions on Multime-dia 12, 4 (2010), 257–266.[5]Douze, M., Jégou, H., Schmid, C., and Pérez, P. Compact video descriptionfor copy detection with precise temporal alignment. InEuropean Conferenceon Computer Vision(2010), Springer, pp. 522–535.[6]Esmaeili, M. M., Fatourechi, M., and Ward, R. K. A robust and fast videocopy detection system using content-based fingerprinting.IEEE Transactionson information forensics and security 6, 1 (2011), 213–226.[7]Guzman-Zavaleta, Z. J., and Feregrino-Uribe, C. Partial-copy detection ofnon-simulated videos using learning at decision level.Multimedia Tools andApplications(2018), 1–20.[8]Heikkilä, M., Pietikäinen, M., and Schmid, C. Description of interest regionswith local binary patterns.Pattern recognition 42, 3 (2009), 425–436.[9]Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., and Brox, T.Flownet 2.0: Evolution of optical flow estimation with deep networks. InPro-ceedingsoftheIEEEConferenceonComputerVisionandPatternRecognition(2017), pp. 2462–2470.[10]Jiang, Y.-G., Jiang, Y., and Wang, J. VCDB: a large-scale database for partialcopy detection in videos. InEuropean conference on computer vision(2014),Springer, pp. 357–371.[11]Jiang, Y.-G., and Wang, J. Partial copy detection in videos: A benchmark andan evaluation of popular methods.IEEE Trans. Big Data 2, 1 (2016), 32–42.[12]Kordopatis-Zilos, G., Papadopoulos, S., Patras, I., and Kompatsiaris, I. FIVR:Fine-grained incident video retrieval.IEEE Transactions on Multimedia(2019), 1–1.[13]Kordopatis-Zilos, G., Papadopoulos, S., Patras, I., and Kompatsiaris, Y. Near-duplicate video retrieval by aggregating intermediate CNN layers. InInterna-tional Conference on Multimedia Modeling(2017), Springer, pp. 251–263.35 [14]Kordopatis-Zilos, G., Papadopoulos, S., Patras, I., and Kompatsiaris, Y. Near-duplicate video retrieval with deep metric learning. InComputer VisionWorkshop (ICCVW), 2017 IEEE International Conference on(2017), IEEE,pp. 347–356.[15]Krizhevsky, A., Sutskever, I., and Hinton, G. E. ImageNet classification withdeep convolutional neural networks. InAdvances in neural information pro-cessing systems(2012), pp. 1097–1105.[16]Long, X., Gan, C., de Melo, G., Wu, J., Liu, X., and Wen, S. Attention clusters:Purely attention based local feature integration for video classification. InPro-ceedingsoftheIEEEConferenceonComputerVisionandPatternRecognition(2018), pp. 7834–7843.[17]Lowe, D. G. Distinctive image features from scale-invariant keypoints.Inter-national journal of computer vision 60, 2 (2004), 91–110.[18]Maaten, L. v. d., and Hinton, G. Visualizing data using t-SNE.Journal ofmachine learning research 9, Nov (2008), 2579–2605.[19]Schroff, F., Kalenichenko, D., and Philbin, J. FaceNet: A unified embeddingfor face recognition and clustering. InProceedings of the IEEE conference oncomputer vision and pattern recognition(2015), pp. 815–823.[20]Simonyan, K., and Zisserman, A. Two-stream convolutional networks for ac-tion recognition in videos. InAdvances in neural information processing sys-tems(2014), pp. 568–576.[21]Smeaton, A. F., Over, P., and Kraaij, W. Evaluation campaigns and trecvid.InProceedings of the 8th ACM international workshop on Multimedia infor-mation retrieval(2006), ACM, pp. 321–330.[22]Song, J., Yang, Y., Huang, Z., Shen, H. T., and Hong, R. Multiple featurehashing for real-time large scale near-duplicate video retrieval. InProceed-ings of the 19th ACM international conference on Multimedia(2011), ACM,pp. 423–432.[23]Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. Rethinkingthe inception architecture for computer vision. InProceedings of the IEEEConference on Computer Vision and Pattern Recognition(2016), pp. 2818–2826.[24]Tan, H.-K., Ngo, C.-W., Hong, R., and Chua, T.-S. Scalable detection of partialnear-duplicate videos by visual-temporal consistency. InProceedings of the17th ACM international conference on Multimedia(2009), ACM, pp. 145–154.[25]Uchida, Y., Takagi, K., and Sakazawa, S. Fast and accurate content-basedvideo copy detection using bag-of-global visual features. InAcoustics, Speechand Signal Processing (ICASSP), 2012 IEEE International Conference on(2012), IEEE, pp. 1029–1032.36 [26]Wang, L., Bao, Y., Li, H., Fan, X., and Luo, Z. Compact cnn based videorepresentation for efficient video copy detection. InInternational conferenceon multimedia modeling(2017), Springer, pp. 576–587.[27]Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., and Van Gool, L.Temporal segment networks: Towards good practices for deep action recogni-tion. InEuropean conference on computer vision(2016), Springer, pp. 20–36.[28]Wu, C.-Y., Manmatha, R., Smola, A. J., and Krahenbuhl, P. Sampling mat-ters in deep embedding learning. InProceedings of the IEEE InternationalConference on Computer Vision(2017), pp. 2840–2848.[29]Wu, X., Hauptmann, A. G., and Ngo, C.-W. Practical elimination of near-duplicates from web video search. InProceedings of the 15th ACM interna-tional conference on Multimedia(2007), ACM, pp. 218–227.[30]Zhang, Y., and Zhang, X. Effective real-scenario video copy detection. InPattern Recognition (ICPR), 2016 23rd International Conference on(2016),IEEE, pp. 3951–3956.[31]Zhu, Y., Huang, X., Huang, Q., and Tian, Q. Large-scale video copy retrievalwith temporal-concentration sift.Neurocomputing 187(2016), 83–91 |