research-article
Free access
Just Accepted
Authors: Dan Song, Shumeng Huo, Xinwei Fu, Chumeng Zhang, Wenhui Li, An-An Liu
ACM Transactions on Multimedia Computing, Communications and Applications
Accepted on 16 August 2024
Online AM: 30 August 2024 Publication History
Metrics
Total Citations0Total Downloads0Last 12 Months0
Last 6 weeks0
New Citation Alert added!
This alert has been successfully added and will be sent to:
You will be notified whenever a record that you have chosen has been cited.
To manage your alert preferences, click on the button below.
Manage my Alerts
New Citation Alert!
Please log in to your account
PDFeReader
- View Options
- References
- Media
- Tables
- Share
Abstract
Image-based 3D shape retrieval (IBSR) is a cross-modal matching task, which searches similar shapes from a 3D repository using a natural image. Continuous attentions have been payed to this topic, such as joint embedding, adversarial learning and contrastive learning. Modality gap and diversity of instance similarities are two obstacles for accurate and fine-grained cross-modal matching. To overcome the two obstacles, we propose a style-mixed contrastive learning method (SC-IBSR). On one hand, we propose a style transition module to mix the styles of images and rendered shape views to form an intermediate style, and inject it to image contents. The obtained style-mixed image features serve as a bridge for later contrastive learning in order to alleviate the modality gap. On the other hand, the proposed strategy of fine-grained consistency constraint aims at cross-domain contrast and considers different importance of negative (positive) samples. Extensive experiments demonstrate the superiority of the style-mixed cross-modal contrastive learning on both the instance-level retrieval benchmark (i.e., Pix3D, Stanford Cars and Comp Cars that annotate shapes to images), and the unsupervised category-level retrieval benchmark (i.e., MI3DOR-1 and MI3DOR-2 with unlabeled 3D shapes). Moreover, experiments are conducted on Office-31 dataset to validate the generalization capability of our method. Code and pretrained models will be available at https://github.com/honoria0204/SC-IBSR.
References
[1]
Mohamed Afham, Isuru Dissanayake, Dinithi Dissanayake, Amaya Dharmasiri, Kanchana Thilakarathna, and Ranga Rodrigo. 2022. CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. IEEE, 9892–9902. https://doi.org/10.1109/CVPR52688.2022.00967
[2]
Masaki Aono and Hiroki Iwabuchi. 2012. 3D shape retrieval from a 2D image as query. In Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2012, Hollywood, CA, USA, December 3-6, 2012. IEEE, 1–10. https://ieeexplore.ieee.org/document/6411819/
[3]
Mathieu Aubry and Bryan C. Russell. 2015. Understanding Deep Features with Computer-Generated Imagery. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015. IEEE Computer Society, 2875–2883. https://doi.org/10.1109/ICCV.2015.329
Digital Library
[4]
David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, and Colin Raffel. 2019. ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring. CoRR abs/1911.09785 (2019). arXiv:1911.09785 http://arxiv.org/abs/1911.09785
[5]
David Berthelot, Nicholas Carlini, Ian J. Goodfellow, Nicolas Papernot, Avital Oliver, and Colin Raffel. 2019. MixMatch: A Holistic Approach to Semi-Supervised Learning. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett (Eds.). 5050–5060. https://proceedings.neurips.cc/paper/2019/hash/1cd138d0499a68f4bb72bee04bbec2d7-Abstract.html
[6]
Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. 2020. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/70feb62b69f16e0238f741fab228fec2-Abstract.html
[7]
Prashanth Chandran, Gaspard Zoss, Paulo F. U. Gotardo, Markus Gross, and Derek Bradley. 2021. Adaptive Convolutions for Structure-Aware Style Transfer. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. Computer Vision Foundation / IEEE, 7972–7981. https://doi.org/10.1109/CVPR46437.2021.00788
[8]
Haibo Chen, Lei Zhao, Zhizhong Wang, Huiming Zhang, Zhiwen Zuo, Ailin Li, Wei Xing, and Dongming Lu. 2021. Artistic Style Transfer with Internal-external Learning and Contrastive Learning. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, Marc’Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.). 26561–26573. https://proceedings.neurips.cc/paper/2021/hash/df5354693177e83e8ba089e94b7b6b55-Abstract.html
[9]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey E. Hinton. 2020. A Simple Framework for Contrastive Learning of Visual Representations. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event (Proceedings of Machine Learning Research, Vol. 119). PMLR, 1597–1607. http://proceedings.mlr.press/v119/chen20j.html
[10]
Nannan Chong. 2022. 3D Reconstruction of Laparoscope Images With Contrastive Learning Methods. IEEE Access 10 (2022), 4456–4470. https://doi.org/10.1109/ACCESS.2022.3140334
[11]
Carlos Esteves, Christine Allen-Blanchette, Ameesh Makadia, and Kostas Daniilidis. 2020. Learning SO(3) Equivariant Representations with Spherical CNNs. Int. J. Comput. Vis. 128, 3 (2020), 588–600. https://doi.org/10.1007/s11263-019-01220-1
[12]
Huan Fu, Shunming Li, Rongfei Jia, Mingming Gong, Binqiang Zhao, and Dacheng Tao. 2020. Hard Example Generation by Texture Synthesis for Cross-domain Shape Similarity Learning. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/a87d27f712df362cd22c7a8ef823e987-Abstract.html
[13]
Yaroslav Ganin and Victor Lempitsky. 2015. Unsupervised domain adaptation by backpropagation. In International conference on machine learning. PMLR, 1180–1189.
[14]
Yaroslav Ganin and Victor S. Lempitsky. 2015. Unsupervised Domain Adaptation by Backpropagation. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015 (JMLR Workshop and Conference Proceedings, Vol. 37), Francis R. Bach and David M. Blei (Eds.). JMLR.org, 1180–1189. http://proceedings.mlr.press/v37/ganin15.html
[15]
Zan Gao, Leming Guo, Tongwei Ren, An-An Liu, Zhi-Yong Cheng, and Shengyong Chen. 2022. Pairwise Two-Stream ConvNets for Cross-Domain Action Recognition With Small Data. IEEE Trans. Neural Networks Learn. Syst. 33, 3 (2022), 1147–1161. https://doi.org/10.1109/TNNLS.2020.3041018
[16]
Zan Gao, Yinming Li, and Shaohua Wan. 2020. Exploring Deep Learning for View-Based 3D Model Retrieval. ACM Transactions on Multimedia Computing Communications and Applications 16, 1 (2020), 1–21.
Digital Library
[17]
Alexander Grabner, Peter M. Roth, and Vincent Lepetit. 2018. 3D Pose Estimation and 3D Model Retrieval for Objects in the Wild. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018. Computer Vision Foundation / IEEE Computer Society, 3022–3031. https://doi.org/10.1109/CVPR.2018.00319
[18]
Alexander Grabner, Peter M. Roth, and Vincent Lepetit. 2019. Location Field Descriptors: Single Image 3D Model Retrieval in the Wild. In 2019 International Conference on 3D Vision, 3DV 2019, Québec City, QC, Canada, September 16-19, 2019. IEEE, 583–593. https://doi.org/10.1109/3DV.2019.00070
[19]
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross B. Girshick. 2020. Momentum Contrast for Unsupervised Visual Representation Learning. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020. Computer Vision Foundation / IEEE, 9726–9735. https://doi.org/10.1109/CVPR42600.2020.00975
[20]
Nian Hu, Xiangdong Huang, Wenhui Li, Xuanya Li, and An-An Liu. 2023. Cross-Domain Image-Object Retrieval Based on Weighted Optimal Transport. IEEE Transactions on Multimedia (2023), 1–16. https://doi.org/10.1109/TMM.2023.3254889
Digital Library
[21]
K. J. Joseph, Salman H. Khan, Fahad Shahbaz Khan, and Vineeth N. Balasubramanian. 2021. Towards Open World Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. Computer Vision Foundation / IEEE, 5830–5840. https://doi.org/10.1109/CVPR46437.2021.00577
[22]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2017. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 6 (2017), 84–90.
Digital Library
[23]
Doyup Lee, Sungwoong Kim, Ildoo Kim, Yeongjae Cheon, Minsu Cho, and Wook-Shin Han. 2022. Contrastive Regularization for Semi-Supervised Learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2022, New Orleans, LA, USA, June 19-20, 2022. IEEE, 3910–3919. https://doi.org/10.1109/CVPRW56347.2022.00436
[24]
Junnan Li, Pan Zhou, Caiming Xiong, and Steven C. H. Hoi. 2021. Prototypical Contrastive Learning of Unsupervised Representations. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net. https://openreview.net/forum?id=KmykpuSrjcq
[25]
Shuang Li, Chi Liu, Qiuxia Lin, Binhui Xie, Zhengming Ding, Gao Huang, and Jian Tang. 2020. Domain conditioned adaptation network. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 11386–11393.
[26]
Qi Liang, Qiang Li, Weizhi Nie, and Anan Liu. 2022. Unsupervised Cross-media Graph Convolutional Network for 2D Image-based 3D Model Retrieval. IEEE Transactions on Multimedia (2022), 1–1. https://doi.org/10.1109/TMM.2022.3160616
Digital Library
[27]
Ming-Xian Lin, Jie Yang, He Wang, Yu-Kun Lai, Rongfei Jia, Binqiang Zhao, and Lin Gao. 2021. Single Image 3D Shape Retrieval via Cross-Modal Instance and Category Contrastive Learning. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. IEEE, 11385–11395. https://doi.org/10.1109/ICCV48922.2021.01121
[28]
Mingsheng Long, Han Zhu, Jianmin Wang, and Michael I. Jordan. 2017. Deep Transfer Learning with Joint Adaptation Networks. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017 (Proceedings of Machine Learning Research, Vol. 70), Doina Precup and Yee Whye Teh (Eds.). PMLR, 2208–2217. http://proceedings.mlr.press/v70/long17a.html
[29]
Chao Ma, Yulan Guo, Jungang Yang, and Wei An. 2019. Learning Multi-View Representation With LSTM for 3-D Shape Recognition and Retrieval. IEEE Trans. Multim. 21, 5 (2019), 1169–1182. https://doi.org/10.1109/TMM.2018.2875512
Digital Library
[30]
Daniel Maturana and Sebastian A. Scherer. 2015. VoxNet: A 3D Convolutional Neural Network for real-time object recognition. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2015, Hamburg, Germany, September 28 - October 2, 2015. IEEE, 922–928. https://doi.org/10.1109/IROS.2015.7353481
Digital Library
[31]
Weizhi Nie, Qi Liang, Yixin Wang, Xing Wei, and Yuting Su. 2020. MMFN: Multimodal information fusion networks for 3D model classification and retrieval. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 16, 4 (2020), 1–22.
Digital Library
[32]
Weizhi Nie, Weijie Wang, Anan Liu, Jie Nie, and Yuting Su. 2020. HGAN: Holistic Generative Adversarial Networks for Two-dimensional Image-based Three-dimensional Object Retrieval. ACM Trans. Multim. Comput. Commun. Appl. 15, 4 (2020), 101:1–101:24. https://doi.org/10.1145/3344684
Digital Library
[33]
Dae Young Park and Kwang Hee Lee. 2019. Arbitrary Style Transfer With Style-Attentional Networks. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. Computer Vision Foundation / IEEE, 5880–5888. https://doi.org/10.1109/CVPR.2019.00603
[34]
Liang Peng, Yujie Mo, Jie Xu, Jialie Shen, Xiaoshuang Shi, Xiaoxiao Li, Heng Tao Shen, and Xiaofeng Zhu. 2023. GRLC: Graph representation learning with constraints. IEEE Transactions on Neural Networks and Learning Systems (2023).
[35]
Yuxin Peng and Jinwei Qi. 2019. CM-GANs: Cross-modal Generative Adversarial Networks for Common Representation Learning. ACM Trans. Multim. Comput. Commun. Appl. 15, 1 (2019), 22:1–22:24. https://doi.org/10.1145/3284750
Digital Library
[36]
Charles Ruizhongtai Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2017. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. IEEE Computer Society, 77–85. https://doi.org/10.1109/CVPR.2017.16
[37]
Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J. Guibas. 2017. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5099–5108. https://proceedings.neurips.cc/paper/2017/hash/d8bf84be3800d12f74d8b05e9b89836f-Abstract.html
[38]
Kate Saenko, Brian Kulis, Mario Fritz, and Trevor Darrell. 2010. Adapting visual category models to new domains. In Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IV 11. Springer, 213–226.
[39]
Kuniaki Saito, Yosh*taka Ushiku, Tatsuya Harada, and Kate Saenko. 2018. Adversarial Dropout Regularization. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=HJIoJWZCZ
[40]
Mehdi Sajjadi, Mehran Javanmardi, and Tolga Tasdizen. 2016. Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain, Daniel D. Lee, Masashi Sugiyama, Ulrike von Luxburg, Isabelle Guyon, and Roman Garnett (Eds.). 1163–1171. https://proceedings.neurips.cc/paper/2016/hash/30ef30b64204a3088a26bc2e6ecf7602-Abstract.html
[41]
Astuti Sharma, Tarun Kalluri, and Manmohan Chandraker. 2021. Instance Level Affinity-Based Transfer for Unsupervised Domain Adaptation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. Computer Vision Foundation / IEEE, 5361–5371. https://doi.org/10.1109/CVPR46437.2021.00532
[42]
Rushendra Sidibomma and Rakesh Kumar Sanodiya. 2023. Learning Semantic Representations and Discriminative Features in Unsupervised Domain Adaptation. In 2023 11th international symposium on electronic systems devices and computing (ESDC), Vol. 1. IEEE, 1–6.
[43]
Dan Song, Wei-Zhi Nie, Wenhui Li, Mohan S. Kankanhalli, and An-An Liu. 2022. Monocular Image-Based 3-D Model Retrieval: A Benchmark. IEEE Trans. Cybern. 52, 8 (2022), 8114–8127. https://doi.org/10.1109/TCYB.2021.3051016
[44]
Dan Song, Yuanxiang Yang, Wenhui Li, Xuanya Li, Min Liu, and An-An Liu. 2024. Structured serialization semantic transfer network for unsupervised cross-domain recognition and retrieval. Information Processing & Management 61, 1 (2024), 103565.
Digital Library
[45]
Dan Song, Yuanxiang Yang, Wenhui Li, Zhuang Shao, Weizhi Nie, Xuanya Li, and An-An Liu. 2024. Adaptive semantic transfer network for unsupervised 2D image-based 3D model retrieval. Computer Vision and Image Understanding 238 (2024), 103858.
Digital Library
[46]
Dan Song, Yue Yang, Weizhi Nie, Xuanya Li, and An-An Liu. 2022. Cross-Domain 3D Model Retrieval Based On Contrastive Learning And Label Propagation. In MM ’22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10 - 14, 2022, João Magalhães, Alberto Del Bimbo, Shin’ichi Satoh, Nicu Sebe, Xavier Alameda-Pineda, Qin Jin, Vincent Oria, and Laura Toni (Eds.). ACM, 286–295. https://doi.org/10.1145/3503161.3548044
Digital Library
[47]
Hang Su, Subhransu Maji, Evangelos Kalogerakis, and Erik G. Learned-Miller. 2015. Multi-view Convolutional Neural Networks for 3D Shape Recognition. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015. IEEE Computer Society, 945–953. https://doi.org/10.1109/ICCV.2015.114
Digital Library
[48]
Xingyuan Sun, Jiajun Wu, Xiuming Zhang, Zhoutong Zhang, Chengkai Zhang, Tianfan Xue, Joshua B. Tenenbaum, and William T. Freeman. 2018. Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018. Computer Vision Foundation / IEEE Computer Society, 2974–2983. https://doi.org/10.1109/CVPR.2018.00314
[49]
Hui Tang and Kui Jia. 2020. Discriminative Adversarial Domain Adaptation. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 5940–5947. https://ojs.aaai.org/index.php/AAAI/article/view/6054
[50]
Zhiqiang Tang, Yunhe Gao, Yi Zhu, Zhi Zhang, Mu Li, and Dimitris N. Metaxas. 2021. CrossNorm and SelfNorm for Generalization under Distribution Shifts. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. IEEE, 52–61. https://doi.org/10.1109/ICCV48922.2021.00012
[51]
Lei Tian, Yongqiang Tang, Liangchen Hu, and Wensheng Zhang. 2023. Cross-domain label propagation for domain adaptation with discriminative graph self-learning. arXiv preprint arXiv:2302.08710 (2023).
[52]
Yonglong Tian, Dilip Krishnan, and Phillip Isola. 2020. Contrastive Multiview Coding. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XI (Lecture Notes in Computer Science, Vol. 12356), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer, 776–794. https://doi.org/10.1007/978-3-030-58621-8_45
Digital Library
[53]
Eric Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. 2017. Adversarial Discriminative Domain Adaptation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. IEEE Computer Society, 2962–2971. https://doi.org/10.1109/CVPR.2017.316
[54]
Jindong Wang, Wenjie Feng, Yiqiang Chen, Han Yu, Meiyu Huang, and Philip S. Yu. 2018. Visual Domain Adaptation with Manifold Embedded Distribution Alignment. In 2018 ACM Multimedia Conference on Multimedia Conference, MM 2018, Seoul, Republic of Korea, October 22-26, 2018, Susanne Boll, Kyoung Mu Lee, Jiebo Luo, Wenwu Zhu, Hyeran Byun, Chang Wen Chen, Rainer Lienhart, and Tao Mei (Eds.). ACM, 402–410. https://doi.org/10.1145/3240508.3240512
Digital Library
[55]
Yaming Wang, Xiao Tan, Yi Yang, Xiao Liu, Errui Ding, Feng Zhou, and Larry S. Davis. 2018. 3D Pose Estimation for Fine-Grained Object Categories. In Computer Vision - ECCV 2018 Workshops - Munich, Germany, September 8-14, 2018, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 11129), Laura Leal-Taixé and Stefan Roth (Eds.). Springer, 619–632. https://doi.org/10.1007/978-3-030-11009-3_38
Digital Library
[56]
Chen Wei, Huiyu Wang, Wei Shen, and Alan L. Yuille. 2021. CO2: Consistent Contrast for Unsupervised Visual Representation Learning. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net. https://openreview.net/forum?id=U4XLJhqwNF1
[57]
Xin Wei, Ruixuan Yu, and Jian Sun. 2020. View-GCN: View-Based Graph Convolutional Network for 3D Shape Analysis. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020. Computer Vision Foundation / IEEE, 1847–1856. https://doi.org/10.1109/CVPR42600.2020.00192
[58]
Congcong Wen, Xiang Li, Hao Huang, Yushen Liu, and Yi Fang. 2023. 3D Shape Contrastive Representation Learning with Adversarial Examples. IEEE Trans. Multim. (2023), 1–14. https://doi.org/10.1109/TMM.2023.3265177
[59]
Jun Wen, Junsong Yuan, Qian Zheng, Risheng Liu, Zhefeng Gong, and Nenggan Zheng. 2022. Hierarchical domain adaptation with local feature patterns. Pattern Recognition 124 (2022), 108445.
Digital Library
[60]
Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3D ShapeNets: A deep representation for volumetric shapes. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015. IEEE Computer Society, 1912–1920. https://doi.org/10.1109/CVPR.2015.7298801
[61]
Yu Xiang, Wonhui Kim, Wei Chen, Jingwei Ji, Christopher B. Choy, Hao Su, Roozbeh Mottaghi, Leonidas J. Guibas, and Silvio Savarese. 2016. ObjectNet3D: A Large Scale Database for 3D Object Recognition. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VIII (Lecture Notes in Computer Science, Vol. 9912), Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer, 160–176. https://doi.org/10.1007/978-3-319-46484-8_10
[62]
Yu Xiang, Roozbeh Mottaghi, and Silvio Savarese. 2014. Beyond PASCAL: A benchmark for 3D object detection in the wild. In IEEE Winter Conference on Applications of Computer Vision, Steamboat Springs, CO, USA, March 24-26, 2014. IEEE Computer Society, 75–82. https://doi.org/10.1109/WACV.2014.6836101
[63]
Enze Xie, Jian Ding, Wenhai Wang, Xiaohang Zhan, Hang Xu, Peize Sun, Zhenguo Li, and Ping Luo. 2021. DetCo: Unsupervised Contrastive Learning for Object Detection. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. IEEE, 8372–8381. https://doi.org/10.1109/ICCV48922.2021.00828
[64]
Saining Xie, Jiatao Gu, Demi Guo, Charles R. Qi, Leonidas J. Guibas, and Or Litany. 2020. PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part III (Lecture Notes in Computer Science, Vol. 12348), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer, 574–591. https://doi.org/10.1007/978-3-030-58580-8_34
Digital Library
[65]
Cheng Xu, Biao Leng, Bo Chen, Cheng Zhang, and Xiaochen Zhou. 2020. Learning Discriminative and Generative Shape Embeddings for Three-Dimensional Shape Retrieval. IEEE Trans. Multim. 22, 9 (2020), 2234–2245. https://doi.org/10.1109/TMM.2019.2957933
[66]
Shuaihang Yuan, Congcong Wen, Yu-Shen Liu, and Yi Fang. 2023. Retrieval-Specific View Learning for Sketch-to-Shape Retrieval. IEEE Transactions on Multimedia (2023), 1–12. https://doi.org/10.1109/TMM.2023.3287332
[67]
Jing Zhang, Wanqing Li, and Philip Ogunbona. 2017. Joint Geometrical and Statistical Alignment for Visual Domain Adaptation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. IEEE Computer Society, 5150–5158. https://doi.org/10.1109/CVPR.2017.547
[68]
Yinghua Zhang, Yu Zhang, Ying Wei, Kun Bai, Yangqiu Song, and Qiang Yang. 2020. Fisher Deep Domain Adaptation. In Proceedings of the 2020 SIAM International Conference on Data Mining, SDM 2020, Cincinnati, Ohio, USA, May 7-9, 2020, Carlotta Demeniconi and Nitesh V. Chawla (Eds.). SIAM, 469–477. https://doi.org/10.1137/1.9781611976236.53
[69]
Heyu Zhou, An-An Liu, and Weizhi Nie. 2019. Dual-level Embedding Alignment Network for 2D Image-Based 3D Object Retrieval. In Proceedings of the 27th ACM International Conference on Multimedia, MM 2019, Nice, France, October 21-25, 2019, Laurent Amsaleg, Benoit Huet, Martha A. Larson, Guillaume Gravier, Hayley Hung, Chong-Wah Ngo, and Wei Tsang Ooi (Eds.). ACM, 1667–1675. https://doi.org/10.1145/3343031.3351011
Digital Library
[70]
Heyu Zhou, Weizhi Nie, Dan Song, Nian Hu, Xuanya Li, and An-An Liu. 2020. Semantic Consistency Guided Instance Feature Alignment for 2D Image-Based 3D Shape Retrieval. In MM ’20: The 28th ACM International Conference on Multimedia, Virtual Event / Seattle, WA, USA, October 12-16, 2020, Chang Wen Chen, Rita Cucchiara, Xian-Sheng Hua, Guo-Jun Qi, Elisa Ricci, Zhengyou Zhang, and Roger Zimmermann (Eds.). ACM, 925–933. https://doi.org/10.1145/3394171.3413631
Digital Library
Index Terms
Cross-modal Contrastive Learning with a Style-mixed Bridge for Single Image 3D Shape Retrieval
Computing methodologies
Artificial intelligence
Computer vision
Computer vision tasks
Visual content-based indexing and retrieval
Recommendations
- C3CMR: Cross-Modality Cross-Instance Contrastive Learning for Cross-Media Retrieval
MM '22: Proceedings of the 30th ACM International Conference on Multimedia
Cross-modal retrieval is an essential area of representation learning, which aims to retrieve instances with the same semantics from different modalities. In real implementation, a key challenge for cross-modal retrieval is to narrow the heterogeneity ...
Read More
- Cross-modal contrastive learning for multimodal sentiment recognition
Abstract
Multimodal sentiment recognition has obtained increasing attention in recent years due to its potential to improve sentiment recognition accuracy by integrating information from multiple modalities. However, the heterogeneity issue caused by the ...
Read More
- Early-Learning regularized Contrastive Learning for Cross-Modal Retrieval with Noisy Labels
MM '22: Proceedings of the 30th ACM International Conference on Multimedia
Cross modal retrieval receives intensive attention for flexible queries between different modalities. However, in practice it is challenging to retrieve cross modal content with noisy labels. The latest research on machine learning shows that a model ...
Read More
Comments
Information & Contributors
Information
Published In
ACM Transactions on Multimedia Computing, Communications, and ApplicationsJust Accepted
EISSN:1551-6865
Table of Contents
Copyright © 2024 Copyright held by the owner/author(s).
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [emailprotected].
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Online AM: 30 August 2024
Accepted: 16 August 2024
Revised: 16 July 2024
Received: 19 March 2024
Check for updates
Author Tags
- 3D model retrieval
- Style transition
- Contrastive learning
Qualifiers
- Research-article
Contributors
Other Metrics
View Article Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
Total Citations
Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Reflects downloads up to 29 Aug 2024
Other Metrics
View Author Metrics
Citations
View Options
View options
View or Download as a PDF file.
PDFeReader
View online with eReader.
eReaderGet Access
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in
Full Access
Get this Article
Media
Figures
Other
Tables