Abstract
In unsupervised domain adaptation (UDA), knowledge is transferred from label-rich source domains to relevant but unlabeled target domains. Current most popular state-of-the-art works suggest that performing domain alignment from the class perspective can alleviate domain shift. However, most of them based on domain adversarial which is hard to train and converge. In this paper, we propose a novel contrastive learning to improve diversity and discriminability for domain adaptation, dubbed as IDD_ICL, which improve the discriminativeness of the model while increasing the sample diversity. To be precise, we first design a novel implicits contrastive learning loss at sample-level by implicit augment sample of the source domain. While augmenting the diversity of the source domain, we can cluster the samples of the same category in the source domain together, and disperse the samples of different categories, thereby improving the discriminative ability of the model. Furthermore, we show that our algorithm is effective by implicitly learning an infinite number of similar samples. Our results demonstrate that our method doesn’t require complex technologies or specialized equipment, making it readily adoptable and applicable in practical scenarios.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Although deep neural networks (DNN) [1] have achieved remarkable results in many computer vision tasks, they generally assume that the training and test sets follow the same distribution. However, in real environments, the training and test sets may come from different distributions. Unsupervised Domain Adaptation (UDA) [2,3,4,5] aims to alleviate the domain gap by leveraging unlabeled target domain data. To this end, researchers design different unsupervised losses on the target data for learning a well-performing model in the target domain. The loss of existing unsupervised domain adaptation can be roughly summarized into three categories: 1) self-training loss that iteratively retrains the network with highly confident pseudo-labeled target samples [6,7,8,9]; 2) image transformation loss which transforms the source image into a target-like style and appearance [10,11,12,13]; 3) adversarial loss that forces the two domains to align in the output space [14, 14, 15].
In order to minimize domain discrepancies, most researchers have developed adversarial losses [2, 14] to handle this problem. For this purpose, GAN-style [16] architectures are widely used, which contain a generator and discriminator. In the discriminator, features are extracted from raw images by the generator, which identifies the two domains. This can be accomplished by both adversarial and cooperative methods by using the discriminator to guide the generator toward extracting target features that are close to the distribution of source features. While these methods match the marginal distributions of the two domains, they do not guarantee that features from different categories within the target domain will be well separated. It is important to describe the feature distribution separately for each category to ensure semantic consistency. In recent years, many approaches have included semantic information [7, 17] with their features in order to align categories. With these methods, category-level adversarial training is used to align semantic features across the source and target domains independently. During adaptation, however, the mini-batch size used for training is small, so an object instance from a source domain typically differs greatly from another image. Therefore, these methods inevitably bring image-level bias, which leads to learning features being misaligned between domains and being unstably aligned.
Based on the analysis above, we present a new approach to achieve domain adaptation that minimizes domain shifts by learning sample-wise representations that attract similar samples and dispel those that differ which can be seen in Fig. 1. In order to guide the directions of category alignment, our first step is to determine the holistic distribution of each category in the source domain, as the distribution can be efficiently estimated with sufficient supervision. Unlike category-centroid-based counterparts, our method is able to provide diverse generations from estimated distributions. Second, a better sample classifier can be obtained by increasing the level of intra-category compactness and inter-category separability in sample-wise representations. By sampling from the estimated distribution in the same category, we define an infinite number of positive pairs for each sample by separating sample-wise representations in both source and target domains. The rest semantic distributions are then used to draw an infinite number of negative sample pairs. For contrastive adaptation, the following form of contrastive loss is used. This formulation is further enhanced with a practical upper bound. Furthermore, we propose to enhance the discriminability of our model by using segmentation predictions with high confidence to retrain the model. In order to confirm the validity of our method regarding sample-wise category alignment, we conduct an analysis using sample-wise discrimination distance. Experimental results have demonstrated that contrastively driving the source and target sample-wise representations toward semantic distributions decreases domain discrepancies and improve generalization capabilities.
The following summarizes our main contributions.
-
We propose novel contrastive learning to improve diversity and discriminability for domain adaptation (IDD_ICL). Specifically, sample-wise representations and semantic distributions of the same category should be explicitly encouraged, while sample-wise representations and semantic distributions of different categories should be penalized.
-
The statistics of each category are used to derive an upper bound on the expected contrastive loss, making it simple yet effective to learn invariant and distinctive sample-wise representations.
-
Several empirical evaluations of competitive benchmarks, including Office-31, Office-Home and VisDA-2017, show that the IDD_ICL significantly improves the baseline model. Its effectiveness is validated by analytical evidence.
The remaining sections are organized as follows: Section 2 delves into the relevant literature. Section 3 provides a concise overview of the proposed design. Section 4 presents and analyzes the results of the experiments conducted. Finally, in Section 5, we conclude and wrap up the discussion.
2 Related work
2.1 Domain adaptation
Unsupervised domain adaptation (UDA) alleviates domain shift by transferring knowledge from a similar source domain to a target domain. The problem of image classification has been tackled in a number of pioneering works [2, 3, 18,19,20,21,22]. By reweighting instances or learning domain-invariant features, early domain adaptation works reduce the gap between domains [23, 24]. As a result, various deep DA works have been discussed to improve transfer performance given CNN’s power. Minimizing feature representation divergence is a common strategy [20, 25, 26], maximum mean discrepancy [27]. Drawn inspiration from generative adversarial network [16], adversarial-based training is another popular method for learning domain-invariant features [14, 28,29,30]. Semantic representations are among the most relevant works in this subset [31, 32]. Zahra. [33] propose a new method for limited domain adaptation, leveraging geometry information of both the source and target domains to maintain geometry information within domains allows for the use of source samples to compensate for the missing classes in the target domain. A new metric named contrastive domain discrepancy is used by Kang [31] to explicitly model intra- and inter-class discrepancies. Many recent works have adopted the adversarial learning mechanism and achieved the state-of-the-art performance for unsupervised domain adaptation. The Adversarial Discriminative Domain Adaptation [28] method uses an untied weight sharing strategy to align the feature distributions of the source and target domains. The Maximum Classification Discrepancy [34] utilizes different task-specific classifiers to learn a feature extractor that can generate category-related discriminative features. Multi-Adversarial Domain Adaptation [35] can exploit multiplicative interactions between feature representations and category predictions to enforce adversarial learning. We show the prospects (Pros) and considerations (Cons) of some of the current technologies. Our proposed method is to improve diversity and discriminability for domain adaptation in Table 1
2.2 Contrastive learning
In recent times, contrastive learning has demonstrated remarkable performance in representation learning, yielding state-of-the-art outcomes in the field of computer vision. The fundamental objective of this approach is to create an embedding space in which similar or positive pairs are brought closer together, while dissimilar or negative pairs are pushed apart. Positive pairs are established by pairing augmentations of the same image, whereas negative pairs are formed using augmentations from different images. Various existing contrastive learning methods employ different strategies for generating positive and negative samples. For example, Wu et al. [36] maintain sample representations in a memory bank, MoCo [37] maintains an on-the-fly momentum encoder alongside a limited queue of previous samples, Tian et al. [38] employ all generated multi-view samples in a mini-batch approach, and both SimClr V1 [39] and SimClr V2 [40] utilize a momentum encoder and all generated sample representations within the mini-batch. While these methods provide pre-trained networks for downstream tasks, they do not explicitly address domain shift when applied directly. In contrast, our approach focuses on learning representations that are generalizable without the need for labeled data. Notably, contrastive learning has recently been applied in the context of unsupervised domain adaptation [37, 41,42,43,44,45]. In these settings, models have access to source labels and typically employ models pre-trained on ImageNet as their backbone network. In comparison, our work is rooted in contrastive learning, often referred to as unsupervised representation learning, and distinguishes itself by not relying on labeled data or pre-trained ImageNet parameters.
3 Method
3.1 Motivation and preliminaries
Formally, we identify the two domains in unsupervised domain adaptation as \(\mathcal {D}_{S}=\{(x_{sk}, y_{sk})\}_{k=1}^{n_{s}}\) with \(n_{s}\) labeled samples and \(\mathcal {D}_{T}=\{x_{tk}\}_{k=1}^{n_{t}}\) with \(n_{t}\) unlabeled samples, and \(y_{sk}\in \{1,2, ... ,K\}\) is the label refer to \(x_{sk}\). Because the distributions of the two domains are different. The different domain adaptation algorithms are presented in Table 1.
Our IDD_ICL framework can be seen in Fig. 2. First, we mine comprehensive semantic information from the distribution statistics for each category; then, to mitigate the domain gap, we design a novel contrastive loss which uses a sample level learning algorithm to simultaneously learn an infinite number of similar/dissimilar pairs.
3.1.1 Contrastive learning
In recent years, contrastive learning [37, 41, 43, 44] has been shown to be an effective method of learning meaningful representations from unlabeled data. Let f be an embedding function (realized via a CNN) that transforms an sample a to an embedding vector \(z=f(a)\,, z \in \mathbb {R}^d\). Then, we normalize z onto a unit sphere. Let \((a\,,a^-)\) be dissimilar pairs and \((a\,,a^+)\) be similar pairs. Then the contrastive loss of InfoNCE [42] can be written as follows:
It is common practice to replace expectations with empirical estimates. Above we saw the contrastive loss essentially is based on the softmax formulation with a temperature of \(\tau \) [42].
Clearly, the contrastive loss promotes sample discrimination. In contrast, our research explores sample predictions for UDA, which have received little consideration in previous studies. In this study, we demonstrate that sample-by-sample representation alignment outperforms existing algorithms by a significant margin. Below we will introduce the contrast loss [46] we use in the paper.
3.1.2 Estimation of semantic distributions
It is essential to identify all possible directions of feature transformation in order to facilitate meaningful cross-domain semantic augmentations. Such calculations require a large amount of computation on the whole source domain in an implementation. In order to resolve this issue, The mean is calculated online by aggregating statistics one by one. According to mathematics, the online mean estimation algorithm is as follows:
where \({\Sigma '}^i_{(t)}\) represents the covariance matrix of the features of the \(i^{th}\) category in \(t^{th}\) image. As an initialization, K mean values and K covariance matrices are computed on the whole source domain for each category before training. These semantic distributions are dynamically updated during adaptation. In order to guide the alignment of categories, the estimated semantic distributions are more informative.
3.2 Contrastive domain adaptation
Recently, several prior methods have leveraged category feature centroids [4, 47] or instance and stuff features in the source domain to serve as anchors to remedy the domain shift problem. However, in their works, these anchors merely preserve the basic characteristic of each category, but at the expense of the diversity and discriminability within the category. Additionally, their potential capability in dense prediction tasks is severely limited by an insufficient margin between categories.
By contrast, our approach maximizes the statistics of the distribution for category alignment at the pixel level, which is different from previous methods. A particular form of contrastive loss is obtained by incorporating multiple positive/negative sample pairs into our framework. To improve UDA, this modification forces similar and dissimilar pairs to establish stronger intra-category and inter-category connections.
Therefore, every sample representation in the source and target features must return a low loss value. Combined with multiple positive sample pairs \(a^{m+}\) and negative sample pairs \(a^{n-}_j\), where \(a^{m+}\) indicates the \(m^{th}\) positive example from the same category represents \(n^{th}\) negative example from the \(j^{th}\) different category. The following is a formal definition of a sample-wise contrastive loss:
Positive and negative examples are represented by M and N, respectively. Explicitly sampling M examples from semantic distribution is a naive implementation of \(\mathcal {L}^{M,N}\). There are N examples from each distribution with a different semantic label that have the same latent class.
By taking an infinity limit on M and N, we hope to absorb the effect of M and N probabilistically. Using the infinity limit, we achieve the same goal of multiple pairing. Mathematically, as M and N reach infinity, \(\mathcal {L}^{M,N}\) becomes the estimation of following:
A positive semantic distribution has the same semantic label as a negative semantic distribution has a different semantic label, and so on. Despite the fact that its analytic form is intractable, it has a rigorous closed form of upper bound:
The distribution of the features requires some further assumptions to facilitate our formulation. In the case of a random variable a that follows a Gaussian distribution \(x\sim \mathcal {N}(\mu , \Sigma )\), where \(\mu \) is the expectation of a, \(\Sigma \) is the covariance matrices of a. The moment generation function [48] satisfies the following conditions:
Under the Gaussian assumption \(a^{+} \sim \mathcal {N}(\mu ^{+}, \Sigma ^{+})\,, a^{-}_j\) \(\sim \mathcal {N}(\mu ^{-}_j, \Sigma ^{-}_j)\), along with (10), we find that (8) for a certain pixel representation \(a_i\) immediately reduces to:
3.3 Overall formulation
An estimate of the mutual information I(X; Y) determines the degree of similarity between two random variables. Due to strong correlations between target features and predictions, our semantic augmentations will contain more meaningful semantic information, ignoring trivial semantic information. Therefore, we maximize mutual information on target data, i.e., minimize loss in (13).
where \(\varvec{\hat{P}} = \frac{1}{n_{t}} \sum _{j=1}^{n_t} \varvec{P}_{tj}\). The ground-truth distribution on the target domain is approximated by the average of the target predictions.
Therefore, IDD_ICL serves the following objective functions:
In this case, \(\alpha \) represents the trade-off parameter. We summarize our training process in Algorithm 1. A detailed analysis of the IDD_ICL will be performed in the ablation study.
4 Experiment
4.1 Datasets
Office-31 [66] uses images from 31 distinct categories based on three distinct domains: Amazon (A), Webcam (W), and DSLR (D). Amazon: 2,817 images, an average of 90 per class, with a single image background. Webcam: 795 images, images exhibit significant noise, color, and white balance artifacts. DSLR: 498 images, 5 objects per class, each object taken on average 3 times from different viewpoints.
Office-Home [67] contains 15,500 images divided into 65 categories, including Artistic (Ar), Clip Art (Cl), Product (Pr), and Real-World (Rw). The office-home dataset is a more complex dataset than Office-31.
VisDA-2017 [68] are included in the visda-2017 dataset, spanning 12 categories. Taking [34] as a guide, we use 152,397 synthetic images as a source and 55,388 realistic images as a target.
4.2 Implementation details
Our backbone network for these datasets is ResNet [69] pre-trained on ImageNet [70]. The experiments in this paper are implemented using PyTorch [71]. For network optimization, we use a mini-batch SGD optimizer with momentum 0.9, and a deep embedded validation [72] to select hyperparameters \(\alpha \) from \(\{0.1, 0.05, 0.1, 0.15, 0.2\}\). On all datasets, we find \(\alpha =0.1\) works well.
4.3 Results
Results on Office-31 can be found in Table 2. IDD_ICL outperforms JAN and DANN by a large margin, showing that revised contrastive learning is also indispensable for UDA. In particular, IDD_ICL significantly improves GSP by 2.5%, demonstrating that IDD_ICL complements previous UDA methods. Additionally, IDD_ICL is superior to recent classifier adaptation methods, such as SymNets or TAT, showing it is capable of exploring useful semantic information for better diversity and discrimination.
Results on Office-Home can be found in Table 3. With large domain discrepancies, Office-Home is one of the most challenging datasets for UDA. In comparison with the compared methods, IDD_ICL consistently improves generalization ability. A specific benefit of IDD_ICL is that it enhances GVB-GD’s accuracy by 4.1%, with the average accuracy reaching 74.5%. These promising results indicate that IDD_ICL enhances transferability of classifiers across cross-domain datasets stably.
Results on VisDA-2017 can be found in Table 4. Compared to other augmentation methods, IDD_ICL performs dramatically better. Generally, IDD_ICL generates better augmentation results since it exploits mean difference and target covariance to capture semantic information class-wise. Furthermore, IDD_ICL proves its effectiveness and versatility over other baseline methods as well.
4.4 Analysis
Ablation study. To demonstrate the effectiveness of the proposed method, we use different methods like CDAN, DANN, BSP as our baseline, it can be seen in Table 5 that compared to the comparison methods, our IDD_ICL method shows a huge improvement, and improves 8% on DANN, which further demonstrates the effectiveness of our method. When evaluating the IDD_ICL component in these methods, we can see that the model without the proposed method produces worse classification accuracies for these tasks, demonstrating that contrastive learning information in the target domain can make a significant contribution in domain adaptation. All experiments produce inferior results, and the full model with IDD_ICL produces the best results. This validates the contributions of the proposed method.
Hyper-parameter sensitivity To study how the hyper-parameter \(\lambda \) affects the performance of our method, sensitivity test is conducted. We conducted experiments on two Office-31 tasks A\(\rightarrow \)W and W\(\rightarrow \)A by varying \(\alpha \in \{0.01, 0.05, 0.1, 0.15, 0.2\}\). Figure 3 shows that IDD_ICL is not that sensitive to \(\alpha \), and can achieve competitive results with different hyper-parameters. empirically, we recommend \(\alpha =0.1\) for naive implementation.
Quantitative Distribution Discrepancy The distribution discrepancy between source and target domains is used here to evaluate the functionality of each component in our model, resulting from \(\mathcal {A}\)-distance [73] in Fig. 4. Based on [73], \(\mathcal {A}\)-distance is defined as \(d_{\mathcal {A}}=2(1-2\epsilon )\), where \(\epsilon \) represents a binary domain classifier’s classification error in discriminating between the source and target domains. In general, the smaller the \(\mathcal {A}\)-distance, the better the alignment of the distribution, as shown in Fig. 4. The \(\mathcal {A}\)-distance between two domains is smaller using our model than those of the other three baselines. In other words, our model reduces domain discrepancy gaps more effectively.
5 Conclusion
This paper proposes a novel contrastive learning approach for improving diversity and discriminability for domain adaptation (IDD_ICL). Through sample-wise alignment guided by semantic distributions, the IDD_ICL model successfully adapts to the target domain. For each sample-wise representation of both domains, we use a particular form of contrastive loss that implicitly involves learning infinitely many similar/dissimilar sample pairs. A practical implementation of this intractable loss function is then derived. The combination of this simple but effective strategy and self-supervised learning is surprisingly effective. IDD_ICL is superior on a variety of benchmarks, as demonstrated by the experimental results.
One limitation of our study is that it only focuses on image classification. In future work, we plan to extend the scope of our benchmarking to include semantic segmentation tasks. Additionally, we have only considered the closed-set domain adaptation scenario, where the source and target classes are similar. In future work, we aim to considering partial and open-set domain adaptation scenarios, which are common in image classification applications and involve varying classes between the source and target domain.
References
O’Shea K, Nash R (2015) An introduction to convolutional neural networks. arXiv:1511.08458
Wang M, Wang W, Li B, Zhang X, Lan L, Tan H, Liang T, Yu W, Luo Z (2021) Interbn: channel fusion for adversarial unsupervised domain adaptation. In: Proceedings of the 29th ACM international conference on multimedia, pp 3691–3700
Wang M, An S, Luo X, Peng X, Yu W, Chen J, Luo Z (2022) Attention-based adversarial partial domain adaptation. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 3144–3148
Wang M, Li P, Shen L, Wang Y, Wang S, Wang W, Zhang X, Chen J, Luo Z (2022) Informative pairs mining based adaptive metric learning for adversarial domain adaptation. Neural Networks
Wang M, Wang S, Yang X, Yuan J, Zhang W (2024) Equity in unsupervised domain adaptation by nuclear norm maximization. IEEE Transactions on Circuits and Systems for Video Technology
Guan D, Huang J, Xiao A, Lu S (2021) Domain adaptive video segmentation via temporal consistency regularization. In: Proceedings of the IEEE/CVF international conference on computer Vision, pp 8053–8064
Li Y, Yuan L, Vasconcelos N (2019) Bidirectional learning for domain adaptation of semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6936–6945
Zou Y, Yu Z, Liu X, Kumar B, Wang J (2019) Confidence regularized self-training. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5982–5991
Zou Y, Yu Z, Kumar B, Wang J (2018) Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 289–305
Wang M, Liu Y, Yuan J, Wang S, Wang Z, Wang W (2024) Inter-class and interdomain semantic augmentation for domain generalization. IEEE Transactions on Image Processing
Wang M, Yuan J, Wang Z (2023) Mixture-of-experts learner for single long-tailed domain generalization. In: Proceedings of the 31st ACM International Conference on Multimedia, pp 290–299
Wang M, Chen J, Wang H, Wu H, Liu Z, Zhang Q (2023) Interpolation normalization for contrast domain generalization. In: Proceedings of the 31st ACM International Conference on Multimedia, pp 2936–2945
Wang M, Wang S, Wang Y, Wang W, Liang T, Chen J, Luo Z (2023) Boosting unsupervised domain adaptation: A fourier approach. Knowledge-Based Systems 264, 110325
Saito K, Watanabe K, Ushiku Y, Harada T (2018) Maximum classifier discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3723–3732
Luo Y, Zheng L, Guan T, Yu J, Yang Y (2019) Taking a closer look at domain shift: category-level adversaries for semantics consistent domain adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2507–2516
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: Proc. NeurIPS, pp 2672–2680
Yang Y, Soatto S (2020) Fda: fourier domain adaptation for semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4085–4095
Tzeng E, Hoffman J, Darrell T, Saenko K (2015) Simultaneous deep transfer across domains and tasks. In: Proc ICCV, pp 4068–4076
Li S, Liu CH, Lin Q, Wen Q, Su L, Huang G, Ding Z (2020) Deep residual correction network for partial domain adaptation. IEEE Trans Pattern Anal Mach Intell 1–1. https://doi.org/10.1109/TPAMI.2020.2964173
Long M, Cao Y, Wang J, Jordan MI (2015) Learning transferable features with deep adaptation networks. In: Proc ICML, vol 37, pp 97–105
Liu L, Pietikäinen M, Chen J, Zhao G, Wang X, Chellappa R (2019) Guest editors’ introduction to the special section on compact and efficient feature representation and learning in computer vision. IEEE Trans Pattern Anal Mach Intell 41(10):2287–2290
Zhang W, Xu D, Ouyang W, Li W (2019) Self-paced collaborative and adversarial network for unsupervised domain adaptation. IEEE Trans Pattern Anal Mach Intell 1–1
Gong B, Shi Y, Sha F, Grauman K (2012) Geodesic flow kernel for unsupervised domain adaptation. In: Proc CVPR, pp 2066–2073
Li S, Song S, Huang G, Ding Z, Wu C (2018) Domain invariant and class discriminative feature learning for visual domain adaptation. IEEE Trans Image Process 27(9):4260–4273
Long M, Zhu H, Wang J, Jordan MI (2017) Deep transfer learning with joint adaptation networks. In: Proc ICML 70:2208–2217
Tzeng E, Hoffman J, Zhang N, Saenko K, Darrell T (2014) Deep domain confusion: maximizing for domain invariance. CoRR abs/1412.3474
Gretton A, Sriperumbudur BK, Sejdinovic D, Strathmann H, Balakrishnan S, Pontil M, Fukumizu K (2012) Optimal kernel choice for large-scale two-sample tests. In: Proc NeurIPS, pp 1214–1222
Tzeng E, Hoffman J, Saenko K, Darrell T (2017) Adversarial discriminative domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7167–7176
Ganin Y, Lempitsky VS (2015) Unsupervised domain adaptation by backpropagation. In: Proc ICML 37:1180–1189
Long M, Cao Z, Wang J, Jordan MI (2018) Conditional adversarial domain adaptation. In: Proc NeurIPS, pp 1647–1657
Kang G, Jiang L, Yang Y, Hauptmann AG (2019) Contrastive adaptation network for unsupervised domain adaptation. In: Proc CVPR, pp 4893–4902
Xie S, Zheng Z, Chen L, Chen C (2018) Learning semantic representations for unsupervised domain adaptation. In: Proc ICML 80:5419–5428
Taghiyarrenani Z, Nowaczyk S, Pashami S, Bouguelia MR (2022) Towards geometry-preserving domain adaptation for fault identification. In: Joint European conference on machine learning and knowledge discovery in databases, pp 451–460
Saito K, Watanabe K, Ushiku Y, Harada T (2018) Maximum classifier discrepancy for unsupervised domain adaptation. In: CVPR, pp 3723–3732
Pei Z, Cao Z, Long M, Wang J (2018) Multi-adversarial domain adaptation. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
Wu Z, Xiong Y, Yu SX, Lin D (2018) Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3733–3742
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738
Tian Y, Krishnan D, Isola P (2020) Contrastive multiview coding. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16, pp 776–794. Springer
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, pp 1597–1607. PMLR
Chen T, Kornblith S, Swersky K, Norouzi M, Hinton GE (2020) Big self-supervised models are strong semi-supervised learners. Adv Neural Inf Process Syst 33:22243–22255
Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: Proc CVPR, pp 1735–1742
Oord A, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. CoRR abs/1807.03748
Chen T, Kornblith S, Norouzi M, Hinton GE (2020) A simple framework for contrastive learning of visual representations. In: Proc ICML, vol 119, pp 1597–1607
Cai Q, Wang Y, Pan Y, Yao T, Mei T (2020) Joint contrastive learning with infinite possibilities. In: Proc NeurIPS
Chuang C, Robinson J, Lin Y, Torralba A, Jegelka S (2020) Debiased contrastive learning. In: Proc NeurIPS
Li S, Xie B, Zang B, Liu CH, Cheng X, Yang R, Wang G (2021) Semantic distribution-aware contrastive adaptation for semantic segmentation. arXiv:2105.05013
Zhang Q, Zhang J, Liu W, Tao D (2019) Category anchor-guided unsupervised domain adaptation for semantic segmentation. In: Proc NeurIPS, pp 433–443
Wang Y, Huang G, Song S, Pan X, Xia Y, Wu C (2021) Regularizing deep networks with semantic data augmentation. IEEE Trans Pattern Anal Mach Intell 1–1
Tzeng E, Hoffman J, Saenko K, Darrell T (2017) Adversarial discriminative domain adaptation. In: CVPR, vol 1, p 4
Long M, Zhu H, Wang J, Jordan MI (2017) Deep transfer learning with joint adaptation networks. In: ICML, pp 2208–2217. ACM
Li M, Zhai YM, Luo YW, Ge PF, Ren CX (2020) Enhanced transport distance for unsupervised domain adaptation. In: CVPR
Cui S, Junbao Zhuo SW, Li L, Huang Q, Tian Q (2020) Towards discriminability and diversity: batch nuclear-norm maximization under label insufficient situations. In: CVPR, pp 3940–3949
Wu Y, Inkpen D, El-Roby A (2020) Dual mixup regularized learning for adversarial domain adaptation. In: ECCV, vol 12374, pp 540–555
Zhang Y, Tang H, Jia K, Tan M (2019) Domain-symmetric networks for adversarial domain adaptation. In: CVPR, pp 5031–5040
Liu H, Long M, Wang J, Jordan M (2019) Transferable adversarial training: a general approach to adapting deep classifiers. In: ICML, pp 4013–4022
Zhang Y, Liu T, Long M, Jordan M (2019) Bridging theory and algorithm for domain adaptation. In: ICML, pp 7404–7413
Cui S, Wang S, Zhuo J, Su C, Huang Q, Tian Q (2020) Gradually vanishing bridge for adversarial domain adaptation. In: CVPR
Xia H, Ding Z (2020) Structure preserving generative cross-domain learning. In: CVPR
Pan Y, Yao T, Li Y, Wang Y, Ngo CW, Mei T (2019) Transferrable prototypical networks for unsupervised domain adaptation. In: CVPR
Long M, Cao Y, Wang J, Jordan MI (2015) Learning transferable features with deep adaptation networks. In: ICML, pp 97–105. ACM
Ganin Y, Lempitsky VS (2015) Unsupervised domain adaptation by backpropagation. In: ICML, pp 1180–1189
Long M, Cao Z, Wang J, Jordan MI (2018) Conditional adversarial domain adaptation. In: NeurIPS, pp 1647–1657
Chen X, Wang S, Long M, Wang J (2019) Transferability vs. discriminability: batch spectral penalization for adversarial domain adaptation. In: ICML, pp 1081–1090
Chang WG, You T, Seo S, Kwak S, Han B (2019) Domain-specific batch normalization for unsupervised domain adaptation. In: CVPR, pp 7354–7362
Zhu Y, Zhuang F, Wang J, Ke G, Chen J, Bian J, Xiong H, He Q (2020) Deep subdomain adaptation network for image classification. TNNLS, 1713–1722
Saenko K, Kulis B, Fritz M, Darrell T (2010) Adapting visual category models to new domains. In: ECCV, pp 213–226
Venkateswara H, Eusebio J, Chakraborty S, Panchanathan S (2017) Deep hashing network for unsupervised domain adaptation. In: CVPR, pp 5018–5027
Peng X, Usman B, Kaushik N, Hoffman J, Wang D, Saenko K (2017) Visda: the visual domain adaptation challenge. arXiv:1710.06924
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR, pp 770–778
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M (2015) Imagenet large scale visual recognition challenge. IJCV 115(3):211–252
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: an imperative style, high-performance deep learning library. In: Proc. NeurIPS, pp 8024–8035
You K, Wang X, Long M, Jordan M (2019) Towards accurate model selection in deep unsupervised domain adaptation. In: ICML, pp 7124–7133
Ben-David S, Blitzer J, Crammer K, Kulesza A, Pereira F, Vaughan JW (2010) A theory of learning from different domains. Machine learning, 151–175
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Heng Xu and Chuanqi Shi contributed equally to this work.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xu, H., Shi, C., Fan, W. et al. Improving diversity and discriminability based implicit contrastive learning for unsupervised domain adaptation. Appl Intell 54, 10007–10017 (2024). https://doi.org/10.1007/s10489-024-05351-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-024-05351-y