Abstract
In unsupervised domain adaptation (UDA), knowledge is transferred from labelrich source domains to relevant but unlabeled target domains. Current most popular stateoftheart works suggest that performing domain alignment from the class perspective can alleviate domain shift. However, most of them based on domain adversarial which is hard to train and converge. In this paper, we propose a novel contrastive learning to improve diversity and discriminability for domain adaptation, dubbed as IDD_ICL, which improve the discriminativeness of the model while increasing the sample diversity. To be precise, we first design a novel implicits contrastive learning loss at samplelevel by implicit augment sample of the source domain. While augmenting the diversity of the source domain, we can cluster the samples of the same category in the source domain together, and disperse the samples of different categories, thereby improving the discriminative ability of the model. Furthermore, we show that our algorithm is effective by implicitly learning an infinite number of similar samples. Our results demonstrate that our method doesn’t require complex technologies or specialized equipment, making it readily adoptable and applicable in practical scenarios.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Although deep neural networks (DNN) [1] have achieved remarkable results in many computer vision tasks, they generally assume that the training and test sets follow the same distribution. However, in real environments, the training and test sets may come from different distributions. Unsupervised Domain Adaptation (UDA) [2,3,4,5] aims to alleviate the domain gap by leveraging unlabeled target domain data. To this end, researchers design different unsupervised losses on the target data for learning a wellperforming model in the target domain. The loss of existing unsupervised domain adaptation can be roughly summarized into three categories: 1) selftraining loss that iteratively retrains the network with highly confident pseudolabeled target samples [6,7,8,9]; 2) image transformation loss which transforms the source image into a targetlike style and appearance [10,11,12,13]; 3) adversarial loss that forces the two domains to align in the output space [14, 14, 15].
In order to minimize domain discrepancies, most researchers have developed adversarial losses [2, 14] to handle this problem. For this purpose, GANstyle [16] architectures are widely used, which contain a generator and discriminator. In the discriminator, features are extracted from raw images by the generator, which identifies the two domains. This can be accomplished by both adversarial and cooperative methods by using the discriminator to guide the generator toward extracting target features that are close to the distribution of source features. While these methods match the marginal distributions of the two domains, they do not guarantee that features from different categories within the target domain will be well separated. It is important to describe the feature distribution separately for each category to ensure semantic consistency. In recent years, many approaches have included semantic information [7, 17] with their features in order to align categories. With these methods, categorylevel adversarial training is used to align semantic features across the source and target domains independently. During adaptation, however, the minibatch size used for training is small, so an object instance from a source domain typically differs greatly from another image. Therefore, these methods inevitably bring imagelevel bias, which leads to learning features being misaligned between domains and being unstably aligned.
Based on the analysis above, we present a new approach to achieve domain adaptation that minimizes domain shifts by learning samplewise representations that attract similar samples and dispel those that differ which can be seen in Fig. 1. In order to guide the directions of category alignment, our first step is to determine the holistic distribution of each category in the source domain, as the distribution can be efficiently estimated with sufficient supervision. Unlike categorycentroidbased counterparts, our method is able to provide diverse generations from estimated distributions. Second, a better sample classifier can be obtained by increasing the level of intracategory compactness and intercategory separability in samplewise representations. By sampling from the estimated distribution in the same category, we define an infinite number of positive pairs for each sample by separating samplewise representations in both source and target domains. The rest semantic distributions are then used to draw an infinite number of negative sample pairs. For contrastive adaptation, the following form of contrastive loss is used. This formulation is further enhanced with a practical upper bound. Furthermore, we propose to enhance the discriminability of our model by using segmentation predictions with high confidence to retrain the model. In order to confirm the validity of our method regarding samplewise category alignment, we conduct an analysis using samplewise discrimination distance. Experimental results have demonstrated that contrastively driving the source and target samplewise representations toward semantic distributions decreases domain discrepancies and improve generalization capabilities.
The following summarizes our main contributions.

We propose novel contrastive learning to improve diversity and discriminability for domain adaptation (IDD_ICL). Specifically, samplewise representations and semantic distributions of the same category should be explicitly encouraged, while samplewise representations and semantic distributions of different categories should be penalized.

The statistics of each category are used to derive an upper bound on the expected contrastive loss, making it simple yet effective to learn invariant and distinctive samplewise representations.

Several empirical evaluations of competitive benchmarks, including Office31, OfficeHome and VisDA2017, show that the IDD_ICL significantly improves the baseline model. Its effectiveness is validated by analytical evidence.
The remaining sections are organized as follows: Section 2 delves into the relevant literature. Section 3 provides a concise overview of the proposed design. Section 4 presents and analyzes the results of the experiments conducted. Finally, in Section 5, we conclude and wrap up the discussion.
2 Related work
2.1 Domain adaptation
Unsupervised domain adaptation (UDA) alleviates domain shift by transferring knowledge from a similar source domain to a target domain. The problem of image classification has been tackled in a number of pioneering works [2, 3, 18,19,20,21,22]. By reweighting instances or learning domaininvariant features, early domain adaptation works reduce the gap between domains [23, 24]. As a result, various deep DA works have been discussed to improve transfer performance given CNN’s power. Minimizing feature representation divergence is a common strategy [20, 25, 26], maximum mean discrepancy [27]. Drawn inspiration from generative adversarial network [16], adversarialbased training is another popular method for learning domaininvariant features [14, 28,29,30]. Semantic representations are among the most relevant works in this subset [31, 32]. Zahra. [33] propose a new method for limited domain adaptation, leveraging geometry information of both the source and target domains to maintain geometry information within domains allows for the use of source samples to compensate for the missing classes in the target domain. A new metric named contrastive domain discrepancy is used by Kang [31] to explicitly model intra and interclass discrepancies. Many recent works have adopted the adversarial learning mechanism and achieved the stateoftheart performance for unsupervised domain adaptation. The Adversarial Discriminative Domain Adaptation [28] method uses an untied weight sharing strategy to align the feature distributions of the source and target domains. The Maximum Classification Discrepancy [34] utilizes different taskspecific classifiers to learn a feature extractor that can generate categoryrelated discriminative features. MultiAdversarial Domain Adaptation [35] can exploit multiplicative interactions between feature representations and category predictions to enforce adversarial learning. We show the prospects (Pros) and considerations (Cons) of some of the current technologies. Our proposed method is to improve diversity and discriminability for domain adaptation in Table 1
2.2 Contrastive learning
In recent times, contrastive learning has demonstrated remarkable performance in representation learning, yielding stateoftheart outcomes in the field of computer vision. The fundamental objective of this approach is to create an embedding space in which similar or positive pairs are brought closer together, while dissimilar or negative pairs are pushed apart. Positive pairs are established by pairing augmentations of the same image, whereas negative pairs are formed using augmentations from different images. Various existing contrastive learning methods employ different strategies for generating positive and negative samples. For example, Wu et al. [36] maintain sample representations in a memory bank, MoCo [37] maintains an onthefly momentum encoder alongside a limited queue of previous samples, Tian et al. [38] employ all generated multiview samples in a minibatch approach, and both SimClr V1 [39] and SimClr V2 [40] utilize a momentum encoder and all generated sample representations within the minibatch. While these methods provide pretrained networks for downstream tasks, they do not explicitly address domain shift when applied directly. In contrast, our approach focuses on learning representations that are generalizable without the need for labeled data. Notably, contrastive learning has recently been applied in the context of unsupervised domain adaptation [37, 41,42,43,44,45]. In these settings, models have access to source labels and typically employ models pretrained on ImageNet as their backbone network. In comparison, our work is rooted in contrastive learning, often referred to as unsupervised representation learning, and distinguishes itself by not relying on labeled data or pretrained ImageNet parameters.
3 Method
3.1 Motivation and preliminaries
Formally, we identify the two domains in unsupervised domain adaptation as \(\mathcal {D}_{S}=\{(x_{sk}, y_{sk})\}_{k=1}^{n_{s}}\) with \(n_{s}\) labeled samples and \(\mathcal {D}_{T}=\{x_{tk}\}_{k=1}^{n_{t}}\) with \(n_{t}\) unlabeled samples, and \(y_{sk}\in \{1,2, ... ,K\}\) is the label refer to \(x_{sk}\). Because the distributions of the two domains are different. The different domain adaptation algorithms are presented in Table 1.
Our IDD_ICL framework can be seen in Fig. 2. First, we mine comprehensive semantic information from the distribution statistics for each category; then, to mitigate the domain gap, we design a novel contrastive loss which uses a sample level learning algorithm to simultaneously learn an infinite number of similar/dissimilar pairs.
3.1.1 Contrastive learning
In recent years, contrastive learning [37, 41, 43, 44] has been shown to be an effective method of learning meaningful representations from unlabeled data. Let f be an embedding function (realized via a CNN) that transforms an sample a to an embedding vector \(z=f(a)\,, z \in \mathbb {R}^d\). Then, we normalize z onto a unit sphere. Let \((a\,,a^)\) be dissimilar pairs and \((a\,,a^+)\) be similar pairs. Then the contrastive loss of InfoNCE [42] can be written as follows:
It is common practice to replace expectations with empirical estimates. Above we saw the contrastive loss essentially is based on the softmax formulation with a temperature of \(\tau \) [42].
Clearly, the contrastive loss promotes sample discrimination. In contrast, our research explores sample predictions for UDA, which have received little consideration in previous studies. In this study, we demonstrate that samplebysample representation alignment outperforms existing algorithms by a significant margin. Below we will introduce the contrast loss [46] we use in the paper.
3.1.2 Estimation of semantic distributions
It is essential to identify all possible directions of feature transformation in order to facilitate meaningful crossdomain semantic augmentations. Such calculations require a large amount of computation on the whole source domain in an implementation. In order to resolve this issue, The mean is calculated online by aggregating statistics one by one. According to mathematics, the online mean estimation algorithm is as follows:
where \({\Sigma '}^i_{(t)}\) represents the covariance matrix of the features of the \(i^{th}\) category in \(t^{th}\) image. As an initialization, K mean values and K covariance matrices are computed on the whole source domain for each category before training. These semantic distributions are dynamically updated during adaptation. In order to guide the alignment of categories, the estimated semantic distributions are more informative.
3.2 Contrastive domain adaptation
Recently, several prior methods have leveraged category feature centroids [4, 47] or instance and stuff features in the source domain to serve as anchors to remedy the domain shift problem. However, in their works, these anchors merely preserve the basic characteristic of each category, but at the expense of the diversity and discriminability within the category. Additionally, their potential capability in dense prediction tasks is severely limited by an insufficient margin between categories.
By contrast, our approach maximizes the statistics of the distribution for category alignment at the pixel level, which is different from previous methods. A particular form of contrastive loss is obtained by incorporating multiple positive/negative sample pairs into our framework. To improve UDA, this modification forces similar and dissimilar pairs to establish stronger intracategory and intercategory connections.
Therefore, every sample representation in the source and target features must return a low loss value. Combined with multiple positive sample pairs \(a^{m+}\) and negative sample pairs \(a^{n}_j\), where \(a^{m+}\) indicates the \(m^{th}\) positive example from the same category represents \(n^{th}\) negative example from the \(j^{th}\) different category. The following is a formal definition of a samplewise contrastive loss:
Positive and negative examples are represented by M and N, respectively. Explicitly sampling M examples from semantic distribution is a naive implementation of \(\mathcal {L}^{M,N}\). There are N examples from each distribution with a different semantic label that have the same latent class.
By taking an infinity limit on M and N, we hope to absorb the effect of M and N probabilistically. Using the infinity limit, we achieve the same goal of multiple pairing. Mathematically, as M and N reach infinity, \(\mathcal {L}^{M,N}\) becomes the estimation of following:
A positive semantic distribution has the same semantic label as a negative semantic distribution has a different semantic label, and so on. Despite the fact that its analytic form is intractable, it has a rigorous closed form of upper bound:
The distribution of the features requires some further assumptions to facilitate our formulation. In the case of a random variable a that follows a Gaussian distribution \(x\sim \mathcal {N}(\mu , \Sigma )\), where \(\mu \) is the expectation of a, \(\Sigma \) is the covariance matrices of a. The moment generation function [48] satisfies the following conditions:
Under the Gaussian assumption \(a^{+} \sim \mathcal {N}(\mu ^{+}, \Sigma ^{+})\,, a^{}_j\) \(\sim \mathcal {N}(\mu ^{}_j, \Sigma ^{}_j)\), along with (10), we find that (8) for a certain pixel representation \(a_i\) immediately reduces to:
3.3 Overall formulation
An estimate of the mutual information I(X; Y) determines the degree of similarity between two random variables. Due to strong correlations between target features and predictions, our semantic augmentations will contain more meaningful semantic information, ignoring trivial semantic information. Therefore, we maximize mutual information on target data, i.e., minimize loss in (13).
where \(\varvec{\hat{P}} = \frac{1}{n_{t}} \sum _{j=1}^{n_t} \varvec{P}_{tj}\). The groundtruth distribution on the target domain is approximated by the average of the target predictions.
Therefore, IDD_ICL serves the following objective functions:
In this case, \(\alpha \) represents the tradeoff parameter. We summarize our training process in Algorithm 1. A detailed analysis of the IDD_ICL will be performed in the ablation study.
4 Experiment
4.1 Datasets
Office31 [66] uses images from 31 distinct categories based on three distinct domains: Amazon (A), Webcam (W), and DSLR (D). Amazon: 2,817 images, an average of 90 per class, with a single image background. Webcam: 795 images, images exhibit significant noise, color, and white balance artifacts. DSLR: 498 images, 5 objects per class, each object taken on average 3 times from different viewpoints.
OfficeHome [67] contains 15,500 images divided into 65 categories, including Artistic (Ar), Clip Art (Cl), Product (Pr), and RealWorld (Rw). The officehome dataset is a more complex dataset than Office31.
VisDA2017 [68] are included in the visda2017 dataset, spanning 12 categories. Taking [34] as a guide, we use 152,397 synthetic images as a source and 55,388 realistic images as a target.
4.2 Implementation details
Our backbone network for these datasets is ResNet [69] pretrained on ImageNet [70]. The experiments in this paper are implemented using PyTorch [71]. For network optimization, we use a minibatch SGD optimizer with momentum 0.9, and a deep embedded validation [72] to select hyperparameters \(\alpha \) from \(\{0.1, 0.05, 0.1, 0.15, 0.2\}\). On all datasets, we find \(\alpha =0.1\) works well.
4.3 Results
Results on Office31 can be found in Table 2. IDD_ICL outperforms JAN and DANN by a large margin, showing that revised contrastive learning is also indispensable for UDA. In particular, IDD_ICL significantly improves GSP by 2.5%, demonstrating that IDD_ICL complements previous UDA methods. Additionally, IDD_ICL is superior to recent classifier adaptation methods, such as SymNets or TAT, showing it is capable of exploring useful semantic information for better diversity and discrimination.
Results on OfficeHome can be found in Table 3. With large domain discrepancies, OfficeHome is one of the most challenging datasets for UDA. In comparison with the compared methods, IDD_ICL consistently improves generalization ability. A specific benefit of IDD_ICL is that it enhances GVBGD’s accuracy by 4.1%, with the average accuracy reaching 74.5%. These promising results indicate that IDD_ICL enhances transferability of classifiers across crossdomain datasets stably.
Results on VisDA2017 can be found in Table 4. Compared to other augmentation methods, IDD_ICL performs dramatically better. Generally, IDD_ICL generates better augmentation results since it exploits mean difference and target covariance to capture semantic information classwise. Furthermore, IDD_ICL proves its effectiveness and versatility over other baseline methods as well.
4.4 Analysis
Ablation study. To demonstrate the effectiveness of the proposed method, we use different methods like CDAN, DANN, BSP as our baseline, it can be seen in Table 5 that compared to the comparison methods, our IDD_ICL method shows a huge improvement, and improves 8% on DANN, which further demonstrates the effectiveness of our method. When evaluating the IDD_ICL component in these methods, we can see that the model without the proposed method produces worse classification accuracies for these tasks, demonstrating that contrastive learning information in the target domain can make a significant contribution in domain adaptation. All experiments produce inferior results, and the full model with IDD_ICL produces the best results. This validates the contributions of the proposed method.
Hyperparameter sensitivity To study how the hyperparameter \(\lambda \) affects the performance of our method, sensitivity test is conducted. We conducted experiments on two Office31 tasks A\(\rightarrow \)W and W\(\rightarrow \)A by varying \(\alpha \in \{0.01, 0.05, 0.1, 0.15, 0.2\}\). Figure 3 shows that IDD_ICL is not that sensitive to \(\alpha \), and can achieve competitive results with different hyperparameters. empirically, we recommend \(\alpha =0.1\) for naive implementation.
Quantitative Distribution Discrepancy The distribution discrepancy between source and target domains is used here to evaluate the functionality of each component in our model, resulting from \(\mathcal {A}\)distance [73] in Fig. 4. Based on [73], \(\mathcal {A}\)distance is defined as \(d_{\mathcal {A}}=2(12\epsilon )\), where \(\epsilon \) represents a binary domain classifier’s classification error in discriminating between the source and target domains. In general, the smaller the \(\mathcal {A}\)distance, the better the alignment of the distribution, as shown in Fig. 4. The \(\mathcal {A}\)distance between two domains is smaller using our model than those of the other three baselines. In other words, our model reduces domain discrepancy gaps more effectively.
5 Conclusion
This paper proposes a novel contrastive learning approach for improving diversity and discriminability for domain adaptation (IDD_ICL). Through samplewise alignment guided by semantic distributions, the IDD_ICL model successfully adapts to the target domain. For each samplewise representation of both domains, we use a particular form of contrastive loss that implicitly involves learning infinitely many similar/dissimilar sample pairs. A practical implementation of this intractable loss function is then derived. The combination of this simple but effective strategy and selfsupervised learning is surprisingly effective. IDD_ICL is superior on a variety of benchmarks, as demonstrated by the experimental results.
One limitation of our study is that it only focuses on image classification. In future work, we plan to extend the scope of our benchmarking to include semantic segmentation tasks. Additionally, we have only considered the closedset domain adaptation scenario, where the source and target classes are similar. In future work, we aim to considering partial and openset domain adaptation scenarios, which are common in image classification applications and involve varying classes between the source and target domain.
References
O’Shea K, Nash R (2015) An introduction to convolutional neural networks. arXiv:1511.08458
Wang M, Wang W, Li B, Zhang X, Lan L, Tan H, Liang T, Yu W, Luo Z (2021) Interbn: channel fusion for adversarial unsupervised domain adaptation. In: Proceedings of the 29th ACM international conference on multimedia, pp 3691–3700
Wang M, An S, Luo X, Peng X, Yu W, Chen J, Luo Z (2022) Attentionbased adversarial partial domain adaptation. In: ICASSP 20222022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 3144–3148
Wang M, Li P, Shen L, Wang Y, Wang S, Wang W, Zhang X, Chen J, Luo Z (2022) Informative pairs mining based adaptive metric learning for adversarial domain adaptation. Neural Networks
Wang M, Wang S, Yang X, Yuan J, Zhang W (2024) Equity in unsupervised domain adaptation by nuclear norm maximization. IEEE Transactions on Circuits and Systems for Video Technology
Guan D, Huang J, Xiao A, Lu S (2021) Domain adaptive video segmentation via temporal consistency regularization. In: Proceedings of the IEEE/CVF international conference on computer Vision, pp 8053–8064
Li Y, Yuan L, Vasconcelos N (2019) Bidirectional learning for domain adaptation of semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6936–6945
Zou Y, Yu Z, Liu X, Kumar B, Wang J (2019) Confidence regularized selftraining. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5982–5991
Zou Y, Yu Z, Kumar B, Wang J (2018) Unsupervised domain adaptation for semantic segmentation via classbalanced selftraining. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 289–305
Wang M, Liu Y, Yuan J, Wang S, Wang Z, Wang W (2024) Interclass and interdomain semantic augmentation for domain generalization. IEEE Transactions on Image Processing
Wang M, Yuan J, Wang Z (2023) Mixtureofexperts learner for single longtailed domain generalization. In: Proceedings of the 31st ACM International Conference on Multimedia, pp 290–299
Wang M, Chen J, Wang H, Wu H, Liu Z, Zhang Q (2023) Interpolation normalization for contrast domain generalization. In: Proceedings of the 31st ACM International Conference on Multimedia, pp 2936–2945
Wang M, Wang S, Wang Y, Wang W, Liang T, Chen J, Luo Z (2023) Boosting unsupervised domain adaptation: A fourier approach. KnowledgeBased Systems 264, 110325
Saito K, Watanabe K, Ushiku Y, Harada T (2018) Maximum classifier discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3723–3732
Luo Y, Zheng L, Guan T, Yu J, Yang Y (2019) Taking a closer look at domain shift: categorylevel adversaries for semantics consistent domain adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2507–2516
Goodfellow IJ, PougetAbadie J, Mirza M, Xu B, WardeFarley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: Proc. NeurIPS, pp 2672–2680
Yang Y, Soatto S (2020) Fda: fourier domain adaptation for semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4085–4095
Tzeng E, Hoffman J, Darrell T, Saenko K (2015) Simultaneous deep transfer across domains and tasks. In: Proc ICCV, pp 4068–4076
Li S, Liu CH, Lin Q, Wen Q, Su L, Huang G, Ding Z (2020) Deep residual correction network for partial domain adaptation. IEEE Trans Pattern Anal Mach Intell 1–1. https://doi.org/10.1109/TPAMI.2020.2964173
Long M, Cao Y, Wang J, Jordan MI (2015) Learning transferable features with deep adaptation networks. In: Proc ICML, vol 37, pp 97–105
Liu L, Pietikäinen M, Chen J, Zhao G, Wang X, Chellappa R (2019) Guest editors’ introduction to the special section on compact and efficient feature representation and learning in computer vision. IEEE Trans Pattern Anal Mach Intell 41(10):2287–2290
Zhang W, Xu D, Ouyang W, Li W (2019) Selfpaced collaborative and adversarial network for unsupervised domain adaptation. IEEE Trans Pattern Anal Mach Intell 1–1
Gong B, Shi Y, Sha F, Grauman K (2012) Geodesic flow kernel for unsupervised domain adaptation. In: Proc CVPR, pp 2066–2073
Li S, Song S, Huang G, Ding Z, Wu C (2018) Domain invariant and class discriminative feature learning for visual domain adaptation. IEEE Trans Image Process 27(9):4260–4273
Long M, Zhu H, Wang J, Jordan MI (2017) Deep transfer learning with joint adaptation networks. In: Proc ICML 70:2208–2217
Tzeng E, Hoffman J, Zhang N, Saenko K, Darrell T (2014) Deep domain confusion: maximizing for domain invariance. CoRR abs/1412.3474
Gretton A, Sriperumbudur BK, Sejdinovic D, Strathmann H, Balakrishnan S, Pontil M, Fukumizu K (2012) Optimal kernel choice for largescale twosample tests. In: Proc NeurIPS, pp 1214–1222
Tzeng E, Hoffman J, Saenko K, Darrell T (2017) Adversarial discriminative domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7167–7176
Ganin Y, Lempitsky VS (2015) Unsupervised domain adaptation by backpropagation. In: Proc ICML 37:1180–1189
Long M, Cao Z, Wang J, Jordan MI (2018) Conditional adversarial domain adaptation. In: Proc NeurIPS, pp 1647–1657
Kang G, Jiang L, Yang Y, Hauptmann AG (2019) Contrastive adaptation network for unsupervised domain adaptation. In: Proc CVPR, pp 4893–4902
Xie S, Zheng Z, Chen L, Chen C (2018) Learning semantic representations for unsupervised domain adaptation. In: Proc ICML 80:5419–5428
Taghiyarrenani Z, Nowaczyk S, Pashami S, Bouguelia MR (2022) Towards geometrypreserving domain adaptation for fault identification. In: Joint European conference on machine learning and knowledge discovery in databases, pp 451–460
Saito K, Watanabe K, Ushiku Y, Harada T (2018) Maximum classifier discrepancy for unsupervised domain adaptation. In: CVPR, pp 3723–3732
Pei Z, Cao Z, Long M, Wang J (2018) Multiadversarial domain adaptation. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
Wu Z, Xiong Y, Yu SX, Lin D (2018) Unsupervised feature learning via nonparametric instance discrimination. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3733–3742
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738
Tian Y, Krishnan D, Isola P (2020) Contrastive multiview coding. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16, pp 776–794. Springer
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, pp 1597–1607. PMLR
Chen T, Kornblith S, Swersky K, Norouzi M, Hinton GE (2020) Big selfsupervised models are strong semisupervised learners. Adv Neural Inf Process Syst 33:22243–22255
Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: Proc CVPR, pp 1735–1742
Oord A, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. CoRR abs/1807.03748
Chen T, Kornblith S, Norouzi M, Hinton GE (2020) A simple framework for contrastive learning of visual representations. In: Proc ICML, vol 119, pp 1597–1607
Cai Q, Wang Y, Pan Y, Yao T, Mei T (2020) Joint contrastive learning with infinite possibilities. In: Proc NeurIPS
Chuang C, Robinson J, Lin Y, Torralba A, Jegelka S (2020) Debiased contrastive learning. In: Proc NeurIPS
Li S, Xie B, Zang B, Liu CH, Cheng X, Yang R, Wang G (2021) Semantic distributionaware contrastive adaptation for semantic segmentation. arXiv:2105.05013
Zhang Q, Zhang J, Liu W, Tao D (2019) Category anchorguided unsupervised domain adaptation for semantic segmentation. In: Proc NeurIPS, pp 433–443
Wang Y, Huang G, Song S, Pan X, Xia Y, Wu C (2021) Regularizing deep networks with semantic data augmentation. IEEE Trans Pattern Anal Mach Intell 1–1
Tzeng E, Hoffman J, Saenko K, Darrell T (2017) Adversarial discriminative domain adaptation. In: CVPR, vol 1, p 4
Long M, Zhu H, Wang J, Jordan MI (2017) Deep transfer learning with joint adaptation networks. In: ICML, pp 2208–2217. ACM
Li M, Zhai YM, Luo YW, Ge PF, Ren CX (2020) Enhanced transport distance for unsupervised domain adaptation. In: CVPR
Cui S, Junbao Zhuo SW, Li L, Huang Q, Tian Q (2020) Towards discriminability and diversity: batch nuclearnorm maximization under label insufficient situations. In: CVPR, pp 3940–3949
Wu Y, Inkpen D, ElRoby A (2020) Dual mixup regularized learning for adversarial domain adaptation. In: ECCV, vol 12374, pp 540–555
Zhang Y, Tang H, Jia K, Tan M (2019) Domainsymmetric networks for adversarial domain adaptation. In: CVPR, pp 5031–5040
Liu H, Long M, Wang J, Jordan M (2019) Transferable adversarial training: a general approach to adapting deep classifiers. In: ICML, pp 4013–4022
Zhang Y, Liu T, Long M, Jordan M (2019) Bridging theory and algorithm for domain adaptation. In: ICML, pp 7404–7413
Cui S, Wang S, Zhuo J, Su C, Huang Q, Tian Q (2020) Gradually vanishing bridge for adversarial domain adaptation. In: CVPR
Xia H, Ding Z (2020) Structure preserving generative crossdomain learning. In: CVPR
Pan Y, Yao T, Li Y, Wang Y, Ngo CW, Mei T (2019) Transferrable prototypical networks for unsupervised domain adaptation. In: CVPR
Long M, Cao Y, Wang J, Jordan MI (2015) Learning transferable features with deep adaptation networks. In: ICML, pp 97–105. ACM
Ganin Y, Lempitsky VS (2015) Unsupervised domain adaptation by backpropagation. In: ICML, pp 1180–1189
Long M, Cao Z, Wang J, Jordan MI (2018) Conditional adversarial domain adaptation. In: NeurIPS, pp 1647–1657
Chen X, Wang S, Long M, Wang J (2019) Transferability vs. discriminability: batch spectral penalization for adversarial domain adaptation. In: ICML, pp 1081–1090
Chang WG, You T, Seo S, Kwak S, Han B (2019) Domainspecific batch normalization for unsupervised domain adaptation. In: CVPR, pp 7354–7362
Zhu Y, Zhuang F, Wang J, Ke G, Chen J, Bian J, Xiong H, He Q (2020) Deep subdomain adaptation network for image classification. TNNLS, 1713–1722
Saenko K, Kulis B, Fritz M, Darrell T (2010) Adapting visual category models to new domains. In: ECCV, pp 213–226
Venkateswara H, Eusebio J, Chakraborty S, Panchanathan S (2017) Deep hashing network for unsupervised domain adaptation. In: CVPR, pp 5018–5027
Peng X, Usman B, Kaushik N, Hoffman J, Wang D, Saenko K (2017) Visda: the visual domain adaptation challenge. arXiv:1710.06924
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR, pp 770–778
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M (2015) Imagenet large scale visual recognition challenge. IJCV 115(3):211–252
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: an imperative style, highperformance deep learning library. In: Proc. NeurIPS, pp 8024–8035
You K, Wang X, Long M, Jordan M (2019) Towards accurate model selection in deep unsupervised domain adaptation. In: ICML, pp 7124–7133
BenDavid S, Blitzer J, Crammer K, Kulesza A, Pereira F, Vaughan JW (2010) A theory of learning from different domains. Machine learning, 151–175
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Heng Xu and Chuanqi Shi contributed equally to this work.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author selfarchiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xu, H., Shi, C., Fan, W. et al. Improving diversity and discriminability based implicit contrastive learning for unsupervised domain adaptation. Appl Intell 54, 10007–10017 (2024). https://doi.org/10.1007/s1048902405351y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s1048902405351y