1 Introduction

Diamonds, epitomizing both luxury and complexity, have long been a symbol of beauty and strength. As of 2019, the global diamond mining industry had extracted approximately 142 million carats, with major contributions from countries like Australia, Canada, the Democratic Republic of Congo, Botswana, South Africa, and notably, Russia, which alone accounts for over half of the world’s estimated 1.2 billion carats in reserves (Diamond Production Value Worldwide by Country Statista 2022.) (Olya Linde, Ari Epstein, Sophia Kravchenko, 2021) (Diamond Reserves Worldwide Statista., 2023). In 2020, the global jewellery market, significantly influenced by Russian production, reached a value of 68 billion USD (Statista 2022.). The United States, in particular, exhibits a profound affinity for diamonds, with expenditures reaching 35 billion USD in 2020 (Olya Linde, Ari Epstein, Sophia Kravchenko, 2021), and maintaining its position as the largest market for polished diamonds in 2019, valued at 12.8 billion USD (Diamond Reserves Worldwide Statista., 2023). The diamond industry, however, is navigating through evolving trends and challenges. Technological advancements, such as blockchain, AI, ML are being introduced to enhance transparency in diamond predicting and certification (De Beers 2022).

Nevertheless, the industry continues to grapple with ethical dilemmas, environmental concerns, and the unpredictability of diamond prices, alongside the growing market presence of synthetic diamonds (Botswana 2023; Joshua Freedman 2022). Beyond the traditional powerhouses, emerging markets like India and China are gaining prominence, influenced by their expanding middle classes. African nations, rich in diamond reserves, are striving for greater economic benefits from their natural resources, exemplified by Botswana’s initiatives (Botswana 2023). The ongoing geopolitical dynamics, including events like the Ukraine war, add further complexity to the diamond industry, potentially affecting supply chains and market stability (Joshua Freedman 2022). As the industry stands at a crossroads between traditional allure and modern ethical considerations, its future trajectory remains a fascinating subject of global interest.

Diamonds, renowned for their unmatched hardness and beauty, are not just industrial gems but symbols of luxury and desire, transcending their utility to become the most sought-after gemstone (GIA 2022) introduced by the Gemmological Institute of America (GIA) in the 1950s, the 4Cs—Cut, Carat, Color, and Clarity—are the most recognized features of diamonds. According to (Chu 2001) these four features are what make a gemstone special and determine how much it is worth. Diamonds have several more characteristics, such as length, breadth, height, and table size, in addition to the famous 4Cs. Using ML methods(Ming et al. 2023; W. Zhang et al. 2023a, b), which can successfully identify non-linear interactions in the dataset, this research recommends for a thorough analysis of how these complex qualities impact diamond price. Notably, the diamond business has been encountering difficulties and changes in value and price, mostly as a consequence of the limited use of innovative technology. The extraction, slicing, and refining phases of the sector’s production link, as well as the retailing phases, have been impacted by many variables but have mostly functioned without major scientific disruptions. Since even other diamonds trading participants sometimes find the procedures and economics of middle-market companies to be translucent, openness about prices has been one of the main issues. Customers have a hard time figuring out how much gemstones are really worth due to the opaque nature of the assessment procedure (Charlotte McLeod 2013).

ML algorithms that have been constructed on large datasets that include information on diamond qualities, industry patterns (Ming et al. 2023; Mosqueira-Rey et al. 2023; Ren & Du 2023; W. Zhang et al. 2023a, b), and bidding outcomes are attempting to shed light on the arbitrary and murky world of diamond pricing. These algorithms have the ability to uncover intricate correlations and unspoken connections among different parameters, which might result in more precise and impartial estimations of prices. Diamond traders have developed a number of theories and approaches in response to the difficulties they face, most notably in the areas of costing and appraisal. The 4Cs framework has long been used to assess diamonds; they are carat weight, cut quality, clarity, and color (GIA 2022). When calculating a diamond’s value, each of these criteria is assigned a certain percentage. However, there are limits and errors in pricing caused by this typical technique failing to adequately capture the intricate interaction between these components. Because of its inherent intricacy, diamonds pricing is notoriously difficult to forecast, which in effect hinders investors’ and purchasers’ capacity to make well-informed financial choices.

A number of approaches and approaches have been developed in response to the difficulties encountered in the diamonds trade business, specifically with regard to pricing and value. This study makes use of the following models: 1. CatBoost Regressor (CBR), 2. XGBoost Regressor (XGBR), 3. Random Forest Regressor (RFR), 4. Decision Tree Regressor (DTR), 5. Gradient Boosting Regressor (GBR), 6. KNearstNeighbour Regressor (KNN) Regressor, 7. Lasso Regression (Lasso), 8. Ridge Regression (RR), 9. Linear Regression (LR), 10. Polynomial Regression (PR), 11. Elastic Net Regression (ENR), 12. Support Vector (SVR) Regressor, 13. Decision Tree Classifier (DTC), 14. Random Forest Classifier (RFC), 15. Support Vector Machine Classifier (SVC), 16. Logistic Regression (LR), 17. KNearstNeighbour Classifier (KNNC), 18. Gradient Boosting Classifier (GBC), 19. Ada Boost Classifier (ABC), 20. XGBoost Classifier (XGBC), 21. CatBoost Classifier (CBC), 22. Naive Bayes Classifier (NBC), and 23. LightGBM Classifier (LGBMC). Precise diamond estimations of prices are the focus of this study, which offers a unique measurement of modeling effectiveness in terms of precision, anticipated values (categorized as under, correct, and above projections), and execution time before and after hyperparameter change. We selected these algorithms to showcase a variety of approaches because of their flexibility, extensive documentation of their successes in many domains, and general popularity (Gajula & Rajesh 2024; Reddy & Kumar 2024).

A thorough evaluation of the written work on gemstone appraisal and ML approaches was conducted to determine their capacity to reliably predict diamond prices for a valuing strategy (Iyortsuun et al. 2023; Mohtasham Moein et al. 2023). The choice was based on this evaluation. But other techniques may have been considered, but we did not look at them. According to future study, more methods may be looked at in order to learn more about how to estimate the value of gemstones. These study results could lead to more accurate and fair ways of valuing diamonds. For anyone working in the field or having a stake in the outcome, this research has far-reaching implications. Customers and investors may benefit from a more accurate and transparent gemstone costing system if it were available. In addition (Harris & Noack 2023; Hasan et al. 2024), it may help resolve the current disparities and inconsistencies in the industry. Also, ML algorithms might revolutionize diamond valuation by streamlining the process and sparing specialists valuable time and resources. When it comes to dealing with multidimensional statistics, which are frequently non-linear and does not adhere to the presumptions of conventional statistical procedures, ML algorithms have demonstrated to be formidable rivals to conventional statistical representations (Méndez et al. 2023). This has resulted in substantial improvements in modern statistical computation. Their ability to accurately forecast and classify diamond prices is a major selling point for their services, as it indicates well for future price precision. Numerous predicted performance metrics, including as R2, MAE, RMSE, F-Measure, Precision, Accuracy, and Recall, and computing parameters like speed and run time, corroborate the statements made above. When it comes to ML knowing which methods work best is essential for focusing the modeler’s efforts where they will have the most impact (Banga et al. 2023).

Several characteristics, including carat weight, color, clarity, and cut, assess the intrinsic value of diamonds, which are among the most precious and costly jewels on Earth. The cut is what really draws in the big dogs in the diamond industry; it has three important features of sparkle, dispersion, and scintillation. As said before, professional evaluation is the backbone of conventional diamond price forecast systems, but it may be arbitrary and laborious. This highlights the need of developing better tools for predicting diamond prices. The predictive power of ML Models has led towards their increased application in the diamond market (Méndez et al. 2023; Mohtasham Moein et al. 2023). These ML methods are capable of learning from historical patterns and provide reliable results (Tian et al. 2023). On the other hand, the ML methods’ prediction abilities differ in relation to the datasets employed.

Historical analysis of intelligence on gemstone value prediction methods has taken factors including diamond size, exterior qualities, and pricing forces into account. Nevertheless, their studies had a number of flaws. They fall short in a number of respects, including addressing the importance of classification, providing relevant evaluation metrics, and appropriately comparing predictions. Nevertheless, these gaps are filled up by this study, which analyses many ML methods that successfully predict gemstone values (Alsuraihi et al. 2020; Fitriani et al. 2022; Mihir et al. 2021a; G. Sharma et al. 2021). Classification and regression techniques are part of this set he study handles standardizing predicting variables, imputing missing values, and resolving multicollinearity, among other things, to get the data ready for analysis. It tests the algorithms’ performance in regression and classification tasks using metrics (Banga et al. 2023).

Consequently, this study’s overarching goal is to determine, using a combination of classification and regression methods, which supervised ML strategy is best suited to forecast diamond prices. Stakeholders in the diamonds sector rely on comprehensive price projections to make educated choices, mitigate risk, and maximize profits (Alsuraihi et al. 2020; Chu 2001; José M Peña Marmolejos, 2018; G. Sharma et al. 2021). Therefore, a significant topic of study for the business community is the establishment of superior modelling systems employing ML approaches (Demir & Sahin 2023). To forecast diamond rates, this research employs ensemble approaches and ML techniques, such as boosting and bootstrapped aggregation (Demir & Sahin 2023). The efficacy of classifiers may be improved by the use of boosting (Sahu et al. 2022). Although ensemble techniques have shown high experimental achievement, they have not been used in the majority of simulation comparative investigations (Thabtah et al. 2019). Our diamond rate forecasting methods combine categorization and regression techniques. Because the cut intervals are not uniformly distributed, the classification-based technique uses equal-width binning on the cut variable. This strategy enhances understanding by transforming value ranges into bin numbers and then splitting the spectrum of each attribute into a predetermined number of bins of equal size. When using equal-width binning, the range of frequencies in each bin remains consistent, which helps in keeping the data (Sahu et al. 2022) in each bin balanced and evenly represented (Chakrabarti et al. 2023). This prevents the system from favouring expected data points over others, which is vital for building predictive models. The diamond attributes have different dimensions, in that 4c’s we are considering the Cut dimension, it has its own significance. The cut feature may take on Fair, Good, Very Good, Premium, or Ideal values. This entails tying the cut factor to aid the ML system in making more accurate diamond pricing predictions (Pandey et al. 2019). Researchers categorize gemstones based on their cut using binning, a prominent technique for turning continuous elements into discrete ones. In order to prevent outliers and very large numbers from skewing quantitative results, this technique is helpful. By first categorizing diamonds according to cut and then focusing on identifying commonalities and relationships within every category, the researchers enhanced the ML strategy’s capacity to predict diamond prices (Mihir et al. 2021b). By collecting similar gemstones into sets and feeding those sets into the ML Approach (Ming et al. 2023; Ren & Du 2023), investigators may improve the model’s ability to detect patterns and provide more precise estimates. In order to evaluate the efficacy of the regression-based techniques used in this work, we use metrics such as RMSE, R2, MAE, and MSE. Alternatively, our classification-based strategies will be evaluated in this study using metrics such as recall, accuracy, precision, and F-measure. The hope is that by delving into the potential of ML in diamond value calculation, this study may pave the way for a more efficient and accurate approach (Ren & Du 2023).

Recent literature highlights the need for improved methods of predicting diamond prices, accounting for the industry’s complexities and evolving trends (Alsuraihi et al. 2020; Fitriani et al. 2022; Mihir et al. 2021a; G. Sharma et al. 2021). Our paper addresses this by presenting a comprehensive analysis of ML models, including regression and classification techniques, to accurately forecast diamond prices (Harris & Noack 2023; Hasan et al. 2024).

Notably, while previous studies focus primarily on regression only, our paper encompasses both, providing a holistic approach to diamond price prediction. This enables us to identify and evaluate the most effective models for the industry’s specific needs.

Main Contributions of this Study articulate more clearly the unique contributions of our research and its impact on advancing the state of the art in diamond price prediction is well taken. Our study integrates a comprehensive approach to ML that significantly enhances predictive accuracy and provides a deeper analytical insight into the factors influencing diamond prices. Our paper contributes to the literature in the following key ways and we summarize the main contributions of our research paper, which significantly advances the predictive analytics of diamond prices by addressing key gaps in the existing literature and setting new benchmarks for accuracy and comprehensiveness:

  • Comprehensive Model Evaluation: We perform an extensive evaluation of both regression and classification models, assessing the accuracy, execution time, and predictive value alignment. This robust baseline comparison enhances understanding of model performance across various metrics.

  • Hyperparameter Tuning: By introducing hyperparameter tuning across diverse ML models, we significantly improve the predictive accuracy and operational efficiency. This addresses a critical literature gap, showcasing the before-and-after tuning effects on model performance.

  • Advanced Feature Extraction Techniques: Our methodology leverages sophisticated feature extraction methods to decipher intrinsic data patterns, crucial for accurate price determination. This not only boosts predictive accuracy but also sheds light on the impact of key characteristics such as cut, color, clarity, and carat weight.

  • Cross-Model Comparisons and Innovations: Beyond comparing existing models, our study introduces innovative approaches that combine various models to enhance predictive power. The use of ensemble techniques, for instance, helps reduce bias and variance, demonstrating the advantages of model integration.

  • Practical Implications for the Diamond Industry: The research provides actionable insights for the diamond industry by analyzing price determinants and market trends. Our findings aid stakeholders in making informed decisions, essential in a volatile and complex market.

Overall, by integrating multiple ML strategies and optimizing model parameters, our study not only fills existing research gaps but also enhances the theoretical and practical understanding of diamond price dynamics. The comprehensive evaluation and novel analytical insights offered by our study significantly advance the state of the art, contributing both theoretically and practically to the field of predictive analytics. In this paper not only fills the gaps identified in previous studies but also sets a new benchmark for comprehensiveness and accuracy in the predictive analytics of diamond prices. By integrating advanced ML techniques and providing a rigorous evaluation of their effectiveness(Wu et al. 2010), we significantly advance the state of the art, offering both theoretical and practical contributions to the field. As the industry stands at a crossroads between traditional allure and modern ethical considerations, its future trajectory remains a fascinating subject of global interest(Bontempi et al. 2013; Walid et al. 2022).

This research involves clearly stating the Objectives of the study aims to achieve, focusing on the analysis of diamond price prediction through ML approaches.

  1. 1)

    To Evaluate the Performance of Machine Learning Models in Diamond Price Prediction: Assess the base accuracy, execution time, and estimated value alignment (under, accurate, over predictions) of various regression and classification models in predicting diamond prices. This evaluation aims to identify initial performance benchmarks for each model.

  2. 2)

    To Optimize Model Performance through Hyperparameter Tuning: Implement hyperparameter tuning techniques across selected Machine Learning models to enhance the predictive accuracy and efficiency. This objective targets the improvement of model outcomes, focusing on the tuning process’s impact on model performance.

  3. 3)

    To Compare the Effects of Hyperparameter Tuning on Different Models: Analyze the before-and-after effects of hyperparameter tuning on model accuracy, execution time, and predictive value distribution. This comparison aims to highlight the tuning process’s efficacy across different models, identifying which models benefit most from optimization.

  4. 4)

    To Identify the Best-Performing Models for Diamond Price Prediction: Determine which Machine Learning models, both regression and classification, show superior performance in terms of accuracy, efficiency, and predictive value alignment after hyperparameter tuning. The objective is to single out the most effective models for practical application in diamond price prediction.

Using ML’s capacity to improve diamond pricing forecasting, the proposed work aims to pioneer breakthroughs in predicting insights for the diamond sector (Alsuraihi et al. 2020; Fitriani et al. 2022; Mihir et al. 2021a; G. Sharma et al. 2021). This project intends to set a new standard for precision, effectiveness, and practical use in diamond pricing prediction by carefully analyzing and improving a wide range of existing algorithms. Our goal is to improve the models’ forecasting abilities via hyperparameter tuning so that diamond valuation techniques, inventory management, and investment choices may be made with greater subtlety and precision. In addition to bettering the analytical foundation for price forecasting, these goals will provide investors, merchants, and evaluators in the diamond business with state-of-the-art resources and knowledge to better traverse the market’s circumstances. By accomplishing these goals, we hope to make a substantial addition to predictive analytics literature, pave the way for similar studies in the diamond industry and elsewhere, and improve decision-making and operational efficiency for businesses in this field.

The next parts of this article thoroughly investigate the many aspects of our research. ‘Review of Literature,’ which follows this introduction, examines the previous research to clarify the present level of understanding and emphasize the gaps that our study intends to address. Following this, in the ‘Proposed Work’ part, we detail the approaches and structures used to deal with the intricacies of ML algorithms used to forecast diamond prices. Following that, the ‘Results’ section displays the outcomes of our thorough investigation, highlighting the efficiency of the several models that were considered. This leads us into the ‘Discussion’ section, where we interpret the significance of our results, placing them within the broader context of predictive modeling and its implications for the gemmology field. Finally, the ‘Conclusion’ section wraps up the study, summarizing the key insights, affirming or refuting our research objectives, and suggesting avenues for future research.

2 Review of literature

The study collection provides abundant expertise on using several ML methods to anticipate diamond prices. More investigation of the scientific papers and methods cited in these writings reveals flaws and room for development.

(Pandey et al. 2019) took a challenging job of predicting the future worth of precious metals through ensemble techniques and feature selection algorithms which was explained in their research. To overcome over- and under-fitting, an integrated model comprising RF and PCA was suggested. This study concluded that RF outperformed LR with an average effectiveness of 0.9730 versus 0.8695 for LR respectively. The top accuracy was 0.9754 using Chi-Square feature selection in RFR which utilized five best characteristics instead of just 0.8663 among others by LR alone, as they were among the best that could be predicted by either algorithm used. Overfitting might have been addressed by this research alongside the identification of relevant features that leverage dynamic relevance by comparing its efficacy with other high-performing ML approaches, but it did not happen like that. However, both these results were not evaluated using important assessment indexes such as R2 or RMSE, while empirical integrity and autonomy were fostered because of arbitrary utilization of PCA during evaluation process.

In their attempt to offer ML strategies for diamond price prediction, (G. Sharma et al. 2021) set out to do just that. LR, Lasso, RR, DT, RF, EN. ABR and GBR were featured among the eight potential ML methods reviewed in the investigation. The goal was to determine which paradigm performed the best. When the dataset was divided into 80% for training and 20% for testing, RF achieved an R2 score of 0.9793, indicating greater efficiency based on the study’s results. The research failed to compare RF to other Classification algorithms for other innovative ML strategies like XGB. The significance of classification in pricing was disregarded, especially when taking the diamond cut into account. Also, multiple regression metrics like MAE and RMSE were not evaluated in the present research, so the outcomes failed to be fully evaluated. (Mihir et al. 2021c) trained ML methods with a variety of characteristics to tackle the problem of diamonds pricing prediction. An assortment of procedures was employed, including LR, SVR, DT, RFR, KNN.CBR, Huber, ET, PR, BR and XGB. CBR stood out as the top strategy for gemstone cost anticipating, boasting an impressive R2 score of 0.9872, along with substantially lower RMSE and MAE values. It was noted in the investigation that other characteristics like form, table value, polish, and symmetrical should be included to improve the precision of projections.

Taking into account the wide variety of diamond dimensions and other pertinent variables, (Alsuraihi et al. 2020) set out to create a method that could accurately estimate gemstone prices. Diamond’s projections for prices were made using a variety of ML techniques, such as Neural Networks, LR, GB, PR, and RFR. RFR emerged as the top performer after training and evaluating various algorithms. Its MAE was 112.93 and its RMSE was 241.97. Unfortunately, for situations like these since the study neglected to account for the dataset’s significant disparity in classes. Similarly, the study ignored diamond classification, an important consideration particularly for the impact of cut on pricing. Improving the regression results utilizing Classifications might have strengthened the evaluation’s conclusions.

In an effort to determine the impact of tangible parameters on diamond prices, (Mamonov & Triantoro 2018) looked into the correlation connecting gemstone exterior characteristics and valuation within the framework of online commerce. Weight, color, and clarity were determined to be the main factors that determined diamond values. Taking into account the ongoing intermediate objective parameter with a proportional dimension of gemstone rates, DT, BDT, and ANN were utilized as predicting data extracting procedures. After looking at the whole dataset, DT had its smallest MAE at 5.8%. With a concentration on diamonds ranging from 0.2 to 2.5 carats, ANN outperformed other methods, achieving an MAE of 8.2%. Although XGB and other prospective prognostic data mining techniques have shown significant results in model juxtaposition investigations, they were not included in the research. They did not use R2 and RMSE any of those other assessment metrics. In addition, the investigation did not take into account the significance of diamonds cut, which has a major impact on marketplace valuation.

(Chu 2001) set out to develop a system for valuing diamonds that takes into account varying levels of 4C’S. In order to forecast gemstone rates, the investigation used MLR that took certification, color, clarity, and carat weight into account. The final algorithm’s R2 estimate was 0.972. The investigation could have used ML approaches alternatively of MLR for tackling the non-linear association involving cartage and pricing, but it did not. Also, other important factors that could affect diamond prices were not considered in the examination.

(Fitriani et al. 2022) investigation provides a useful look into the application of ML methods to predicting for forecasting by predicting diamond prices using k-NN and LASSO algorithms. A close look at the technique, though, shows that the simulations that were taken into account were rather narrow in nature. While kNN and LASSO are effective designs, they are the only ones that have been considered here. However, other ML techniques, such as classification and regression simulations, could offer more accurate and insightful predictions of diamond pricing. A possibility to investigate the categorization features of diamond value was lost in this analysis modeling. The fact that this study only looked at these two models means that other regression methods that could provide better results or new insights were not considered.

(Jose M Pena Marmolejos, 2018) technique aims to understand the structure of each diamond arrangement by using three algorithms based on NN, linear regression, and M5P regression tree methods. An analysis is conducted by comparing the correlation among the most significant characteristics that impact diamond price. An investigation of how multicollinearity impacts the effectiveness of data mining models is carried out. However, it lacks a comparative perspective with regression models and does not explore hyperparameter tuning for a broad spectrum of classification models and represents a significant limitation.

Table 1. provides a concise summary of the main points and caveats of research on ML methods for diamond price prediction. Papers with Checkmarks (✔) indicate that the respective feature was utilized and compared within the study. Table 1 provides a comparative overview of the implementation of various ML features across multiple studies. Each column indicates the presence (✔) or absence (–) of key components such as optimizers, feature extraction techniques, and the type of models used, highlighting the comprehensive nature of the proposed model relative to previous work.

Table 1 Comparative analysis of previous studies with proposed model

The proposed model distinguishes itself by incorporating and optimizing all listed aspects into our methodology, marking a significant advancement over previous models.

Beneath the table, explained the specific terms used within the table for clarity. For instance:

  • Optimizers refer to methods used to adjust the algorithm parameters automatically to minimize errors.

  • Feature Extraction involves techniques to identify and use the most relevant information from the raw data to improve model accuracy.

  • Classification Models and Regression Models refer to the type of predictive modeling techniques used, depending on the nature of the prediction.

  • Hyperparameters involve settings on models that are tuned to optimize performance.

In distinguishing our proposed model from the reviewed literature, several critical enhancements have been implemented to advance the predictive analytics of diamond prices. A notable disparity is observed in the application of optimizers, feature extraction, and hyperparameter tuning, which our proposed model integrates comprehensively. Previous studies, such as those by Alsuraihi et al. (2020) and Sharma et al. (2021), while adept in employing regression models for price prediction, did not harness the full potential of ML by neglecting optimizers and hyperparameter tuning. These components are essential for navigating the intricacies of high-dimensional data and extracting nuanced relationships within the features, which, in our study, have been judiciously used to refine model precision. Furthermore, our proposed model leverages both classification and regression models, unlike the singular approach adopted in the majority of the prior works. For instance, Fitriani et al. (2022) focused solely on regression models, possibly overlooking the insights that classification models could provide in predicting categorical price ranges. By utilizing a dichotomous strategy, our proposed model accommodates a more holistic view, making it versatile in predicting both continuous prices and categorical price segments.

Feature extraction has also been a focal point in our methodology, enabling the identification of intrinsic patterns that contribute significantly to the accuracy of price estimation. This aspect of our model sets it apart from studies like that of Mamonov & Triantoro (2018), which did not utilize feature extraction methods, potentially limiting the depth of analysis possible from the given data. Moreover, our extensive use of hyperparameters stands in contrast with the methodologies of Chu (2001) and Jose M Pena Marmolejos (2018), which did not exploit this dimension of model optimization. Hyperparameters tuning is a sophisticated method that offers a customized approach to model training, thereby fine-tuning the models to the specific characteristics of the dataset at hand. In essence, our proposed model delineates a pioneering approach in diamond price prediction, encapsulating the efficacy of optimizers, the precision of feature extraction, the robustness of classification and regression models, and the finesse of hyperparameter tuning. This comprehensive integration of techniques not only enhances the performance metrics but also underlines the importance of a multifaceted approach in predictive modeling, ensuring that each model reaches its apex of performance.

3 Proposed work

In this research, we present a diamond Prediction, a state-of-the-art predictive model that unites classification and regression methods with hyperparameter tuning to radically alter the process of diamond price predicting. With diamond prediction, diamond buyers, sellers, and appraisers can get accurate and tailored price projections, allowing for better buying, selling, and assessment judgments. In order to capture the complex nature of diamond appraisal, this model architecture integrates a wide range of advanced ML approaches, as shown in Fig. 1 Various factors impact diamond prices, and diamond prediction uses a combination of ML strategies to account for all of them. Utilizing regression models continuous price values are predicted for diamonds based on their physical properties. To improve the model’s adaptability to different prediction requirements, it uses approaches to classify diamonds into predefined price ranges as estimated values. Optimizing modelling effectiveness through the usage of hyperparameter tuning is an important characteristic of diamond prediction. We employ random search to fine-tune the techniques so that each approach works as well as it can. By comparing the basic approaches with the tuned variations, this approach demonstrates how the tuned variations outperform the basis strategies in terms of forecasting accuracy and runtime efficiency. The anticipation experience is made one-of-a-kind by gemstone qualities such as carat weight, cut, color, clarity, depth, and the table of the diamonds prediction. This framework makes approximations that are true to each diamond’s unique characteristics because it considers the complicated link between these characteristics and the diamond’s price. The study also looks closely at predicted estimates and processing time, contrasting base models and tuned models in particular. This study also shows how hyperparameter tuning can improve accuracy (Y. A. Ali et al. 2023a, b; Vincent & Jidesh 2023). This process compares the costs and benefits of complicated models and predicts rates to see how well different methods work in real life. It uses a large collection of diamond prices and characteristics to learn and test diamond estimates. Because it learns from such a big dataset, it can easily and accurately predict a wide range of forecasting tasks. Finally, diamond Prediction is a big deal in the diamond estimate space because it combines classification and regression models with advanced hyperparameter adjusting and understanding features diamond. Prediction wants to change the gem market by giving users access to AI-driven data through accurate, personalized, and easy-to-understand value predictions.

Fig. 1
figure 1

Proposed Model

Figure 1. illustrates the comprehensive workflow of developing and evaluating ML models, starting from the data source to the final model evaluation. This figure has been simplified by removing color coding to enhance clarity and focus on the structure of the process. Below is a description of the symbols used in the flowchart:

  • Rectangles (Boxes): Represent process steps or operations. Each box in the workflow contains a specific task such as ‘Data Preprocessing,’ ‘Model Development,’, ‘Models evaluation’ and ‘ Tuning evaluation’, which are crucial stages in the ML model pipeline.

  • Circles (Dots): These are used to denote connection points in the flowchart, linking different sections of the process through arrows. They facilitate the flow from one stage to the next and are essential for depicting the sequential steps in model evaluation and tuning.

  • Summing Junctions: Represented by the circle with a cross inside, these junctions are used to indicate points where different inputs are combined or where a decision impacts the subsequent process flow, such as combining various metrics results to finalize the model evaluation.

  • Arrows: Indicate the direction of the workflow, guiding the reader through the sequential steps from data preprocessing to the final evaluation of the models.

This notation system ensures that each element of the process is clearly understood and the flow from one task to another is connected.

3.1 Methodology

The model that was suggested uses a complicated design that includes classification and regression methods, with hyperparameter selection used to make them work better. This makes it possible to accurately predict the price of diamonds. This design uses the best parts of both approaches to give a complete answer that can be tailored to the tricky problem of figuring out how much a diamond is worth. The model design looks at many aspects of diamonds with great care, which lets us make accurate price predictions and make good use of computing power.

3.2 Overview of the model architecture

3.2.1 Integration of classification and regression

These are employed to categorize gems into different price ranges based on the features. It is possible to find out how much a diamond costs ahead of time using regression techniques, which are very good at working with qualitative data. One of the most important things that goes into choosing an ML model is how well it works with ongoing data(Sharifani & Amini 2023). Setting up the hyperparameters: Randomized Search and other techniques are used in a planned way to improve model parameters (Wahyutama & Hwang 2022). Each classification and regression model are fine-tuned on its own to meet its own needs and reach its full potential(Hoffmann et al. 2019; Loh 2011). This approach uses the best parts of more than one algorithm so that mistakes are less likely to happen when it relies on just one.

In order for diamond forecasters to make accurate statements about specific gemstones, they are taught a number of traits and factors, such as: In order to meet the required standard for comprehension, a diamond projection uses a number of processes that make the reasons behind its predictions clear. Visualizations and saliency maps are employed to show how different characteristics and outside variables affect the projected price. This makes the model’s theory easy for both experts and regular users to understand. By making the AI’s suggestions less mysterious, these interpretation tools give stakeholders the information they need to make smart choices about diamond value and investment.

To train a diamond projection, a huge and varying set of data with things like gemstone traits and rates was utilized. This means running the framework through a lot of tests on a large dataset to see how well it works with different types of diamonds and in different markets. Diamond projection stays ahead of the curve when it comes to accurately forecasting the diamond marketplace, which continually evolves because the algorithm is always being improved based on user feedback and new data. When it comes to predicting diamond prices, diamond forecast is a game-changer because of the way it uses ML techniques in new ways and makes them easy to understand (Mishra et al. 2019; Shaukat et al. 2020). This method increases trust in AI-driven gem value by making predictions more accurate and giving users useful information they can use.

3.3 Dataset

The data gathering step is the most important part of this study, which is mainly about predicting gem prices using classification and regression methods. With a carefully thought-out study strategy, the researcher guides through the tricky process of choosing data, knowing that secondary data is very important in this case. (Shivam, 2017.).

The dataset we used for this study is from Kaggle, which has also been widely adopted by other studies in the literature. This dataset is comprehensive, containing key attributes for diamond price prediction, including carat, color, clarity, cut, and various dimensions such as length, width, and depth. This richness of features makes it suitable for both regression and classification models, which aligns well with our research objectives. Using a dataset that has been widely adopted by previous studies(Alsuraihi et al. 2020; Fitriani et al. 2022; Mamonov & Triantoro 2018; Jose M Pena Marmolejos, 2018; Mihir et al. 2021c; G. Sharma et al. 2021) adds a valuable layer of comparability to our research. Our findings can thus be contrasted directly with those of other studies, enhancing the validity and context of our work.

In sum, the dataset provides a robust foundation for the current study, contributing to the broader discourse on diamond price prediction and offering a consistent benchmark against previous research.

  • Usually, the record has the essential features:

  • Carat: The weight of the diamond, measured in carats. One carat is equivalent to 200 mg. The carat size is a significant determinant of the diamond’s price.

  • Cut: The quality of the diamond’s cut, which affects its brilliance and overall appearance. Common categories include Fair, Good, Very Good, Premium, and Ideal.

  • Color: The color of the diamond, usually graded from D (colorless and rare) to Z (light yellow or brown). Less color generally results in a higher price.

  • Clarity: A measure of the diamond’s purity, with categories such as IF (Internally Flawless), VVS1 and VVS2 (Very Very Slightly Included), VS1 and VS2 (Very Slightly Included), SI1 and SI2 (Slightly Included), and I1, I2, I3 (Included).

  • Depth: The height of a diamond, measured from the culet to the table, divided by its average girdle diameter. The depth percentage can affect the diamond’s brilliance.

  • Table: The width of the diamond’s table (the top flat facet) expressed as a percentage of its average diameter.

  • Price: The price of the diamond in US dollars. This is usually the target variable in predictive modeling.

  • X, Y, Z: The dimensions of the diamond in millimeters. These represent the length, width, and depth, respectively.

3.4 Data preprocessing

The preprocessing step involves cleaning the data to handle any missing values, or inconsistencies. This step ensures that the data fed into the model is of high quality and reliability. Unlike EDA, which is an extensive and often graphical examination of data relationships (Shabbir et al. 2023), IDA in this context refers to the initial screening and understanding of the dataset. It includes summary statistics: Providing basic statistics like mean, median, mode, standard deviation, and range of different variables to understand data distribution. Handling Missing Values: Identifying and imputing or removing any missing values in the dataset(Ikram et al. 2023; Werner de Vargas et al., 2023).

  • Summary Statistics: Our preprocessing begins with providing summary statistics such as mean, median, mode, standard deviation, and range of different variables to understand the data distribution. This aligns with standard practices in data science and ML, as outlined by (Chou & Lin 2023)and helps to detect skewness, outliers, and overall data spread.

  • Handling Missing Values: We identify and address any missing values, either by imputing or removing them, to ensure the integrity of the dataset. This step is crucial for accurate predictive modeling, ensuring that the data is suitable for analysis (Psychogyios et al. 2023).

  • Exploratory Data Analysis (EDA): We conduct an extensive examination of data relationships, including graphical analysis, to uncover patterns, correlations, and insights to inform model development. This process is detailed by Shabbir et al. (2023), emphasizing its importance for understanding data trends and relationships(Verbeeck et al. 2020).

We trust this explanation provides clarity on our data preprocessing methods and the support in literature, enhancing the robustness and reliability of our analysis.

3.5 Model development

This phase involves the construction of both classification and regression models to predict diamond prices.

  • Classification Models: These models categorize diamonds into different price ranges. Algorithms such as LR, DTC, RFC, CBC, SVC, KNN, GBC, ABC, NBC, LGBC, XGBC could be employed. The categories can be defined based on the price distribution of the diamonds in the dataset.

  • Regression Models: These models predict the continuous price of a diamond. Algorithms including LR, Lasso, RFR, DT, CBR, XGBR, KNN, SVR, PR, ENR, RR, GBR models are suitable candidates.

3.6 Description of models

There are two main kinds of ML approaches: regression and classification. The simulations utilized in this study are detailed below.

3.6.1 Regression models

Among the many supervised ML algorithms, regression takes labelled data as its input. The connection between the dependent and independent parameters can be better understood with its assistance.

  1. 1.

    Linear Regression (LR): In linear regression (LR), a linear equation is fitted to the observed data in order to model the relationship between Y, the dependent variable, and X, the independent variables. The intercept and slope, denoted by the coefficients, are the line parameters in question(James et al. 2023). In linear regression, the sum of squared residuals is minimized in order to find the best-fit line through the data points.

  2. 2.

    Lasso Regression: A regularization term is also used in another kind of linear regression called Lasso regression (Lasso). Because it punishes the absolute value of the coefficients, it causes some of them to be zero, which enables feature selection. Using the L1 norm for regularization(Andriopoulos & Kornaros 2023), Lasso is comparable to Ridge Regression; however, it removes features that have coefficients of zero from the model.

  3. 3.

    Decision Tree Regressor (DTR): Decision trees use the feature values as inputs to partition the data into subsets. A decision tree is produced by iteratively repeating this process. Essentially, the model is a hierarchical framework. Ability to model non-linear relationships; does not necessitate feature scaling(Garcia & Koo 2023).

  4. 4.

    Random Forest Regressor (RFR): The Random Forest (RFR) algorithm constructs a number of decision trees and then integrates them into a single, more reliable estimate. RF takes a number of data segments and uses them to train a number of trees, which it then aggregates(Rajković et al. 2023). It improves forecasting accuracy by decreasing over fitting and variability. Resolves high-dimensional, massive data sets with ease.

  5. 5.

    Gradient Boosting Regressor (GBR): The trees are added in a sequential fashion until there is no room for improvement. In GBR, trees are constructed sequentially, with the goal that each tree will rectify the mistakes committed by its predecessors. Although it is a powerful model, it needs to be tuned carefully to avoid overfitting. Offers consistently high accuracy and does a good job with different kinds of data(Rajković et al. 2023).

  6. 6.

    XGBoost Regressor (XGBR): The acronym for eXtreme Gradient Boosting is XGBoost. It is a versatile reliable way to apply gradient boosting. It is a variation on gradient boosting that incorporates regularization and tree pruning, two features that enhance its power and efficiency. Fast, works with incomplete data, and gives feature importance ratings (Abdelhedi et al. 2023).

  7. 7.

    Support Vector Regression (SVR): SVR finds a good line (or hyperplane in higher dimensions) to fit the data, and it gives us a versatile approach of determining the allowable level of modelling deviation. Regression issues are the domain of SVR, which employs the same concepts as SVM for categorization. Very useful in three-dimensional environments(Bi et al. 2023).

  8. 8.

    Cat Boost Regressor (CBR): Yandex created Cat Boost, a technique for decision trees that uses gradient boosting. For datasets that contain qualitative characteristics, it works wonders. Without pre-encoding, CBR reliably deals with categories of characteristics(Sobolewski et al. 2023).

  9. 9.

    K-Nearest Neighbors (KNN) Regressor: The KNN Regressor takes an average of the k instances in the training data that are the most similar to the target variable and uses it to make predictions. For big datasets, it becomes computationally costly due to feature scaling (Sumayli 2023).

  10. 10.

    Polynomial Regression (PR): In polynomial regression, an expansion of linear regression, the nth degree polynomial model represents the relationship between the independent and dependent variables. This method offers a means of fitting a non-linear connection between x and y’s expected values(Shi et al. 2023).

  11. 11.

    Elastic Net regression: One regularized regression method is Elastic Net regression, which combines the penalties of the Lasso and Ridge regularization methods linearly. This setup combines the regularization capabilities of Ridge with those of Lasso, allowing for the learning of a sparse model with few non-zero weights. When dealing with a large number of correlated features, Elastic Net shines(Srinivasan & Deepalakshmi 2023).

  12. 12.

    Ridge Regression: When it comes to ML, ridge regression is a tool for keeping models from getting too comfortable with their training data. Overfitting occurs when a model becomes so used to the training data’s noise that it fails miserably when presented with novel, unseen data(Srinivasan & Deepalakshmi 2023).

3.6.2 Classification models

The dataset is partitioned into classes using this supervised learning algorithm’s set of settings. After being trained on a training set, it utilizes this knowledge to categorize data into different groups.

  1. 1.

    Decision Tree Classifier: In decision trees, occurrences are sorted from root to a leaf node, which in turn offers an instance’s categorization. The dataset’s features are represented by the tree’s nodes, while class labels are represented by the leaf nodes. The tree is built using a recursive top-down approach(Tariq et al. 2023).

  2. 2.

    Random Forest Classifier: The "bagging" technique is commonly used to train an ensemble of Decision Trees. Forecasting precision and over-fitting control are both enhanced by this. The RF algorithm constructs a number of decisions trees and then merges them into a single, more reliable estimate(Tariq et al. 2023).

  3. 3.

    SVM Classifiers: In order to sort data into categories, SVM search for the hyperplane that does the job best. SVMs simplify the process of finding a distinguishing hyperplane by transforming the input space using kernel operations(Keerthana et al. 2023).

  4. 4.

    Logistic Regression: Logistic Regression is mostly employed for issues involving binary categorization, despite its name. The likelihood that an occurrence falls into a specific categorization is forecasted. Squeezes a linear equation’s output into the interval between 0 and 1 using the logistic function(Y. Zhou et al. 2023a, b).

  5. 5.

    K-Nearest Neighbors (KNN): In order to generate projections, the KNN Encoder finds the k instances in the training data that are most similar to each other and then returns the most prevalent outcome response. A distance measurement is employed to determine the distance between cases and an estimate is derived from the k nearest occurrences (Vommi & Battula 2023).

  6. 6.

    Gradient Boosting Classifier: Slowly constructs a network of trees, with each new tree serving to fix the mistakes of its predecessors. Iteratively adding trees continues until no more improvements are possible. As it grows, each tree fixes the mistakes made by its predecessor. It can model complicated relationships in the data and often gives extremely accurate results(Louk & Tama 2023).

  7. 7.

    Ada Boost Classifier: Assists decision trees in solving binary classification problems more effectively. Functions by assigning a value to each instance in the dataset based on its classification difficulty, so that the technique may disregard or ignore them when building new models(L. Zhang et al. 2023a, b).

  8. 8.

    XGBoost Classifier: Gradient boosting made efficient and scalable. It manages missing values, prunes trees, and incorporates standardized boosting. Delivers cutting-edge outcomes while outperforming competing gradient-boosting algorithms in terms of speed(Raihan et al. 2023).

  9. 9.

    Cat Boost Classifier: Effective encoding of categorical features through gradient boosting on decision trees. Developed to deal with categorical features directly, without requiring any prior encoding. Effective when working with features that fall into specific categories. Highly effective encoding of categorical features, resistant to overfitting(Z. Liu et al. 2023).

  10. 10.

    Naive Bayes Classifier: It is based on Bayes’ theorem and the concept that features don’t affect each other. Combines each class’s absolute and conditional odds based on each input number to make a prediction. Good for separating gems into various price groups, especially when working with large datasets(Romano et al. 2023).

  11. 11.

    LightGBM Classifier: LightGBM stands for "Light Gradient Boosting Machine." It is an expandable gradient boosting system designed to work quickly and effectively. It is most commonly used in real-life situations and ML competitions that use large datasets with lots of different traits. To make predictive models that get better over time, LightGBM uses a process called "boosting," in which the models learn from their mistakes(H. Yang et al. 2023).

3.6.3 Model evaluation and performance metrics

3.6.3.1 The performance of the models is critically evaluated using suitable metrics.
  • Classification Models: Metrics such as accuracy, precision, recall, F1-score.

  • Regression Models: Metrics like MAE, MSE, RMSE, and R2 are employed.

3.6.3.2 Regression metrics

Regression examination is a powerful statistical tool used to investigate the relationship between multiple important factors. It can be used to predict the price of a diamond by analyzing how its 4c’s (carat, cut, color, and clarity) impact its price. To evaluate the performance of regression models, several measures are commonly used. The four most common measures are R2, RMSE, MSE, and MAE. These measures help determine how well the regression model is able to predict the diamond’s price.

  • Root Mean Square Error (RMSE): RMSE is a measure used to compare predicted values with actual measured values. It quantifies the difference between what was predicted and what was actually observed. A small RMSE indicates that the regression test is good at predicting diamond prices, while a large RMSE suggests that the predictions are not as accurate(F. Ali et al. 2023a, b).

\(RMSE = \sqrt {1/n*\sum\limits_{1 = 1}^{n} {(yi - \hat{y}i)}^\wedge{2} }\)

  • Mean Squared Error (MSE): MSE is a measure that helps us understand how different the real and predicted numbers are from each other. It is similar to RMSE, or Root Mean Squared Error, as both give us a numerical value to represent the average size of mistakes. In the context of predicting diamond prices, MSE can be used to determine if the model is significantly inaccurate, which could have significant financial consequences. In order to make more accurate predictions of diamond prices, it is desirable to have a low MSE. This indicates that the model is fitting the data well (Wang & Bovik 2009).

\(MSE~ = ~1/n*\sum\limits_{1 = 1}^{n} {(yi - \hat{y}i)}^\wedge{2}\)

  • Mean Absolute Error (MAE): To find MAE, take the average of the percentages of the numbers that were seen and those that were expected. It makes it clear how far off average the predictions are. If the dataset has numbers for how much gemstones are worth that are very high or very low, MAE may be better than RMSE or MSE because it is less affected by these "outliers". If the MAE number is smaller, the model is better at predicting gem prices (Robeson & Willmott 2023).

\(MAE~ = ~1/n~*~\Sigma \left| {yi~ - ~\hat{y}i} \right|\)

  • R-squared (R2): R2 is a scientific value that tells us how much of the change in the dependent variable can be explained by the changes in the independent variables. It shows how well the model matches the facts. If the R2 number is higher, it means that the model and the data are more closely related. This is because it explains more of the difference in diamond prices. It is helpful to look at how well the model can handle changes in the price of diamonds (Gao 2023).

\(R2 = 1 - \frac{{\mathop \sum \nolimits_{{1 = 1}}^{n} \left( {yi~ - ~\hat{y}i} \right)^{2} }}{{\mathop \sum \nolimits_{{1 = 1}}^{n} \left( {yi~ - ~\hat{y}} \right)^{2} }}\)

3.6.3.3 Classification metrics

We categorize diamonds into different price groups or classes using a classification method to estimate the cost. Researchers evaluate the effectiveness of our categorization system in several ways. Precision, Recall, F1-Score, and Support are important methods for measuring how well an algorithm performs in classification tasks. These metrics provide us with more information about the algorithm’s performance, particularly in terms of the trade-offs between different types of errors.

  • Precision: The accuracy of a prediction model is determined by how many correct predictions it makes. It tells us how many of the positive events that were predicted actually turned out to be positive. A high accuracy means that the model has a low false positive rate, which is important in situations where the cost of incorrect predictions is high (Morstatter et al. 2016).

    • \(Precision = True Positives / (True Positives + False Positives)\)

  • Recall (Sensitivity): Recall is a measure that tells us how many positive events were correctly predicted out of all the positive events. It helps us understand how many good events were actually detected. If a model has a high recall, it means it can identify a lot of positive cases accurately. Recall is particularly important when there is a high number of false negatives, which means cases that were actually positive but were mistakenly classified as negative (Morstatter et al. 2016).

    • \(Recall = True Positives / (True Positives + False Negatives)\)

  • F1-Score: The F1-Score is a measure that combines recall and precision into a single value using a weighted average. It takes into account both false positives and false negatives, which helps find a balance between precision and recall. If the model’s F1-Score is high, it means that recall and precision are well-balanced. When there is an imbalance in the transportation of classes, it becomes even more useful(Chicco & Jurman 2020).

    • \(F1-Score = 2 * (Precision * Recall) / (Precision + Recall)\)

3.6.4 Programming language

  • Python: The entirety of our ML workflow—including data preprocessing, model development, tuning, and evaluation—is implemented in Python. Python is a leading programming language in data science due to its readability, vast community support, and a rich ecosystem of libraries that facilitate various data analysis tasks.

3.6.5 Key libraries and usage

  • Pandas: This library is instrumental in data manipulation and cleaning. We use it for reading the dataset, handling missing values, and encoding categorical variables, which is crucial for preparing the data for modeling.

  • NumPy: Leveraging this library allows us to perform numerical computations efficiently. It plays a pivotal role in manipulating arrays and implementing various mathematical operations needed during the modeling process.

  • Scikit-learn: A foundational tool for ML, scikit-learn is utilized for splitting the data, training models, hyperparameter tuning via RandomizedSearchCV, and computing performance metrics such as MSE, MAE, RMSE, and R2 scores.

  • XGBoost and CatBoost: These libraries provide optimized and scalable ML algorithms under the Gradient Boosting framework. They are used for training more sophisticated models that potentially yield better predictive performance.

  • Matplotlib and Seaborn: Both libraries are utilized for creating informative visualizations, such as performance metric comparisons and model accuracy graphs, which aid in the interpretability of our results.

  • SciPy. Stats: This module from the SciPy library is used to define distributions for hyperparameters, which are crucial for the randomized search during hyperparameter tuning.

3.6.6 Data preprocessing and model evaluation

  • Preprocessing: The train_test_split function is applied to divide the data into training and testing sets, ensuring that the models are evaluated on unseen data. We also use pipeline mechanisms from scikit-learn to streamline the process of applying polynomial feature transformations in regression tasks(Fan et al. 2021).

  • Evaluation: We implement custom functions to categorize predictions based on a defined tolerance level, further utilizing scikit-learn’s metrics to calculate and report model performance comprehensively.

3.6.7 Hyperparameter tuning

  • RandomizedSearchCV: We apply this technique to efficiently search through a specified hyperparameter space for each model. This method is a crucial part of model optimization as it can significantly enhance model performance by finding the most effective hyperparameters (Bergstra et al. 2011).

3.6.8 Execution time measurement

  • Time Library: The time module is used to measure the duration of model training and prediction, allowing us to compare the computational efficiency of different algorithms.

3.6.9 Version control and reproducibility

  • Python Version: Our experiments were conducted using Python 3.8. This information is crucial for replicating the study, as the behavior of some functions can vary between versions.

  • Library Versions: All libraries used are kept up-to-date with the latest stable releases as of the time of the experiment to leverage recent improvements and bug fixes.

3.7 Hyperparameter tuning (tuned models) and model optimization

Fine-tuning the models’ hyperparameters to get the best results is what this step is all about. Methods like random search can be used.

Comparison Analysis (Base vs. Tuned) and Model Choice: All the models that have been made are carefully compared based on how well they work, how much work they require, and how complicated they are. The model or group of models that work the best is chosen to be deployed.

3.7.1 Tuning methodology for regression and classification models

3.7.1.1 Tuning methodology

The tuning methodology applied in the code is systematic and thorough, ensuring that the chosen models are optimized effectively:

3.7.1.2 Randomized search
  • RandomizedSearchCV: This function from the scikit-learn library is used for both regression and classification models. It performs a random search over a defined set of hyperparameters, iterating through various combinations to find the best configuration (Takkala et al. 2022).

  • Cross-Validation: The random search utilizes a fivefold cross-validation, dividing the data into five subsets. For each iteration, one subset is used for testing while the remaining four are used for training. This process ensures each combination of hyperparameters is thoroughly tested on different portions of the dataset(Bergstra et al. 2011).

  • Parameter Grids: For each model, a specific set of hyperparameters is defined for tuning:

3.7.1.3 Tree-based models
  • Decision Tree Classifier/Regressor: max_depth (1–10 for classifiers, 1–20 for regressors) controls the depth of the tree, min_samples_split (2–20) and min_samples_leaf (1–20) manages the number of samples required for splitting and leaf nodes, respectively(Gomes Mantovani et al. 2024).

  • Random Forest Classifier/Regressor: n_estimators (10–100/50–200) controls the number of trees in the ensemble. The depth, splitting parameters, and leaves manage the complexity(Probst et al. 2019).

3.7.1.4 Ensemble models
  • Gradient Boosting Classifier/Regressor: n_estimators (10–100/50–200) controls the number of boosting iterations, while learning_rate (0.01–0.2/0.01–0.1) and max_depth (3–10) manages learning progression and model depth (Bentéjac et al. 2021).

  • Ada Boost Classifier: Similar parameters as Gradient Boosting, but with simpler boosting mechanisms(X. Huang et al. 2022). XGBoost Classifier/Regressor: Similar parameters as Gradient Boosting, with additional mechanisms to handle missing values and manage features (Putatunda & Rama 2018).

3.7.1.5 Linear models
  • Ridge and Lasso Regression: alpha (0-10) controls regularization strength, balancing overfitting risk. Elastic Net Regression: alpha (0-10) and l1_ratio (0-1) balance L1 and L2 regularization(Hui et al. 2015; J. Liu et al. 2018; Roozbeh et al. 2020).

3.7.1.6 SVM models
  • SVM Classifier/Regressor: C (0.1-10) controls regularization strength, and gamma (0.01-0.1) adjusts influence in kernel functions (Yao et al. 2015).

Kernel: (linear, rbf, poly) specifies the type of decision boundary used.

3.7.1.7 Other models
  • KNN Classifier/Regressor: n_neighbors (1-20) determine the number of neighbors to consider. CatBoost Classifier/Regressor: iterations (50-200), learning_rate (0.01-0.2), and depth (2-10) manage training iterations and model depth (Prokhorenkova et al. 2018; S. Zhang et al. 2018).

3.7.2 Training and testing

Once tuned, the best configurations are selected, and models are trained on the training dataset. Predictions are made on the test set, allowing for comprehensive evaluation and comparison against base performances.

3.7.3 Evaluation

3.7.3.1 Regression models
  • Mean Absolute Error (MAE): Measures the average absolute difference between predicted and actual values, reflecting overall prediction accuracy.

  • Mean Squared Error (MSE): Squares the errors before averaging, amplifying the impact of larger errors.

  • Root Mean Squared Error (RMSE): Takes the square root of MSE, bringing errors into the same unit as the target variable.

  • R-squared: Measures the proportion of variance in the target variable explained by the model, indicating its explanatory power.

3.7.3.2 Classification models
  • Accuracy: Measures the proportion of correctly classified instances.

  • Confusion Matrix: Provides a detailed breakdown of correct and incorrect classifications.

  • Precision: Proportion of positive predictions that are correct.

  • Recall: Proportion of actual positives that are correctly classified.

  • F1-Score: Harmonic mean of precision and recall, balancing both metrics.

3.7.3.3 Execution time

The time taken to train and test models is recorded, reflecting the efficiency.

3.7.3.4 Impact:

Improved Performance: Tuning improves key metrics such as accuracy, precision, recall, F1-Score, MAE, MSE, RMSE, and R-squared, making models more reliable and accurate.

Efficiency Gains: Execution times are reduced, enhancing efficiency and making models more suitable for real-world applications.

Better Generalization: Tuning helps balance model complexity, mitigating overfitting and underfitting risks, improving generalization to new datasets.

3.7.3.5 Comprehensive comparisons
  • Comparative analyses between base and tuned models provide insights into strengths and weaknesses, guiding future research and practical applications.

  • Literature Contribution: Our research highlights the impact of hyperparameter tuning on predictive modeling, providing insights into its ability to refine models and improve outcomes.

Hyperparameter tuning is a crucial process in ML research, enabling models to achieve the full potential. By optimizing relevant parameters (Bergstra et al. 2011), the study presents comprehensive performance evaluations, contributing to the field’s understanding of how to refine and balance model outcomes effectively. This detailed explanation covers the tuning methodologies for both regression and classification models, along with the evaluations and impacts.

3.7.3.6 Regressor hyperparameter selection

Hyperparameter tuning is an integral part of optimizing ML models. It involves selecting parameters that control the learning process and model architecture, which are not learned directly from the data (N. Sharma et al. 2023). This study employs RandomizedSearchCV to tune the hyperparameters of various regression and classification models. Detailed breakdown of the key hyperparameters and the values:

3.7.3.7 Linear models
  • Ridge Regression: This model tunes the alpha parameter, which controls the regularization strength, with values sampled from a uniform distribution between 0 and 10. Higher alpha values increase regularization, reducing overfitting but potentially increasing bias(Roozbeh et al. 2020).

  • Lasso Regression: Similar to Ridge, Lasso also tunes its alpha parameter, with values ranging from 0 to 10. This controls the strength of Lasso regularization, encouraging sparsity by setting some coefficients to zero(Hui et al. 2015).

  • Elastic Net: This model tunes both alpha (0–10) and l1_ratio (0–1). l1_ratio determines the balance between Lasso (L1) and Ridge (L2) regularization, allowing for a flexible combination of both regularization techniques(J. Liu et al. 2018).

  • Polynomial Regression: This model tunes polynomialfeatures__degree (2–5), which controls the degree of polynomial features added to the model. Higher degrees allow for capturing more complex relationships at the cost of increased model complexity and potential overfitting(Blank & Deb 2022).

3.7.3.8 Tree-Based models
  • Decision Tree Regressor: The model tunes max_depth (1–20), min_samples_split (2–20), and min_samples_leaf (1–20). These parameters control tree growth, affecting model complexity, depth, and the number of samples required to split or form leaves(Gomes Mantovani et al. 2024).

  • Random Forest Regressor: This model tunes n_estimators (50–200), max_depth (1–20), min_samples_split (2–20), and min_samples_leaf (1–20). These control the number of trees in the ensemble, tree depth, and sample requirements, ensuring a balanced model that avoids overfitting or underfitting(Probst et al. 2019).

3.7.3.9 Boosting models
  • Gradient Boosting Regressor: This model tunes n_estimators (50–200), learning_rate (0.01–0.2), and max_depth (3–10). These parameters manage ensemble size, learning pace, and tree depth, ensuring steady growth and generalizability (Bentéjac et al. 2021).

  • XGBoost Regressor: Similar to Gradient Boosting, XGBoost also tunes n_estimators (50–200), learning_rate (0.01–0.2), and max_depth (3–10), offering additional fine-tuning capabilities for boosting(Putatunda & Rama 2018).

3.7.3.10 Support vector machines
  • SVR: This model tunes C (0.1-10), epsilon (0.01-0.1), and kernel (linear, rbf, poly). C controls the regularization strength, epsilon determines the tolerance for error margins, and kernel defines the model’s functional form(Q. Huang et al. 2012).

3.7.3.11 Ensemble models
  • CatBoost Regressor: This model tunes iterations (50–200), learning_rate (0.01–0.2), and depth (2–10), balancing model depth and learning pace to improve accuracy and efficiency(Prokhorenkova et al. 2018).

  • KNN Regressor: This model tunes n_neighbors (1–20), determining how many neighbors are considered in making predictions. Fewer neighbors lead to higher variance, while more neighbors can smooth out predictions(S. Zhang et al. 2018).

3.7.3.12 Evaluation and results

The tuned models are evaluated on several metrics, including:

  • Accuracy and Predictive Value: By tuning these hyperparameters, the models show improved accuracy and alignment with test set values, reducing errors and achieving balanced predictions.

  • Execution Time: The models also show varied execution times, highlighting the impact of hyperparameter tuning on efficiency.

  • Practical Impact: This study provides a comprehensive evaluation of hyperparameter tuning across multiple models, emphasizing its role in balancing performance metrics for practical applications.

Hyperparameter tuning is a vital aspect of ML model optimization, balancing complexity, accuracy, and efficiency. By exploring key hyperparameters across different models(L. Yang & Shami 2020), this study highlights the impact on predictive modeling, providing valuable insights for future research and practical applications.

3.7.3.13 Classifier hyperparameter selection

Hyperparameter tuning is essential for optimizing ML classifiers. By fine-tuning these parameters, we aim to enhance model performance, achieving better classification accuracy and reducing errors. This study applies RandomizedSearchCV to optimize key hyperparameters for various classifiers (Takkala et al. 2022). Detailed overview of the selected classifiers and the hyperparameters:

3.7.3.14 Tree-based classifiers
  • Decision Tree Classifier: This model tunes the max_depth parameter (1–10), which controls the maximum depth of the tree. A shallow tree may not capture complex relationships, while a deep tree can overfit. This balance ensures the model generalizes well(Gomes Mantovani et al. 2024).

  • Random Forest Classifier: This model tunes n_estimators (10–100) and max_depth (1–10). n_estimators controls the number of trees in the ensemble, while max_depth limits individual tree depth. This combination helps create a robust ensemble that generalizes well across different datasets(Probst et al. 2019).

3.7.3.15 Support vector machine (SVM)
  • SVM Classifier: This model tunes C (0.1-10) and gamma (0.01-0.1). C controls the regularization strength, balancing model complexity and accuracy. gamma influences the decision boundary, determining how much influence each data point has on the model(Yao et al. 2015).

3.7.3.16 Ensemble classifiers
  • Gradient Boosting Classifier: This model tunes n_estimators (10–100) and learning_rate (0.01–0.1). n_estimators define the number of trees in the ensemble, while learning_rate adjusts how much each tree contributes to the final model, balancing learning speed and accuracy (Bentéjac et al. 2021).

  • AdaBoost Classifier: This model tunes n_estimators (10–100) and learning_rate (0.01–0.1). Similar to Gradient Boosting, these parameters manage ensemble size and learning pace, ensuring the model’s robustness and efficiency(X. Huang et al. 2022).

  • XGBoost Classifier: This model tunes n_estimators (10–100) and learning_rate (0.01–0.1), providing additional fine-tuning capabilities for ensemble learning (Putatunda & Rama 2018).

  • LightGBM Classifier: This model also tunes n_estimators (10–100) and learning_rate (0.01–0.1), offering a balanced approach to ensemble learning with rapid execution(Panigrahi et al. 2022).

3.7.3.17 Logistic regression
  • Logistic Regression: This model tunes the C parameter (0.1-10), controlling the regularization strength. This balance between complexity and regularization ensures the model generalizes effectively, avoiding overfitting(Li & Lederer 2019).

3.7.3.18 K-Nearest neighbors (KNN)
  • KNN Classifier: This model tunes n_neighbors (1-20), determining how many neighbors are considered when making predictions. Fewer neighbors lead to higher variance, while more neighbors can smooth out predictions(S. Zhang et al. 2018).

3.7.3.19 Evaluation and results

The tuned classifiers are evaluated using various metrics:

  • Accuracy: This measures how often the classifier makes correct predictions, reflecting its overall performance.

  • Precision, Recall, and F1-Score: Precision measures the proportion of positive predictions that are correct. Recall assesses the proportion of actual positives correctly identified. The F1-Score provides a harmonic mean of precision and recall, balancing both metrics.

  • Execution Time: The tuned models also show varied execution times, highlighting the impact of hyperparameter tuning on efficiency.

Hyperparameter tuning significantly impacts classifier models, balancing complexity, accuracy, and efficiency. This study demonstrates its value across different models, providing insights for future research and practical applications.

4 Results

It is suggested that both classification and regression methods be used together in the proposed model for predicting diamond prices. The proposed approach offers a complete answer. The model is designed to provide precise and reliable projections by focusing on two important techniques: IDA (Initial Data Analysis) and EDA (Exploratory Data Analysis). These techniques help in fine-tuning the models for better accuracy. The model also prioritizes simplicity and readability, making it easy to understand and use. This strong framework is designed to work effectively in various real-life situations, making it a valuable resource for anyone involved in the diamond business. The importance of accurate and thorough data loading is emphasized, as it ensures that the information is properly prepared and researched. The model highlights the significance of loading data accurately and thoroughly when attempting to forecast diamond prices. The careful approach used in this phase lays a solid foundation for creating classification and regression models, ensuring accurate forecasts and robust evaluation.

This section of the report presents the findings from our detailed analysis and assessment of ML techniques used to predict gem values. We begin by thoroughly examining the qualitative data to understand the distribution of the different components in our dataset, such as the number and category of gems. This analysis helps us identify patterns that may impact decision-making and the effectiveness of the programs. In addition, correlation analysis looks at the links between the different features of a diamond. In this part, we look at a close contrast between regression and classification simulations, which were picked because they are good at making predictions. An analysis of algorithm results is given, including information on the preciseness, estimate accuracy, and processing times of each algorithm. Finally, we put together all of our results from the comparative research in order to give a full evaluation of how well the ML algorithms can foresee diamond prices.

4.1 Numerical features

  • Carat: The right-skewed distribution indicates that most diamonds in the dataset are of lower carat values, with fewer diamonds having higher carat values. Skewness in the distribution might influence the performance of certain ML models, especially linear models, as they assume a normal distribution of the features.

  • Depth: The roughly normal distribution is a good sign for many statistical analyses and ML models. The small peak might indicate a specific range of depth values that are more common.

  • Table: The distribution is roughly normal but with some outliers, which might be extreme values that deviate significantly from the rest of the data.

  • Price: The highly right-skewed distribution indicates that most diamonds are priced lower, with a few exceptions of very high-priced diamonds. This skewness might require transformation (e.g., log transform) to make the data more normally distributed, which could improve the performance of some models.

  • x (length), y (width), and z (depth): The distributions are roughly normal, but the presence of diamonds with dimensions of 0 is unusual and could indicate incorrect or missing data.

Fig. 2 The histograms below show the distributions of the numeric variables in the dataset. Most diamonds are less than 2 carats, with a peak around 0.3 carats. The depth percentage and table size have normal and slightly right-skewed distributions, respectively. The price distribution is right-skewed, indicating a few diamonds with very high prices. The dimensions of the diamonds (x, y, z) show distributions similar to that of the carat size and the unusually high values in "y" and "z” outliers. Addressing these issues is important as they could impact the accuracy of predictive models

Fig. 2
figure 2

Distribution of Numerical Features

4.2 Categorical features

  • Cut: "Ideal" is the most common, suggesting that the dataset has a high proportion of diamonds with the best quality cuts.

  • Color: "G" is the most common, indicating a high proportion of high-quality color diamonds in the dataset.

  • Clarity: "SI1" is the most common, meaning many diamonds have slight inclusions that are noticeable under 10 × magnification.

The bar plots below shown in the Fig. 3. the distribution of the categorical variables ‘cut’, ‘color’, and ‘clarity’. The ‘Ideal’ cut is the most common, followed by ‘Premium’ and ‘Very Good’. The most common colors are G, E, and F, while J is the least common. SI1 and VS2 are the most common clarity grades, whereas I1 is the least common.

Fig. 3
figure 3

Distribution of Categorical Features

The box plots below shown in the Fig. 3. the relationships between the price of diamonds and the categorical variables. There is a wide range of prices across different cuts, colors, and clarity grades, indicating that these factors significantly influence the price. Diamonds with a ‘Premium’ cut, D color, and ‘IF’ clarity tend to have higher median prices.

There are no missing values in the dataset. All data types appear to be appropriate for the respective columns. The "Unnamed: 0" column seems to be an identifier and may not be needed for analysis.

4.3 Overall utility of summary statistics

  • Data Understanding: Summary statistics provide a quick and intuitive understanding of the data’s distribution, central tendency, and spread.

  • Modeling: In predictive modeling, understanding the distribution of the target variable (price) and predictor variables is crucial for selecting appropriate models and transformations.

  • Validation: When making forecasting simulations, contrasting the summary data of the predicted values to the real values can help show that the model is correct.

To sum up, summary statistics are very useful for exploring, cleaning, and analyzing data, especially when the data is complicated and has a lot of variables, like when predicting the price of diamonds in Table 2. They give us basic information that guides our further research and modeling, which leads to more accurate and reliable forecasts in the end. This is the detailed interpretation of the summary statistics in the table for each of the numerical features in the diamond’s dataset. Summary statistics for key features of diamonds, including carat, depth, table, price, and dimensions (x, y, z). Each feature is described with its mean, median, standard deviation, minimum, and maximum values to provide a comprehensive overview of the dataset’s distribution characteristics. This Table 2 aids in understanding the central tendency and variability of each diamond feature.

Table 2 Summary Statistics

Column Descriptions:

  • Feature: The specific attribute of the diamond being analyzed.

  • Mean: The average value of the feature across all samples.

  • Median: The middle value of the feature when all samples are arranged in order.

  • Standard Deviation: A measure of the amount of variation or dispersion of the feature values.

  • Minimum: The smallest value observed for the feature across all samples.

  • Maximum: The largest value observed for the feature across all samples.

Definitions and Explanations:

  • Carat: Weight of the diamond, typically impacting price and size.

  • Depth: The height of a diamond, measured from the culet to the table, divided by its average girdle diameter.

  • Table: The width of the diamond’s table expressed as a percentage of its average diameter.

  • Price: The retail price of the diamond in US dollars.

  • x, y, z: Measurements of the diamond in millimeters representing length, width, and depth respectively.

All measurements are provided in standard units with carat in weight, dimensions (x, y, z) in millimeters, and price in US dollars. The data presented in this table are crucial for preliminary analyses to understand the distribution and range of diamond characteristics before further detailed modeling.

4.4 Correlation analysis

Correlation analysis is a way to figure out how strong and which way the straight relationships between numbers are going. When we look at the diamond collection, correlation analysis can help us figure out how the different features of diamonds are connected (Q. Zhou et al. 2023a, b). Compute the association values for all the pairs of numbers that make up the feature. This is often done with the Pearson correlation coefficient, which gives numbers from -1 to 1. When the value is 1, there is a perfect positive linear relationship. When the value is -1, there is a perfect negative linear relationship, and when the value is 0, there is no linear relationship.

  1. 1.

    Carat (Weight of the Diamond): Strong positive correlation with price (0.92: As the weight of the diamond increases, its price tends to increase significantly. Strong positive correlation with dimensions (x, y, z): Larger diamonds in terms of dimensions tend to have a higher carat weight.

  2. 2.

    Depth (Total Depth Percentage): Slight negative correlation with table (− 0.30): As the depth percentage increases, the table size tends to decrease slightly, though the relationship is weak. Very weak correlation with other features: Depth percentage does not show strong linear relationships with carat, price, or dimensions.

  3. 3.

    Table (Width of Top of Diamond): Moderate positive correlation with carat (0.18) and price (0.13): Larger table sizes are somewhat associated with higher carat weights and prices. Negative correlation with depth (− 0.30): As mentioned earlier, a larger table size tends to be associated with a lower depth percentage.

  4. 4.

    Price (Price in USD): Strong positive correlation with carat (0.92): Heavier diamonds tend to be more expensive. Moderate positive correlation with dimensions (x, y, z): Larger diamonds in terms of dimensions tend to be more expensive.

  5. 5.

    Dimensions (x, y, z): These dimensions (length, width, depth in mm) are highly correlated with each other (> 0.95), which makes sense as they are all measures of size. They also show strong positive correlations with carat, reflecting that larger diamonds (in dimensions) tend to have higher carat weights.

  6. 6.

    Unnamed: 0 (Index): This feature shows negative correlations with most other features, but since it is likely just an identifier, these correlations might not be meaningful.

The correlation matrix above provides insights into the linear relationships between numerical features in the diamond’s dataset in the Fig. 4.

Fig. 4
figure 4

Correlation Matrix

4.5 Comparative results

The results present a comprehensive comparison between various regression and classification models in the context of diamond price prediction. Each model has been evaluated on the basis of base accuracy versus tuned accuracy, estimated values in terms of underestimation, accurate estimation, and overestimation, as well as execution time before and after tuning. These metrics are crucial for understanding the efficacy of each model in the Table 3 and 4.

Table 3 Comparative Results of Regressions Models
Table 4 Comparative Results of Classifications Models

This Table 3 and 4 presents a detailed comparison of various regression models’ and classification models’ performance before and after parameter tuning. It includes metrics such as base and tuned accuracy, estimated value distributions under various conditions (underestimate, accurate, overestimate), execution times, and the improvement in execution times post-tuning. Each metric aims to provide a holistic view of model efficiency and effectiveness in predictive tasks.

Column descriptions

  • Model: The name of the regression model and classification model used.

  • Base Accuracy/Tuned Accuracy: Accuracy of the model before and after tuning.

  • Base Estimated Values (Under, Accurate, Over): Distribution of prediction results as underestimates, accurate estimates, and overestimates before tuning.

  • Tuned Estimated Values (Under, Accurate, Over): Distribution of prediction results as underestimates, accurate estimates, and overestimates after tuning.

  • Base Execution Time (seconds): Time taken by the model to execute before tuning.

  • Tuned Execution Time (seconds): Time taken by the model to execute after tuning.

  • Execution Time Improvement: The difference in execution times before and after tuning, indicating efficiency gains or losses.

Definitions and explanations

  • Underestimate: Predictions where the model’s output was lower than the actual value.

  • Accurate Estimate: Predictions where the model’s output matched the actual value closely.

  • Overestimate: Predictions where the model’s output was higher than the actual value.

  • Accuracy: Accuracy of the Regression and Classifications Models.

  • Execution Time: Measured in seconds, representing the computational efficiency of the model.

Table 3 Focuses exclusively on the results for regression models and Table 4 Focuses exclusively on the results for classification models. This clear division helps readers understand which metrics apply to each type of model, avoiding any confusion. Table 3 includes columns such as "Base Accuracy/Tuned Accuracy," "Base Estimated Values," "Tuned Estimated Values," and "Execution Time Improvement.

Table 4: Includes similar columns tailored to classification models. These labels provide precise descriptions, making it easier to interpret the data within each column. Each table now presents data in a consistent format, The "Accuracy" columns show both base and tuned accuracies for direct comparison. The "Estimated Values" columns categorize predictions into "Under," "Accurate," and "Over," showing both base and tuned values. The "Execution Time" columns provide both pre- and post-tuning times, along with the difference between them. This format simplifies comparative analysis across different models. Regression Models in Table 3, we summarize the key differences between base and tuned models, comparing accuracy, estimation categories, and execution time. Classification Models in Table 4, we offer a similar summary, highlighting performance improvements on regressions.

4.6 Analysis of regression models

4.6.1 CatBoost regressor

  • Accuracy: Remains constant at 0.9893, indicating that the model was optimally configured from the outset and further tuning did not enhance predictive performance.

  • Estimation: The estimated values remain unchanged post-tuning, suggesting that the model’s predictions are stable and tuning does not impact its estimation distribution.

  • Execution Time: There is a significant increase in execution time from 4.72 to 21.89 s, which is a considerable drawback, considering there is no accuracy gain.

4.6.2 XGBoost regressor

  • Accuracy: Consistently at 0.9800, indicating a strong predictive ability that tuning did not improve, possibly because the default parameters were already suitable.

  • Estimation: No change in the estimation distribution suggests that the model’s predictive behavior is stable across different configurations.

  • Execution Time: Tuning increases the execution time from 2.88 to 6.99 s, which is not as drastic as some other models, but still a factor to consider if deployment constraints exist.

4.6.3 Random forest regressor

  • Accuracy: Maintains a constant accuracy of 0.9783, suggesting that the model is robust and tuning did not further refine its performance.

  • Estimation: Slight changes in estimation suggest minor shifts in the model’s decision-making process, but these are not significant.

  • Execution Time: Interestingly, execution time decreases post-tuning from 58.76 to 49.2 s, indicating that the tuning process may have simplified the model, improving computational efficiency.

4.6.4 Decision tree regressor

  • Accuracy: A marginal decrease in accuracy from 0.9580 to 0.9578 could be due to the tuning process simplifying the model slightly, which may be a favorable trade-off for reducing complexity.

  • Estimation: The slight shift in estimation figures post-tuning is negligible and does not indicate a significant change in model behavior.

  • Execution Time: The execution time is nearly unchanged, remaining under one second, which is excellent for real-time predictions.

4.6.5 Gradient boosting regressor

  • Accuracy: Unchanged at 0.9576, indicating that tuning did not improve the model’s ability to predict diamond prices.

  • Estimation: The estimation counts remain the same, suggesting a consistent model performance.

  • Execution Time: Slightly reduced from 11.17 to 9.99 s, which is one of the few models where tuning has reduced the execution time.

4.6.6 KNN regressor

  • Accuracy: Remains at 0.9398, implying that the base configuration was already well-suited for the dataset.

  • Estimation: No change in estimation suggests that tuning did not alter the model’s predictions.

  • Execution Time: The execution time increases from 0.68 to 2.19 s post-tuning, which is a factor to consider for time-sensitive applications.

4.6.7 Lasso regression

  • Accuracy: Stays at 0.9233, showing that this model’s performance is stable and not enhanced by tuning.

  • Estimation: The estimation distribution remains constant, indicating stable prediction behavior.

  • Execution Time: The execution time increases after tuning (from 1.53 to 5.37 s), which may not be justifiable given the lack of accuracy improvement.

4.6.8 Ridge regression

  • Accuracy: Consistent at 0.9198, suggesting that the default settings were adequate.

  • Estimation: The estimation does not change with tuning, indicating a stable predictive model.

  • Execution Time: Increases slightly post-tuning, which should be considered when evaluating the practicality of deploying this model.

4.6.9 Linear regression

  • Accuracy: Unchanged at 0.9195, indicating that linear regression’s performance is not impacted by tuning.

  • Estimation: The estimation counts remain the same, reflecting consistent model behavior.

  • Execution Time: There is a noticeable increase in execution time post-tuning (from 0.19 to 3.12 s), which may affect its suitability for deployment.

4.6.10 Polynomial regression

  • Accuracy: The accuracy is low at 0.7862 and remains unaffected by tuning.

  • Estimation: There are no changes in estimation, suggesting the model’s prediction distribution is consistent.

  • Execution Time: There is virtually no change in execution time, which is expected given the nature of polynomial regression.

4.6.11 Elastic net regression

  • Accuracy: Remains constant at 0.8250, which is lower compared to other models, indicating limited predictive power for this dataset.

  • Estimation: The estimation distribution is unchanged, which means the model’s predictions are stable.

  • Execution Time: A slight increase in execution time post-tuning is observed, but it remains minimal.

4.6.12 SVR regressor

  • Accuracy: The negative accuracy (-0.1331) indicates that the model’s predictions are worse than a simple average, and tuning does not improve this.

  • Estimation: The estimation suggests that the model is consistently inaccurate, with a large number of under and overestimations.

  • Execution Time: Execution time is extremely high and increases even further post-tuning (from 277.1 to 523.5 s), making it impractical for most applications.

From the above detailed analysis of Table 3, Researcher can conclude that: CatBoost, XGBoost, and Random Forest Regressors show top-tier performance with high accuracy and stable estimation abilities. However, CatBoost’s increased execution time post-tuning is a concern. Decision Tree and Gradient Boosting Regressors offer good accuracy with the added benefit of reduced execution time post-tuning. KNN, Lasso, Ridge, and Linear Regression models do not benefit from tuning in terms of accuracy, and the increased execution time post-tuning should be factored into the decision-making process for the use. Polynomial Regression and Elastic Net Regression show moderate performance, and the unchanged execution time post-tuning indicates stability but not necessarily efficiency or effectiveness. The SVR Regressor is the least suitable model for this dataset, with poor accuracy and prohibitive execution times. In practical terms, for a diamond price prediction system, one should consider using models that balance accuracy with execution time. If the prediction speed is crucial, models with minor increases in execution time post-tuning, like the XGBoost or Random Forest with the execution time reduction, may be optimal. For cases where the highest accuracy is paramount, and execution time is less of a concern, then models like the CatBoost Regressor, despite its longer execution time, might be preferred. However, any model selection should be preceded by an extensive cost–benefit analysis, especially if the model is to be deployed in a real-time pricing environment.

4.7 Analysis of classification models

For classification, the XGBoost Classifier and Cat boost Classifier showed the highest base accuracies. However, the Cat boost Classifier’s performance remained consistent after tuning, while the XGBoost Classifier saw a decrease in accuracy upon tuning. Notably, the Cat boost Classifier’s execution time improved after tuning, which is a rare and valuable outcome.

4.7.1 Decision tree classifier

  • Accuracy: The Decision Tree Classifier has a high base accuracy (0.9343) which slightly decreases after tuning (0.9317). This might be a sign of overfitting in the base model, which tuning has mitigated.

  • Estimation: The base model slightly overestimates and underestimates with nearly equal frequency. Tuning reduces underestimations but increases overestimations, indicating a shift in the decision boundary.

  • Execution Time: The tuning process greatly increases execution time (from 0.52 to 9.6 s), likely due to a more extensive search through the hyperparameter space or more complex split criteria.

4.7.2 Random forest classifier

  • Accuracy: There is a notable drop in accuracy after tuning (from 0.9522 to 0.9336). This suggests that the default parameters may already be well-suited for the dataset, and tuning might introduce unnecessary complexity.

  • Estimation: The number of underestimations increases after tuning, while accurate estimations decrease, which might indicate an overfitting to the noise in the training data.

  • Execution Time: Tuning substantially increases execution time (from 10.54 to 87.65 s), likely due to a larger number of trees or increased depth of each tree, which raises questions about the practicality of the tuned model in an operational environment.

4.7.3 SVM classifier

  • Accuracy: There is a significant improvement in accuracy post-tuning (from 0.9114 to 0.9496), suggesting that the tuning process found a more optimal hyperplane for classification.

  • Estimation: Tuning has notably reduced overestimations and slightly increased accurate estimations, indicating an improved fit to the data.

  • Execution Time: The execution time after tuning is extremely high (from 248.14 to 1854.41 s), making it the least practical model in terms of speed.

4.7.4 Logistic regression

  • Accuracy: The model shows a slight improvement in accuracy after tuning (from 0.9165 to 0.9205), which is a marginal gain.

  • Estimation: Post-tuning, the model does a slightly better job at reducing both overestimations and underestimations.

  • Execution Time: There is an increase in execution time post-tuning (from 1.39 to 28.74 s), which could be attributed to a more rigorous regularization process during tuning.

4.7.5 KNN classifier

  • Accuracy: The accuracy increases slightly after tuning (from 0.9102 to 0.9153), suggesting that the model benefits from optimizing the number of neighbors.

  • Estimation: Tuning reduces underestimations but increases overestimations, which could result from changes in the neighborhoods size.

  • Execution Time: The execution time increases after tuning (from 2.52 to 36.81 s), indicating a higher computational cost likely due to an increase in the number of neighbors considered.

4.7.6 Gradient boosting classifier

  • Accuracy: There is a minor decrease in accuracy after tuning (from 0.9431 to 0.9380), possibly indicating that the base model was already well-configured for the dataset.

  • Estimation: Tuning seems to increase both underestimations and overestimations slightly, which might be a sign of the model struggling to generalize with the tuned parameters.

  • Execution Time: The tuning leads to a significant increase in execution time (from 47.52 to 341.35 s), implying a more complex model with potentially more estimators or deeper trees.

4.7.7 Ada boost classifier

  • Accuracy: The Ada Boost Classifier shows an improvement in accuracy after tuning (from 0.6802 to 0.6917), but it still lags behind other models.

  • Estimation: The tuned model has a higher frequency of underestimations, and significantly reduces overestimations, which could indicate a conservative shift in the decision boundary.

  • Execution Time: There is an increase in execution time after tuning (from 4.39 to 65.89 s), which may not be justified given the relatively small gain in accuracy.

4.7.8 XGBoost classifier

  • Accuracy: There is a slight decrease in accuracy after tuning (from 0.9607 to 0.9452), which might suggest that the base model was already near-optimal.

  • Estimation: The number of accurate estimations decreases, and overestimations increase after tuning, potentially indicating overfitting.

  • Execution Time: The execution time significantly increases post-tuning (from 10.81 to 121.67 s), which might make it less suitable for time-sensitive applications.

4.7.9 CatBoost classifier

  • Accuracy: The accuracy remains consistent before and after tuning (0.9585), indicating a robust base configuration.

  • Estimation: There is no change in the estimation abilities of the model post-tuning, which is exceptional.

  • Execution Time: There is an improvement in execution time post-tuning (from 25.08 to 10.88 s), which is unusual and suggests effective optimization during tuning.

4.7.10 Naive bayes classifier

  • Accuracy: The accuracy remains the same before and after tuning (0.8805), which is typical given the model’s lack of hyperparameters that significantly affect performance.

  • Estimation: The estimation breakdown remains consistent, with no changes after tuning.

  • Execution Time: The execution time is negligible and remains largely unchanged post-tuning, making it the fastest among the models evaluated.

4.7.11 LightGBM classifier

  • Accuracy: There is a negligible decrease in accuracy after tuning (from 0.9607 to 0.9598), implying that the base model was already well-tuned.

  • Estimation: The tuned model has a slight increase in underestimations and a decrease in accurate estimations, although these changes are minimal.

  • Execution Time: There is an increase in execution time post-tuning (from 1.76 to 16.24 s), indicating a more complex model which may not be necessary given the small changes in accuracy.

The investigation assessed several ML classifiers for their effectiveness in predicting outcomes in the Table 4, with particular attention given to the XGB and CB classifiers. Both models showed excellent initial accuracies, but only the CB Classifier sustained its accuracy after alterations, thus enhancing its execution time, which is a significant accomplishment. Alternative models, such as DT and RF classifiers, exhibited reduced accuracy after adjustments, indicating probable overfitting or the addition of superfluous complexity. After adjustment, the SVM Classifier improved in accuracy but became unworkable because of a substantial increase in execution time. After being changed, models like GB and AB classifiers got better at making estimates, but it took a lot longer to do the work. Compared to other models, the CB Classifier performed better by maintaining accuracy and reducing processing time after enhancing, showing that it is durable and effective. However, the NB and LGBM classifiers did not improve much in terms of effectiveness measures after being enhancing. This shows how important it is to choose algorithms that are a good balance between accuracy, estimate precision, and computing efficiency for tasks that need to make predictions.

4.8 Overview of hyperparameter tuning

Hyperparameter tuning involves finding the optimal set of parameters that improve a model’s performance. This is crucial in ML, where models rely on configurations like learning rates, maximum depths, or the number of estimators.

4.8.1 Study’s approach

This study employs hyperparameter tuning techniques across both regression and classification models, incorporating a diverse range of algorithms:

  • Randomized Search: For models like Decision Tree, Random Forest, and Gradient Boosting, Randomized Search CV is used to explore hyperparameters(N. Sharma et al. 2023; Takkala et al. 2022). This technique samples configurations from a distribution, allowing for efficient tuning.

  • Predefined Grids: Each model has specific parameter grids tailored to its nature. For example, Decision Tree has a grid covering maximum depth, while Ridge Regression explores different alpha values(Abdollahi & Nouri-Moghaddam 2022).

4.8.2 Comparison to other hyperparameter methods

  • Manual Search: This method involves manually adjusting parameters and testing results, which is impractical for the variety of models(Y. A. Ali et al. 2023a, b) and hyperparameters in this study.

  • Grid Search: This exhaustive approach systematically explores every combination in a predefined range, but is infeasible here due to the large number of models and configurations(Belete & Huchaiah 2022).

  • Bayesian Optimization: This method uses a probabilistic model to guide hyperparameter exploration(Victoria & Maragatham 2021). While not employed in this study, it offers efficient exploration of promising configurations.

  • Genetic Algorithms: This evolutionary approach applies genetic operators like crossover and mutation to evolve configurations(Katoch et al. 2021). It is not used but can handle complex parameter spaces effectively.

  • Hyperband: This method dynamically allocates resources, balancing exploration, and exploitation(L. Yang & Shami 2020). Though not part of this study, it could offer efficient resource management.

  • Successive Halving: This approach progressively allocates resources to configurations, discarding weak performers early(Goay et al. 2021). It is not explicitly used, but its resource efficiency could be beneficial.

  • Adaptive Methods: Adaptive optimizers like Adam or RMSProp automatically adjust learning rates, improving training efficiency(Baik et al. 2020). These are not directly discussed in our study.

  • Meta-Learning: This involves training a model to generalize tuning experiences across tasks(Baik et al. 2020), which is not part of this study. It requires substantial data and similar tasks.

This study’s hyperparameter tuning methodology effectively balances exploration and performance:

  • Efficient: Randomized Search provides diverse configurations without exhaustive exploration, making it practical for this study’s scope.

  • Comprehensive: Covers various models with tailored grids, enhancing both regression and classification performance.

This study provides a strong foundation for future enhancements, balancing exploration, performance, and evaluation across diverse models.

5 Discussion

This discussion focuses on the goals of our research about the use of ML algorithms for predicting gemstone prices. Our research is characterized by the significant use of regression and classification models. We analyze the numerical results but also examine how they relate to, question our original assumptions and objectives. We emphasize the crucial trade-offs that have been established, such as the balance between accuracy and computing efficiency, as well as the diverse effects of hyperparameter adjustment. This study situates our findings within the broader framework of predictive modeling, highlighting its importance for both gemmology and the wider discipline. By clearly stating the goals of our study and how we achieved them, we emphasize the impact of our research on existing literature and its potential to shape future studies in the use of ML to diamond assessment.

5.1 Objective 1: machine learning models’ effectiveness assessment

We learned a lot about the accuracy, execution time, and value forecasting performance parameters from our extensive study of ML models for diamond forecasting. This analysis is unique among its peers since it uses more types of regression and classification algorithms than any other prior research. A sophisticated comprehension of simulation accomplishment in the context of diamond price prediction was achieved through the wide range of models tested. For regression, we used Cat Boost, XGBoost, Random Forest, Decision Tree, Gradient Boosting, KNN, Lasso, Ridge, Linear, Polynomial, Elastic Net, and SVR. For classification tasks, we used a similar variety of models. Impressive results are obtained by comparing these models with standards derived from prior research. One example is the possibility of sophisticated ensemble algorithms to capture the intricate patterns in diamond price data. The study we conducted found excellent base accuracies, especially with Cat Boost and XGBoost. The incorporation of diverse models enhances the body of knowledge by offering a thorough evaluation across numerous model families, drawing attention to the compromises that exist in practical settings regarding accuracy, runtime, and alignment of prediction values.

5.2 Objective 2: Optimizing hyperparameters

The studied models showed a wide range of responses to hyperparameter adjustment in terms of improved performance. Execution times and the distribution of predicted values were significantly different from one model to the next, even if the variations in accuracy were small for some. For example, the Random Forest model’s processing time went from 58.76 s to 49.2 s after optimization, which shows how optimization can make computing more efficient. On the other hand, after adjusting, SVR’s execution time surged from 277.1 s to 523.5 s, post-tuning, raising concerns about the practical viability of such models in time-sensitive applications. These results show that hyperparameter tuning comes with a lot of different trade-offs. For example, accuracy might get a little better or stay the same, but the effects on execution time and processing efficiency can be very different. This shows how important it is to think about all the ways that changing hyperparameters can affect model success.

5.3 Objective 3: An evaluation of the impact of hyperparameter tuning in comparison

One useful way to learn about how different algorithms respond to efforts to improve them is to study how hyperparameter optimizing changes various models in various ways. Models that naturally change and are resistant to growth, like Random Forest, have shown faster processing times without losing accuracy. While models like SVR and GB had issues, SVR’s execution time went up a lot after tweaking, and Gradient Boosting’s execution time went down, but its accuracy went down a little. The differences could be caused by the difficulty of the model, the features of the dataset, or the specifics of the changed hyperparameters. In order to optimize the results of adjusting parameters, it is advisable to use models that have simpler structures or ones that closely align with the characteristics of the dataset. Based on the findings, it is necessary to customize hyperparameter tuning for each model in order to consider all of their distinct traits and requirements.

5.4 Objective 4: To identify the best-performing models for diamond price prediction

After a lot of testing and research, the best models for predicting the price of a diamond after tuning found the best mix between accuracy, computing efficiency, and value predictions that were in line with each other. It was a tough competition between Cat Boost and XGBoost. Both could handle tasks that needed perfect accuracy thanks to their good performance speeds and accuracy. On the other hand, the Random Forest model’s longer processing time after tuning could be useful in cases where speed is very important. The implications of these findings are huge for gemmology predictive modeling and related areas. They show how important it is to look at models as a whole, considering how well they work, how accurate the predictions are, and how efficient they are. This all-around view is important for using predictive models in the real world because it makes sure they give accurate predictions in a fast manner that can be calculated.

5.5 Comparative analysis

This study contributes to the existing body of work on the use of ML in predicting diamond prices by using a more extensive range of regression and classification models. By comparing the efficacy of various algorithms on different datasets and under varied settings, this research has shed light on the promise of ML for diamond price prediction.

Nevertheless, proposed method stands out by using an extensive array of methods in regression and classification scenarios, enabling thorough comparisons with prior results. The Proposed method used 23 ML models in the Table 5. For example, our work confirms the results of (Alsuraihi et al. 2020; Sharma et al. 2021) about RFR’s, but our work goes a step further by investigating the effects of hyperparameter optimizing and execution time analysis on a broader range of models. Our research expands our knowledge of successful algorithms for diamond price prediction by showing that models such as CatBoost and XGBoost in regression and LightGBM and XGBoost in classification perform far beyond RFR.

Table 5 Comparative Analysis

This Table 5 presents a comparative analysis of the performance of various ML models used in regression and classification tasks, as proposed and compared with other studies. The table lists features, models, and accuracy metrics for base and tuned settings, providing a clear view of model performance improvements.

Column Descriptions

  • Authors & Methods: References the specific studies along with the methods they used.

  • Proposed Model: The specific ML model applied in the proposed research.

  • Features: The number of features used in each model.

  • ML Models: The total number of machine learning models considered.

  • Accuracy: Divided into two sub-columns.

  • Base: The initial accuracy of the model before tuning.

  • Tuned: The improved accuracy after tuning.

Definitions and Explanations

  • Regression/Classification: Indicates whether the model is used for regression or classification tasks.

  • Accuracy: Represents the percentage of correct predictions made by the model. Displayed as Base for initial accuracy and tuned for post-optimization accuracy.

  • Authors & Methods: Lists studies associated with similar models and notes their base accuracy.

This extensive research on diamond price prediction included an experimental comparison of diverse ML models, including those that use classification and regression techniques. Unprecedented in previous research, this study suggested model makes use of 23 distinct ML techniques and a large collection of characteristics. Table 5, we highlight the improvements and contributions of this work by comparing it to those of prominent predecessors, particularly (Alsuraihi et al. 2020; Fitriani et al. 2022; José M Peña Marmolejos, 2018; Pandey et al. 2019; G. Sharma et al. 2021). This study increases the number of regression models examined and incorporates classification models, expanding the analytical scope and application possibilities in diamond price prediction. This paper introduces the CBR as an innovative model in ML for predicting diamond values, with a remarkable accuracy rate of 98.9%. The performance of the model showcases its exceptional predictive capability and sets a higher benchmark in the area, surpassing the accuracies shown in previous studies. The RFR model, as shown by the Alsuraihi et al. (2020) and Sharma et al. (2021), method was regarded as the highest-performing, with an accuracy rate of 97%. The improvement in our ability to accurately predict using CBR, even though it may seem small, shows that we have made significant progress. This is especially important in a field where even small improvements can greatly impact the accuracy of valuations and financial outcomes. Pandey et al. (2019) model achieved a 95% accuracy using GB, a powerful ensemble approach that is known for its excellent performance in regression tasks. This demonstrates the effectiveness of the CB algorithm in handling complex datasets with multiple factors that affect diamond values, such as carat weight, clarity, color, and cut quality. Their model’s superior performance compared to GB further confirms the validation of the CB algorithm in this context. Fitriani et al. (2022) KNN model had an accuracy of 90%, which is good but not as high as the more advanced algorithms we talked about. It is important to choose advanced models that can effectively capture the many connections and interactions in diamond pricing data. Jose M Pena Marmolejos (2018) presented a LR model that achieved a 95% accuracy rate. While LR is known for its simplicity and interpretability, our study found that using advanced data-driven techniques like classification trees (CB) can greatly improve predictive accuracy. This is because these techniques are able to identify subtle patterns in the data. The CBR model outperforms previous models due to its ability to handle categorical features commonly found in diamond datasets and its capability to prevent overfitting. Additionally, CB’s gradient boosting design effectively handles missing data and leverages complex relationships between features, which is crucial for accurately predicting gem prices. The increased accuracy of the CBR model suggests that it has great potential as a powerful tool for predicting diamond prices. It also demonstrates the continuous improvement of ML algorithms. These models should be further developed and enhanced through additional research. One approach to achieve this is by combining them with other data sources and prediction elements to increase the accuracy and usefulness in the diamond industry and other domains. This study contributes to the existing research on predicting diamond prices by examining ML algorithms in a more specific and comprehensive manner. The findings of this paper lay the foundation for further study and practical applications in the field of diamond pricing by improving the accuracy of the analysis, incorporating classification models, and showcasing the versatility of these models in various prediction tasks.

This study has successfully demonstrated the effectiveness of various ML models in predicting diamond prices with high accuracy. These findings have significant implications for the diamond industry, particularly in areas such as pricing strategy development, inventory management, and consumer targeting. specific models and findings can be applied in everyday business practices within the industry. For each model and finding. For instance:

  • Pricing Strategy Optimization: The high accuracy of the Cat Boost and XGBoost regressors in predicting diamond prices suggests that these models can be integrated into pricing tools to help jewellers set competitive and profitable prices based on anticipated market trends and consumer behavior.

  • Inventory Management: Models like Random Forest and Gradient Boosting can analyze historical sales data to predict future demand, enabling suppliers and retailers to optimize the stock levels, reduce overhead costs, and minimize the risk of overstocking or stockouts.

  • Customer Segmentation and Targeting: The precision of classification models in identifying customer preferences can be leveraged to tailor marketing strategies, personalize customer interactions, and enhance customer satisfaction and loyalty.

The diamond industry could gain from implementing these ML solutions, such as increased efficiency, reduced costs, enhanced competitive advantage, and improved customer satisfaction. The diamond industry faces unique challenges such as price volatility and complex supply chain logistics. The application of advanced predictive analytics can provide more stable pricing mechanisms and improve supply chain efficiencies, offering substantial economic benefits. Challenges that might arise from applying these models in practical scenarios, such as the need for high-quality data, integration with existing systems, and ongoing maintenance. Offer solutions or recommendations to overcome these challenges. To mitigate the risk of inaccuracies due to poor-quality data, companies should invest in robust data collection and processing systems. Additionally, continuous training and updating of the models are recommended to adapt to changing market conditions.

6 Conclusion

In our comprehensive analysis of ML algorithms for diamond price prediction, we observed a complex interaction between model accuracy, estimate abilities, and processing time before and after hyperparameter modification. We examined the variations in performance metrics, estimated values, and execution times among different models to assess the efficiency and realism. The effectiveness of ML algorithms in providing diamond price forecasts, we assess the original goals and predictions to see whether they were fulfilled. Our investigation aimed to probe a wide range of ML techniques, including regression and classification models, accuracy, runtime, and prediction precision as it pertained to diamond value. The outcomes revealed that the algorithms were not all the same. Ensemble strategies (like CatBoost and XGBoost Regressor) did better at achieving high accuracies. These approaches also had different effects of hyperparameter adjustment on simulation effectiveness, which made them particularly intriguing. The comparison shows the complicated balance between model accuracy and processing speed, which is important to consider when using prediction models in real life. especially SVR, saw processing times grow too slowly, making them less useful for time-sensitive tasks. The results back up our theory that complex ensemble algorithms can accurately find trends in diamond price data. This gives us useful information about how to make predictions. When it comes to the diamond business, where the stakes are high and the margins are thin, being able to quickly and correctly guess diamond prices can have a big effect on how product is managed and how prices are set. It is important to use ML models that work well in terms of both accuracy and processing time. Models that find this balance can give companies in an industry a competitive edge by letting them make better decisions more quickly and with more information. Researchers and the business world can both benefit from a long-term plan that includes looking into mixed models or group methods. These more advanced methods might help improve performance measures even more, making them more accurate without slowing them down. The best model for predicting diamond prices is one that strikes a good mix between being accurate and being useful. As the field moves forward, developing and improving predictive models will remain a key part of making the diamond industry better at both running its business and planning its future. But the fact that different models react differently to tuning shows how important it is to understand the pros and cons of each model fully. In the future, work should be put into making these models even better, looking into group strategies, or using more advanced methods to improve performance without lowering efficacy.

7 Research implications

This section outlines both the managerial and practical implications of the study.

7.1 Managerial implications

  • Industry Impact: The findings from this study provide valuable insights into the diamond industry, including trends in pricing and valuation accuracy. By applying ML techniques, stakeholders can gain a clearer understanding of market dynamics, enabling more informed decision-making.

  • Strategic Planning: The comparative analysis of different models provides managers with a clear understanding of which approaches yield the most accurate predictions. This, in turn, can inform strategic planning, particularly in pricing strategies, inventory management, and supply chain optimization.

7.2 Practical implications

  • Valuation Accuracy: The models developed and tuned in this study offer practical tools for accurately predicting diamond prices, reducing reliance on subjective human evaluation. This can lead to more consistent pricing strategies across the industry.

  • Operational Efficiency: The use of ML models and hyperparameter tuning can streamline operations within the diamond industry, improving processing times and reducing costs.

8 Limitations

This section identifies potential limitations of the study and outlines areas for future exploration.

8.1 Limitations

  • Data Sources: The study relies on data from a specific dataset, which may not fully represent all diamond market scenarios. Future studies may benefit from including additional data sources or real-time market data.

  • Generalizability: While the study’s findings are robust, they may not necessarily apply to all contexts within the diamond industry. Further research could investigate how these models perform in different geographical markets or under varying economic conditions.

  • Model Constraints: While the selected ML models demonstrate high accuracy, performance is contingent upon the quality and the granularity of the data available, which may limit the generalizability of the results across different market segments.

  • Computational Resources: The computational demand for some of the advanced models requires significant processing power, which might not be feasible for all industry players, especially small-scale jewellers.

These limitations suggest that while the findings are robust within the context of the current study, care should be taken when extrapolating these results to different economic conditions or to the broader diamond market.

9 Future research directions

  • Technological Advancements: Future research could explore how emerging technologies, such as blockchain and artificial intelligence, can enhance diamond industry operations. Additionally, continued advancements in ML algorithms may offer even greater predictive accuracy.

  • Cross-Industry Comparisons: Research could also explore how ML models used in this study might apply to other industries with similar characteristics, such as the gemstone or precious metals markets.

  • Further Model Optimization: Future work may delve deeper into hyperparameter tuning and model optimization techniques, exploring new approaches to improve performance across both regression and classification models.

  • Exploring Bayesian Optimization could guide hyperparameter selection intelligently. Incorporating adaptive optimizers like Adam could optimize learning rates dynamically. Developing a Meta-Learning model to generalize tuning experiences could streamline future efforts.

  • Future research could focus on integrating real-time market data and broader economic indicators to enhance the models’ predictive power and applicability in dynamic market conditions. Investigating the use of hybrid models that combine ML with traditional econometric methods may offer a way to mitigate the limitations posed by volatile market data.

  • Further studies could explore the application of these models in emerging markets, which may present different challenges and opportunities compared to established markets.

The ongoing evolution of machine learning and data analytics will undoubtedly open new avenues for research. By addressing the limitations identified in this study and exploring the suggested areas for future research, scholars can continue to refine and enhance the predictive capabilities of models used within the diamond industry and beyond.