Intellectual Technology of Analysis and Price Forecasting of Used Cars

Authors

  • V. B. Mokin Vinnytsia National Technical University
  • A. V. Losenko Vinnytsia National Technical University
  • M. V. Dratovanyi Vinnytsia National Technical University

DOI:

https://doi.org/10.31649/1997-9266-2019-147-6-62-72

Keywords:

intellectual technology, data mining, price prediction, used car, machine learning models

Abstract

For the profitable sale of a used car, people should not only be guided by their own or third-party experts' evaluation, but also use all other suitable resources. Such resources can serve as price prediction systems that, using the common features of a car (such as a car manufacturer, car model, mileage, fuel type, body type, etc.), are able to predict the possible price of a car. Such systems can help in decision-making not only to ordinary car dealers, but also to agencies involved in the ordering and bulk transportation of used cars from abroad. To select the key features and identify the optimal structure and parameters of the models, relevant datasets should be selected, the intelligence analysis and selection of features will be conducted, after which building of a number of machine learning models has begun, from which the optimal model was chosen by certain criteria. In order to build an information system and test the functionality of the proposed intellectual technology, two comparable datasets for used cars of the USA and Ukraine were selected. Python methods and libraries have been systematized for intelligence analysis and general recommendations for their application for the task have been formulated. The general principles of intellectual technology, which is tested on the selected datasets, are offered. In particular, a exploratory data analysis of US data was conducted and a rule for filtering anomalous, and possibly erroneous, data was substantiated. Many possible models were selected, their training was carried out and the optimal one was selected according to the R-squared criterion. The cost of the car has been predicted to an accuracy of 86.1%. A similar problem is solved for data on Ukraine. An accuracy of 85.6% was achieved. This has proven the workability of the proposed technology and has yielded useful results in practice.

Author Biographies

V. B. Mokin, Vinnytsia National Technical University

Dr. Sc. (Eng.), Professor, Head of the Chair of System Analysis, Computer Monitoring and Computer Graphics

A. V. Losenko, Vinnytsia National Technical University

Student of the Department of Computer Systems and Automation

M. V. Dratovanyi, Vinnytsia National Technical University

Post-Graduate Student of the Chair of System Analysis, Computer Monitoring and Engineering Graphics

References

A. Bezerra, I. Silva, L. A. Guedes, D. Silva, G. Leitão, and K. Saito, “Extracting Value from Industrial Alarms and Events: A Data-Driven Approach Based on Exploratory Data Analysis,” Sensors, 2019, no 19, issue 12, pp. 11-32.

Stefan Lessmann, and Stefan Voß, “Car resale price forecasting: The impact of regression method, private information, and heterogeneity on forecast accuracy,” International Journal of Forecasting, 2017, no 33, issue 4, pp. 864-877.

Kanwal Noor, and Sadaqat Jan, “Vehicle Price Prediction System using Machine Learning Techniques,” International Journal of Computer Applications, 2017, no 167, issue 9, pp. 27-31.

Sun, Ning & Bai, Hongxi & Geng, Yuxia & Shi, Huizhu, “Price evaluation model in second-hand car system based on BP neural network theory,” IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, 2017, pp. 431-436.

Python leads the 11 top Data Science, Machine Learning platforms: Trends and Analysis. [Electronic resource]. Available: https://www.kdnuggets.com/2019/05/poll-top-data-science-machine-learning-platforms.html .

Comprehensive Data Exploration with Python [Electronic resource]. Available: https://www.kaggle.com/pmarcelino/comprehensive-data-exploration-with-python .

Module pandas_profiling. [Electronic resource]. Available: https://pandas-profiling.github.io/pandas-profiling/docs/

Matplotlib API Overview. [Electronic resource]. Available: https://matplotlib.org/api/index.html .

A new correlation coefficient between categorical, ordinal and interval variables with Pearson characteristics. [Electronic resource]. Available: https://arxiv.org/abs/1811.11440 .

Used Cars Dataset, Vehicles listings from Craigslist. [Electronic resource]. Available: https://www.kaggle.com/austinreese/craigslist-carstrucks-data .

Supervised Learning API Overview. [Electronic resource]. Available: https://scikit-learn.org/stable/supervised_learning.html#supervised-learning .

T. Houska, P. Kraft, A. Chamorro-Chavez, and L. Breuer, SPOTting Model Parameters Using a Ready-Made Python Package. [Electronic resource]. Available: https://doi.org/10.1371/journal.pone.0145180 .

Metrics and scoring: quantifying the quality of predictions. [Electronic resource]. Available: https://scikit-learn.org/stable/modules/model_evaluation.html#r2-score .

Downloads

Abstract views: 487

Published

2019-12-23

How to Cite

[1]
V. B. Mokin, A. V. Losenko, and M. V. Dratovanyi, “Intellectual Technology of Analysis and Price Forecasting of Used Cars”, Вісник ВПІ, no. 6, pp. 62–72, Dec. 2019.

Issue

Section

Information technologies and computer sciences

Metrics

Downloads

Download data is not yet available.

Most read articles by the same author(s)

<< < 2 3 4 5 6 7