Solving The Issues Of Traffic Risk Mining
At present, a large amount of traffic-related data is obtained manually and throughsensors and social media, e. g. , traffic statistics, accident statistics, road informa-tion, and users comments. Traffic risk refers to the possibility of occurrence oftraffic accidents. Specifically, we focus on two issues – predicting the numberof accidents on any road or at intersection and clustering roads to identify riskfactors for risky road clusters.
Using real traffic data in India, we demonstratethat the proposed algorithm can be used to predict traffic risk at any location moreaccurately and efficiently than existing methods, and that a number of clustersof risky roads can be identified and characterized by two risk factors. For solvingthese issues we use a unified approach for addressing these issues by means offeature- based non-negative matrix factorization (FNMF). In particular, it is a newmultiplicative update algorithm developed for the FNMF to handle big traffic data.
Goals and Objectives
- Overcoming the unavailability of data for all of the locations on the map
- Collecting information for isolated nature of the traffic risk information as-sociated with each location
- Report the traffic mining results obtained for real data-sets, including (i) theresults for predicting the number of accidents on roads and at intersectionsand (ii) the results related to knowledge discovery concerning traffic riskfactors.
Relevant mathematics associated with the Project
- Traffic accident dataset
- Dataset containing traffic flow
- Roadway data statistics
- Brake application rate dataset
- Accident number predicted for a road or intersection
- Calculated risk factor for a specific cluster formed
Matrix data structure is used to exploit distributed processing, constraints.
- Formation of matrix by combining all datasets
- Prediction number of accidents
- Formation of clusters
- Calculation of risk factor
- Predicted number of accidents is accurate
- Calculated risk factor is accurate
- Predicted number of accidents is not accurate
- Calculated risk factor is not accurate
Names of Conferences / Journals where paperscan be published
- IEEE/ACM Conference/Journal 1
- Conferences/workshops in IITs
- Central Universities or SPPU Conferences
- IEEE/ACM Conference/Journal 212
Review of Conference/Journal Papers support-ing Project ideaKoichi Moriya, Shin Matsushima, and Kenji Yamanishi in ”Traffic RiskMining From Heterogeneous Road Statistics” used real-traffic data in Tokyo, anddemonstrated that Feature-based non-negative matrix factorization algorithm canbe used to predict traffic risk at any location more accurately and efficiently thanexisting methods, and that a number of clusters of risky roads can be identifiedand characterized by two risk factors. E. Bayam et al. in ”Older drivers and accidents: A meta analysis and datamining application on traffic accident data” studied teenage driving and associatedaccidents thoroughly. This paper addresses these two needs by providing a meta-analysis of the existing literature on senior drivers and showing how data miningtechniques could be used in this application.
T. Beshah and S. Hill in ”Mining road traffic accident data to improvesafety: Role of road-related factors on accident severity in Ethiopia” applied datamining technologies to link recorded road characteristics to accident severity inEthiopia, and developed a set of rules that could be used by the Ethiopian TrafficAgency to improve safety. L. Chang and W. Chen in ”Data mining of tree-based models to analyzefreeway accident frequency” explained that Classification and Regression Tree(CART) is one of the most widely applied data mining techniques, has been com-monly employed in business administration, industry, and engineering. CARTdoes not require any pre-defined underlying relationship between target (depen-dent) variable and predictors (independent variables) and has been shown to be a powerful tool, particularly for dealing with prediction and classification problems. M. Chong, A. Abraham, and M. Paprzycki in ”Traffic accident analysisusing machine learning paradigms” summarizes the performance of four machinelearning paradigms applied to modeling the severity of injury that occurred dur-ing traffic accidents. They considered neural networks trained using hybrid learn-ing approaches, support vector machines, decision trees and a concurrent hybridmodel involving decision trees and neural networks. Experiment results revealthat among the machine learning paradigms considered the hybrid decision tree-neural network approach outperformed the individual approaches. I. S. Dhillon and S. Sra in ”Generalized nonnegative matrix approxima-tions with Bregman divergences” makes algorithmic progress by modeling andsolving (using multiplicative updates) new generalized NNMA problems that min-imize Bregman divergences between the input matrix and its low-rank approxima-tion.
In addition, this paper shows how to use penalty functions for incorporatingconstraints other than nonnegativity into the problem. Further, some interestingextensions to the use of ”link” functions for modeling nonlinear relationships arealso discussed. D. D. Lee and H. S. Seung in ”Algorithms for non-negative matrix fac-torization” discussed two different multiplicative algorithms for NMF are ana-lyzed. They differ only slightly in the multiplicative factor used in the updaterules. One algorithm was shown to minimize the conventional least squares er-ror while the other minimizes the generalized Kullback-Leibler divergence.
Themonotonic convergence of both algorithms was proven using an auxiliary functionanalogous to that used for proving convergence of the Expectation- Maximizationalgorithm. D. D. Lee and H. S. Seung in ”Learning the parts of objects by non-negative matrix factorization” demonstrated an algorithm for non-negative matrixfactorization that is able to learn parts of faces and semantic features of text. Thisis in contrast to other methods,such as principal components analysis and vectorquantization, that learn holistic, not parts-based, representations. S. Krishnaveni and M. Hemalantha in ”A perspective analysis of trafficaccident using data mining techniques” deals with the some of classification mod-els to predict the severity of injury that occurred during traffic accidents. Theycompared Naive Bayes Bayesian classifier, AdaBoostM1 Meta classifier, PARTRule classifier, J48 Decision Tree classifier and Random Forest Tree classifier forclassifying the type of injury severity of various traffic accidents. The final resultSYNOPSISshowed that the Random Forest outperforms than other four algorithms. J. Kim and H. Park in ”Sparse nonnegative matrix factoriza- tion for clus-tering” studied the properties of Nonnegative Matrix Factorization (NMF) as aclustering method by relating its formulation to other methods such as K-meansclustering.
Cite this Essay
To export a reference to this article please select a referencing style below