Analysis Of Data Mining Based Knn Classification On Twitter
Every day, millions of tweets are published on Twitter as a reflection of the opinions and beliefs of their publishers. Hence it is important to analyze these tweets and identify and classify the trends of different users. This may provide valuable information and help to observe what is displayed on social sites to overcome any unusual events. A new idea is introduced to detect user affiliation to some anomaly groups by analyzing short text extracted from various Twitter accounts. The results are very promising. The next step is to apply this technique to a larger number of accounts and investigate the effectiveness and robustness of the proposed approach [1]. N-gram based statistical approach is used to identify significant terms and using them for vector-space modelling of the tweets.
Thereafter, a social graph generation method is proposed, considering tweets as nodes and the degree of similarity between a pair of tweets as a weighted edge between them. The social graph is decomposed into various clusters using Markov Clustering technique, wherein each cluster corresponds to a particular event. In this paper, we have presented a Twitter data mining technique for events classification and analysis. LDA is used to identify significant key terms for tweets representation using the vector-space model. A social network generation method is proposed, which models tweets as a weighted graph in which the weight of an edge represents the topical similarity of the tweets. Finally, Markov clustering is used to crystallize the social network into various clusters, each one representing a particular event. Since Twitter API provides various structural features, development of a hybrid approach to analyze twitter data using structural and content-related features could be a promising area of future research[2]. The analyses some of the approaches used to gather information and knowledge from Twitter for Twitter mining.
In addition, this paper reviews a number of the applications that employ Twitter mining, investigating Twitter information for prediction, discovery and as an informational basis of causation [3]. Sentiment analysis on twitter can allow users to understand the opinions expressed in tweets and classifying them in positive or negative categories. The organizations can use sentiment analysis to get an idea of the customer reviews of their products, and subsequently try and improve their services based on the reviews [4]. This survey assists to explore and examine the usage of how OSNs, such as the micro blogging tool Twitter, can help in the detection of spreading epidemics. The paper highlights significant challenges in the field of Natural Language Processing (NLP) when using micro blog based Early Disease Detection Systems. For instance, micro blogging data is an unstructured collection of short messages (140 characters in Twitter), with noise and non-standard use of the English language. Hence, research is currently exploring the field of linguistics in order to determine the semantics of the text and uses data mining techniques in order to extract useful information for disease spread detection. Furthermore, the survey discusses applications and existing early disease detection systems based on OSNs and outlines directions for future research on improving such systems based on a combination of linguistics methods, data mining techniques and recommendation systems [5]. The paradigm to extract the sentiment from a famous micro blogging service, called Twitter.
Twitter is one of the most popular portals, where people post their opinions, views for everything. In this paper, data mining techniques are used to automatically classify the sentiments of tweets taken from Twitter dataset. CPython open source is used to find out frequent data items used in tweets with ‘#ipl’ and ‘#ipl2016’ tag. In this research, different algorithms are used to assign a sentiment (positive, neutral or negative) to a tweet. The number of complex tweets has been processed for normal data adapters; therefore a complex event processing engine is required for processing the data. This research focuses on the application of sentiment analysis to Twitter and comparing the performance of different classification algorithms on this problem [6]. The use of micro blogging for eWOM branding has been examined. Examining several datasets from a variety of angles, our research has shed light on critical aspects of this phenomenon. The implications of this research include that micro blogging is a potentially rich avenue for companies to explore as part of their overall branding strategy. Customer brand perceptions and purchasing decisions appear increasingly influenced by web communications and social networking services, as consumers increasingly use these communication technologies for trusted sources of information, insights, and opinions. This trend offers new opportunities to build brand relationships with potential customers and eWOM communication platforms. It is apparent that micro blogging services such as Twitter could become key applications in the attention economy. Given the ease of monitoring any brand’s sentiment, one can view micro blogging as a competitive intelligence source.
The essence of eWOM communicating and customer relationship management knows what customers and potential customers are saying about the brand. Microblogging provides a venue into what customers really feel about the brand and its competitors in real time. Additionally, micro blogging sites provide a platform to connect directly, again in real time, with customers, which can build and enhance customer relationships [7]. Text classification techniques that have been used to overcome the challenges posed by the short length of messages like those found in social media sites such as Twitter. We also present techniques that have not yet been used to classify short text in the context of social media. The second challenge in classifying social media text is that the data is streamed. We review stream data classification techniques that have not been used for classifying social media text. As future work, we would like to implement the suggested solutions to simultaneously overcome challenges caused by short text and stream data. We will evaluate the performance of these techniques. In addition, we want to extend the research to include classifying all kinds of streamed text such as news feed instead of focusing only on social media [8]. The focus of this work is designing and building a highly accurate classification of Arabic Twitter users.
The proposed user classifier can help social scientists, teachers, companies, and governments to classify users of social networks or in learning by experiment. A supervised approach for texts using profile properties to classify users is presented. It is applicable for the social network Twitter but also may be useful for other social networks [9]. Growth in the area of opinion mining and sentiment analysis has been rapid and aims to explore the opinions or text present on different platforms of social media through machine-learning techniques with sentiment, subjectivity analysis or polarity calculations. Despite the use of various machine-learning techniques and tools for sentiment analysis during elections, there is a dire need for a state-of-the-art approach. Moreover, this paper also provides a comparison of techniques of sentiment analysis in the analysis of political views by applying supervised machine-learning algorithms such as Naive Bayes and support vector machines [10].
Cite this Essay
To export a reference to this article please select a referencing style below