The Devlopment of Hindi-Speaking Personal Assistant

A Smart Personal Assistant is a mobile software system that has the ability to perform tasks, or services, on behalf of an individual user based on a combination of user input, location awareness, and the ability to access information from a variety of online sources. We propose a system which uses Hindi as the language of choice for the assistant and plan to extend the scope by adding other Indian languages. The system uses speech recognition and machine learning model to perform the requested action given by the user in Hindi.

The huge success of smart personal assistants is overwhelming and well accepted by people all across the globe. Its advantages are felt in all wakes of life .However language proves to be a significant barrier in its adoption since most of the assistants are language and region specific such as Siri for IOS , Cortana for Windows and Google Now for Android all support English language only.


A virtual assistant is a software agent that can perform tasks or services for an individual. The popularity of virtual assistants has increased to a great extent as developing systems with integration of such assistants has become a trend in today’s technological world. Several efforts have been made towards creating personal assistant software (PAS) to help people in their daily life activities, at home or at work. Google Now for Android based smart phones, Apple Inc’s Siri for iOS systems, Cortana for Windows and Alexa by Amazon are some examples to name the same. These systems perform tasks implicitly just by interpreting user’s voice commands given in English language. The tasks include miscellaneous actions like calling a contact, messaging, opening an application present in the system, setting an alarm, web searching for queries and many more. Support for local languages in these systems are limited to web search only. They are unable to interpret commands in other languages except English creating a huge gap between people comfortable and uncomfortable in English language. Thus the need to develop systems supporting local languages for assisting users is felt.

This project aims at developing personal assistant for Android based smart phones supporting Hindi, the major language being spoken in India. Through this project we intend to propose a model that can be extended to support many prominent local languages across the globe. To eliminate language barrier in usage of virtual assistants serve as prime motivation for the project.

Proposed Solution

The application takes Hindi speech given by the user as input. For example, 'Umang ko call karo', 'Aditi ko message karo', 'kripya camera kholde' and many more similar sentences in Hindi language. The aim is to identify the object, action and additional parameters associated with the input so that appropriate functionality is performed. The input is in the form of audio signal which gets passed by the application to an instance of RecognizerService class of android.speech package developed by Google. The corresponding speech to text data provided by the RecognizerService is displayed in a text view mode by the application to the user. The text is then further passed to the remote server where appropriate classifier is present for the analysis and processing of the text. Almost all the known techniques for classification such as decision trees, decision rules, Bayes methods, nearest neighbor classifiers, SVM classifiers, and neural networks have been extended to the case of text data.

Recently, a considerable amount of emphasis has been placed on linear classifiers such as neural networks and SVM classifiers, with the latter being particularly suited to the characteristics of text data.[2] Hence, we are proposing to use SVM Classifier, for text classification, required at the server side of the system. It is a SVM model which is pre-trained with sample inputs and parameters which is then used to classify the new input given to it. It takes speech to text data from the application and analyses it so as to classify it into corresponding classes of actions such as call, message, etc relative to the user’s command given for that instance.

As soon as the action in relation with the given command by the user is identified by the classifier the Dialogue Manager in that class does the work of extracting important parameters and objects required for the fulfillment of the action. The relative action and parameters associated with it is passed to the application in order to perform the user’s given task. According to the action that needs to be triggered, appropriate intents encapsulated with corresponding data are issued by the android application to the android system.


This paper discusses the need for developing an android based personal assistant supporting regional languages. We proposed a system which uses machine learning techniques to do text categorization and perform basic functionalities of an android user such as calling, messaging and opening applications. The system will also generate appropriate responses relative to the input query thereby making it interactive and efficient. The system makes use of client server architecture in order to reduce the overhead on phones due to limited resource capability, thus making it more responsive and efficient.

