Handwritten Gujarati Script Recognition With Image Processing And Deep Learning Network
The motive behind writing this paper is to throw light on the proposed application which can be used for detecting and recognizing Gujarati handwritten scripts using image processing machine learning techniques. It emphasizes the key technologies involved in this process. There is a lot of variation in the handwriting of people and the curves involved in the characters of the Gujarati language and therefore possess a challenge in the process. The paper features all the important phases in character recognition and detection process namely image acquisition, preprocessing, segmentation, classification and recognition and post processing. It also emphasizes on the key aspects like the designing a neural network suitable for the challenging task of handwritten character recognition in Gujarati scripts, training and testing that model and fine tuning various hyper-parameters to get the best accuracy. The paper can be put to use by researchers and technology enthusiasts to develop systems for Gujarati script recognition. The paper aims to present and deal with special properties associated with Gujarati script.
Keywords—clustering; deep neural networks; activation function; grayscale image; histogram; Otsu thresholding
Classification is segregating an object from a set of objects and mapping them to their respective known category. The task of detection and recognition of characters of a given script is considered under the classification problem type. The major issues in pattern recognition problems is the offline recognition of handwritten characters. There have been noteworthy upgradations and innovations in character detection and recognition systems since the advancements in complex technologies like deep neural networks and machine learning. The Optical Character Recognition (popularly known as OCR) system has now become a vital part of many smartphone cameras and is used in many sectors such as education, government, research, navigation, postal processing, script recognition, security, authentication.
The origin of idea of recognization of a character by a computer system can be traced back to the 19th century. In 1914, Emanuel Goldberg developed a machine that read characters and converted them into standard telegraph code (early OCR). But thanks to the development in the field of artificial intelligence and deep learning methodologies, research in character recognition domain gained momentum and these technologies were considered as a ray of hope for achieving better results. Back in the olden days, when the handwritten scripts were written, due to the lack of technological advancements, all had to be written on pieces of papers, with using leaves and ink. But there is always a risk associated with a written material on paper. The piece of paper may get torn off, is highly susceptible to weather conditions or people may simply lose it. And in this era of globalization, people tend to forget their cultural roots. In order to preserve the culture through ancient Gujarati scripts, digitization of these scripts is one of the best possible solutions so that the upcoming generations can read those scripts and get enlightened with moral lessons. This paper aims to share a methodology to automate the digitization of such handwritten scripts with the help of advanced concepts of image processing and deep learning.
THE PROPOSED RECOGNITION SYSTEM
The proposed system acquires a handwritten Gujarati script as an input in the form of an image. The extension of the image can be. jpg,. jpeg or. png etc. In the next phase, image preprocessing is done wherein the image is converted to the grayscale format and therefore the resulting pixel intensity will now be in the range of 0 to 255, where a black pixel is a pixel. Flow diagram of the proposed handwritten character recognition system. with intensity value as 0 and a white pixel is a pixel with pixel intensity with intensity value as 255. In the next phase, thresholding is performed which helps to remove the noise from the input image. The next significant phase is of segmentation which is done so as to detect individual characters in the script. For detecting the individual characters clustering algorithm is used, such that each character is an individual cluster in itself. After all the image preprocessing operations are performed, the segmented character recognition and classification is done with the help of deep neural network model and after that the results of the model are analyzed.
METHODOLOGY TO IMPLEMENT THE PROPOSED SYSTEMA.
The role played by the thresholding algorithms in the image segmentation is vital. Image thresholding is generally implemented to separate the background from the objects in the foreground in an image. It is used to retain only that part of an image which contains useful information while eliminating the rest of the unwanted information. It is also used to remove noise from an image. All the thresholding algorithms can be applied only on grayscale images and cannot be applied to RGB images. So an input RGB image has to be first converted to the grayscale format and then a thresholding algorithm can be applied to that image. Fig 2. displays a 6X6 grayscale image and its corresponding grayscale histogram. The pixel intensity for a grayscale image lies between 0 and 255. The threshold selection method is based on the histogram of the grayscale image. The pixel value which best separates the background from the foreground is selected as the threshold value. Binary thresholding, adaptive thresholding and Otsu thresholding are some of the commonly used thresholding algorithms which return different results for different values.
Binary thresholding performs in an extreme way. If the intensity value a particular pixel is greater than or equal to the threshold value, then the intensity value of that pixel is changed to 255 and if the intensity value a particular pixel is greater than or equal to the threshold value, then the intensity value of that pixel is changed to 255. That is either the pixel will be treated as black or white. Here the background pixels are denoted as black pixels and foreground pixels are considered as a white pixel. But in adaptive thresholding, the central tendency for intensities of neighborhood pixels are also considered for a particular pixel and treated as the threshold for that pixel and the class (foreground or background) is determined accordingly.
In Otsu’s thresholding algorithm, each and every pixel’s intensity value is checked for the selection of an ideal threshold value. The measure of variance from the threshold value is calculated and the pixel value is put to that class from which it varies less than the other. That is, either the pixel will be considered as a background pixel or it would be considered as a foreground pixel. The task is to segregate the pixels of an image into two classes with utmost precision such that the inter-class variance is maximum but the intra-class variance is the least.
Segmentation of an image implies subdividing the image into its sub-areas or objects and it is an extremely important step in the image processing. It allows to extract or highlight the important and necessary parts of the text. After the application of Otsu thresholding algorithm on the Gujarati handwritten script images, the next important phase is the segmentation of the image into each individual characters. Segmentation of an image is defined as the division of an image into regions or categories, where each region or category map to different objects or part of objects. Image segmentation is the division of an image into regions or categories, which correspond to different objects or parts of objects. Every pixel in an image is allocated to one of a number of these categories. A top-down approach is followed in the segmentation process. From the whole text, first the lines from the whole Gujarati script is segmented followed by the words in the segmented lines and then finally the individual character is detected from the segmented words.
The task of correctly identifying the detected character during the segmentation task is known as recognition.
In case of handwritten script recognition, a deep neural network is proposed to be used. The motivation behind using a deep neural network model is that the handwriting of individuals vary a lot and a self learning and self optimizing neural network can adapt to such large amount of variations. A deep neural network generally consists of an input layer, one or many hidden layers and one output layers. Associated with each neuron in the input layer, there is a weight value which signifies the role of a particular neuron in the recognition task. And as the model gets trained, these weights value are tuned so as to increase the accuracy of the model. The back propagation technique is used to manipulate the weights by analyzing the accuracy and loss of the model.
The proposed neural network model will consist of 784 neurons or nodes in the input layer since each of the segmented character in the script is transformed into a 28*28 size pixel matrix. And the output layer consists of 45 neurons since the Gujarati language consists of total 45 characters (34 vowels and 11 consonants). Therefore, any handwritten script image of any size is first converted to 28*28 matrix of pixel and then transformed to a one dimension array of 784 pixels which is then fed to each and every node in the first layer. The hidden layers will perform the task of extracting the features from the input training data set which is fed to it. The initial hidden layers will extract the features like the curves involved in each character and also the orientation of lines associated with each character. The later hidden layers will extract the other features from the training dataset. The proposed activation function for the output layer is softmax function as it returns that element from an array which has the highest probability of occurrence amongst all other elements.
The existing handwritten Gujarati script recognition systems have lower accuracy percentages due to complexity involved in the handwritten Gujarati characters. These characters involve several curves and discontinuities which makes detection a difficult process. The proposed system aims to improve the segmentation step, wherein entire character or vowel gets clustered as a separate entity. Later, with the extensive training of deep neural network model with the training dataset, a vowel or consonant can be easily recognized. The advantage of this system is that every consonant or vowel is segmented as a separate cluster, the recognition model can give better accuracy. The success of this proposed system is an useful contribution to this field of research. The system can be applied in preserving the cultural literature and as well as the government documents which are handwritten in Gujarati language.
Cite this Essay
To export a reference to this article please select a referencing style below