Research on Emergence of Leukemia, Its Causes and Symptoms
Table of contents
Abstract
Leukemia is a cancer of the blood or bone marrow, where the number of White blood cells increases tremendously causing them to grow abnormally and lose function. These cells interfere with other cells in the blood and disrupt their routine mechanisms. According to current statistics, a person is diagnosed with cancer every 3 mins (approx.). The purpose of this project is to design a Convolutional Neural Networks system, where the cancer cells can be diagnosed automatically by scanning through the images of the blood cells. Often the appearance of these defected cells is unclear and misrepresenting. This detection of defective cells would have numerous benefits in delivering proper treatments at the appropriate time and will be used at places where there is a lack of expertise.
Introduction
Blood is a specialized body fluid. It has four main components namely, Plasma, Red blood cells, White blood cells, and Platelets, carrying cells and antibodies that fight the infection. The blood has many different functions, which include transporting Oxygen and nutrients to the lungs and tissues, forming blood clots to prevent excess blood loss, regulating body temperature. The White blood cells account for 1% of the blood and oversee protecting the body from infection. Leukemia is a cancer of the blood or bone marrow. Bone marrow produces blood cells. Leukemia can develop due to a problem with blood cell production. It usually affects the leukocytes or white blood cells. There are two kinds of Leukemia: Acute and Chronic Leukemia. The former matures rapidly and worsens the host body quickly, but the later gets worse over a period. Further, the Leukemia are of two types. Lymphocytic leukemia (also known as lymphoid or lymphoblastic leukemia) develops in the white blood cells called lymphocytes in the bone marrow. Myeloid (also known as myelogenous) leukemia may also start in white blood cells other than lymphocytes, as well as red blood cells and platelets.
In the current scenario, for every 3 minutes, a person is being diagnosed with Leukemia. Before the course of treatment is begun, it is very important to detect the type and stage of cancer. For types like Acute Leukemia, if the treatment is not done at appropriate time, the person dies within a span of few months. Due to lack of expertise, it is becoming difficult to manually detect the cancer cells, which goes undetected, resulting in delayed treatment and death. To overcome this issue, we want to design a system that detects cancerous cells at an early stage and thereby providing adequate treatment at appropriate time. This system is designed using Convolutional Neural Networks which takes blood images as input and categorizes them as cancerous or not in the output. We believe this will have invaluable benefits in the healthcare industry which lacks expertise in diagnostics. CNN's can be thought of automatic feature extractors from the image. While if an algorithm with pixel vector is used, we lose a lot of spatial interaction between pixels, a CNN effectively uses adjacent pixel information to effectively down-sample the image first by convolution and then uses a prediction layer at the end.
Background
- have proposed an automatic detection of white blood cells (WBCs) from peripheral blood images. It firstly proposes an algorithm to detect WBCs from the microscope images based on the simple relation of colors R, B, and morphological operation. SVMs are applied to classify eosinophil and basophil and CNN is later to extract features at a high level from WBC automatically.
- have presented fully automatic system able to recognize 17 classes of myelogenous leukemia from images of bone marrow aspirate. Cells are segmented using watershed algorithm combined with region-growing and edge detection techniques. 117 descriptive features have been generated and selected using linear SVM.
- presents the recognition for the WHO classification of acute lymphoblastic leukemia (ALL) subtypes. They implemented a CNN classifier to explore the feasibility of deep learning approach to identify lymphocytes and ALL subtypes, and this approach is benchmarked against a dominant approach of support vector machines. Additionally, two traditional machine learning classifiers, multilayer perceptron (MLP), and random forest are also applied for the comparison.
Approach
The primary function of this project is to classify images of blood into a normal cell and cancerous cell. For this purpose, we have implemented Convolutional Neural Network to classify the images and for feature selection. A Convolutional Neural Networks is a class of Deep Learning, which is primarily used to analyze visual objects and it is designed specifically to process pixel data. The network consists of 12 layers including the input layer. In the input layer, each color is processed separately. The convolutional layer starts from the second layer and ends at the seventh layer. For the Convolutional layer, we are using a 3x3 filter/kernel. In order to preserve the corners, a layer of padding is introduced. This is the process of adding zeros to the input layer to avoid the above-mentioned problem.
The input shape is of size 150x150x3. We are applying 16 of 3x3 kernels for the first 2 convolutional layers followed by 32 of 3x3 filters for the subsequent 2 layers and 64 of 3x3 filters for the last 2 convolutional layers. Throughout the convolutional layer, the ReLU activation function is employed. To reduce the image size, a 2x2 filter was applied to the pooling layer after which 64 features were extracted by the network represented in a 32X32 array. In between the convolutional layer and the fully connected layer (Eighth Layer), there is a 'Flatten' layer. Flattening transforms a two-dimensional matrix of features into a vector that can be fed into a fully connected neural network classifier. The output of this layer is a single layer of size 4800. The next layer is a fully connected layer that maps the input from the previous layer to the 64 output layers. A dropout layer was introduced to overcome overfitting, which reduced the input value coming to the eleventh layer (fully connected layer) by 50%. The final layer uses the sigmoid activation function to maps the existing input to 2 class labels (Normal or Cancer).
The Dataset was divided into 3 sets: train, validate and test. Initially, the convolutional neural network was trained using the training set to obtain an appropriate weight leading to minimal error. This process is repeated until 10 epoch values. Next, using the validation set, the validation error and cross-entropy error was obtained. Finally, the performance of the model is measured using the test set.
Results
Dataset:
The dataset was obtained from Kaggle, which is available online for public use. The dataset was divided into 2 categories, the training set which contained a total of 4961 training images of which 2483 were from healthy persons and 2478 images were from cancer affected patients and the test set consisting of 620 images of each type (Normal and Cancer). The resolution of the images was 320*240.
Analysis/Performance Evaluation:
The primary goal of this project is to classify the normal cells from defected cells using image recognition. The output was measured by calculating the loss and accuracy of the model. A series of iteration was performed in order to minimize the loss and to obtain a better accuracy. We evaluated the performance of the result using a confusion matrix.
Predicted/Actual Normal Cancer
Normal 108 126
Cancer 11 379
Since the training set and Test set has unequal number of images, accuracy was not the only metric which was considered to evaluate the performance. The Precision and Recall were also calculated for the evaluation. This model is basically designed to make sure the doctors do not falsely identify a person who has cancer as a non-cancer patient (False Negative). Hence our top priority was to design a model to reduce the error committed by the experts like doctors and lab technicians.
Future Work:
- To implement and find different stages of cancer
- To implement to detect
The below graph denotes the accuracy measured where it increases with each iteration. This explains that the model is learning at each iteration and thereby improving the classification.
Conclusion
We illustrate a deep learning approach to classify the cancerous cells from the Normal Cells defined by the WHO classification. A convolutional Network was implemented which took raw images as input and gave an output of classification. We obtained an accuracy of 78% and a recall of 97%. This result was obtained through CNN instead of using a traditional approach of SVM, obtaining a better accuracy.
Cite this Essay
To export a reference to this article please select a referencing style below