The Impact Of Accurate Inputs On Neural

Published Date: 02 Nov 2017

Artificial neural networks are widely used in medical diagnosis replacing most of the conventional diagnosis methods due to its accuracy and speed. This paper analyses the variation in the accuracy of diagnosis of type II diabetes using Artificial Neural Networks based on the accuracy of the inputs given to the network. It compares the efficiency of the network based on the input format. The data needed for this comparison is collected by interviewing patients who approach the diabetician with various symptoms of the disease. These symptoms can be modeled in 2 different forms. One form just specifies the presence or absence of the symptom and can be represented using Boolean values. The other form specifies the severity or frequency of occurrence of the symptom. Both these inputs are given to the system and the accuracy of the output is analyzed. This result indicates the impact of the specification of the input on the output. Comparison is done by performing regression analysis on both the outputs. Regression analysis gives the correlation between the output of the system and the target [1]. It makes use of only the most general symptoms of the disease. Further analysis can be done on other diabetes particular symptoms.

Key Terms: Diabetes, Artificial Neural Networks, Feed forward neural networks, regression.

I. Introduction

Classification is the most important part of medical diagnosis as it rightly identifies the disease and leads to providing proper treatment. The multilayer neural networks (MLNNs) have been successfully used in replacing conventional pattern recognition methods for the disease diagnosis systems [5].Many studies have been undertaken by researchers to improve the efficiency of neural networks which are widely used in diagnosis. The results obtained showed that for any system to produce the best output, the input given must be as accurate as possible. Different types of neural networks can be used, but feed forward neural networks are widely recognized as the most efficient networks in medical diagnosis. Efficient knowledge acquisition and representation are one of the central challenges for the successful construction and following use of medical-expert and knowledge-based systems in clinical practice [4].

II. Related work

Diabetes Mellitus:

It is the most common endocrine disorder characterized by glucose underutilization and hyperglycemia. Diabetes is a major health problem in both industrial and developing countries and its incidence is rising. Diabetes occurs when the body does not produce adequate amount of insulin, or the produced insulin is not properly used, as it is the hormone which is responsible for glucose entering the body cells[3]. This state is also called "insulin resistance" and is the reason for Diabetes type II. Type II is considered to be the most commonly occurring type of diabetes [6]. Diabetes increases the risk of developing kidney disease, blindness, nerve damage, blood vessel damage and it contributes to heart disease [7]. So the correct and early diagnosis of Diabetes is of paramount importance. Proper representation and interpretation of data is important for medical classification and is the main concern of this paper.

Artificial Neural Networks:

Artificial Neural Networks are inspired from the biological neurons. The network consists of multiple layers of neurons connected with each other. In general there is one input layer, one or more hidden layers and one output layers. Each neuron in the input layer is connected with every neuron in the hidden layer and each neuron in the hidden layer is connected with every neuron in the output layer. The neurons process the inputs and produce the output based on the transfer function and the weights on the interconnection.

The three main parameters which determine the performance of the neural network are:

The way in which the neurons are connected with each other.

The learning methodology.

The transfer function which converts the neuron input to the processed output.

The most commonly used learning methodology in medical diagnosis is supervised learning. In this, the network is provided with both the inputs and the corresponding output. The network adjusts its weight till the error is minimized and the output of the network matches with the target. Pattern recognition and regression are the most common applications of this network.

annfig01.gif

Figure 1: Artificial neural network using supervised learning.

Feed forward Neural network:

This network makes use of the supervised learning algorithm. The Backpropogation algorithm is the most commonly used algorithm in these networks. The network signals travel in the forward direction and the errors travel backwards. The network weights are initially assigned random values and then they are adjusted to obtain the desired output. Feed forward neural networks play a prominent role in the field of medical diagnosis compared to the other artificial neural networks.

III. Method

Data preparation:

The dataset was collected from Diabetic Care and Research Centre, Sivapreethi Hospital, Tanjore, TamilNadu. The dataset also includes the records of some patients who were not diagnosed with the disease. When a doctor interrogates a patient about the symptoms of the disease it can be a simple "yes or no", wherein the frequency of the occurrence of the symptom is not mentioned properly and may lead to a diagnosis error. When the severity of the symptoms are specified based on the frequency per day, or based on Likert scale where the frequency cannot be considered on a daily basis, the chances of a diagnosis error are reduced. A Likert scale is a psychometric scale which is used when the study involves data collection using questionnaires. It is used to specify the frequency on a scale of 10 or so for research purpose. The data was collected with extreme care so that proper analysis can be carried out.

Network structure: The network used in this comparative study consists of an input layer, two hidden layers and one output layer. The hidden layers makes use of "poslin" transfer function and the output layer makes use of the "purelin" transfer function.

The poslin transfer function returns the same value if the input is greater than 0 else it returns 0.

Poslin(n) = n, if n > 0.

= 0, if n < 0.

Purelin is a linear transfer function which returns the input as the output.

Purelin(n) = n.

net.png

Figure 2: Neural network with two hidden layers.

Training algorithm:

"Trainlm" is the training algorithm used to train the network to achieve the desired output. It updates the weights and the biases based on the Levenberg-Marquardt optimization and is the most preferred training algorithm as it has the fastest convergence and high accuracy. LM algorithm can provide better generalization performance compared to the other algorithms. However, the requirement of high computer memory and longer time during training has limited the application of this algorithm for practice. Therefore, in order to apply this algorithm, a balance is always needed between the size of the ANN model and the selection of learning algorithm [3].

Algorithm:

The Jacobian jX of performance is calculated with respect to the bias variables X and weight using Backpropagation. The adaptation is done according to Levenberg-Marquardt algorithm.

jj = jX * jX

je = jX * E

dX = -(jj+I*mu) \ je

Here I represents the identity matrix and E represents all errors. In general the metric used to estimate the network performance is the mean-squared error.

IV. Experimental Results

The table below shows the input symptom values for 10 patients. The 10 symptoms used are polyurea, polyphagia, polydipsia, nocturia, tiredness, giddiness, non-healing ulcer, sleeplessness, itching and shoulder pain.

Table 1: Patient Vs Symptom matrix without specifying severity of the symptom.

Symptom1

Symptom2

Symptom3

Symptom4

Symptom5

Patient1

Patient2

Patient3

Patient4

Patient5

Patient6

Patient7

Patient8

Patient9

Patient10

Symptom6

Symptom7

Symptom8

Symptom9

Symptom10

Patient1

Patient2

Patient3

Patient4

Patient5

Patient6

Patient7

Patient8

Patient9

Patient10

In table1, the value 0 indicates the absence of the symptom and 1 indicates the presence of the symptom.

Table 2: Patient Vs Symptom matrix specifying the severity of the symptom.

Symptom1

Symptom2

Symptom3

Symptom4

Symptom5

Patient1

Patient2

Patient3

Patient4

Patient5

Patient6

Patient7

Patient8

Patient9

Patient10

Symptom6

Symptom7

Symptom8

Symptom9

Symptom10

Patient1

Patient2

Patient3

Patient4

Patient5

Patient6

Patient7

Patient8

Patient9

Patient10

In Table2, the first 4 symptom values are based on the frequency of occurrence of the symptoms per day and the remaining symptom values are specified based on the Likert scale.

Regression Analysis:

Regression analysis is performed to measure the system performance. It indicates the correlation between the system output and the target of the system. The regression value (R) of 1 indicates the maximum correlation and a regression value of 0 indicates the minimum correlation between the output and the target. The regression value is also called the correlation coefficient.

lmunfuz.png

Figure 3: Regression plot for input specified as in Table 1.

lmfuz.png

Figure 4: Regression plot for input specified as in Table 2.

In the above regression plots, the dashed line indicates the best fit produced by the algorithm and the solid line indicates the obtained output, and a perfect fit of the solid over the dashed line indicates the perfect output [1]. So from the regression plots it can be seen that the system performance improves to a great extent when the input to the network is more precise and accurate. The correlation coefficient of Figure 4 is higher than that of Figure 3 which emphasizes the same.

Table 3 : Network simulation parameters:

Epochs

Correlation Coefficient (R)

Input not specifying severity

0.83046

Input specifying severity

0.99997

V. Conclusion

This study analyses the improvement in the efficiency of the system performance based on the input accuracy. From the regression plots obtained and from Table 3 it can be seen that the first input set achieves a correlation value of 0.83046 which is less compared to the value achieved by the second set which is 0.99997. Hence the network performance improves for more accurate input. In future, analysis can be done on the symptoms used in the diagnosis of the disease.

Our Service Portfolio

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

Do not panic, you are at the right place

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now

The Impact Of Accurate Inputs On Neural

I. Introduction

II. Related work

Diabetes Mellitus:

Artificial Neural Networks:

Figure 1: Artificial neural network using supervised learning.

Feed forward Neural network:

III. Method

Data preparation:

Figure 2: Neural network with two hidden layers.

Training algorithm:

Algorithm:

IV. Experimental Results

Table 1: Patient Vs Symptom matrix without specifying severity of the symptom.

Table 2: Patient Vs Symptom matrix specifying the severity of the symptom.

Regression Analysis:

Figure 3: Regression plot for input specified as in Table 1.

Figure 4: Regression plot for input specified as in Table 2.

Table 3 : Network simulation parameters:

V. Conclusion

Our Service Portfolio

Want To Place An Order Quickly?

Do not panic, you are at the right place

Get 20% Discount, Now £19 £14/ Per Page14 days delivery time

Get An Instant Quote

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time