Character Recognition Using Neural Network Learned by Artificial Bee Algorithm

Character Recognition is the text recognition system that allows hard copies of written or printed text to be rendered into editable, soft copy versions. In this paper, work has been performed to recognize pattern using multilayer perceptron learning by Artificial Bee algorithm (ABC) that simulates the intelligent foraging behavior of a honey bee swarm. Multilayer Perceptron (MLP) trained with the standard back propagation (BP) algorithm normally utilizes computationally intensive training algorithms. One of the crucial problems with the BP algorithm is that it can sometimes yield the networks with suboptimal weights because of the presence of many local optima in the solutions space. The suggested method is to use ABC for learn the Neural Networks, to solve text character recognition problem, by update the Neural Networks weights. A comparison studies are made between ABC and BP methods in NN learning to specify which is better in solving character recognition problem.


Introduction
Humans have developed highly sophisticated skills for sensing their environment and taking actions according to what they observe, e.g., recognizing a face, understanding spoken words, reading handwriting and distinguishing fresh food from its smell.Pattern recognition wants to give similar capabilities to machines.Pattern is an entity, vaguely defined, that could be In this paper, Artificial Bee Algorithm is used to learn the NN and the result are compared with BP method in order to specified which one is better in character recognition problems.

Artificial Neural Network
It is an information-processing system that has certain performance characteristics in common with biological neural networks.ANNs represent an important area of research, which opens a variety of new possibilities in different fields including classification or pattern recognition or predictions.
It is known that NN can approximate functions and mathematical operators arbitrarily as well as the number of neurons in the network tends to infinity.In this respect Feed Forward Artificial Neural Networks (FFANNs) can be considered as "universal approximations" which is capable of describing the input-output relationships of mechanical systems [3].
With classification in application domain of ANNs, there is a long list of researches and real applications include speed processing, image processing and computer vision, pattern classification and recognition, system control, robotics, forecasting and modeling, optimization and management of information and medical diagnosis [4].Also there is a great role of NNs as a means for implementing expert systems, because of their ability to solve a specific problem, producing it as if it were a black-box solution where the mode of producing answers is not clearly understood [5].Due to its adaptive and parallel processing ability, it has many applications in the engineering field [6].
NNs can be grouped into six areas of applications: prediction, pattern recognition, associative memories, classification, optimization and general mapping.= The back propagation learning algorithm can be divided into two phases: propagation and weight update.For each weight-synapse is multiply its output delta weight and input activation to get the gradient of the weight.Where delta is defined in equation ( 1) Where ƞ is learning rate, t is target, o output of net, x i is input of net.

The Artificial Bee Colony (ABC)
The Artificial Bee Colony (ABC) algorithm is a swarm based Produce new solutions (food source positions) υ i,j in the neighborhood of x i,j for the Employed bees using the formula υ i,j = x i,j + Φ ij (x i,j -x k,j) (k is a solution in the Neighborhood of i, Φ is a random number in the range [-1,1] )and evaluate them 6: Apply the greedy selection process between x i and υ i 7: Calculate the probability values P i for the solutions x i by means of their fitness values using the equation (2) . 2 In order to calculate the fitness values of solutions we employed the following equation (eq.

Fitness Criterion
One of these stopping criterions is the fitness value.Since the BA algorithm is chosen to be a supervised learning algorithm, then there are observed values of (t i ) and desired output values of (f i ).These two values have to be compared, if they are closed to each other then the fitness is good, else the algorithm must continue its calculations until this condition is satisfied or the specified number of iterations is finished.
The corrections to the weights are selected to minimize the residual error between t i and f i output.The Mean Squared Error (MSE) is one solution for the comparison process used equation ( 5):

Iraqi Commission for Computers & Informatics (ICCI) Iraqi Journal for Computers and Informatics (IJCI)
Vol (1) Issue (1), 2014 2  5 Where n is the number of the compared categories.

Training the ANN by using ABC
The ANN is statistical model of cognition that input vectors of independent variables and outputs estimates of vectors of dependent variable.The network is structure as set of weights, usually arranged in layers, and the optimization problem is to find values for the weight that make the mapping minimal error.
For the purpose of NN learning the mean square error Artificial Bee algorithm has been used in several optimization problems, including the optimization of synaptic weights from an Artificial Neural Network to generate a robust ANN.[11].
In this paper, the ABC was implemented in classification applications.From the classification application, the character recognition problem is chosen.
A NN-Learning system has been proposed to solve the mentioned problems by ABC and BP algorithms. M : number of neuron in hidden layer .
The proposed CR neural network consists of 3 layers, with (RxC, M, k) neurons which represent the number of neuron in input, hidden and output layers respectively.The proposed CR-NN is shown in figure (3).
First it has to choose random weights for the network interconnections ranging between 0 and 1.
Character Recognition has been defined as the conversion of text characters into machine readable codes .The aim was to recognize character pattern is represented by 5*5 matrix (R=5,C=5), this means that (25) neurons are in the input layer.
The training set Ns=32=8*4, each character was represented by four matrices, one of them gives the correct shape, in other words, the noise in the input data is about 75% of the whole input data.Table (1) shows the eight characters binary representation.
8 neurons can be suggested to be used in the hidden layer.
The results of ABC implementation in (25-8-8) CR-NN for 5 experiments are shown in table (3).We need to use the following symbols in the tables of ABC implementation:  Exp: Experiment number.
 NoI: Number of Iterations.
 AL: Accuracy Level of the learning Table 1: show the eight characters (5*5) matrix representation In Table (2) show the 8 character that used for CR

Phase 1 : 2 .
PropagationEach propagation involves the following steps: 1. Forward propagation of a training pattern's input through the neural network in order to generate the propagation's output activations.Backward propagation of the propagation's output activations through the neural network using the training pattern target in order to generate the deltas of all output and hidden neurons.Phase 2: Weight update

Figure 1 :
Figure 1: Flow chart of the ABC algorithm

4 ) 11 :
the new solutions (new positions) υ i for the onlookers from the solutions x i , selected depending on P i , and evaluate them 9: Apply the greedy selection process for the onlookers between x i and υ i 10: Determine the abandoned solution (source), if exists, and replace it with a new randomly produced solution xi for the scout using the equation (4) x ij =min j +rand(0,1)*(ma xj -min j ) (Memorize the best food source position (solution) achieved so far 12: cycle=cycle+1 13: until cycle= Maximum Cycle Number (MCN) [7,11 ,8].
referred to as the object function to be optimized by the optimization method.Each patch represent a candidate solution to the optimization problem, a patch represent the weight vector of NN including all biases.Update position mean update the weight of network in order to reduce error.The new patch is set of new weight and biases used to obtain the new error.The training process continues until satisfactory error is reached or computational limits are exceeded.When the training ends, the weights are used to calculate the classification error for the training patterns.The same set of weights is used then to test the network using the test patterns.One of the drawbacks of the algorithm is the number of tunable parameters used.However, it is possible to set the parameter values by conducting a small number of trials.Other optimization algorithms usually employ gradient information.However, the proposed algorithm makes little use of this type of information and thus can readily escape from local optima.The ABC algorithm is vastly different than any of the traditional methods of training.ABC does not just train one network, but rather training networks.ABC builds a set number of ANN and initializes all network weights to random values and starts training each one.On each pass through a data set, ABC compares each network's fitness.The network with the highest fitness is considered the global best [10].
The proposed procedure for training ANN for character recognition showed in figure(2).The procedure start by reading input data of character is save as matrix and feed into ABC, ABC initializes all network weights to random values and continue ABC algorithm steps using mean square error as objective function until maximum number of iteration reached then NN weights and basises is save in matrices to use in test phase.

Figure 2 :
Figure 2: The proposed procedure of training ANN The patterns are represented by Row *Column (RxC) matrix.The data of the matrix consists of 1's and 0's only.The objects of interest are the 1's and the background consists of 0's.Where  Ns: is the number of training set.K is number of character to recognized.
result of ABC algorithm in 5 times and corresponding mean square error and accuracy level is shown in table 3 .

6 .
Comparison Results between ABC and BP The multi-layer perceptrons established for relevant purposes are trained with ABC algorithm and BP algorithm, respectively, through the selected training set.In order to establish a fair start-up state for BP-based perceptrons, the training processes of BP-based perceptrons always start with best solution in the initial population used for the processes of the counterparts, ABC-based perceptrons.For BP-based perceptrons, such "evolution time" directly equals the times of its updated iterations, while for BAbased perceptrons, it is equivalent to a production of the population size and the evolving generations.In order to compare performances between ABC and BP-based perceptrons, the training histories are recorded with the same fitness evolution times for both perceptrons to be compared with.In this study, some examples of letter recognition are chosen, in order to make a comparison study for CR-NN between BA and BP algorithms.In the CR-problems we chose Ns=32, for our comparative study between ABC and BP.The implementation is done for 5 experiments for each problem.

Figure 4 :
Figure 4: Show MSE value in each iteration for BP,ABC

Iraqi Commission for Computers & Informatics (ICCI) Iraqi Journal for Computers and Informatics (IJCI) Vol (1) Issue (1), 2014
given a name, e.g., fingerprint image, handwritten word, human face, speech signal, DNA sequence.Character recognition is a sub-field of pattern recognition.Various areas ofPattern Recognition such  Template matching. Statistical pattern recognition. Artificial Neural Networks.vision, physics, retail, battlefield management, and finance.Their performance depends on several factors, including: the number of neuronal layers, the number of neurons at each layer, the activation functions used by the neurons, and the choice of initial connection weights.There are many algorithm for ANN learning algorithm such as Adaline, Hebbian , Perceptron Learning rule, Back propagation.Yana M., Mohmad H. and Rozaida [1].G. [2010], Training a Functional Link Neural Network Using an Artificial Bee Colony for Solving a Classification Problems.Faez H., Hazem N. [9], 2013, used Particle Swarm Optimization for training artificial neural network algorithm in Greek Letter Recognition.Sudarshan N., et al. [12], 2012, proposed method

Table 2 :
the output binary coding of CR.

Table 4 :
(4)le (4)shows the comparison results.It's obvious that BP is much better than ABC because of the thousands of iterations and bad accuracy level as shown in table(4).Comparison results between ABC and BP for CR-NN