Feature Extraction Using Dct Fusion

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

Facial Symmetry for Enhanced Face Recognition

Prathik P.

Student

Dept. of Electronics

and Communication Engg.

M.S. Ramaiah Inst. of Tech.

Bangalore-560054, INDIA

[email protected]

Rahul Ajay Nafde

Student

Dept. of Electronics

and Communication Engg.

M.S. Ramaiah Inst. of Tech.

Bangalore-560054, INDIA

[email protected]

K. Manikantan#

Associate Professor

Dept. of Electronics

and Communication Engg.

M.S. Ramaiah Inst. of Tech.

Bangalore-560054, INDIA

[email protected]

S. Ramachandran

Professor

Dept. of Electronics

and Communication Engg.

S.J.B. Inst. of Tech.

Bangalore-560060, INDIA

[email protected]

#Corresponding Author

Abstract�Feature Extraction plays a very important role

in Face Recognition technology. This paper proposes a novel

Discrete Cosine Transform (DCT) fusion technique based on

facial symmetry. Also proposed are DCT subset matrix selection

based on aspect ratio of the image and pre-processing concepts,

namely Local Histogram Equalization to remove illumination

variation and Scale normalization using skin detection for colored

images. The performance of proposed techniques is evaluated by

computing the recognition rate and number of features selected

for ORL, Extended Yale B and Color FERET databases.

I. INTRODUCTION

Face Recognition (FR) is one of the most extensively

studied research topics that spans multiple disciplines. Ref. [1],

[2] provide an excellent survey on the various FR techniques.

Feature extraction and feature selection plays a major role in

the success of the FR system. Block diagram of the proposed

FR system is shown in Fig. 1.

For enhanced face recognition, this paper proposes the

following four new ideas.

1) Feature extraction using DCT fusion: This is a novel

way of extracting low frequency features obtained by

applying DCT to the image based on facial symmetry.

2) Optimal DCT subset selection: Selection of subset DCT

matrix dimensions plays a vital role in determining

the number of distinguishing features. Best feature set

is obtained by selecting DCT dimensions based on

dimensions of the image.

3) Local Histogram Equalization: An image pre-processing

technique in which areas of lower local contrast in each

region gain higher contrast to render image of better

quality. The various regions formed are determined by

the window size specified.

4) Scale Normalization using skin detection: This is a

technique to focus on the Region Of Interest (ROI) based

on the application. A novel approach to scale normalize

colored image based on skin detection is proposed.

The rest of this paper is organized as follows: An

overview of DCT, Binary Particle Swarm Optimization

(BPSO), Euclidean classifier, Image enhancement techniques

and Edge Detection is presented in Section II. In Section III

the proposed techniques- DCT fusion, DCT subset selection,

Local Histogram Equalization and Scale Normalization using

skin detection are explained. Sections IV and V contain the

experimental results and conclusion.

II. FUNDAMENTAL CONCEPTS

A. Discrete Cosine Transform (DCT)

DCT is a powerful feature extraction tool. DCT�s excellent

energy compaction property enables it to concentrate most of

the signal information in its low frequency components [3].

The use of DCT in facial recognition has been described by

several research groups [4], [5].

The general equation for the DCT of an N # M image

F(u; v) is defined by Eq. 1.

F(u; v) = #(u)#(v)

NX??1

x=0

MX??1

y=0

cos(a):cos(b):f(x; y) (1)

where a =

#u

2N

(2x + 1) ; b =

#v

2M

(2y + 1)

f(x; y) is intensity of pixel in row x and column y; u =

0,1,...,N-1 and v = 0,1,...,M-1 and #(u), #(v)are defined as :

0 # u # N ?? 1; 0 # v # M?? 1

#(u) =

( p1

N

q ; u = 0

2

N; 1 # u # N ?? 1

#(v) =

( p1

M

q ; v = 0

2

M; 1 # v # M?? 1

B. Binary Particle Swarm Optimization (BPSO)

Swarm optimization introduced by J. Kennedy and

R. Eberhart is a computational paradigm based on the

collaborative behavior and swarming inspired by biological

entities like fish school or bird flocking [6].

Computation involves a population (swarm) of processing

elements called particles. Each particle is a potential solution.

The system is first initialized with a population of random

Fig. 1. Proposed Face Recognition system

solutions and searches for optima by updating generations.

Scatter index is used as the fitness function. Fitness function

is the metric to evaluate the best probable optimum solution.

Each particle in the search space evolves its candidate solution

over time, making use of its individual memory and knowledge

gained by the swarm as a whole.

Binary PSO algorithm has been developed in [7]. It is a

discrete version of Particle Swarm Optimization (PSO) in

which each particle�s position is a string of 1�s or 0�s. The

positional values are determined by the sigmoid function and

probabilistic rule. The particle velocity function is used as a

probabilistic function for position update. Potential solution

is represented as a particle having positional co-ordinates

Xti

= [xi1;xi2;:::;xiD] in a D dimensional space where i denotes

the particle number and t represents the iteration number. Each

ith particle maintains a record of the position of its previous

best performance in a personal best position vector Pbesti.

An iteration comprises evaluation of each particle and then

stochastic adjustment of its velocity Vt

i= [vi1;vi2;:::;viD] in

the direction of its own previous best and the best previous

position of any particle in the neighbourhood. The best

position of any individual in the whole swarm is stored as the

global best position Gbest. PSO is described by the following

velocity and position update equations:

Vt+1

i = w:Vti

+c1:rand:(Pbesti??Xti

)+c2:rand:(Gbest ?? Xt

i )

(2)

where w = inertia weight, c1 = cognitive parameter, c2 = social

parameter.

Xt+1

i = Xti

+Vt+1

i (3)

for i =1 to P; P = number of particles. If r is a random number

between 0 and 1, the equation that updates particle position

is:

Xt+1

i = 1 if r <

1

1+e??Vt+1

i

; else Xt+1

i = 0: (4)

The Binary PSO Algorithm is:

1) Initialize w, c1, c2

2) Initialize particle positions Xt

i and velocities V t

i

3) Repeat steps 4 to 8 for a fixed number of iterations

4) For particles from 1 to P do steps 5 to 8

5) If fitness of Xti

>fitness of Pbesti then update personal

best positions Pbesti = (Xt

i )

6) If fitness of Xt

i > fitness of Gbest then update global

best position Gbest = Xt

i

7) Update Velocity Vector using Eq. (2)

8) Update Position Vector using Eq. (3) and Eq. (4)

9) Get Gbest

10) End

BPSO algorithm is used as an optimization technique,

important features among the extracted features which are

capable of differentiating one subject from another are

selected. The main advantage of this technique compared to

techniques like ANN and GA is that the time required to

achieve the same performance is very low because the number

of samples is very less.

C. Euclidean classifier

The Euclidean distance classifier is employed to measure

similarity between the test and reference vectors in the image

gallery. A reference vector is obtained by multiplying the

feature vector from DCT fusion and the Gbest vector from

BPSO. In a K-dimensional space, if pi and qi are the

co-ordinates of two points of the feature vector in the ith

dimension of training image and testing image respectively

then, the Euclidean distance between the points is given as

D(p; q) =

vuut

XK

i=1

(pi??qi)2 (5)

D. Image enhancement techniques

1) Gamma Intensity Correction (GIC): GIC is a non linear

operation used to control the luminance of the image [8]. It

compensates for the badly illuminated images by increasing

the overall brightness of the image.

2) Log Transform (LT): In this technique dynamic range

of an image is compressed due to the inherent nature of the

logarithmic function. The transform when applied to images,

replaces each pixel value by its respective logarithmic value.

E. Edge Detection

In an image, edges characterize object boundaries which

are useful for segmentation, registration, and identification of

objects. Various edge detection methods differ in the type of

smoothing filter applied and the way the measures of edge

strength are computed.

1) Laplacian of Gaussian Filter (LoG): The LoG filter

measures the second spatial derivative of an image. Laplacian

of an image highlights regions of rapid intensity change. It is

applied to the smoothened image to compensate for noise.

2) Sobel Filter: In this method at each image point, the

gradient vector points in the direction of largest possible

intensity increase, and the length of the gradient vector

corresponds to the rate of change in that direction.

III. PROPOSED TECHNIQUES

A. Feature extraction using DCT Fusion based on facial

symmetry

The human face is structurally left-right symmetrical

contentwise though not function wise or size wise. This

bilateral symmetric property of the face can be used to

recognize a person. This technique is based on the premise

of bilateral symmetry of the human face.

Feature extraction using DCT is done by transforming the

image into its frequency domain subsequently extracting low

frequency components. But in DCT fusion technique the image

is split vertically into two parts to ensure low frequency

components of both the parts are given equal importance.

Features from each DCT subset matrix is then converted to a

row vector using raster scanning method and the two vectors

are merged together to form a single row vector and stored in

the face feature gallery. Fig. 2 illustrates this technique.

The obvious question is why constrain the division of the

image into two halves than four or more parts? The presence

of occlusion, rotation and the shift present in the image

limits further division. This causes disproportionate division of

image content resulting in different extracted features causing

a drop in success rate. However, splitting the image vertically

into two parts even in the presence of slight occlusion results

in the image content being equally distributed. Extraction of

features from the two parts now has a lesser probability of

deviant features being extracted from other images of the same

subject.

Vertical split is preferred than horizontal, the former uses

the property of bilateral symmetry of the human face. In

case of horizontal split low frequency components constitute

features like hair which supersedes the distinguishing regions

in the upper half of the image thus hampering the performance

of the FR system. Equal emphasis on low frequency

components is achieved than the conventional method resulting

in improved performance.

B. Optimal DCT subset selection

Dimensions of the DCT subset matrix selected plays a vital

role in selecting most discriminative features. Also, the number

of features selected and hence the recognition time is directly

proportional to the dimensions of the selected matrix.

The upper left corner of the DCT matrix contains the

low frequency components of an image (Ref. Fig. 2).

Conventionally, a square matrix is chosen in the upper left

corner of the DCT matrix [9]. Choosing DCT subset matrix

dimensions proportional to aspect ratio of the image selects

better features. (For example, for a 112 # 46 image, a DCT

subset size of 12 # 6 is chosen.)

Fig. 2. Feature extraction using DCT fusion technique

C. Local Histogram Equalization (LHE) to achieve

illumination invariance

Histogram Equalization (HE) is a global contrasting

technique for an image. An image has pixels of varying

intensities from 0 to 255 (for a gray image). The pixel intensity

variations are not uniform for a badly illuminated image. These

varitions are remapped by spreading out the most frequent

pixel intensity values to render image of better quality.

In Local Histogram Equalization, a window size is defined

which divides an image into various regions. Zeros are padded

on all sides of the raw image prior to the processing and

regions are formed by tracing the first pixel of the image.

In every region, pixels with low intensity values are replaced

with higher intensity values. Probability distribution of pixel

intensity values in each region is computed following which

cumulative density function is computed to determine the most

likely pixel intensity value that replaces the central pixel value.

Histograms of the original image and processed image are

shown in Fig. 3.

D. Scale Normalization using skin detection

Scale normalization is used to remove shift variance in

images. Color images are usually represented in RGB system.

YCrCb system is an alternate way to represent color image.

The conversion from RGB to YCrCb is given by Eqs. (6�8)

Y = 0:299R + 0:587G + 0:114B (6)

Cr = 0:5R + 0:41869G ?? 0:08131B + 128 (7)

Cb = ??0:16874R ?? 0:33126G + 0:5B + 128 (8)

The YCrCb color system is suitable for skin extraction. Skin

detection is performed using Eqs. (9,10) as indicated in [10]

133 # Cr # 173 (9)

77 # Cb # 127 (10)

(a) Original image (b) Histogram of image (a)

(c) Image after LHE (d) Histogram of image (c)

Fig. 3. Local Histogram equalization of a given image

Fig. 4. Scale normalization using skin detection

After the skin detection a binary pattern of the detected

skin is formed. Scale Normalization is achieved by cropping

outermost edges of the binary image. The process is illustrated

in Fig. 4.

IV. EXPERIMENTAL RESULTS

To establish the performance of the proposed technique, two

parameters are defined: Recognition Rate (RR) and number of

features. Recognition rate is defined as

Recognition Rate =

Correctly identi

ed test images

Total no: of test images

#100%

and the number of features selected are the number of

DCT co-efficients after applying BPSO which are used for

recognition using the euclidean classifier. The parameters for

BPSO are swarm size = 30, w = 0.6, c1 = 2, c2 = 2 and

number of iterations = 100.

Three benchmark face databases ORL, Extended Yale B

and Color FERET are used to evaluate the performance

of proposed concepts using MATLAB [11]. ORL database

Fig. 5. Comparison of DCT fusion technique with normal DCT computed

over the entire image

Fig. 6. Comparison of the features selected in DCT fusion with normal DCT

contains images with slight pose variations, Extended Yale

B database contains illumination variant images while Color

FERET database contains both pose and illumination variant

images. We have used only Fa, Fb subset from Color FERET.

A. Experiment 1: ORL Database

This database comprises of 40 distinct subjects, each subject

having 10 different images [12]. The original size of each

image is 112 (height) # 92 (width) pixels. Image resolution

is reduced to half using bicubic interpolation.

1) Comparison of DCT fusion with normal DCT: DCT

matrix dimensions are chosen such that the number of features

for both the techniques are equal. Matrix dimension of 12#6

is used for both halves for DCT fusion while 12#12 is used

in the conventional technique. Refer Fig. 5.

2) Optimum DCT subset matrix dimension: Recognition

rates obtained for different DCT subset matrix dimensions

are listed in Table I. It is observed that DCT matrix size

proportional to aspect ratio of the image gives higher RR.

3) Comparison of number of features: To compare the

number of features, recognition rate obtained by both

techniques should be the same. 12#6 is chosen as the subset

DCT matrix dimension for each half in the fusion technique

while 24#24 for the conventional method. The number of

features selected is shown in Fig. 6.

4) Comparison with other FR techniques: DCT fusion

technique is compared with several standard methods and

results are tabulated for training to testing ratio of 5:5 in Table

IV and training to testing ratio of 9:1 in Table V.

B. Experiment 2: Extended Yale B Database

The original database can be found at [17]. The database

used here contains images of 28 different subjects, pose 0,

subset 5 totalling to 532 images, 19 images per subject. The

TABLE I

DCT FUSION APPLIED TO DIFFERENT DCT

SUBSET MATRICES IN ORL DATABASE

DCT RR Features

dimensions in % selected

6x3 90.41 31

6x6 91.25 55

12x6 93.75 99

12x12 93.25 153

24x12 92.50 304

TABLE II

DCT FUSION APPLIED TO DIFFERENT DCT

SUBSET MATRICES IN YALE B DATABASE

DCT RR Features

dimensions in % selected

6x4 86.29 41

12x8 96.51 126

12x12 97.29 183

18x12 99.06 254

18x18 98.81 356

TABLE III

DCT FUSION APPLIED TO DIFFERENT DCT

SUBSET MATRICES IN FERET DATABASE

DCT RR Features

dimensions in % selected

6x3 98.62 30

6x6 97.75 56

12x6 99.60 102

12x12 92.50 180

24x12 94.00 338

TABLE IV

COMPARISON OF DCT FUSION TECHNIQUE

WITH STANDARD TECHNIQUES IN ORL

DATABASE FOR TRAINING TO TESTING RATIO 5:5

Methods Average

used RR in %

KSen1-RS (M) [13] 93.35

KSen1-RS (E) [13] 94.22

PSen1-RS (M) [13] 94.50

Psen1-RS (E) [13] 95.00

Proposed method 95.40

TABLE V

COMPARISON OF DCT FUSION TECHNIQUE

WITH STANDARD TECHNIQUES IN ORL

DATABASE FOR TRAINING TO TESTING RATIO 9:1

Methods Average

used RR in %

ICA [14] 93.80

Gradient direction [15] 95.75

Correlation filters [16] 96.25

Eigen faces [14] 97.50

Kernel Eigen faces [14] 98.00

Proposed method 98.50

TABLE VI

COMPARISON OF DCT FUSION TECHNIQUE

WITH STANDARD TECHNIQUES IN EXTENDED

YALE B DATABASE

Methods Average

used RR in %

LTP (L1) [18] 94.00

LTP (CHI) [18] 94.10

R-LDA [19] 96.48

LBP (DT) [18] 97.20

LTP (DT) [18] 97.20

Proposed Method 99.04

Fig. 7. Pre-processing steps in Extended Yale B database

Fig. 8. Comparison of the features selected in DCT fusion with normal DCT

in Yale B database

image size is 480 # 640 pixels. Image resolution is reduced

to

?? 1

8

#th using bicubic interpolation. Pre-processing steps are

shown in Fig. 7.

1) Comparison of DCT fusion with normal DCT technique:

Matrix dimension of 18#24 is used for both halves for DCT

fusion while 18#12 is used for the conventional technique.

The comparison is shown in Fig. 8.

2) Comparison of preprocessed images with raw images:

The DCT subset matrix dimension is chosen as 18#12 each

for both cases. The results are shown in Fig. 9.

3) Optimum DCT subset matrix dimension: DCT fusion

technique is applied to different dimensions for training to

Fig. 9. Effectiveness of the pre-processing steps in Yale B database

testing ratio of 3:16 and their values are tabulated in Table II.

4) Comparison with other FR techniques: DCT fusion

technique is compared with several standard methods and

results are tabulated in Table VI.

C. Experiment 3: Color FERET Database

The Color FERET (Face Recognition Technology) is a

standard dataset which is available at [20]. For our testing

purpose, a separate database is created choosing Fa and Fb

images from 80 individuals, Fa indicate regular frontal image,

Fb indicate alternative frontal image (with a different facial

expression). The size of original image is 384 # 256. Image

resolution is reduced to

?? 1

4

#th using bicubic interpolation.

Pre-processing steps are shown in Fig. 10.

1) Comparison of DCT fusion with conventional method:

DCT matrix dimensions are chosen such that number of

features for both techniques are equal. Matrix dimension of

12#6 is used for both halves for DCT fusion while 12#12 is

used for the conventional technique. (Fig. 11 (a))

2) Comparison of pre-processed images with raw images:

The DCT subset matrix dimension was chosen as 12#6 each

for both cases. The results are shown in Fig. 11 (b).

3) Optimum DCT matrix dimension: DCT fusion technique

is applied to different dimensions for training to testing ratio

Fig. 10. Pre-processing steps for FERET images

(a) Comparison of DCT fusion

technique with Normal DCT in

FERET database

(b) Importance of pre-processing

steps using DCT fusion with size

12x6 each in FERET database

Fig. 11. Experiments on FERET database

TABLE VII

COMPARISON OF DCT FUSION TECHNIQUE WITH STANDARD TECHNIQUES

IN FERET DATABASE FOR FA, FB IMAGES

Methods Top RR

used in %

PSen1-RS (M) [13] 74.50

PSen1-RS (M) [13] 76.50

Projection approach [21] 83.00

KSen1-RS (E) [13] 86.25

CS2 [22] 86.50

PSen1-RS (E) [13] 88.80

SVD-Based FLDA [23] 90.50

DFT-Based extraction [24] 97.00

Proposed Method 99.60

of 3:16 and their values are tabulated in Table III.

4) Comparison with other FR techniques: DCT fusion

technique is compared with several standard methods and

results are tabulated in Table VII.

V. CONCLUSION

The proposed extraction technique using DCT fusion was

applied to ORL, Extended Yale B and Color FERET databases.

Recognition rate of 98.50% for a training to testing ratio of 9:1

per subject for ORL database, 99.04% for 3 training images

and 16 testing images per subject in case of Extended Yale B

database and 99.60% for Color FERET Fa, Fb database were

obtained respectively. Thus it was conclusively established that

the proposed technique accomplishes better performance.

The importance of pre-processing steps in Face Recognition

was clearly indicated by comparing the performance of FR

system for raw images and pre-processed images. Techniques

like DCT subset matrix selection based on the aspect ratio

of the image, Local Histogram Equalization to eliminate

illumination variation and scale normalization based on skin

detection have improved the recognition rate significantly,

simultaneously reducing the number of features and hence the

computational time for the FR process. Further the recognition

rates were compared with different techniques and for all three

databases.



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now