Credit Card Fraud Detection Using Weighted Statistical

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

Abstract—As the world is moving for a compact, convenient transaction, more and more people are using credit cards for the same. Credit cards can be used for both online as well as regular transactions. Its increase usage leads to higher fraudulent rates and there comes the question of security. In this paper, we have proposed a credit card fraud detection algorithm in which a dispersion matrix of weighted deviations is created and using this matrix, threshold is calculated. The incoming transaction is checked against this threshold. We present our idea in great details to show the effectiveness of our proposed approach.

Keywords—credit card; fraud detection; weighted attribute; dispersion matrix

Introduction

The use of credit cards has drastically increased over the last few years due to its various advantages. Some of them include its simplicity, compactness, convenience, flexibility and universal acceptance. According to some research studies, around 27 percent of the world’s consumers shop via the internet [1]. According to the 2010 U.S. Census Bureau, there is a projected increase in the total number of credit card holders, a projected decrease in the number of credit cards and a projected increase in credit card purchase volume [2]. As the number of credit card users increase worldwide so does increase the chances of physically stealing the card or stealing the credit card information and subsequently misusing it.

Credit cards can be used mainly for two types of transactions i) online transaction and ii) regular transaction. The later involves the physical presence of the customer at the transaction site. It could be turned dangerous if the customer does not keep an eye on the card and the person handling the card "skims" the information through electronic capture. However, for the former transaction, card’s physical existence doesn’t matter. Only some important information (card type, credit card number, security code, expiry date and credit card holder’s name) is required to make payment. This kind of transaction is usually done via internet or over the phone. Since this kind of transaction only requires the important details, chances of committing fraud in this case are higher. Sometimes the card holder doesn’t know that the card’s important information has been stolen so the only way of detecting fraud in this case and then letting the card holder know about the status is to analyze the past transactions and accordingly look for the inconsistencies that have occurred due to current transaction. The other approach includes statistical modeling. Various works have been done in this field. Section second gives the overview of the related work done in credit card fraud detection. The third section describes our proposed algorithm.

related work

A number of fraud detection techniques have been proposed and implemented in the past which majorly include neural network, data mining, meta-learning, very fast decision tree, hidden markov model, fuzzy logic, game theory etc.

Ghosh and Reilly[9] carried out a feasibility study based on neural network on the credit card transactions for the Mellon bank. It consisted of training the system on a sample of good and bad transactions. Then, the test was carried out on a blind set of samples which were not labeled to detect the efficiency of the system. The results indicated a decrease in the total credit card losses from 20% to 40%.

Aleskerov, fieisleben and Rao[10] in their work named ‘CARDWATCH’ used data mining approach for credit card fraud detection. In this piece of work they use three parameters to determine a customer’s spending pattern- category of item purchased, time since the last transaction and transaction amount. This GUI based system produced convincing results: a fraud detection rate of 85% and a genuine transaction detection rate of 100%.

Syeda, Zbang and Pan[11]developed a parallel granular neural network to speed up the process of data mining and knowledge discovery process for credit card fraud detection. It gives fewer errors if we have a large set of past training data.

Stolfo[4] used the technique of meta-learning for the detection process. A meta-learning system is based on the collective knowledge of local classifiers. Thus, a meta–classifier is based on two or more local agents (base classifiers). They apply four base classifiers and use the class-combiner strategy to select the best classifier for meta-learning. Fraud catching rate (True Positive rate) and false alarm rate (False Positive rate) are the metrics considered for evaluating the fraud.

Brause, Langsdorf and Hepp[12] have used advanced data mining techniques and neural network algorithms to do the same. A parameter called confidence limit is determined to indicate a legal/fraud transaction. Transactions with a confidence for fraud of higher than 10% are accepted to be revised or aborted. This approach has resulted in high fraud coverage. Phua , lee et al.[13] have produced an exhaustive survey of all the technical articles published and proposed for credit card fraud detection and gives an insight into the various techniques.

Fang Yu and wang[3] have given a different data mining approach to detect credit card fraud. It is based on the outlier detection which is used to find a part of objects which differ significantly from other objects in their common behavior and characteristics. The results show that outlier mining can detect credit card fraud better than clustering method when fraudulent transactions are far less than normal data.

Srivastava, Kundu et al.[1] have proposed hidden markov model for detecting credit card frauds. First, it is trained with normal behavior of cardholder considering purchase type as hidden state and purchase amount as the observable state. The incoming transaction will be accepted only if it has a probability higher than the threshold probability.

Proposed algorithm

In our proposed approach, we have taken into account a number of transaction parameters of a credit card holder. The weight is assigned to each individual parameter based on its priority in determining credit card fraud. The parameters considered are depicted in Table 1 in their increasing priority.

Parameter Number

Parameter Name

1

Frequency of transaction

2

Amount of transaction

3

Frequency of transaction in a month

4

Amount of transaction in a month

5

Frequency of transaction in a week

6

Amount of transaction in a week

7

Frequency of transaction in a day

8

Amount of transaction in a day

9

Number of overdraft transactions

Parameters of a transaction according to their priority.

Since the algorithm works at the backend so the bank provides us with the credit card holder’s transaction details which include card number, date and time of transaction, purchase type, purchase amount. Through these details, we calculate the parameters for every transaction in our algorithm.

Data Normalization

Before creating data set of n transactions with m attributes each, every attribute is transformed and normalized so that they can be used for further calculations. In our algorithm, we have used the standard method of standard deviation to normalize data.

T matrix denotes the transaction matrix for every user where each row specifies a particular transaction and each column specifies an attribute thus, making it an nXm matrix.

t11 t12 ……………….. t1m

t21 t22 ………………... t2m

T= …... ..….

….. ……

tn1 tn2 ………………… tnm

Now this matrix is standardized or normalized.

TÌ…j specifies the mean of the jth attribute. Aj denote the standard deviation:

T̅j= 1/n(∑ni=1 tij ) Aj=1/n(√(∑ni=1 ( tij - T̅j )2))

tij* = ( tij - TÌ…j ) /n

tij* where i=1 to n and j=1 to m denote the new normalized values of transaction matrix T.

Applying statistical deviation algorithm

After standardizing the transaction parameters and obtaining the desired transaction matrix, we now calculate the dispersion matrix. It denotes the distance between any two transactions for a particular user. It is represented by D.

d11 d12 ……………….. d1m

d21 d22 ………………... d2m

D= …... ..….

….. ……

dn1 dn2 ………………… dnm

The distance between any two transactions is calculated using Euclidian distance with the weights assigned to each parameter. Let U and V be two transaction sets with m attributes each. The distance between these two transactions is given as:

DIS(U,V) = √(∑ni=1(wi2)(Ui-Vi)2)

Where wi is the weight of each parameter. These weights are according to their priorities of determining fraudulent transactions.

Si= ∑mj=1dij

λn = (Si – Smin)/ Smin

Si is the sum of all the distances in a particular row where I varies from 1 to n.

λn is calculated for the new transaction that comes in the system here, the last row of matrix. It is checked against the threshold value, if it is greater than threshold value it is considered a fraudulent transaction.

Structural design of the algorithm

Fraudulent transaction detection process follows the following steps:

Transaction data set

Normalization



Deviation calculation

Dispersion Matrix   

Setting threshold

Compare threshold with the calculated value

Steps in the algorithm processing

Fraudulent transaction detection process follows the following steps:

Obtain the required transaction parameter with the specified attributes. This data is then normalized and transformed to get the transaction matrix. This transaction matrix is used in further processing.

Now calculate the dispersion matrix which is the weighted distance between every two transactions. Si= ∑mj=1dij is used to get the sum. The larger sum indicates a greater distance between the two transactions.

After calculating the dispersion matrix, a threshold value is generated through the formula λn = (Si – Smin)/ Smin. This threshold value is compared against the pre-defined threshold in order to check for fraudulent transaction.

FUTURE WORK

Our future work would contain implementation of the proposed algorithm. We would generate the database of random values with the specified parameters of each transaction.



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now