Recommendation Using Regression Model Computer Science Essay

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

Abstract:

A system which suggest list of most popular items to a set of user on the basis of their interest called recommendation system. ERPM is one of the movie recommendation techniques used for recommendation, which overcome the limitation of scalability and sparsity of recommendation system. ERPM is one of the easiest methods used for recommendation but prediction generated by ERPM method based on probability model are less accurate and taking more time to calculate. To overcome this problem a novel method based on regression model is proposed which improves the prediction accuracy along with speed, named CRRM (Category based Recommendation using Regression Model). Performance of our method is evaluated by MAE and show 35% improvement in rating from 100000 rating.

Keywords: ERPM, Regression model, MAE, Recommendation system.

1.Introduction:

With the increase in the amount of information across the world it is necessary to process these data, more quickly in the exigent environment. For processing the data in the real world scenario two terms, data mining and recommendation system is plays a key role. Data mining is defined as the process of mining the unnecessary information and summarized into useful form while recommendation system is the suggesting those information to user and incorporate the data mining techniques like clustering, association rule which helps in decision making. So we can say that recommendation system is one which process enormous range of these data and suggest useful and interesting data among all data, to a set of users on the basis of user interest.

One of the major applications of recommendation system is in online shopping where user find’s a item which he/she want to purchase. An item has many features and it is responsibility of recommendation system to suggest the useful item to a user on the basis of their feature requirements.

A recommendation system is prominent part of networking environment and applies in many different areas like tag recommendation, music recommendation, item recommendation and movie recommendation etc. among them movie recommendation is the one of the valuable area in recommendation.

Jian et.al[7] proposed a ERPM algorithm for the recommendation of movie named ERPM. In ERPM, the techniques used for recommendation numerate the prediction based on probability model. However ERPM provide the easiest way for computing the prediction form whole databases but has a problem of poor prediction. To overcome this problem a novel method based on regression model named CRRM is proposed. Instead of computing the prediction from whole database the proposed algorithm computes the prediction from the movie category.

Proposed algorithm calculate the rating given by the user for a movie, and check the difference between the rating generate by proposed CRRM and existing ERRM method from actual rating (present in databases) individually. Minimization in difference show that proposed method predicts more accurately and enhances the performance of recommendation system.

Remainder of section described as follows section 2 described the related work ,section 3 describe the Regression based recommendation approaches, section 4 described the experimental result and performance study and last section 5 concluded the paper.

2. Related work:

In this section we briefly describe some of the research literature related to collaborative filtering, recommendation system, and regression model.

Collaborative filtering method is the fundamental method used for recommendation system and applies in various domains. Numerous researches have been done in the field of recommendation system using collaborative filtering algorithm, and show some improvement from previously implemented collaborative filtering algorithm used for recommendation. Primary recommendation system based on collaborative filtering algorithm proposed by tapestry[8].collaborative filtering method find out the similarity between user’s and item’s ,for recommending the items to similar user. Tapestry systems [8] are better for small organisation where people know each other but for large organisation or community it is not possible to known every person very well. Later various methods have been proposed in the field of recommendation system using collaborative filtering by different researcher.

One of the techniques used for recommendation system is clustering which create the cluster of similar user based on the similar user preferences. For the user those belong to several clusters, in that case prediction is evaluated by taking the average across the cluster based on degree of participitation.

Xiang cui[2] proposed a method of collaborative filtering algorithm based on uncertain user interest cluster ,in which cluster is made on based on uncertain features .with the help of some method using clustering algorithm author solved the uncertain features by computing the entropy between two cluster and get stable class. The author improved the limitation of traditional CF method on the basis of trustworthy degree of uncertain user interest.

Another method based on collaborative filtering method used for recommendation is TyCo. TyCo is a novel method proposed by Yi cai[1] named as Typicality based collaborative filtering recommendation which overcome the problem of data sparsity, recommendation inaccuracy and big error in prediction by taking the ideas of object typicality from cognitive psychology and proposed a novel typicality based collaborative filtering method. It’s find the neighbours of user based on user typicality degree in user group and Outperforms many CF based recommendation with the 6.35% improvement in accuracy and have lower time cost.

Many recommendation systems are focused on improving the accuracy of recommendation system but diversity is another aspect which was focused by Gediminas. et. al [3].In traditional recommender system relevant item is ranked for each user in descending order of predicted rating and recommend the highly ranked itemed with high accuracy. Author used the item popularity for increasing the diversity of recommendation system and technique used for diversity improvement is flexible, efficient and parameterized.

Along with the diversity of recommendation system, contextual information (information related to recommendation) is one more important aspect which was not focused more .Time and location of recommendation, information about actor, director or writer etc are the context of recommendation system. Toine Bogers [6] proposed a context walk algorithm on movie for recommendation which overcome the drawback of collection of context information and generating a computable formalization of contextual information. Tonie applies the random walk on the contextual graph in which context is connected by link.

Noulas. et. al [5] proposed a method for ranking the venues which overcome the problem of working with mobility data in collaborative filtering method. random walk approaches overcome the problem of relationship between check-in,social data and sptial data with others. Item is connected by linked structure and each have transitioned probability, which helps random walk model to select a items. After selecting a item, random walk stay on each node for different amount of times and gives the steady state probabilities of each node. The output of the random walk model is a steady state probability and rank items according to decreasing order of steady state probabilities.

Now a day’s recommendation system is growing in every field and tag recommendation is one of the applications of the recommendation system. For searching the topic, tags recommendation is useful and other tag related to topic can be recommended for new resource. Useful Information retrieving is a big task in social network so Jun. et. al. [4] proposed a method for retrieving information in easy and suitable way named personalised tag recommendation . Author firstly create the network than proposed a topology ,based on tagging history and latent personalised preference and recommend the tag for user’s who have most influenced on other user. Result is better than the non-personalized global co-occurrence method even when experiment is performed on large scale real world data.

Jian. et.al [7] overcome the limitation of collaborative filtering method like scaliblity and sparsity of recommendation and proposed a method ERPM based on probability model. ERPM compute the predicted rating by find out the probability of user of watching the movie for each rating and similarly find out the probability of movie gets each rating by users. from comparison of MAE ERPM Method outperform from the collaborative filtering method and show better prediction accuracy.

3.Regression based recommendation approaches:

[A] Data modeling and representation:

Typical recommendation system which provide the E Business facility, contain the list of m users represented by set [u1,u2,u3…..un] .user select a item from the list of m items represented by a set[i1,i2,i3….im] and relationship between user item is represented by a n×m matrix. Entry of matrix is the rating given by the user uiєU for items ijєI and represented as ri,j means rating given by user j on the item j.

i1

i2

ij

…

im

u1

r1,1

……

……

r1,j

……

r1,m

u2

r2,1

r2,2

……

……

……

…

……

……

……

……

……

……

ui

ri,1

……

……

ri,j

……

ri,m

…

……

……

……

……

……

……

un

rn,1

r2,n

……

……

……

rn,m

Table 1: representation of user item matrix.

[B] What is regression model?

A model that has both deterministic as well as probabilistic components called regression model. in deterministic model With the help of one variable, value of other variable can be predicted and represented by y=f(x) which means value of y is determined based on x .The model is called deterministic because value of y is totally depend upon the value of x, but in real life there is less chance of determining y totally based on x, hence we use probabilistic model. Probabilistic model or probability model are used to predict the value of variable on the basis of previous information and represented by Y~p(y) where Y is randomly generate form the probability distribution p(y) ,but probability does not exactly tell what the value of Y will be. . Like deterministic model, Regression model are also predict the value of one variable based on other variable and combine the feature of both the model. Regression model is represented by Y ~ p(y|x), where Y is generate at random from the probability distribution for known x and value of x and y taken from the sample of object for constructing a regression model. Comparing form other model, regression model takes less time and/or money for retrieving the information for computing the prediction.

[C ]Problem Description:

Main goal of recommendation system is, suggest the available item to a listed user by user interest and generate the high quality prediction along with high speed for an active user-movie. Instead of calculating the predicate rating from whole databases in ERPM, we evaluate the rating from the category of movie using regression model.

A movie is watched by number of users and gives rating to a movie between 1 and 5.user ‘s gives rating 1 for movie which he not like and rating 5 for those movie which he like more. Quality of movie is depend upon the no of user and cannot be evaluated by the single user.

Movie m is rated by Un different users and the no of user who rates the movie m by rating 1 is represented as Q1 .Q2 is the no of user who rates the movie m by rating 2 and so on. Probability of movie Pm,r corresponding to rating is calculated by

P(m,r)= ………(1)

Where Nm=Q1+Q2+Q3……..+. Where r is the rating and m is the movie no. Similar to find out the probability of movie, now we calculate the probability of user, by counting the no of movie of category c gets rating r given by user. User Ud rated Nd different movie and Sc,1= no of movie rated 1 by user u of category c. Sc,2= no of movie rated 2 by user u of category c and so on. Probability of user is calculated by a movie category as

P(d,c,r)=)= ................(2)

Where Nd=Sc,1+Sc,2+Sc.3+………………+.rmax the highest rating given by user d for movie m and c is the category of movie.

[D]recommendation generation

CRRM method computes the prediction on the basis of movie category while ERPM method computes the prediction on whole databases.

=

Where is defined as the probability of user d gives rating r on category c of movie m and calculated as

=[Pm,r×Pd,c,r][Regression]....................(4)

Value of regression is calculated by the equation : Regression (Y’) = a+bX………(5)

Regression model find the value of regression with the help of one independent variable and one dependent variable from equation (5). Where x is the independent variable and y is dependent variable and Y’ is the regression value. Value of a and b is calculated by the equation 6 and 7.

a = -b …(6) Where = and =

b= ……..(7)

Value of a and b put in equation no 5 which produced the regression value and by using these regression value predicted rating is produced.

[E]Compare with ERPM method CRRM model have following advantages

The computed predictions on the basis of regression model in CRRM method gives better prediction accuracy and increase the speed of recommendation. The reason behind that is that it computes the predicted values for a movie from the category of movie rather than computed the value from whole databases. Second important advantage is that RRC removes the dependency upon any parameter (α and β in ERPM) and calculate the value of prediction by mathematically equation .hence predictions are improved because in ERPM, a change in value of any parameter will changes the result. From comparing from other model CRRM model takes less time and money for retrieving information for prediction.

4.Experimental result and performance study:

[A]Description of dataset used:

We used the MovieLens data set which was collected by the group lens research project by the University of Minnesota. MovieLens data set has 943 User and 1682 movies by considering only those user who rated at least 20 movies out of 50000 user and more than 3000 different movie. Data set consisting 100000 rating and rating is given between 1(for bad movies) and 5 (for good movies).

Dataset contain the users, movies and rating information in user, movie and data files respectively. User files contain the information about the user id, age, gender, occupation and zip code. Movie files contain the movie id, release date, movie title, imdb URL and list of genre and data file contain the user id ,mid ,rating and timestamp, among them the uid from the user file , mid, and movie genre from movie file and uid, mid, rating from data files are useful for our work and other will be ignored.

Dataset was converted into user movie matrix which has 943 rows and 1682 column and Entry in matrix is the continuous rating given by 943 users to 1682 movie between 1 and 5. Experiment are performed on windows 7,4GB RAM of main memory, Core i3 processor and jdk 1.7 of java.

[B]MAE Evaluation metrics :

Quality of recommendation system can be evaluated by several types of measure wherein MAE is one of the popular methods among them and used to find out the error between actual rating and predicted rating for each user movie pair.

MAE=

actual rating is denoted by the rm , predicted rating is denoted by the prm. and N is the total no of items. For better prediction by proposed recommendation algorithm ,MAE value should be low.

[C]Result:

Table given the MAE value for both method ERPM and CRRM for 100 user and 1682 movies.

parameter

ERPM

CRRM

MAE

3.364328638

2.035591765

Low value of MAE indicates the better result in prediction. This result is computed for the 100 user and 1682 movies. And show 35-40% improvement in prediction over 100000 rating.

5.Conclusion:

Recommendation system is a decision making system which suggest the most popular item (movie ,music) to user on the basis of their interest. ERPM method used for movie recommendation have problem of poor prediction and take more time for recommendation. To overcome the problem in ERPM, in this paper we proposed a novel method CRRM which predict the rating from category of movie using regression model with dynamic data changes. CRRM method show better prediction accuracy and speed up the recommendation process with experimental proof.

Refrences:

[1]. Yi Cai, Ho-fung Leung, Qing Li, Huaqing Min,Jie tang and Juanzi Li, "Typicality-based Collaborative Filtering Recommendation",IEEE TRANSACTION ON KNOWLEDGE AND DATA ENGINEERING,Jan 2013.

[2].Xiang Cui,Guisheng Yin," Method of collaborative filtering based on uncertain user interests cluster",JOURNAL OF COMPUTERS ,VOL.8,NO.1,JANUARY 2013,PP186-193.

[3] Gediminas Adomavicius, YoungOk Kwon," Improving Aggregate Recommendation.Diversity Using Ranking-Based Techniques", IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 24, NO. 5, MAY 2012.

[4] Jun Hu , Bing Wang, Yu Liu ,De-Yi Li," Personalized Tag Recommendation Using Social Influence ",JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 27(3): 527-540 May 2012.

[5]. Anastation Noulas, Salvatore Seellaoto,Neal Lathia,Cecilia Mascolo,"Random Walk Around the City :New Venue Recommendation in Location-Based Social Networks", 2012.

[6]. Toine Bogers, "Movie Recommendation uses Random Walk Over Contextual Graph", 2010.

[7]. Jian Chen ,Jin Huang, Huaqing Min, "Easy Recommendation Based on Probability Model",IEEE,2008,pp 441-444.

[8]. David Goldberg, David Nichols, Brian M. Oki and Douglas Terry, "using collaborative filtering to weave an information tapestry", communication of the ACM,Dec 1992.

[9]http://www.psychstat.missouristate.edu/introbook /sbk16. htm.

[10]http://courses.ttu.edu/isqs5349westfall/images/5349/deterministic_stochastic.htm.



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now