Recommendation Techniques For Items Computer Science Essay

Published Date: 02 Nov 2017

A. Gourav Jain, B. Nishchol Mishra, C. Sanjeev Sharma

School of Information Technology

Rajiv Gandhi Proudyogiki Vishwavidyalaya, Bhopal, India

[email protected], [email protected], [email protected]

Abstractâ€•The amount of information available on the internet is increasing rapidly and there is a challenge of providing relevant and proper information to the users because whenever the user wants to search data of his interest, he/she has to search in whole databases, which is very tedious and time consuming. So a system is needed which recommend useful information based on user interest named Recommendation system. A Recommendation System is a powerful tool which makes decision and gives ranking to the most popular items based on user preference. Various algorithms were proposed by different researchers for recommendation of web pages and items some of the most popular efficient algorithms were discussed in this paper. This paper gives us snapshot work accomplished in the field of recommendation of web pages and items.

Index Termsâ€• Collaborative Filtering, ,LDA,Naive Bayes,EPRM,TyCo.

1. INTRODUCTION

Whenever a user wants to search data of his interest, he/she have to search in whole databases, which is very tedious and time consuming. So a system is needed which recommend useful information based on user interest through mining the unnecessary information. A recommendation system which is a type of information filtering system that built a model from the characteristic of an item according to the rating and the prediction that user gives to an item. The main motivation behind using the recommendation system is that it is based on real activity, reduced overloading and requires less organizational maintenance. Any recommendation system consists of two basic entities user and items which helps in decision making. A recommendation system is based on the ranking phenomena of items where ranking is defined as the relationship between a set of items such that, for any two items, the first is either 'ranked higher than, ranked lower than' or 'ranked equal to' the second however highest ranked items get more preference over the lowest ranking items.

The Output of recommendation can be a prediction or recommendation, based on the type of the input given to the system. Input may be either rating or data, where rating is defined as the suggestion by user on items and data which may be age, gender and education of users. Prediction is the opinion of users to any item which gives lower error rates and recommendation is the one which is most liked by users. Recommendation systems found their application in the field of e-commerce and internet.

Data mining is the process of analyzing data from different perspectives and summarizing it into useful information. This technique is used in recommendation for analyzing the data and uses them for finding patterns in a set of data. This paper focus on the various recommendation algorithms applies in Web Pages and Items separately. The motivation behind this paper is to analyze the various algorithms used for the recommendation of Web Pages and Items.

The remaining part of this paper is organized as follows: Recommendation of Web Pages is described in section II, Section III defines the algorithms used for recommendation of Items, and finally section IV concluded the papers.

2. RECOMMENDATION FOR ITEMS.

The recommendation of items can be done using several methods. Below from section 2.1to 2.7 we describe each of the methods in brief.

2.1Collaborative Filtering

A collaborative filtering is a technique in which every user interacts with other user and providing an option to the set of product for establishing the quality of product. But the traditional collaborative filtering system has a drawback that they are not scalable to large data set and does not provide the quality of recommendation user. So , to overcome above mentioned problems a new recommendation algorithm is needed which produced high quality recommendation even for large scale data .To address these issues Sarwar. et. al [1] proposed a new algorithm named item based collaborating filtering recommendation algorithm. In the proposed algorithm author initially compute the user-item matrix, for identifying different relationship, and these relationships indirectly compute the recommendation for users. The author analyses the different technique for calculating item similarity and based on which recommendation is performed. The experimental result reveals the fact that item based algorithm gives better result that user based recommendation algorithm

2.2 Ranking by Naive Bayes

sometimes the problem in suggesting the true popular item to the users, can give more importance to some items, but may affect the ranking of the resulting items for other items, whose ranking is dependent on the resulting items. So, Gouthami. et. al [2] proposed an algorithm called Naive Bayes for ranking and suggesting popular items. It is basically a model based recommender and provide fast and highly scalable model. For ranking, the Naive Bayes algorithm uses the Bayes theory concept which is based on conditional probabilities. Naive Bayes algorithm work on every attribute contained in data treats as each is equally important and independent of each others. The algorithm constructs a decision tree, which helps in ranking the popular items.

Naive bayes algorithm construct a suggestion set which have a less or equal (if the total no of items is less) number of items.in naive bayes algorithm item is checked in suggestion set,if item is present in suggestion set no problem,but in case if item is not presented in the suggestion set than requested item from the entire set is replaced with the suggestion set items who have a less probability.

2.3 Improving Aggregate Recommendation Diversity Using Ranking-Based Techniques:

Recommender systems are becoming increasingly important to individual users and businesses for providing personalized recommendations. Techniques used for recommendation mainly emphasis on improving the accuracy while other aspects such as diversity of recommendations, have not been considered. So, Gediminas.et.al [3] proposed various item ranking techniques that produced recommendation with high diversity along with accuracy. Traditional recommender system have high accuracy because the system rank the relevant items for each user in descending order of their predicted rating and recommended top N items, gives high accuracy. While in this method author consider additional factors, such as item popularity which increases the diversity when ranking the recommendation list. For gaining diversity real world rating data sets and different rating prediction algorithm is used. The benefit of using ranking techniques is that they offer flexibility to system designers, in terms of parameter and can be used in conjunction with different rating prediction algorithms. They are also based on scalable sorting based heuristics and thus are extremely efficient. The various proposed method not only shows improvement in diversity about 20-25 percent with 0.1 percent of accuracy loss, but also have several other advantageous characteristics like these techniques are extremely efficient, parameterizable and flexible along with advantage of diversity improvement.

2.4 Method of collaborative filtering based on uncertain user interest cluster.

Recommender system helps in decision making and one of the most powerful and popular tool in electronic commerce. With the help of recommendation system user take the decision easily about what music is to listen, what news to read, what items to be buy. Collaborative filtering based recommendation system suffers from the problem of data scalability hence Xiang. et. al[4] proposed a collaborative filtering based recommendation method using clustering techniques where the uncertain interest of users is considered because computer logs take down data that have uncertain features These Uncertain features were solved with the help of clustering algorithm. Weighted factor is used to measures the quality of clustering result, which further helps in making a new improve method of collaborative filtering recommender system based on weight factor. The experimental results shows that the proposed method outperforms in comparison to traditional method.

3.5 recommendation of movie

In this section we discuss the different methods use for recommendation of movies.

2.5.1 Typically based Collaborative filtering recommendation

Current CF suffers from such problem of data sparsity, recommendation inaccuracy and big error in prediction. To deal with these problems Yi Cai [5] proposed a novel method named TyCo which uses the idea of Object typicality from cognitive psychology, where neighbours of users were find based on user typicality degrees in user groups. The TyCo method cluster all items into several item groups, and create a user group corresponding to each item group. These user groups contain multiple users having a different typicality degree in each of user groups. With the help of user typicality matrix, userâ€™s similarity is measures based on typicality degree in all user groups, which gives neighbour of each user. Based on the neighbours rating of at user on the items, author predicts the unknown rating of user on an item. Tyco shows better performance in accuracy (improvement of 6.35%),sparse training data(improvement of 9.89%)and has lower time cost than traditional CF algorithm in movie lens data sets apart from this the algorithm obtain more accurate prediction in comparison to less number of big â€“ error prediction.

2.5.2 Contextual Walk

The Recommendation System is useful in solving the problem of information overloading, but most of the existing recommended system only focussed on rating and not on the context. The Context can be time and location of the recommendation, movie information like an actor, director, writer etc. So, Toine Bogers [6] proposed a context walk which is a recommendation algorithm that includes a different type of contextual information. Context walks easily allow adding the contextual feature like time, mood information and social network information. The main purpose of proposing this algorithm is to remove the two drawbacks. Firstly contextual information is difficult to collect and secondly it is difficult to produce a computable formalization of contextual information. Toine applies the context walk algorithm on the datasets of movie and context is connected by a link that gives a contextual graph on which random walk is applied. The advantage of the context walk is that it can support many recommendation tasks with the same random walk model without retaining the information such as recommending movie for the group of users or tag recommendation.

2.5.3 ERPM

One of the novel method for recommendation is collaborative filtering (CF) which is classified into two categories namely user based CF and item based CF , but there is a certain limitation in both of classification. The user based CF, lacks in scalability and item based CF suffers from the data sparsity problem and fails in understanding the dynamic changes of the relationship between items. Jian. et.al [7] overcomes the above mentioned problem by proposing a method based on probability model named as ERPM (Easy Recommendation based on Probability Model). EPRM generate the prediction more efficiently and accurately as compare to collaborative filtering algorithm and also saves the storage dramatically. Instead of comparing user or item similarity, EPRM find out the probable rating with the help of probability model, in which the probability of each user with respect of rating assigned to movie and probability of each movie with respect of rating assigned by different users is computed and with the help of these calculated probabilities, predicted rating is find. For checking the quality of recommendation, mean absolute error (MAE), which gives better prediction accuracy of recommendation than CF algorithm.

2.5.4 Typically based Collaborative filtering recommendation

Current CF suffers from such problem of data sparsity, recommendation inaccuracy and big error in prediction. To deal with these problems Yi Cai [8] proposed a novel method named TyCo which uses the idea of Object typicality from cognitive psychology, where neighbours of users were find based on user typicality degrees in user groups. The TyCo method cluster all items into several item groups, and create a user group corresponding to each item group. These user groups contain multiple users having a different typicality degree in each of user groups. With the help of user typicality matrix, userâ€™s similarity is measures based on typicality degree in all user groups, which gives neighbour of each user. Based on the neighbours rating of at user on the items, author predicts the unknown rating of user on an item. Tyco shows better performance in accuracy (improvement of 6.35%),sparse training data(improvement of 9.89%)and has lower time cost than traditional CF algorithm in movie lens data sets apart from this the algorithm obtain more accurate prediction in comparison to less number of big â€“ error prediction

2.6 tag recommendation

2.6.1 Personalised tag recommendation

The Tag recommendation helps in bridging the semantic gap between Human and features of media object by allowing user to add more tags which provide a feasible solution for content based multimedia information retrieval. ATag based recommendation proposed by Jun et. al.[9] is used for retrieving information in a easy and convenient way. An Online social network is produced based on social relationship information of user. Author proposed a topology based on network where nodes are used for characterizing userâ€™s social influence. Based on tagging history and the latent personalized preference which is learned from those have a most influenced in the userâ€™s social network, recommendation is performed. The experiment is performed on large scale real world data and result shows that the proposed method can outperform the non-personalised global co-occurrences method and other two state-of-art-personalised method using social network.

2.6.2 LDA

Now a days tagging system have major contribution for recommendation on web. Tags basically help in searching and other tagâ€™s which is belonging to a topic can be recommended for the new resources. For recommending tags of resources, Ralf. et.al [10] proposed an approach based on Latent Dirichlet Allocation (LDA). LDA is a generative model that shows why some parts of data are similar with the help of observation explained by unobserved groups. Goal of this method is to overcome the problem of cold start ("reduces the usefulness of tags in particular for resources annotated only by a few users") for tagging new resources. LDA takes three input parameters the number of terms to represent latent topics, the number of latent topics to represent a document, and the overall number of latent topics to be identified in the given corpus. This method show better precision and recall in comparison of association rules where for a high threshold value precisions increases and recall decreases.

2.7 venue recommendation

Noulas. et. al [11] proposed a random walk method for ranking the items (venues). Previously proposed collaborative filtering (CF) algorithm have problems in working with mobility data, because the result is not satisfactory while for online recommendation scenario collaborative algorithm is best. This random walk approaches overcome the problem of relationship between check-in, social and spatial data with others. It has linked structure of connected items in which every item has transitioned probabilities, which helps for the random walker to choose the nodes. Random walker moves from one node to another node according to transition probabilities. Random walk stays on every node for different amount of time, after that steady state comes for each node. This is the steady state probabilities of each node. These probabilities the output of the Random walk model .Than random walk start from a node and find out the constant probabilities for a node , if found then return to the target node. Node those are closer to target node rank high compare to other nodes. Rank each place in decreasing order of steady state probabilities. Random walk performs shows 5-18% improvement compared to other algorithms.

2.8 video recommendation

With the increase in the number of items (videos) in YouTube or any other site, the choice for the user to select items (videos) of their interest also increases. But, unfortunately gives rise to the problem for the user to select the items (videos) of their interest in it.So, for the removal of this problem. Baluja. et. al [12] proposed a novel method named Adsorption which provides personalized video suggestion for users based on the analysis of the user video graph with the help of random walk. The adsorption algorithm is used for classification and learning when of the labelled object and a graph structure defines the universe of labels and unlabeled objects. The proposed algorithm constructs a personalized page which provides user recommendations as per their viewing habits along with the latest and most popular videos. With the help of adsorption algorithm the author tries to improve the efficacy of suggestion in You Tube. The author also performs a recommendation test done in three months snapshots of the live data from YouTube.

3. RECOMMENDATION FOR WEB PAGES

The recommendation of web pages can be done using several methods. Below from section 3.1 to 3.4 we describe each of the methods in brief:

3.1 Recommendation of web pages for online users using web log data:

For better understanding of the user, concept of data mining is introduced which is not got by the statistical user and on line analytical processing (OLAP). Data mining is a process of discovering useful information from the large amount of data stored in data bases, data warehouses and other information repositories. Similarly finding the useful information from the web data is known as web mining .web mining is classified into three categories web structure mining, web content mining and web usage mining. V. Chitraa et.al[13] proposed a recommendation system based on the web usage mining which mines the user access patterns from usage logs that keep the record of every yearâ€™s clicks and shows user interest in a web sites. Web usage mining consists of three distinct phase: pre-processing, pattern discovery and pattern analysis. To increase the efficiency of mining, pre-processing is performed which takes the usage of data recorded in server log, and convert them into data abstraction. Data abstraction is necessary for the pattern discovery, which achieved by extracting, decomposing, combining and deleting raw data. Data mining technique apply on the pre-processed log data in pattern discovery phase, to identify the some useful pattern. In pattern analysis phase the main aim is to analyzing some of mode rule that have exhumed, to find out the patterns and interesting rules. This paper focused on the web user clustering and recommendation for personalizing the users. The fuzzy clustering is used to handle the ambiguity in the data during pattern discovery phase and LCM (longest common subsequenceâ€™s algorithm)[ ] is used for recommendation which classify the current user activities. The recommendation given by authors can help web sites owner to provide personalised service to users for their effective browsing.

3.2 Page ranking algorithm

Larry Page and Sergey Brin (CoFounder of Google) proposed a page rank algorithm [14] for computing the ranking of web pages. They proposed a page rank algorithm for their search engine. The proposed algorithm focused on the link structure of the web. The importance of any page is dependent on the appropriate rank score. The Rank score is based on the inlinks of a page, where inlinks means from where the links came. As compared to the traditional method, inlinks from good pages carry more weight, because as the number of sites on a page increases, the importance of a page is also increased. The page rank depends upon the back links, if the sum of the rank of back links on any page is more, than that page have more page rank, compare to others. The page

rank method can be best demonstrated by the figure below: Untitled.png

Fig1: Demonstration of Page Rank Algorithm.

As seen from the figure there are fewer links to C still Page C has a higher Page Rank than Page E because one link to C comes from an important page (B) and hence is of high value.

Some of the most popular techniques which help in page rank computation are as follows first is the breadth first search, which is used to figure out the structure of the network. Second, the sparse matrix generally used for less memory storage without affecting the ranking of the matrix. Third is compressed row vector which is helpful to accelerate the process of multiplying the matrix. However by using compressed row vector only L addition and L multiplication is required in place of N addition and 2N multiplication, where L and N are the size of a row vector. Apart from Page rank algorithms there are many other algorithms for the recommendation of web pages which were described as follows:

3.3 Flexi rank algorithm

Hits algorithm used for ranking the web page have one drawback, that they do not consider textual content, So Debajyoti. et. al [15] proposed a Flexi Rank algorithm, which focused on the syntactic classification of web pages. The Classification used in FlexiRank is not dependent on the semantics of page content. It takes a proper class of pages for a given query based on the user requirement. Classification will be like an indexed page, home page, article, advertisement pages etc. In place of working on a number of pages, FlexiRank give a way to rank a few number of pages. In Flexi Rank changes in user query gives the changes in weighted, for example article type page authority weight is high, while for advertisement type pages hub weight is high. The main purpose of FlexiRank algorithm is to provide the flexibility to the user. Flexi Rank provides flexibility in two areas namely property selection and in finding the weight of selected properties. Flexi Rank provides accurate results along with flexibility because of the ease of change in weight of the parameter which is used for ranking like relevance weight, hub and authority weight and link analysis of pages.

3.4 Weighted Page Rank

Wenpu Xing and Ali Ghorbani [16] proposed a weighted page rank algorithm which is the extension of the page rank algorithm. In the proposed algorithm it assumes that the larger rank value is assigned to more popular pages. The Popularity is depending on the number of incoming and outgoing links. This algorithm does not divide the rank value based on the number of outgoing links as page rank algorithm did. Outgoing links get the weight according to their Popularity. The important parameter of this algorithm is back links and forward links. Weighted page rank algorithm classified the pages into four categories based on the relevancy to a given query. These are very relevant pages (VR), relevant pages (R), weak relevant pages (WR), irrelevant pages (IR). The weighted page rank algorithm applies the relevant rule which finds out the relevancy score of each page in the list of pages, which differentiate the WPR and PR. The experimental results concluded that weighted page rank algorithm gives more relevancy score and it has less Complexity (< O (log N)) comparing to page rank algorithm (O (log N)).

Conclusion:

As the information in the world is increasing day by day some measure must be needed which helps user in finding the valuable information among them. One such measure is Recommendation which helps user in finding data of their interest. These systems are popping up everywhere from movies to news ,to travel and leisure. They provide the valuable personalized information that can greatly influence the way we use the web. Any recommendation system consists of two basic entities namely users and items which helps in decision making. Recommendation system is actually based on real activity where the system records and then gives recommendation based on actual user behaviour. The Recommendation system is not based on guess mark but on an objective reality. Recommendation system is no longer a novelty because it is applied in almost every domain, two such domain were web pages and items. This survey paper is primarily focus on these two domains and work done by different researchers regarding the recommendation of these which further helps in decision making.

ACKNOWLEDGMENT

The authors would like to thank the anonymous reviewers for their detailed, valuable comments and constructive suggestions.

Our Service Portfolio

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

Do not panic, you are at the right place

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now