本文设计了一个个性化电影推荐系统.众所周知,现在电影资源是网络资源的重要组 成部分,随着网络上电影资源的数量越来越庞大,设计电影个性化推荐系统迫在眉睫.所 以本文旨在为每一个用户推荐与其兴趣爱好契合度较高的电影.
论文首先阐述推荐系统的研究现状以及意义,随后介绍了相关的推荐算法,重点介绍 协同过滤算法,并对系统实现所需技术进行了研究,接着介绍了整个推荐系统的实现,最 后对整个项目进行了回顾与总结.
本系统包含电影前端展示界面、电影评分板块、推荐算法的实现以及后端数据库的设 计.其中实现推荐算法是整个电影推荐系统的核心.系统采用由grouplens项目组从美国著名 电影网站movielens整理的ml-latest-small数据集,该数据集包含了671个用户对9000多部电 影的10万条评分数据.首先将该数据集包含的全部文件经过筛选重组之后存储到建好的数 据库中,并将数据集按一定比例划分为训练集和测试集,对训练集进行算法分析生成Top-N 个性化电影推荐列表,然后在测试集上对算法进行评测,至少包括准确率和召回率两种评 测指标.
协同过滤算法是推荐领域最出名也是应用最广泛的推荐算法.所以系统拟采用两种协 同过滤算法给出两种不同的推荐结果,一种是基于用户的协同过滤算法,另一种是基于物 品的协同过滤算法,用户可以根据两种推荐结果更加合理的选择合适的电影.系统采用了 改进之后的ItemCF-IUF和UserCF-IIF算法,对计算用户相似度和物品相似度的计算都做出 了改进.最后通过计算两种算法的准确率(Precision)、召回率(Recall)和流行度从而对系 统进行评测、并比较了两种算法各自的优势和劣势.实验证明,改进后的算法比原始的协 同过滤算法推荐效果要好,准确率更高.
整个系统涉及到的编程语言包含Python、Html5、JQuery、CSS3以及MySQL数据库编 程.用到的框架是Django重量级web框架,通过该框架连接系统的前、后端.用户首先需要 填写用户名、密码以及邮箱注册系统,然后才能登陆推荐系统.进入首页后会看到8个电影 分类,包括恐怖片、动作片、剧情片等.用户需要给自己看过的电影进行评分,评分起止 为0.5-5.0分,共10个分段.每评价一部电影就要点击一下提交按钮,将所评分的电影的 imdbId号以及对应的评分存入数据库中.用户点击“推荐结果”按钮,系统就调用推荐算法 遍历数据库所存数据,得出推荐列表之后将结果反馈给浏览器,同时调取数据库所存电影 海报图片进行展示.用户点击自己登陆的昵称,会跳转页面显示自己已经评价过的电影.
本文还分析了系统的需求,并对需求进行相关设计,最后用Django框架实现了该系统, 并给出了系统所用的主要数据表展示以及各个功能界面的展示.
I
Abstract
This paper designs a personalized movie recommendation system. As we all know, nowdays, film resources are an important part of network resources. The number of film resources on the Internet is increasing. Designing a personalized movie recommendation system is imminent. Therefore, this project aims to implement a personalized movie recommendation system, recommending movies for each user in accordance with their interests.
The paper elaborates the research status and significance of the recommendation system firstly. Then it introduces the related recommendation algorithm, focuses on the collaborative filtering algorithm, and studies the required technology of the system implementation. Then it introduces the implementation of the entire recommendation system, finally reviews and summarizes the whole system.
The system includes the front-end display interface of the movie, the movie scoring board, the implementation of the recommendation algorithm, and the design of the back-end database. The implementation of the recommendation algorithm is the core of the entire movie recommendation system. The system plans to adopt the ml-latest-small dataset organized by the grouplens project team from the famous movie site movielens in the United States. This dataset contains 671 user ratings data for more than 9,000 movies. Firstly, the csv file included in the data set is stored in the database. The data set is divided into training set and test set. Algorithm analysis of the training set generates Top-N personalized movie recommendation list, and then the algorithm is evaluated on the test set, there include at least two indicators of test: accuracy and recall.
Collaborative filtering algorithms are the best known and most widely used recommendation algorithms. Therefore, the system proposes two collaborative filtering algorithms to give two different recommendation results. One is a user-based collaborative filtering algorithm, and the other is an item-based collaborative filtering algorithm. Users can make more reasonable choices based on the two recommended results. The right movie. The improved ItemCF-IUF and UserCF-IIF algorithms are used in the system to improve the calculation of user similarity and item similarity. Finally, the system is evaluated by calculating the precision, recall and popularity of the two algorithms, and the advantages and disadvantages of the two algorithms are compared. Experiments show that the improved algorithm is better than the original collaborative filtering algorithm and the accuracy is higher.
The programming languages involved in the entire system include Python, Html5, JQuery, CSS3 and MySQL database programming. The framework used is django's heavyweight web framework, connecting the system's front and back ends via the Django framework. The user first needs to fill in the username, password, and email registration system before logging in to the recommendation system. After entering the front page, you will see 8 movie categories, including horror films, action films, drama films, etc. Users need to rate the movies they have seen. The score starts from 0.5-5.0 points, a total of 10 segments. Each time you evaluate a movie, you must click the submit button to save the imdbId number of the movie you are rating
and the corresponding rating into the database. When the user clicks the “ recommendation
result” button, the system invokes the recommendation algorithm to traverse the data stored in
III
Design General Description
the database, and after the recommendation list is obtained, the result is fed back to the browser, and the movie poster picture stored in the database is retrieved for display.When the user clicks on the nickname that he or she login, he will jump to the page to show the movie he has already evaluated.
This article also analyzes the requirements of the system, and related design of the requirements. Finally, the system is implemented using the Django framework, and given the main data table and function interface display .
Keywords: Movie recommendation system; Collaborative Filtering; criterion; Based on neighborhood recommendation; Personalized service
目录
i