Recommender System with Python Code Implementation-Part 1 — Data Science Horizon

7 min readJan 15, 2024

Introduction to Recommender System

The digital world around us is pouring an unprecedented and ever-increasing amount of information on us, and it is very difficult to sort out the elements that matter to us. Since ancient times, information has been recognized as a value. With today’s sheer volume, the real value and service are to be able to sort the relevant information for the user.

The purpose of recommender systems is to accurately filter this flow of information, helping us process only relevant content and minimizing the time it takes to do so. The proliferation of smartphones has made this help permanently available to us in many areas. In my personal experience, this comfort is not yet enough for us in many aspects of our lives, such as shopping, services, tourism, etc.

The Goal of the Recommender System

In terms of its function — in an extremely simplistic way — the goal of the recommender system is to sell more products. This can lead to sales of tangible products, and even to how much time someone spends viewing the content on a particular website or how many people click on an article. The system aims to work out the users’ preferences and needs when interacting with them and “understand” what they need. By increasing their satisfaction, consumer trust and loyalty can be built which can create profitable cooperation in the long run.

By building enough trust and familiarity with your users, you can sell a much wider range of products through a given channel. If we don’t know someone, it’s less risky to offer only the most popular products, which makes it harder to make a mistake. However, for a well-known user, a product ignored by most people can be recommended if it fits into his or her profile. In addition to the general recommendation function, the system can perform many other special tasks. You can sort items by perceived relevance, as well as find all “good” or “useful” ones.

The results generated are often recommendations for the user for things they need/want. It can recommend a series of items that may be useful to the user. For example, after purchasing a gaming keyboard mouse, the customer may be interested in purchasing a gaming headset. Most of the time, users are unaware of that need/want until they’ve been recommended.

Recommender systems are often seen as a “ black box “, and the model created by these large companies is not very easily interpretable.

Use-Cases of Recommender System

There are various use cases for recommendation engines. Two major use cases are:

A. Personalized Content

It helps to improve the on-site experience by creating dynamic recommendations for different kinds of audiences like Amazon, Netflix, YouTube, etc.

B. Better Product Search Experience

Helps to categorize the product based on its features. E.g.: brand, quality, color, style, etc.

Types of Recommendation System

There are three types of recommendation systems:

Content-based Filtering

Content-based filtering is a type of recommender system that guesses what a user may like, depending on his/her activity, and recommends the same to a user. It generates recommendations by using keywords and attributes assigned to objects in a database (e.g., items on an e-commerce site) and matching them to a user profile.

The user profile is created based on data derived from a user’s actions, such as past purchases, ratings (likes and dislikes), reviews (+ve and -ve) downloads, and items searched for on a website or visited too frequently, and clicks on product links. The basic idea of the content-based method is to build a model, based on the available “features”, that describe the elements and observed user interactions.

For example:

Consider an example of a movie recommender system, the additional information can be, the age, the sex, the job, the director, the producer, the main actors, the duration, or other characteristics of the movies i.e. the items.

Still considering users and movies, we can also build the model in such a way that it could provide us with an insight into why this is happening. Such a model helps us make new predictions for a user easily, by just looking at that user’s profile and based on its information, determining relevant movies to suggest.

Assigning attributes

Content-based filtering relies on assigning attributes to database objects, as shown in the diagram below, so the algorithm knows something about each object. These attributes largely depend on the recommended product, type, service, or content.

Assigning attributes can be a monumental task and companies rely on domain experts to manually assign attributes to each element.

For example, Netflix has hired a team of screenwriters to review/rate shows on categories ranging from shooting locations and actors to plotlines, story, tone, and emotional effects. The resulting tags are then algorithmically combined to group films together that share similar aspects.

Building a user profile

User profiles are another important component of a content-based recommender system. Profiles contain the database objects the user has interacted with, like- recently purchased, browsed, shared, read, watched, or listened to as well as their assigned attributes. Attributes that appear across multiple objects are given more weight than those that appear less frequently.

This helps to establish importance because not all object’s attributes are equally important to the user. User review is also critical when weighing items, that is why websites that provide recommendations, continuously ask you to rate products, services, or content. Based on attribute weights and histories, the recommender system creates a unique preference model for each user. The model consists of user likes and dislikes attributes based on past activities and is weighted by importance.

User models are compared against all database objects, and then the scores are assigned, based on their similarity to the user profile.

Pros of Content-based filtering

Content-based filtering is widely used because of the below-mentioned benefits:

1. Recommendations are highly relevant to the user

Content-based recommenders can be highly tailored to the user’s interests, including recommendations for niche items because this method works by matching the attributes of a database object with the user’s profile.

2. Recommendations are transparent to the user

Content-based recommenders are highly transparent to the user, bolstering their trust level in offered recommendations.

3. Implementation Ease

The data science behind a content-based filtering system is relatively simple, compared to collaborative filtering systems that aim to mimic user-to-user recommendations. The actual task of content-based filtering is to assign the attributes.

4. New Items Recommendations

New items may be recommended before being rated by a large number of users, as opposed to collective filtering.

5. No issue “cold start” problem

Collaborative filtering-based models face a potential cold start problem when a new website or the community has few new users and no user connections. Although content-based filtering requires few initial inputs from users to initiate recommendations, the quality of the initial recommendations is generally better than the collaborative system that requires the addition and correlation of millions of data points before optimization.

Cons of Content-based filtering

Some drawbacks of content-based filtering are:

1. Incorrect or inconsistent Attributes

Since attributes can be subjective, many may be mislabeled. A process that ensures consistent and accurate application of attributes is paramount. Otherwise, the content-based recommendation system will not work properly.

2. Lack of Novelty and Diversity.

There’s more to recommendations than relevance. Suppose you liked the movie Dhoom2. Chances are more that you’ll like Dhoom3, too. But it is more likely to predict this without a recommender system. Therefore, for a recommender system to be useful, it must come up with diverse and unexpected results.

Python Code Implementation

For implementation, we are using the movie dataset from Kaggle. Click here for the dataset.

We have produced similar recommendations for a film, e.g. “Splash”. Click here for the code.

Stay Tuned

We will continue with collaborative filtering and the Hybrid Recommender System in upcoming posts.

Keep learning and keep implementing!!

Originally published at https://www.datasciencehorizon.com on January 15, 2024.