Product Recommendations

One of the most important features in an e-commerce retail solution is the capability to provide high-quality recommendations for buying other products/items. A system responsible for this feature is known as a Product Recommendations (PR) system. A good PR system can significantly increase the website profits by driving additional product sales for the existing customers and also by increasing the customer base. Ultimately, from the end-user point of view, the product recommendation is that feature that provides this type of information:

  • Customers who bought this also bought this;
  • This product is usually bought together with this;
  • These are related recommended products

Product recommendations systems can be found everywhere in the modern e-commerce world; below some relevant numbers (source):

  • Research conducted by Barilliance in 2018 concluded that product recommendations accounted for up to 31% of ecommerce revenues. On average, customers saw 12% of their overall purchases coming from products that were recommended to them.
  • A Salesforce study of product recommendations concluded that visits where the shopper clicked a recommendation comprise just 7% of total site traffic but make up 24% of orders and 26% of revenue.
  • The conversion rate for visitors clicking on product recommendations was found to be 5.5x higher than for visitors who didn’t click.
  • An Accenture report says personalization increases the likelihood of a prospect purchasing from you by 75 percent.
Of the toal ecommerce recenues, product recommendations accounted for up to
31%
Personalization increases the likelihood of a prospect purchasing from you by
75%

As a proof of the importance of having good product recommendation systems, a few years ago Netflix organized a challenge (the “Netflix prize”) where the goal was to produce a recommender system that performs better than its own algorithm, with a prize of 1 million dollars to win.

So let’s go together through the main concepts used in building recommendation systems, what is at the core of such systems, and why using our multi-model database – ArniaDB is the right technology choice to build a product recommendations solution.

Recommendation systems

There are three main models used to design and implement a product recommendation system:

  • Content-based filtering models. In these models the recommendations are generated based on the properties of the product/item. For example, if a user likes movies with certain actors or directors, we can recommend other movies involving the same actors/directors. This is historically the first approach used to build a PR system, and it is also today in wide use, because it is relatively easier to implement. The biggest limitation of these models is related to the fact that it does not account for complex relationships between users and products. These models are usually implemented using a relational database structure.
  • Collaborative filtering models. These models generate recommendations based on the relationship between users and items, and similarity to other users’ profiles. Users are similar if they have relationships to products in common. For example, if similar users enjoyed a particular movie, that can be a good one to recommend. This model is a more modern approach and is becoming increasingly used as it accounts better for complex relationships. These models are usually built using a graph database.
  • Hybrid models. These models are a mix of the other two models. The hybrid models are very used because they can provide the best of the two worlds. Even if they are relatively harder to implement, the additional benefits provided by hybrid/mixed models are significant.

The usual components we see in a product recommendation system are:

  • The e-commerce relational database; this type of database is at the heart of most existing e-commerce systems – Magento, Shopware etc. Most of the time these solutions use MySQL, PostgreSQL, Oracle, MS SQL Server, or other relational database solution.
  • A graph database. If you are implementing support for PR collaborative models (see below), a graph database is a must have. This type of database is the only one that can provide fast results for complex data queries – relational databases cannot really manage complex join operations.
  • An ETL system. A PR system, especially when using a graph database for the collaborative/hybrid PR models, must have its specific data populated from the primary data source (the relational database used by the e-commerce application). This means that the PR solution needs to have an ETL tool/module responsible for:
    • Populating the graph database;
    • Populating specific relational data structures needed for content-based PR models;
    • Populating back the e-commerce relational database with the recommendations data.

Depending on how often the data is updated, there are two types of PR systems:

  • Offline systems. In such systems, the data which is used to provide recommendations is updated periodically, following a fix schedule or on-demand (manual trigger for the data update operations).
  • Real-time systems. In these systems, the data which is used to provide recommendations is updated in real-time (or very close to real-time). Parallel processing is a must in these scenarios. The key technology in enabling real-time recommendations is the graph database, a technology that provides support for data queries that are almost impossible to do using a relational database structure.

Obviously, real-time systems provide the most accurate recommendations, and they should be implemented whenever possible. However, in practice there are constraints that limit the ability to always implement a real-time system due to:

  • Hardware and software capacity constraints. To update the data you need to have dedicated software procedures and sometimes, depending on the data size and on the data structure/design real-time operations are not possible.
  • The (huge) size of the data that must be updated cannot always permit a real-time update.
  • The infrastructure operating costs are too high.

Therefore, it might be a good idea to start with an offline system design, and when the system is complete, up and running to the targeted parameters, make the adjustments to transform it into a real-time system.

Why ArniaDB

There are many reasons for choosing ArniaDB as your database component for a high-quality PR system:

Technology

  • You get a top-grade graph database.
  • You get a rich built-in graph algorithms library, optimized to cover the specifics of product recommendations solutions.
  • You get a top-grade relational database which lives in the same core as the graph database. You can read more about ArniaDB here.

Performance

When you talk about PR systems, you usually talk about big data (millions of entities, with a rich collection of properties). In order to not slow down the system and to be able to provide fast recommendations, you will need a database with high performance capabilities. ArniaDB is built and optimized for dealing with big data in both its relational and graph engines.

Cost advantages

As opposed to most commercial solutions, such as Oracle or MS SQL Server and others, ArniaDB comes with very affordable pricing plans. We also offer a free community edition.

Team

Our team has extensive, rich experience in building database solutions, with a proven track of success. Most of our engineers have more than 10 years of experience in this area; we have also specialized consultants with vast knowledge in building solutions using back-end databases. Read more about our team here.

Our solutions

Create from scratch a full e-commerce solution

The assumption is that you have selected your e-commerce application solution (such as Magento) and what you need on top of it is a high-quality PR system. There are two main choices to select from:

  1. Use the common relational database of choice associated with your e-commerce application solution (MySQL, PostgreSQL etc.) and use ArniaDB graph database capabilities to implement the layer of PR collaborative filtering models.
  2. Choose ArniaDB for both your main e-commerce solution database backend (use the relational engine in ArniaDB for the core e-commerce solution and also for the content-based filtering PR models implementation), and top-out by using ArniaDB graph database capabilities to implement PR collaborative filtering models. ArniaDB is compatible with the standard SQL ANSI databases, so using ArniaDB instead of MySQL, PostgreSQL etc. is easy.

Improve your existing e-commerce solution

In this case, the assumption is that you have already in place an e-commerce solution (Magento, Shopware etc.), having a PR system in place based on a content-based filtering method, and you need is to add a high-quality PR system based on a collaborative model. There are two main choices to select from:

  1. Fully replace your current relational database with ArniaDB. ArniaDB is compatible with the standard SQL ANSI databases, so the replacement will go smoothly. Use ArniaDB graph database capabilities to implement the PR collaborative filtering models.
  2. Keep your current relational database along with your current content-based PR system and use ArniaDB graph database capabilities to implement PR collaborative filtering models.

The main differentiator between using ArniaDB and other databases is the multi-model capabilities built-in in the ArniaDB. Having both relational and graph apabilities in the same unified database engine provides important advantages toward design, implementation, maintenance, performance, and overall costs.

The current market solutions design for a hybrid model is usually the following:

But by using ArniaDB the system design is optimized for data processing, organized by specific functional responsibilities, more performant and easier to maintain. Another potential advantage is that the e-commerce relational database is not “polluted” with PR specific processing data – everything related to PR implementation/processing can be stored in a single database – ArniaDB (such as PR analytical data).

In all scenarios, our services cover all the solution implementation phases:

Analysis

We will do a complete analysis of your specific requirements and will recommend the selection of methods and algorithms that will best fit your needs. For content-based filtering implementations we will analyze the fit of the most used models, such as:

  • Vector space models
  • Probabilistic models (Bayes Classifiers)
  • Decision trees
  • Neural networks
SELECT c.CustomerId,
        p.Model,
        COUNT(p.Model) as FrequentBuy 
FROM …
GROUP BY  
	p.Model,
	c.CustomerId
ORDER BY FrequencyBuy DESC

For collaborative filtering, the analysis will target the fit for known models such as correlation-based, cosine-based, dimensionality reduction techniques, latent semantic methods, regression, clustering, neural networks, particle filtering and others.

MATCH 
(customer:Customer {name: ‘Joe’})-[:BOUGHT]->(:Basket)<-[:IN]-(product:Product)
RETURN product, count(product)
ORDER BY count(product) DESC LIMIT 10

Solution architecture & Design

Based on the analysis results, we will propose the best solution, and we will build the solution architecture and the system design. This includes the design of the data storage (graph database & relational database, as applicable).

As a side note, a good solution must address the typical issues associated with PR systems:

  • Cold-start
  • Data sparsity
  • Scalability
  • Trust
  • Synonymity

Development of the ETL module(s)

For both models, in particular for the collaborative filtering model which requires a graph database back-end, we need to select, prepare, and load the data needed to run the algorithms (i.e. populate the dedicated graph and relational storages). This is achieved by implementing an ETL (Extract-Transform-Load) module/application.

The ETL solution will be executed at the scheduled periods or continuously, for real-time solutions. For example, it is very important to have an ETL batch process that will calculate in parallel the various similarities between the system entities in scope (such as users, items) when you scale to big data (millions of entities), in order to provide fast response times.

Product recommendations engine implementation

  1. Collaborative module implementation: Graph-based algorithms implementation.
  2. Content module implementation. Relational (or NoSQL) algorithms implementation.

In this phase we will implement the product recommendations engine, using one or multiple (hybrid) model(s).

Solution maintenance and support

Our product team will ensure proper system functionality, periodic updates, solution improvements and evolution.

Not the least, as part of the solution, we will implement tools to support specific metrics and efficiency data analysis. The quality of the PR system must be constantly measured, in order to be able to act for improvements.

Other scenarios

It is not just users and products/items that can be used in a recommendation system. A recommendation system is a more general domain than just what we see in e-commerce product recommendation solutions – it is any system that provides a recommendation, prediction, opinion, or user-configured list of items that assists the user in evaluating items.
What is important is that the same theory, models, techniques, and solutions used for e-commerce PR systems can be applied in other domains and scenarios, beyond the typical e-commerce online store, such as for:

  • Data mining
  • Business intelligence
  • Online advertising
  • Content recommenders for social media platforms
  • Web content recommenders
  • Restaurants
  • Online dating
  • Media streaming

If you are interested in a recommendation system that is not used for an e-commerce solution, don’t hesitate – we can absolutely do it together, so please contact us.

Extra ingredient: ML/AI

Making recommendations is mostly about taking past behavior and preferences and trying to predict a preference. But it is not an exact algorithm. A certain degree of unknown is always present. The recommended products can depend on factors which are not easily considered in a “standard” system, such as:

  • Weather
  • Geography and culture
  • Product description keywords
  • Fashion tendencies

All these can influence the quality of the recommendations, which is measured by the sales volume increase. To build a more flexible PR system that can learn by itself over time and make connections that are not immediately spotted by the usual models/algorithms, more and more we see Machine Learning and Artificial Intelligence engines added to the PR systems.

We can put to work our rich experience in designing and building ML/AI solutions to create or enhance new or existing recommendations systems.

Summary

We have the right people and skills, a rich and very successful software development experience, the knowledge about e-commerce solutions, and the right technologies – ArniaDB in particular, to build performant and efficient recommendations solutions that will help your e-commerce business to sell more products and to increase the customer base.

Please contact us to analyze together your specific requirements and to propose the best solution for your business.