Recommender Model Training Flow
A recommender system is a technology that is deployed in the environment where items (products, movies, events, articles) are recommended to users. To build a recommender system there is necessary to have a dataset with items, users and interactions of users with items.
The process of interaction between users, provided service and recommender system may be described as below.
A dataset with items also may be called a feed. Basically "product feed" is a catalog of products where each item has a set of features which describes them. The structure is flexible, you can add any extra features. Basic "product feed" structure is the follow:
- sku (<product_id>_<variant_id>) [basic, not_none]
- title [opengraph basic, not_none]
- type [opengraph basic, not_none]
- url [opengraph basic, not_none]
- image [opengraph basic]
- description [opengraph optional]
- brand [basic]
- categories [basic, array]
- active [basic]
- price [optional]
- discount [optional]
- created_at [optional, default=autocomplete]
- updated_at [optional, default=autocomplete]
Recommender model is trained on the base of usage events. Usage events are the records of how the items have been used by users. The users may display, hover, click, like, share or buy a product. All users' actions with items are collected, stored and analyzed by recommender system. Usage events are essential for the recommendation algorithm which looks for correlations between users and usage events.
Recommender algorithm assigns to all items some score according to user behavior. Then the system displays to users the items with the highest score.
Recommender algorithm analyzes the behavior of users in conjunction with feed. Final product recommendations cannot be filtered and displayed to user on the base of features which are unknown. For example, suppose that product feed doesn't contain information about if a product is new or not. In that case recommender system can't show to users only new products because such feature doesn't exist in feed. So that it will became possible to show users only new items, the client must provide additional information about products so that it will become clear if the product is new or not.
Requirements For Product Feeds
Recommender model matches the offers from client's catalog to user's behavior. The product feed information is the foundation for creating recommender system. It is necessary that the clients provide as much information about each product as possible.
Required attributes for the recommender
Optional attributes for the recommender
- sku - Used to refer to different versions of the same product, often to denote different sizes or colors.
- asset_id - Autogenerated feed identifire, numeric type.
- title - The product name ( the title of a book, magazine, DVD, CD, game, etc.).
- type - Type of the product.
- url - Link to the product where a potential buyer can purchase it.
- image - Link to an image where buyer may see the product.
- description - Additional description of the product.
- brand - Product manufacturer’s name, brand name or publisher’s name.
- categories - Category hierarchy for the product. The category hierarchy usually does not change over the lifetime of a product.
- active - Availability for sale.
- price - The base price of the product.
- discount - Available discount of the product.
Product Feed Example
Feeds can be provided to engineering team in the TXT, CSV, JSON, XLS, XLSX, HTML, XML formats via updatable link or API.
<title>Crema para manos</title>
You can add the fields that are necessary.
Unlike product feed, content feed is usually provided in RSS-format. Mainly RSS is used for publishing frequently updated information like blog entries, news headlines, audio, video, etc.
Content feed, as well as product feed, is saved in database and have almost the same fields. The main difference between content feed and product feed is hidden in column "type". Content feed is marked in this column as "article" and product feed – as "product". There are some differente in optional fields. For example articles do no have 'price' and 'discount' fields, but 'author' do.
Content Feed Example
Feature "image" of content feed contains the header image of an article (product feed contains the image of product). The full text of an article is not saved in database. Instead there is saved the link to the full text. Besides column "data" contains description of an article which is usually its first paragraph.
Feature "active" is defined for content feed manually. By default an article is considered inactive if it was published more than three months ago. When article is inactive it is not shown in recommender. This status of an article is updated once a day.