Can cloud and machine learning deliver personalised shopping?At Percona Live in Amsterdam, Enterprise Times talked with Pavel Genov, CTO of Pepper Media Holding GmbH. Genov describes the company as: “A social eCommerce platform. A shopping community where visitors and the community can share where they can buy a certain product or where they should not buy a certain product. The goal is to help people make better buying decisions.”

Coupon sharing and shopping tip sites are not uncommon. But how many actually use them? According to Genov, the company gets: “25 million unique visitors and over 500 million views per month.” While that sounds huge, this is not about a single website but is, instead, traffic spread over 11 sites around the world.

Focused on the consumer not the retailer

Pepper is focused on the consumer not the retailer. This is about providing a platform for people to share information rather than retailers to get rid of old stock. This means it has a different engagement model to the likes of Groupon. It means that retailers are not paying money to Pepper to promote deals. For the consumers using the site, it means that posts about deals are simply about that, deals. There is no hyping by a retailer.

Pavel Genov, CTO of Pepper Media Holding GmbH
Pavel Genov, CTO of Pepper Media Holding GmbH

This does not mean that there is no relationship with retailers. As Genov said: “Sometimes they don’t like us, the vendors, sometimes they love us. So far there is no other service they can use to drive the traffic that we send them.” From an eCommerce perspective, traffic is king and Pepper is the kingmaker.

ET asked Genov how they dealt with the risk of disinformation and libel. It’s a tricky problem, especially as Genov admits that Pepper does not interact with any discussion. These are owned by the consumers which is why Pepper sees itself as a social eCommerce site.

Pepper takes a different approach with scammers. It uses a mix of automation and manual techniques that include fraud detection and ratings around users. It also monitors how a deal is then promoted on the site by other users. Genov believes that this helps to prevent scamming.

ET asked Genov what makes good deal? His response – heatness. He explained this saying: “People vote whether that’s a hot deal or not a hot deal. If it’s a good deal then it pops up more and more and it gets more and more audiences.” Hot deals have heat thus the company has coined the term heatness.

What sort of technology challenges is Genov dealing with?

With so many visitors and page views, Pepper gathers around 1 terabyte of data every day. While it sounds a lot, it is not a number that is subject to significant variation, according to Genov. He does not expect this to suddenly jump to 2TB or more in a short space of time. But even if it did, the architecture that Pepper uses is designed to scale.

Early on, Pepper used a single MySQL database to support all of its different sites. This left it with a monolithic application and dataset that created its own problems. Among those were issues with backup scripts and updates. The latter is a real issue as users heat up a deal. Genov said that there could be more than 20-25,000 people increasing the temperature of a deal. That’s a lot of traffic to capture at the database.

To support this, Genov and his team have moved away from a monolithic app to microservices. Each of these has its own database which means 50-60 different databases across the 12 websites. To do this, Genov and his team have had to create a replicable architecture that they can deploy quickly. One challenge it dealt with early this year was the problem of running on-premises. It has now completed its move to AWS.

Genov sees a number of benefits from cloud. It provides the ability to scale on demand and to deploy new sites quickly. It also means that the team that Genov has had for five years can continue to run the business without having to bring in new people.

What new technologies is Genov considering?

Genov is looking at what he needs to do for the future. The biggest challenge is around the database technology. Big data, a columnar database and the ability to process large amounts of data. Genov said: “Aurora is an amazing product but it doesn’t work right out of the box. We are using it on some projects so we gain a little more knowledge. The next thing is columnar databases. We’ve tried Redshift, its pretty good and we’ve tried ClickHouse which is one of the best open source databases for on-premises. We’ve decided to use BitQuery, its where the BI knowledge within the team is present.”

One of the key targets is to understand personalisation. At the moment, every visitor to every site gets the same page when they arrive. This is a problem as it means there is no personalisation. According to Genov: “All the personalisation of the different deals is based on their temperature. With the high number of users per day we can’t anymore deliver the best service to the users of the website. We have so many deals you will not scroll down to find exact ones. We just need to reorder a little bit what you want to see.”

Enter machine learning

Reordering the data comes with risk. You might be able to filter the data for each user but that leaves them always getting the same deals. It is the challenge with connected fridges that simply reorder what has been used. There is no variety. But can machine learning solve this? Genov sees two very distinct issues.

The first is: “For what are we doing this?” Genov explained the challenge this creates. “What is the main KPI so we can see the model as a success? Will that be a purchase? No! We do not optimise anything for a purchase because the goal of what Pepper does is help with a better buying decision. A good scenario is are you interested? We have KPI about how you consume the website so we know if you are interested in that particular content or not.”

The second issue that Genov identified was: “How much of the content will we personalise? It is not like a regular shop where the items are available all the year. It varies a lot. If it is available it will be by some marketing campaign or something. Maybe they will run out of stock so we need to maintain a really good selection of items, whether they are available or not. We only personalise the last 3-4 days of content. It’s like a constant streaming of content.”

Most machine learning solutions rely on vast historical data sets. Focusing on just the last 3-4 days is interesting. The model will be constantly changing putting pressure on the algorithms that Pepper develops. It will also be interesting to see how effective Genov and his team can make the machine learning solution. Arguably, it will have to completely restructure its decisions on a continuous basis.

Enterprise Times: What does this mean?

Building a social eCommerce business is no trivial matter. Nor is running a company with 12 sites, across multiple countries, getting 25 million visitors and 500 million page views per month. That it was only this year that Pepper moved from on-premises to the cloud says much about the architecture that Genov and his team have put in place.

The question is always about the future. While the daily volume of data is unlikely to change much, continuing to meet user expectation comes with challenges. Which database technology will Genov eventually settle on? Will machine learning deliver the personalisation the way he wants? Does the existing team have the data science knowledge to build that machine learning solution?

It will take time to know the answer to these questions. For now, however, it is clear that Genov and his team are willing to look at the options available to them.

Percona customers talk about database challenges

Open Source databases increasing in popularity


Please enter your comment!
Please enter your name here