Portfolio | LeverX

Data Science Solution Development for the Fintech Sector

Written by Danuta | Jun 7, 2022 9:20:39 AM

Merchant recognition solution based on machine learning.

Customer

Data collection service company.

Pain

Insufficiency of data for the attraction of relevant audiences and maintenance of the desired level of customer retention.

Solution

Data science application based on machine learning that allows banks to extract valuable data about merchants for further analysis, which results in stronger strategy-building.

Project Details

Emerline was involved in the development of a data science solution aimed at providing European banks with detailed, categorized information about the use of their products — debit and credit cards. The goal was to establish a mechanism that would automatically detect valuable information for banks about merchants based on customer payments and then split this data into categories. In this way, a bank would be able to determine their key merchants and receive insights into their customer behavior and accompanying risks.

Our team was responsible for the creation of the ML algorithms that would ensure the extraction of the following data:

  • Merchant’s URL scoring based on different criteria
  • Extraction of valid information
  • Product recognition
  • Merchant categorization by accompanying risks

The process of categorization had to be built with respect to the list of categories provided by the client.

One more challenge was to optimize the process of URL scoring in such a way to make it as close as possible to how humans select websites when browsing for information.

How Does The Solution Work?

The principle of how the delivered solution works can be described as follows:

  1. A customer makes a purchase with a debit or credit card.
  2. Brief information about the purchase (where it took place, the name of a merchant, MCC code, etc.) is sent to the bank.
  3. From a banking server, the info transfers to the server of the data science solution we worked on.
  4. Servers browse for the merchant’s information to define — with the help of machine learning — what products or services the merchant provides.
  5. In accordance with the extracted data, the solution determines categories related to the merchant and sends this information to the bank.

Technologies Used

MERCHANT CLASSIFICATION

  • Gensim
  • Doc2vec
  • tf-idf
  • logistic regression

URL SCORING

  • XGBoost
  • LightGBM
  • Log Regression
  • Optuna

PRODUCT RECOGNITION

  • Default Python
  • Numpy

Results

Promptly addressing challenges during development, including those related to invalid URLs, and compiling a list of products for system recognition, our team provided the client with a well-thought-out solution that gathers important information about merchants and customer behavior. With such a system, the client can strengthen their position in the market of data collection service providers, offering it to banks that can use it to extract useful information.