0% found this document useful (0 votes)

4 views

Comparitive Analysis of Gradient Boosting and Transformer Based Models for Binary Classification in Tabular Data

This study compares the performance of Gradient Boosting (XGBoost) and Transformer-based models for binary classification using tabular data, specifically focusing on customer churn prediction. The results indicate that while both models perform similarly, the Transformer model outperforms XGBoost in Recall by 8%, making it more suitable for applications like fraud detection and medical diagnostics. The findings highlight the importance of selecting the appropriate algorithm based on the specific data challenges and performance requirements.

Uploaded by

International Journal of Innovative Science and Research Technology

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Comparitive Analysis of Gradient Boosting and Transformer Based Models for Binary Classification in Tabular Data

Uploaded by

International Journal of Innovative Science and Research Technology

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Volume 10, Issue 3, March– 2025 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar416

Comparitive Analysis of Gradient Boosting and

Transformer Based Models for Binary
Classification in Tabular Data
A Customer Churn Case Study

Jebaraj Vasudevan1
1
Visa Inc., Atlanta, GA, USA

Publication Date: 2025/03/20

Abstract: This study compares the classification performance of the Gradient Boosting (XGBoost), and Transformer based
model with multi-head self-attention for Tabular Data. While the methods exhibit broadly similar performance, the
Transformer model particularly excels in Recall by about 8% showing that it would be better suited to applications such as
Fraud Detection in Payment processing and Medical Diagnostics.

Keywords: Transformer, Gradient Boosting, XGBoost, Tabular Data.

How to Cite: Jebaraj Vasudevan (2025). Comparitive Analysis of Gradient Boosting and Transformer Based Models for Binary
Classification in Tabular Data. International Journal of Innovative Science and Research Technology,
10(3), 466-470. https://doi.org/10.38124/ijisrt/25mar416

I. INTRODUCTION task of evaluating their performance on a Binary Churn

prediction problem using the Telco Customer Churn data [3].
Tabular data is ubiquitous in industry because it is The algorithms exhibit a similar level of performance on
inherently structured, easily interpretable, and compatible with multiple classification metrics while the Tab Transformer
a wide range of analytical and reporting tools. Its organization outperforms the XG Boost on Recall by +8%.
in rows and columns simplifies the process of data storage,
retrieval, and manipulation, which is why relational databases, Comparing XG Boost and Tab Transformer reveals
spreadsheets, and data warehouses predominantly use this distinct methodologies that cater to different aspects of tabular
format. data modeling. XG Boost, a gradient boosting framework, is
lauded for its efficiency in handling structured data. It builds
Industries such as finance, healthcare, retail,
ensembles of decision trees using gradient statistics and
telecommunications, and manufacturing heavily rely on tabular
data. In finance, for instance, transaction records, market data, regularization, resulting in robust models that mitigate
and risk assessments are typically stored in structured tables, overfitting and offer clear interpretability. This algorithm has
facilitating quantitative analyses and regulatory reporting. In been refined over years and is widely adopted in industry and
healthcare, patient records, laboratory results, and treatment research due to its computational speed and ease of
histories are maintained in tabular formats to support clinical deployment. In contrast, Tab Transformer harnesses the power
decision-making and research. Retail and e-commerce sectors of transformer architectures originally designed for natural
use tabular data for inventory management, sales tracking, and language processing. By applying self-attention mechanisms,
customer behavior analysis, while telecommunications Tab Transformer captures complex, non-linear interactions
companies employ it for billing, service usage, and churn among features, providing a deep representation of data
prediction. relationships. While XG Boost excels in scenarios where model
transparency and speed are paramount, Tab Transformer
The prevalence of tabular data across these sectors demonstrates potential in situations with intricate feature
highlights its role in enabling robust, data-driven decision- dependencies that require nuanced contextual understanding.
making and operational efficiency. Its simplicity and versatility The choice between these methods depends on the problem
make it a cornerstone of analytical workflows in both domain, computational resources, and the need for model
traditional and modern digital enterprises. interpretability versus expressive power. Both approaches offer
complementary strengths; combining them might even enhance
This case study shows a comparative analysis of XG performance in hybrid systems. Ultimately, their continued
Boost [1] and Tab Transformer [2], two of the most popular development reflects the dynamic evolution of machine
supervised learning algorithms for Tabular data. We chose the learning techniques for structured data analysis. This

IJISRT25MAR416 www.ijisrt.com 466

Volume 10, Issue 3, March– 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar416
comparative review highlights the importance of aligning  Data
algorithm selection with specific data challenges Telco Customer Churn Data from Kaggle contains real-
world data collected from a telecommunications company,
II. SCOPE capturing various aspects of customer behavior and account
characteristics. The dataset includes demographic details,
 XGBoost account information, service subscriptions, billing data, and
XG Boost is a highly efficient, scalable gradient boosting usage metrics. The primary target variable is a binary indicator
algorithm that has revolutionized machine learning practices representing whether a customer has discontinued their service
across various domains. It constructs an ensemble of decision ("Churn"), making it a popular benchmark for binary
classification tasks focused on customer attrition.
trees in a sequential manner, optimizing each new tree based
on the residual errors of previous iterations. By employing both
The dataset’s structure—with a mix of categorical
first-order and second-order gradient statistics, XG Boost features (e.g., gender, contract type, payment method) and
effectively minimizes loss functions while integrating numerical features (e.g., tenure, monthly charges, total
regularization techniques to prevent overfitting. This algorithm charges)—requires robust preprocessing and feature
is well-known for its speed and performance, especially on engineering. Researchers and practitioners have leveraged this
large and complex datasets. Its implementation supports dataset to test various data transformation and modeling
parallel processing and distributed computing, enabling the approaches, as its inherent challenges, such as handling missing
analysis of massive datasets with ease. Additionally, XG Boost values and imbalanced classes, reflect real business scenarios.
provides robust handling of missing values and sparse data
through innovative approaches such as weighted quantile Due to its practical significance, the Telco Customer
sketch. The framework is highly customizable, accommodating Churn dataset is frequently used in both academic studies and
various objective functions, including regression, industrial applications. It helps organizations develop
classification, and ranking. As a result, it has become a favored predictive models aimed at understanding and mitigating
choice in data science competitions and industry applications. churn, ultimately supporting customer retention strategies
With a strong emphasis on interpretability and computational through data-driven insights.
efficiency, XG Boost has significantly contributed to the
advancement of predictive analytics and remains a critical tool III. IMPLEMENTATION
for researchers and practitioners aiming to extract meaningful
insights from data. Furthermore, its design enables seamless The Tab Transformer [2] is organized into three principal
integration with various programming languages and data components: a dedicated column embedding layer, a
processing libraries, making it a versatile solution for research succession of N Transformer layers, and a concluding
and industry applications. multilayer perception. Each Transformer layer, as described by
[4] integrates a multi-head self-attention mechanism that
 TabTransformer dynamically models inter-feature dependencies, followed by a
Tab Transformer is an innovative neural architecture position-wise feed-forward network that refines the learned
designed specifically for tabular data analysis by leveraging the representations. This configuration facilitates the extraction of
principles of transformer models. It extends the self-attention complex interactions within categorical data while seamlessly
mechanism, which is central to transformers, to capture integrating numerical inputs, ultimately enhancing predictive
intricate relationships among features in structured datasets. performance on tabular datasets.
Transformers, initially introduced for natural language
processing, utilize multi-head self-attention to assess the
significance of each input element, regardless of their order. In
Tab Transformer, categorical features are first transformed into
dense embeddings, which are then processed through a series
of transformer layers. These layers enable the model to learn
complex, non-linear interactions among variables, facilitating
superior feature representation. The self-attention mechanism
allows the model to dynamically weigh contributions from
different features, thus enhancing predictive accuracy and
robustness. Moreover, the architecture seamlessly integrates
with traditional deep learning frameworks, making it adaptable
to various data science tasks. By combining the strengths of
transformer architectures with specialized adaptations for
tabular data, Tab Transformer offers a novel approach to
overcome limitations of conventional methods. Its design
represents a convergence of ideas from natural language
processing and structured data modeling, offering promising
potential in fields requiring high interpretability and
performance. This approach not only enhances model
efficiency but also paves the way for future innovations in data
representation.

IJISRT25MAR416 www.ijisrt.com 467

Volume 10, Issue 3, March– 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar416

Fig 1 Tab Transformer Architecture [2]

 Forward Pass 𝐿(𝑥, 𝑦) ≡ 𝐶 (𝑔𝜓 (𝑓𝜃 (𝐸𝜑 (𝑥𝑐𝑎𝑡 )) , 𝑥𝑐𝑜𝑛𝑡 ) , 𝑦) (1)

 Embedding Categorical Inputs

 Muti-head Self Attention
In the forward method, each column of the categorical
In the formulation presented by [4], the Transformer
input 𝑥𝑐𝑎𝑡 is passed through its corresponding embedding
architecture is structured around a multi-head self-attention
layer. These embeddings are stacked along a new dimension
mechanism followed by a position-wise feed-forward network,
to form a tensor of E with shape (batch, num_cat, embed_dim).
with both sub-layers augmented by residual connections and
layer normalization. The self-attention mechanism operates via
 Embedding Categorical Inputs three learnable projection matrices—namely, Key, Query, and
The stacked embeddings E are passed through the
Value. Each input embedding is projected onto these matrices
transformer encoder. This layer applies multi-head self-
to produce its corresponding key, query, and value vectors.
attention (explained in detail below), allowing the model to
Formally, let 𝐾 ∈ ℝ𝑚 𝑥 𝑘 , 𝑄 ∈ ℝ𝑚 𝑥 𝑘 , 𝑉 ∈ ℝ𝑚 𝑥 𝑣 denote
learn complex interdependencies between different
the matrices containing the key, query, and value vectors for m
categorical features
input embeddings, where 𝑘 , 𝑣 represent the dimensions of the
key and value vectors, respectively. Each embedding then
 Concatenation and Prediction
computes attention over all embeddings via an attention head
The output E′ from the previous layer is flattened to a
defined by
vector and concatenated with the numerical features 𝑥𝑐𝑜𝑛𝑡 ∈
𝐴𝑡𝑡𝑒𝑛𝑡𝑖𝑜𝑛(𝐾, 𝑄, 𝑉) = 𝐴 . 𝑉 (2)
ℝ𝑐 denotes all the c continuous features. The resulting vector
is processed by the MLP to yield the final prediction logits with the attention weights are given by,
For our classification task, let C be the cross-entropy for
𝑄𝐾 𝑇
and we want to minimize the following loss function L(x, y) to 𝐴 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 ( ) (3)
learn all the parameters in an end-to-end learning gradient √𝑘
descent. The Tab Transformer parameters include φ for column
embedding, θ for Transformer layers, and ψ for the top MLP Here, the matrix 𝐴 ∈ ℝ𝑚 𝑥 𝑚 quantifies the degree to
layer. which each embedding attends to every other embedding,
thereby producing contextually enriched representations.
Following the attention operation, the output—originally of

IJISRT25MAR416 www.ijisrt.com 468

Volume 10, Issue 3, March– 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar416
dimension v is re-projected to the embedding dimension d via Instead of using the continuous tenure variable directly,
a fully connected layer. This is then processed sequentially by we segmented tenure into categorical bins (e.g., 0–12 months,
two position-wise feed-forward layers, where the first layer 13–24 months, etc.). This transformation captures non-linear
expands the dimensionality to four times the original size and effects, as churn likelihood may change drastically at different
the second layer subsequently reduces it back to d. stages of a customer’s lifecycle.

IV. ANALYSIS  Interaction Features

We explored interaction terms such as the product of
 Feature Engineering Monthly Charges and Contract type, which can reveal
We used several new features to enhance model combined effects where, for example, high charges paired with
performance by providing additional context and capturing a month-to-month contract might be a stronger churn signal
non-linear relationships within the data. Below is an than either feature in isolation.
explanation of the key engineered features and their potential
impact: These engineered features enrich the dataset by providing
more nuanced signals for the learning algorithms. For XG
 Average Monthly Charge Boost, the additional numerical variables enhance tree-splitting
decisions, while for Tab Transformer, they offer extra context
𝑇𝑜𝑡𝑎𝑙𝐶ℎ𝑎𝑟𝑔𝑒𝑠 that complements the embedded representations of categorical
𝐴𝑣𝑔𝑀𝑜𝑛𝑡ℎ𝑙𝑦𝐶ℎ𝑎𝑟𝑔𝑒 = data. Overall, these features aim to improve the models’ ability
𝑇𝑒𝑛𝑢𝑟𝑒 to detect subtle patterns and relationships that contribute to
customer churn.
This feature normalizes the total spending by the length
of the customer’s relationship, highlighting customers who
 Methodology and Metrics
incur higher charges relative to their engagement duration. It
Both the models were trained using the same set of
may indicate dissatisfaction or financial stress, both of which
features and the training was stopped as soon as the loss of the
can correlate with churn.
unseen data did not improve (early stopping)
 Service Count The models are compared using several performance
By summing binary indicators for various service metrics in Error! Reference source not found. that provide a
features (e.g., OnlineSecurity, OnlineBackup, comprehensive view of their classification abilities. These
DeviceProtection, TechSupport, StreamingTV, and include:
StreamingMovies), we created a feature:
𝑛
 Accuracy: Measures the overall proportion of correct
predictions.
𝑆𝑒𝑟𝑣𝑖𝑐𝑒𝐶𝑜𝑢𝑛𝑡 = ∑ 1 (𝑆𝑒𝑟𝑣𝑖𝑐𝑒𝑖 = "𝑌𝑒𝑠")  Precision: Evaluates the correctness of positive predictions,
𝑖=1 indicating how many predicted positives are true positives.
 Recall (Sensitivity): Assesses the model's ability to identify
This aggregation provides a measure of customer all actual positive cases.
engagement with additional services, which can be a proxy for
 F1 Score: The harmonic mean of precision and recall,
loyalty. A higher count may imply a deeper investment in the
offering a balance between them.
ecosystem, potentially reducing churn risk.
 Area Under the ROC Curve (AUC): Captures the trade-off
between true positive and false positive rates across
different thresholds.
 Tenure Binning

Table 1 Metrics Comparing the Model Performance on Unseen Data

Accuracy Precision Recall F1 AUC
XGBoost 79.4% 64.3% 50.2% 56.5% 84.1%
TabTransformer 79.5% 63.1% 54.8% 58.6% 83.6%

As evident from the table shown above, the models systems, overlooking a fraudulent transaction could result in
have very similar overall performance similar to what [2] had substantial financial loss, making it preferable to flag more
also noticed in their results. But what we also see here is that transactions for review even if some are false alarms. In these
the Transformer model outperforms the Boosting method in circumstances, the Transformer based model can be preferred
Recalling the positive examples by about 8%. So, in over the Gradient Boosting XG Boost.
scenarios, when the cost of missing a true positive far
outweighs the inconvenience or cost of incorrectly flagging a
negative instance as positive. For instance, in medical
diagnostics—such as screening for cancer or infectious
diseases—failing to identify a diseased patient (a false
negative) can have severe or even fatal consequences,
whereas a false positive might lead to further testing that,
while potentially anxiety-inducing and costly, is
comparatively less harmful. Similarly, in fraud detection

IJISRT25MAR416 www.ijisrt.com 469

Volume 10, Issue 3, March– 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar416

Fig 2 Roc Curve

Additionally, ROC curves as shown in Fig 2 are plotted

to visually analyze the distribution of classification errors and
to assess model discrimination capabilities. These combined
metrics allow for a detailed scientific comparison between the
XGBoost and TabTransformer models, highlighting strengths
and potential trade-offs in different aspects of performance.

V. CONCLUSTION

In conclusion, this study provides a comprehensive

comparative analysis of Gradient Boosting (XGBoost) and
Transformer-based models for binary classification in tabular
data. Both models exhibit similar performance across various
metrics, with the Transformer model demonstrating a notable
advantage in recall. This suggests that the Transformer model
may be better suited for applications where the cost of false
negatives is high, such as fraud detection and medical
diagnostics. The findings underscore the importance of
aligning model selection with specific data challenges and
application requirements. Future research could explore hybrid
approaches that combine the strengths of both models to further
enhance performance. Overall, this study contributes valuable
insights into the evolving landscape of machine learning
techniques for structured data analysis.

REFERENCES

[1]. T. Chen and C. Guestrin, "XGBoost: A Scalable Tree

Boosting System," 2016.
[2]. X. Huang, A. Khetan, M. Cvitkovic and Z. Karnin,
"TabTransformer: Tabular Data Modeling Using
Contextual Embeddings," 2020.
[3]. "Kaggle," [Online]. Available:
https://www.kaggle.com/datasets/blastchar/telco-
customer-churn/data.
[4]. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L.
Jones and A. Gomez, "Attention is all you need," 2017.

IJISRT25MAR416 www.ijisrt.com 470

Group Assignment Oct 2016
0% (1)
Group Assignment Oct 2016
2 pages
Epec 2024 Control Module
100% (1)
Epec 2024 Control Module
34 pages
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Unlocking The Power of Shopper Marketing With A National Brand - Cheerios
No ratings yet
Unlocking The Power of Shopper Marketing With A National Brand - Cheerios
14 pages
CSR Number: 592/2021 - FIR/Case Registered
No ratings yet
CSR Number: 592/2021 - FIR/Case Registered
1 page
Data Analytics and Data Processing Essentials
From Everand
Data Analytics and Data Processing Essentials
gareth thomas
No ratings yet
Edge Computing Applications in Supply Chain Management
From Everand
Edge Computing Applications in Supply Chain Management
Bo Li
No ratings yet
The InfluxDB Handbook: Deploying, Optimizing, and Scaling Time Series Data
From Everand
The InfluxDB Handbook: Deploying, Optimizing, and Scaling Time Series Data
Robert Johnson
No ratings yet
Python Data Wrangling for Business Analytics: Python for Business Analytics Series
From Everand
Python Data Wrangling for Business Analytics: Python for Business Analytics Series
George Snypes
2/5 (1)
Data Structures Explained: A Practical Guide with Examples
From Everand
Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Building and Operating Data Hubs: Using a practical Framework as Toolset
From Everand
Building and Operating Data Hubs: Using a practical Framework as Toolset
Georg Graner
No ratings yet
Lexicon of Operation Terminology: Lexicon of Tech and Business, #7
From Everand
Lexicon of Operation Terminology: Lexicon of Tech and Business, #7
Mustafa Al-Dori
5/5 (1)
Lexicon of Computer Science Terminology: Lexicon of Tech and Business, #16
From Everand
Lexicon of Computer Science Terminology: Lexicon of Tech and Business, #16
Mustafa Al-Dori
4/5 (1)
Tabular Data Classification and Regression XGBoost or Deep Learning With Retrieval-Augmented Generation
No ratings yet
Tabular Data Classification and Regression XGBoost or Deep Learning With Retrieval-Augmented Generation
14 pages
Data Quality: Empowering Businesses with Analytics and AI
From Everand
Data Quality: Empowering Businesses with Analytics and AI
Prashanth Southekal
No ratings yet
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
From Everand
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
WINTON CLEM
No ratings yet
Introduction to Data Platforms: How to leverage data fabric concepts to engineer your organization's data for today's cloud-based digital world
From Everand
Introduction to Data Platforms: How to leverage data fabric concepts to engineer your organization's data for today's cloud-based digital world
Anthony David Giordano
No ratings yet
Dancing on a Cloud: A Framework for Increasing Business Agility
From Everand
Dancing on a Cloud: A Framework for Increasing Business Agility
David Sterling
No ratings yet
Application and Technology Rationalization: A Strategic Guide for Midsize to Large Companies: IT and Digital Transformation
From Everand
Application and Technology Rationalization: A Strategic Guide for Midsize to Large Companies: IT and Digital Transformation
Pavi Agrawal
No ratings yet
C Data Structures and Algorithms: Implementing Efficient ADTs
From Everand
C Data Structures and Algorithms: Implementing Efficient ADTs
Larry Jones
No ratings yet
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Spreadsheets To Cubes (Advanced Data Analytics for Small Medium Business): Data Science
From Everand
Spreadsheets To Cubes (Advanced Data Analytics for Small Medium Business): Data Science
alasdair gilchrist
No ratings yet
Mastering Trino: The Definitive Guide to Distributed SQL
From Everand
Mastering Trino: The Definitive Guide to Distributed SQL
Robert Johnson
No ratings yet
C++ Data Structures Explained: A Practical Guide with Examples
From Everand
C++ Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Tabular Data - Deep Learning Is Not All You Need
No ratings yet
Tabular Data - Deep Learning Is Not All You Need
13 pages
Big Data for Enterprise Architects
From Everand
Big Data for Enterprise Architects
Dr Mehmet Yildiz
4.5/5 (2)
Do It Smart: Seven Rules for Superior Information Technology Performance
From Everand
Do It Smart: Seven Rules for Superior Information Technology Performance
Jurgen Ringback
3.5/5 (2)
Automated Network Technology: The Changing Boundaries of Expert Systems
From Everand
Automated Network Technology: The Changing Boundaries of Expert Systems
Carl P. Catalano Ph.D.
No ratings yet
Synthetic Data Generation: A Beginner’s Guide
From Everand
Synthetic Data Generation: A Beginner’s Guide
Robert Johnson
No ratings yet
Application Design: Key Principles For Data-Intensive App Systems
From Everand
Application Design: Key Principles For Data-Intensive App Systems
Rob Botwright
No ratings yet
Semantic Translation: Fundamentals and Applications
From Everand
Semantic Translation: Fundamentals and Applications
Fouad Sabry
No ratings yet
CIW Data Analyst Exam Prep: 500 Practice Questions for Certification Success
From Everand
CIW Data Analyst Exam Prep: 500 Practice Questions for Certification Success
Steve Brown
No ratings yet
DL Tabular
No ratings yet
DL Tabular
43 pages
IT Specialist: Data Analytics Certification Prep - 500 Exam Questions and Explanations
From Everand
IT Specialist: Data Analytics Certification Prep - 500 Exam Questions and Explanations
Steve Brown
No ratings yet
AI-Driven Web Apps: Practical Machine Learning for Software Developers
From Everand
AI-Driven Web Apps: Practical Machine Learning for Software Developers
Sivaramarajalu Ramadurai Venkataraajalu
No ratings yet
IT Consulting Boom: Navigating the Software Replacement Wave
From Everand
IT Consulting Boom: Navigating the Software Replacement Wave
Andrew Rover
No ratings yet
AI-Driven Test Data Management for Large-Scale BI Applications
No ratings yet
AI-Driven Test Data Management for Large-Scale BI Applications
11 pages
Data Mining 101: Core Concepts and Algorithms
From Everand
Data Mining 101: Core Concepts and Algorithms
Swarnalata Verma
No ratings yet
Digital Twins: How Engineers Can Adopt Them To Enhance Performances
From Everand
Digital Twins: How Engineers Can Adopt Them To Enhance Performances
Isrin Ismail
No ratings yet
Exploring Innovative Approaches in Buyer Differentiation: A Detailed Examination of AI- Powered Methods and RFM-Centric Strategies for Practical Intelligence
No ratings yet
Exploring Innovative Approaches in Buyer Differentiation: A Detailed Examination of AI- Powered Methods and RFM-Centric Strategies for Practical Intelligence
5 pages
Encapsulating Legacy: A Guide to Service-Oriented Architecture in Mainframe Systems: Mainframes
From Everand
Encapsulating Legacy: A Guide to Service-Oriented Architecture in Mainframe Systems: Mainframes
Isaac Nangan
No ratings yet
Customer Churn Prediction Using Machine Learning Algorithms
No ratings yet
Customer Churn Prediction Using Machine Learning Algorithms
5 pages
Cloud Brokering
From Everand
Cloud Brokering
Felipe Díaz-Sánchez
No ratings yet
HPE Compute Certification Guide: 444 Practice Questions for the Advanced HPE1-H02 Exam
From Everand
HPE Compute Certification Guide: 444 Practice Questions for the Advanced HPE1-H02 Exam
Steve Brown
No ratings yet
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet
Student Profile Modeling Using Boosting Algorithms
No ratings yet
Student Profile Modeling Using Boosting Algorithms
13 pages
Smarter Data Science: Succeeding with Enterprise-Grade Data and AI Projects
From Everand
Smarter Data Science: Succeeding with Enterprise-Grade Data and AI Projects
Neal Fishman
No ratings yet
Enterprise Data Science: Smarter Decisions with Big Data
From Everand
Enterprise Data Science: Smarter Decisions with Big Data
Vidhur Gupta
No ratings yet
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
Hybrid Classifier Using Evolutionary and Non-Evolutionary Algorithm For Performance Enhancement in Data Mining
No ratings yet
Hybrid Classifier Using Evolutionary and Non-Evolutionary Algorithm For Performance Enhancement in Data Mining
6 pages
The Analyst's Atlas: Navigating the Financial Data Sphere
From Everand
The Analyst's Atlas: Navigating the Financial Data Sphere
Manish Tomar
No ratings yet
Getting Started with Greenplum for Big Data Analytics
From Everand
Getting Started with Greenplum for Big Data Analytics
Sunila Gollapudi
No ratings yet
Data Lake Development with Big Data: Explore architectural approaches to building Data Lakes that ingest, index, manage, and analyze massive amounts of data using Big Data technologies
From Everand
Data Lake Development with Big Data: Explore architectural approaches to building Data Lakes that ingest, index, manage, and analyze massive amounts of data using Big Data technologies
Pradeep Pasupuleti
No ratings yet
Network Coding and Signcryption for Cloud Data Integrity
From Everand
Network Coding and Signcryption for Cloud Data Integrity
Noah Joan
No ratings yet
Building Scalable Systems with C: Optimizing Performance and Portability
From Everand
Building Scalable Systems with C: Optimizing Performance and Portability
Larry Jones
No ratings yet
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
From Everand
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
Steven Vollmer
No ratings yet
Interoperability in Mining: Strategies for Software Integration Success: MINING AUTOMATION
From Everand
Interoperability in Mining: Strategies for Software Integration Success: MINING AUTOMATION
Elizabeth Mogopodi
No ratings yet
Enabling World-Class Decisions for Asia Pacific (APAC): The Executive’s Guide to Understanding & Deploying Modern Corporate Performance Management Solutions for Asia Pacific
From Everand
Enabling World-Class Decisions for Asia Pacific (APAC): The Executive’s Guide to Understanding & Deploying Modern Corporate Performance Management Solutions for Asia Pacific
Corey Barak
No ratings yet
Azure Data Demystified: From SQL to Synapse
From Everand
Azure Data Demystified: From SQL to Synapse
Kameron Hussain
No ratings yet
Innovation Landscape brief: Blockchain
From Everand
Innovation Landscape brief: Blockchain
International Renewable Energy Agency (IRENA)
No ratings yet
Data and Analytics in Action: Project Ideas and Basic Code Skeleton in Python
From Everand
Data and Analytics in Action: Project Ideas and Basic Code Skeleton in Python
Zemelak Goraga
No ratings yet
Get Hired as a Data Analyst FAST in 2024
From Everand
Get Hired as a Data Analyst FAST in 2024
Silas Meadowlark
No ratings yet
Mastering Data Engineering and Analytics with Databricks: A Hands-on Guide to Build Scalable Pipelines Using Databricks, Delta Lake, and MLflow (English Edition)
From Everand
Mastering Data Engineering and Analytics with Databricks: A Hands-on Guide to Build Scalable Pipelines Using Databricks, Delta Lake, and MLflow (English Edition)
Manoj Kumar
No ratings yet
Data Science Career Guide Interview Preparation
From Everand
Data Science Career Guide Interview Preparation
Gradient Publication
No ratings yet
Monte Carlo-Based Modeling of 2-D Ising Systems Using Metropolis Algorithm, Simulation Techniques, Thermodynamic Behavior and Magnetization Patterns
No ratings yet
Monte Carlo-Based Modeling of 2-D Ising Systems Using Metropolis Algorithm, Simulation Techniques, Thermodynamic Behavior and Magnetization Patterns
16 pages
Investigating the Interplay between Climate Change and Sustainable Environment Development: Challenges, Strategies and Future Directions
No ratings yet
Investigating the Interplay between Climate Change and Sustainable Environment Development: Challenges, Strategies and Future Directions
11 pages
Assessment of Underground Water Quality of Gosa Landfill Site of the Federal Capital Territory, Abuja Nigeria
No ratings yet
Assessment of Underground Water Quality of Gosa Landfill Site of the Federal Capital Territory, Abuja Nigeria
11 pages
Crude Oil Price Volatility and its Impact on Nigeria’s Balance of Trade: An Empirical Assessment (2000–2023)
No ratings yet
Crude Oil Price Volatility and its Impact on Nigeria’s Balance of Trade: An Empirical Assessment (2000–2023)
13 pages
Transition to Telepsychotherapy: Experiential Perspective of Debutant Therapists
No ratings yet
Transition to Telepsychotherapy: Experiential Perspective of Debutant Therapists
6 pages
Perception, Attitude, and Readiness in Artificial Intelligence Adoption among Academic Librarians in the Bicol Region Librarians Council (BRLC)
No ratings yet
Perception, Attitude, and Readiness in Artificial Intelligence Adoption among Academic Librarians in the Bicol Region Librarians Council (BRLC)
6 pages
Developing Gamified Educational Technologies to Enhance Learning and Motivate Student Engagement in Education: A Quantitative Study Using Human-Computer Interaction (HCI)
No ratings yet
Developing Gamified Educational Technologies to Enhance Learning and Motivate Student Engagement in Education: A Quantitative Study Using Human-Computer Interaction (HCI)
16 pages
Unlocking the Therapeutic Power of Coriander: A Review of Coriandrum Sativum’s Bioactive Compounds and Health Benefits
No ratings yet
Unlocking the Therapeutic Power of Coriander: A Review of Coriandrum Sativum’s Bioactive Compounds and Health Benefits
15 pages
Analysis of the Role of Websites, Design, and Performance Metrics in Improving Company Performance in Medan City
No ratings yet
Analysis of the Role of Websites, Design, and Performance Metrics in Improving Company Performance in Medan City
4 pages
A Review on Gold Nanoparticles: Properties, Synthesis and Biomedical Application in Drug Delivery and Cancer Therapy
No ratings yet
A Review on Gold Nanoparticles: Properties, Synthesis and Biomedical Application in Drug Delivery and Cancer Therapy
6 pages
Optimal Voltage Regulation in Standalone Photovoltaic Systems Using Model Predictive Control and MOGA
No ratings yet
Optimal Voltage Regulation in Standalone Photovoltaic Systems Using Model Predictive Control and MOGA
8 pages
Analyzing Social Communication Deficits in Autism Using Wearable Sensors and Real-Time Affective Computing Systems
No ratings yet
Analyzing Social Communication Deficits in Autism Using Wearable Sensors and Real-Time Affective Computing Systems
17 pages
A Decade of Genome Editing: Comparative Review of Zfn, Talen, and Crispr/Cas9
No ratings yet
A Decade of Genome Editing: Comparative Review of Zfn, Talen, and Crispr/Cas9
10 pages
Cost Comparative Analysis of Solar/Utility and Diesel/Utility Hybrid Power System for a Typical Residential Building
No ratings yet
Cost Comparative Analysis of Solar/Utility and Diesel/Utility Hybrid Power System for a Typical Residential Building
8 pages
A Phytochemical Evaluation of Sierra Leonean Cassia siamea: A Source of Bioactive Compounds
No ratings yet
A Phytochemical Evaluation of Sierra Leonean Cassia siamea: A Source of Bioactive Compounds
5 pages
Architecture as a Reflection of Cultural Continuity: A Study of Traditional Trends
No ratings yet
Architecture as a Reflection of Cultural Continuity: A Study of Traditional Trends
3 pages
A MIC-MAC-Based Structural Exploration of Determinants Impacting Investment Sensitivity
No ratings yet
A MIC-MAC-Based Structural Exploration of Determinants Impacting Investment Sensitivity
8 pages
Analysis of the Export Competitiveness of Indonesia's Horticultural Fruit Products in the International Market
No ratings yet
Analysis of the Export Competitiveness of Indonesia's Horticultural Fruit Products in the International Market
8 pages
Smart Narrator Robot: Enhancing Experiential Learning through Conditional Autonomy
No ratings yet
Smart Narrator Robot: Enhancing Experiential Learning through Conditional Autonomy
6 pages
Real - Time Recognition of Cardiovascular Conditions from ECG Images with Deep Learning
No ratings yet
Real - Time Recognition of Cardiovascular Conditions from ECG Images with Deep Learning
10 pages
Development of Mirror Biosensor in Saliva pH Measurement in Health Services
No ratings yet
Development of Mirror Biosensor in Saliva pH Measurement in Health Services
7 pages
Assessing the Achievements of the Re-Alignment of an Industry Educatiocal Based System in Society
No ratings yet
Assessing the Achievements of the Re-Alignment of an Industry Educatiocal Based System in Society
5 pages
Enhancing Model Accuracy for Keypoint-Based Sign Language Recognition using Optimized Neural Network Architectures
No ratings yet
Enhancing Model Accuracy for Keypoint-Based Sign Language Recognition using Optimized Neural Network Architectures
7 pages
Design and Implementation of a GPS-GSM based Real-Time Vehicle Theft Tracking System for Urban Security in Uganda
No ratings yet
Design and Implementation of a GPS-GSM based Real-Time Vehicle Theft Tracking System for Urban Security in Uganda
7 pages
Continuing Training and Professional Performance of Primary School Teachers in Tchad: The Case of Teachers in the Farchana Refugee Camp
No ratings yet
Continuing Training and Professional Performance of Primary School Teachers in Tchad: The Case of Teachers in the Farchana Refugee Camp
7 pages
EduTech Portal: An AI-Powered Student Assistant Chatbot
No ratings yet
EduTech Portal: An AI-Powered Student Assistant Chatbot
12 pages
ResumeMatch: Intelligent Resume Enhancement & Job Fit Analysis
No ratings yet
ResumeMatch: Intelligent Resume Enhancement & Job Fit Analysis
7 pages
Enhancing the Robustness of Computer Vision Models to Adversarial Perturbations Using Multi-Scale Attention Mechanisms
No ratings yet
Enhancing the Robustness of Computer Vision Models to Adversarial Perturbations Using Multi-Scale Attention Mechanisms
14 pages
Behavior Addiction in Adolescents Post COVID 19: A Systematic Mental Health Review
No ratings yet
Behavior Addiction in Adolescents Post COVID 19: A Systematic Mental Health Review
8 pages
Evaluating the Impact of Shopee Mall on Consumer Purchase: Basis for Developing an Effective Marketing Plan
No ratings yet
Evaluating the Impact of Shopee Mall on Consumer Purchase: Basis for Developing an Effective Marketing Plan
61 pages
Zero Deviation Document For Waste Management 2018
100% (1)
Zero Deviation Document For Waste Management 2018
55 pages
Effectiveness of Performance Appraisal
75% (4)
Effectiveness of Performance Appraisal
110 pages
Pembrokeshire Marine Forum Newsletter
No ratings yet
Pembrokeshire Marine Forum Newsletter
11 pages
HRM Assignment
No ratings yet
HRM Assignment
7 pages
Hedging Strategies Using Futures: Options, Futures, and Other Derivatives, 9th Edition
No ratings yet
Hedging Strategies Using Futures: Options, Futures, and Other Derivatives, 9th Edition
19 pages
VCB Siemens
No ratings yet
VCB Siemens
8 pages
Defileul Oituzului-1944
No ratings yet
Defileul Oituzului-1944
6 pages
Binding Materials
No ratings yet
Binding Materials
4 pages
CLASS 11 - SECOND TERM - Sequence and Series
No ratings yet
CLASS 11 - SECOND TERM - Sequence and Series
64 pages
KMSG Printf 20200820173241
No ratings yet
KMSG Printf 20200820173241
8 pages
Pu 2005
No ratings yet
Pu 2005
6 pages
Relational Algebra DBMS
No ratings yet
Relational Algebra DBMS
17 pages
KCCA-ADVERT-NO.3-4-2024-SUCCESSFUL-CANDIDATES
No ratings yet
KCCA-ADVERT-NO.3-4-2024-SUCCESSFUL-CANDIDATES
24 pages
Alcatel NRS II - Composite Exam Info
No ratings yet
Alcatel NRS II - Composite Exam Info
4 pages
2019 Fall
No ratings yet
2019 Fall
116 pages
Lorem Ipsum
No ratings yet
Lorem Ipsum
99 pages
Edge Intelligence in 5G and Beyond: Ashwin Rao Department of Computer Science University of Helsinki
No ratings yet
Edge Intelligence in 5G and Beyond: Ashwin Rao Department of Computer Science University of Helsinki
12 pages
13 Lim Vs Executive Secretary
No ratings yet
13 Lim Vs Executive Secretary
3 pages
ML266 Group 10 Final Report
No ratings yet
ML266 Group 10 Final Report
29 pages
JSS 55555 Dust Chamber
No ratings yet
JSS 55555 Dust Chamber
5 pages
Kaiciid Monitoring and Evaluation Guidelines 0
No ratings yet
Kaiciid Monitoring and Evaluation Guidelines 0
88 pages
Chapter 1 Problem 5 To 7
No ratings yet
Chapter 1 Problem 5 To 7
2 pages
Backyard Garden Design 125
89% (9)
Backyard Garden Design 125
132 pages
Saudi Arabia Airlines Vs CA Digest
No ratings yet
Saudi Arabia Airlines Vs CA Digest
5 pages
P.C. Varghese - Advanced Reinforced Concrete Design, 2Nd Edition-Prentice-Hall of India PVT - LTD (2005)
No ratings yet
P.C. Varghese - Advanced Reinforced Concrete Design, 2Nd Edition-Prentice-Hall of India PVT - LTD (2005)
556 pages
Indian Grocery Shopping List Checklist
No ratings yet
Indian Grocery Shopping List Checklist
1 page

Comparitive Analysis of Gradient Boosting and Transformer Based Models for Binary Classification in Tabular Data

Uploaded by

Comparitive Analysis of Gradient Boosting and Transformer Based Models for Binary Classification in Tabular Data

Uploaded by

Volume 10, Issue 3, March– 2025 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar416

Comparitive Analysis of Gradient Boosting and

Publication Date: 2025/03/20

Keywords: Transformer, Gradient Boosting, XGBoost, Tabular Data.

I. INTRODUCTION task of evaluating their performance on a Binary Churn

IJISRT25MAR416 www.ijisrt.com 466

IJISRT25MAR416 www.ijisrt.com 467

Fig 1 Tab Transformer Architecture [2]

 Forward Pass 𝐿(𝑥, 𝑦) ≡ 𝐶 (𝑔𝜓 (𝑓𝜃 (𝐸𝜑 (𝑥𝑐𝑎𝑡 )) , 𝑥𝑐𝑜𝑛𝑡 ) , 𝑦) (1)

 Embedding Categorical Inputs

IJISRT25MAR416 www.ijisrt.com 468

IV. ANALYSIS  Interaction Features

Table 1 Metrics Comparing the Model Performance on Unseen Data

IJISRT25MAR416 www.ijisrt.com 469

Fig 2 Roc Curve

Additionally, ROC curves as shown in Fig 2 are plotted

In conclusion, this study provides a comprehensive

[1]. T. Chen and C. Guestrin, "XGBoost: A Scalable Tree

IJISRT25MAR416 www.ijisrt.com 470

You might also like