Mab2Rec: Bandit-based Recommenders

Bandit Policies

MABWiser

Core engine to create multi-armed bandit recommendation algorithms used by Mab2Rec.

pip install mabwiser

Code API Docs Bridge@AAAI'24 Arxiv'24 TMLR'22 IJAIT'21 JDSA'21 ICTAI'19

# An example that shows how to use the UCB1 learning policy
# to choose between two arms based on their expected rewards.

# Import MABWiser Library
from mabwiser.mab import MAB, LearningPolicy, NeighborhoodPolicy

# Data
arms = ['Arm1', 'Arm2']
decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
rewards = [20, 17, 25, 9]

# Model
mab = MAB(arms, LearningPolicy.UCB1(alpha=1.25))

# Train
mab.fit(decisions, rewards)

# Test
mab.predict()

@article{mabwiser,
  title={MABWiser: parallelizable contextual multi-armed bandits},
  author={Strong, Emily and Kleynhans, Bernard and Kad{\i}{\u{g}}lu, Serdar},
  journal={International Journal on Artificial Intelligence Tools},
  volume={30},
  number={04},
  pages={2150021},
  year={2021},
  publisher={World Scientific}
}

Bernard Kleynhands, Emily Strong, Xin Wang, Serdar Kadıoğlu

Item Features

TextWiser

Text featurization toolkit to create item representations for recommender models.

pip install textwiser

Code API Docs AAAI'21 Slides

# Conceptually, TextWiser is composed of an Embedding, potentially with a pretrained model,
# that can be chained into zero or more Transformations
from textwiser import TextWiser, Embedding, Transformation, WordOptions, PoolOptions

# Data
documents = ["Some document", "More documents. Including multi-sentence documents."]

# Model: TFIDF `min_df` parameter gets passed to sklearn automatically
emb = TextWiser(Embedding.TfIdf(min_df=1))

# Model: TFIDF followed with an NMF + SVD
emb = TextWiser(Embedding.TfIdf(min_df=1), [Transformation.NMF(n_components=30), Transformation.SVD(n_components=10)])

# Model: Word2Vec with no pretraining that learns from the input data
emb = TextWiser(Embedding.Word(word_option=WordOptions.word2vec, pretrained=None), Transformation.Pool(pool_option=PoolOptions.min))

# Model: BERT with the pretrained bert-base-uncased embedding
emb = TextWiser(Embedding.Word(word_option=WordOptions.bert), Transformation.Pool(pool_option=PoolOptions.first))

# Features
vecs = emb.fit_transform(documents)

@inproceedings{textwiser,
  title={Representing the unification of text featurization using a context-free grammar},
  author={Kilit{\c{c}}ioglu, Doruk and Kad{\i}{\u{g}}lu, Serdar},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={35},
  number={17},
  pages={15439--15445},
  year={2021}
}

Karthik Uppuluri, Doruk Kilitçioğlu, Serdar Kadıoğlu

User Features

Selective

Feature selection toolkit to create user representations from structured and high-dimensional data.

pip install selective

Code AMAI'24 CPAIOR'21 CPAIOR'21 DSO@IJCAI'21 🤗 Dataset NVIDIA GTC Slides

# Import Selective and SelectionMethod
from sklearn.datasets import fetch_california_housing
from feature.utils import get_data_label
from feature.selector import Selective, SelectionMethod

# Data
data, label = get_data_label(fetch_california_housing())

# Feature selectors from simple to more complex
selector = Selective(SelectionMethod.Variance(threshold=0.0))
selector = Selective(SelectionMethod.Correlation(threshold=0.5, method="pearson"))
selector = Selective(SelectionMethod.Statistical(num_features=3, method="anova"))
selector = Selective(SelectionMethod.Linear(num_features=3, regularization="none"))
selector = Selective(SelectionMethod.TreeBased(num_features=3))

# Feature reduction
subset = selector.fit_transform(data, label)
print("Reduction:", list(subset.columns))
print("Scores:", list(selector.get_absolute_scores()))

@article{selective,
  title={Integrating optimized item selection with active learning for continuous exploration in recommender systems},
  author={Kad{\i}{\u{g}}lu, Serdar and Kleynhans, Bernard and Wang, Xin},
  journal={Annals of Mathematics and Artificial Intelligence},
  volume={92},
  number={6},
  pages={1585--1607},
  year={2024},
  publisher={Springer}
}

Xin Wang, Bernard Kleynhans, Serdar Kadıoğlu

Sequential Pattern Mining

Seq2Pat

Sequential pattern mining toolkit to construct user representations from temporal interaction sequences.

pip install seq2pat

Code API Docs AI Magazine'23 Bridge@AAAI'23 AAAI'22 Frontiers'22 KDF@AAAI'22 CMU Blog Post

# Example to show how to find frequent sequential patterns
# from a given sequence database subject to constraints
from sequential.seq2pat import Seq2Pat, Attribute

# Seq2Pat over 3 sequences
seq2pat = Seq2Pat(sequences=[["A", "A", "B", "A", "D"],
                             ["C", "B", "A"],
                             ["C", "A", "C", "D"]])

# Price attribute corresponding to each item
price = Attribute(values=[[5, 5, 3, 8, 2],
                          [1, 3, 3],
                          [4, 5, 2, 1]])

# Average price constraint
seq2pat.add_constraint(3 <= price.average() <= 4)

# Patterns that occur at least twice (A-D)
patterns = seq2pat.get_patterns(min_frequency=2)

@article{seq2pat,
  title={Seq2Pat: Sequence-to-pattern generation to bridge pattern mining with machine learning},
  author={Kad{\i}{\u{g}}lu, Serdar and Wang, Xin and Hosseininasab, Amin and van Hoeve, Willem-Jan},
  journal={AI Magazine},
  volume={44},
  number={1},
  pages={54--66},
  year={2023},
  publisher={Wiley Online Library}
}

Xin Wang, Amin Hosseininasab, Willem-Jan van Hoeve, Serdar Kadıoğlu

Fairness, Bias & Performance Evaluation

Jurity

Evaluation library for recommendation quality, including ranking, classification, and fairness metrics.

pip install jurity

Code API Docs ACM'24 ArXiv'24 LION'23 CIKM'22 ICMLA'21 Intel Blog Post

# Import binary and multi-class fairness metrics
from jurity.fairness import BinaryFairnessMetrics, MultiClassFairnessMetrics

# Data
binary_predictions = [1, 1, 0, 1, 0, 0]
multi_class_predictions = ["a", "b", "c", "b", "a", "a"]
multi_class_multi_label_predictions = [["a", "b"], ["b", "c"], ["b"], ["a", "b"], ["c", "a"], ["c"]]
memberships = [0, 0, 0, 1, 1, 1]
classes = ["a", "b", "c"]

# Metrics (see also other available metrics)
metric = BinaryFairnessMetrics.StatisticalParity()
multi_metric = MultiClassFairnessMetrics.StatisticalParity(classes)

# Scores
print("Metric:", metric.description)
print("Lower Bound: ", metric.lower_bound)
print("Upper Bound: ", metric.upper_bound)
print("Ideal Value: ", metric.ideal_value)
print("Binary Fairness score: ", metric.get_score(binary_predictions, memberships))
print("Multi-class Fairness scores: ", multi_metric.get_scores(multi_class_predictions, memberships))
print("Multi-class multi-label Fairness scores: ", multi_metric.get_scores(multi_class_multi_label_predictions, memberships))

@article{jurity,
  title={Surrogate Modeling to Address the Absence of Protected Membership Attributes in Fairness Evaluation},
  author={Kadio{\u{g}}lu, Serdar and Thielbar, Melinda},
  journal={ACM Transactions on Evolutionary Learning},
  volume={5},
  number={3},
  pages={1--25},
  year={2025},
  publisher={ACM New York, NY}
}

Filip Michalsky, Melinda Thielbar, Serdar Kadıoğlu

Explainable AI

BoolXAI

Explainable AI based on expressive Boolean formulas.

pip install boolxai

Code API Docs AAAI'25 AAAI Slides ArXiv'26 MAKE'23 arXiv'23

import numpy as np
from sklearn.metrics import balanced_accuracy_score

from boolxai import BoolXAI, Operator

# Create random toy data for binary classification. X and y must be binary!
rng = np.random.default_rng(seed=42)
X = rng.choice([0, 1], size=(100, 10))
y = rng.choice([0, 1], size=100)

# Rule classifier with maximum depth, complexity, possible operators
rule_classifier = BoolXAI.RuleClassifier(max_depth=3,
                                         max_complexity=6,
                                         operators=[Operator.And, Operator.Or, Operator.Choose, Operator.AtMost, Operator.AtLeast],
                                         random_state=42)

# Learn the best rule
rule_classifier.fit(X, y)

# Best rule and best score
best_rule = rule_classifier.best_rule_
best_score = rule_classifier.best_score_
print(f"{best_rule=} {best_score=:.2f}")

# The depth of a rule is the number of edges in the longest path from the root to any leaf/literal.
# The complexity of a rule is the total number of operators and literals.
print(f"depth={best_rule.depth()} complexity={best_rule.complexity()}")

# Predict and score
y_pred = rule_classifier.predict(X)
score = balanced_accuracy_score(y, y_pred)
print(f"{score=:.2f}")

# It is also possible to plot the best rule --requires installing plot dependencies
best_rule.plot()

# or get a networkx.DiGraph representation of the rule --requires installing plot dependencies
G = best_rule.to_graph()
print(G)

Fidelity Blog Post FCAT Blog Post Amazon Quantum Blog Post Amazon AWS Blog Post

@inproceedings{boolxai,
  title={BoolXAI: Explainable AI Using Expressive Boolean Formulas},
  author={Kad{\i}{\u{g}}lu, Serdar and Zhu, Elton Yechao and Rosenberg, Gili and Brubaker, John Kyle and Schuetz, Martin JA and Salton, Grant and Zhu, Zhihuai and Katzgraber, Helmut G},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={39},
  number={28},
  pages={28900--28906},
  year={2025}
}

Serdar Kadıoğlu, Elton Zhu, Gili Rosenberg, Martin Schuetz

Mab2Rec:
Multi-Armed Bandit Recommenders

Overview

Open-Source AI at Scale:
Establishing an Enterprise AI Strategy

Modular Architecture

MABWiser

TextWiser

Selective

Seq2Pat

Jurity

BoolXAI

Industry-Strength Open-Source Software

Patents

Resources

Overview

Open-Source AI at Scale: Establishing an Enterprise AI Strategy

Modular Architecture

MABWiser

TextWiser

Selective

Seq2Pat

Jurity

BoolXAI

Industry-Strength Open-Source Software

Patents

Resources

Open-Source AI at Scale:
Establishing an Enterprise AI Strategy