Ahan M R - Academic Portfolio

About Me

Hi! I am Ahan M R, an Applied Scientist at Amazon's Alexa International team, where I work at the intersection of cultural AI and multilingual systems. My work focuses on making AI systems more culturally aware and accessible across different languages and regions, with a particular emphasis on LLM evaluation frameworks and synthetic data generation for low-resource languages.

My research interests span across Large Language Models, particularly in developing novel approaches for domain-specific fine-tuning, in-context learning optimization, and building autonomous agentic systems. I'm also deeply involved in architecting scalable recommendation systems, focusing on cold-start problems and personalization at scale. Recently, I've been exploring the cultural aspects of AI, developing frameworks for evaluating and enhancing the cultural awareness of language models.

Outside of my technical work, I'm an enthusiastic quizzer and regularly participate in tech and sports quiz competitions. Ultimate Frisbee has been a significant part of my life, offering the perfect balance to my technical pursuits. I'm also passionate about cricket and maintain a general vlogging channel where I share fun things I've been upto and my experiences over the year. These diverse interests help me bring a well-rounded perspective to my technical work and life.

Professional Experience

Applied Scientist @ Amazon - Alexa International

May 2022 - Present

Building Evaluation for Cultural Framing of LLMs: Implemented an i18n framework for understanding different cultural pillars and red teaming of LLMs in the form of contextual and cultural relevancy; Built a synthetic data generator for instruction-tuned ICL along with fine-tuning; conducted reward model-based evaluation of LLM output.
Optimization of Contextual Recommendation System for Marketing: Developed a two-tower neural network for streaming propensity prediction, specifically addressing cold start domains in promotional & personalization contexts. Achieved impactful enhancements in user engagement and content relevancy across i18n markets, driving key metrics.
Enhancement of Multilingual Translation and Localization: Developed a robust pipeline from language-to-LLM mapping and prompt generation to translation execution and quality assurance, optimizing the translation process. Engineered internal DataGenerationService to enable high-quality translations of structured datasets across multiple languages, employing state-of-the-art LLMs such as CommandR+, Claude-3.5-Sonnet, etc.
Synthetic Dataset Generation for i18n Languages: Used multilingual LLMs for ICL-based translation and dataset generation tasks across multiple languages with strong evaluation metrics such as COMET, LLM-as-a-judge, etc.

Data Scientist @ Lowe's - Search and Personalisation

May 2021 - May 2022

Building Personalization Solutions for Purchase Prediction: Implemented sequence and session-aware recommendation systems for purchase prediction using GRU, XLNet, and Transformer models as next-item prediction tasks.
Improving Search Results with Query Reformulation: Implemented an OpenNMT and Transformer-based solution for reformulating low-performing queries using a neural query expansion approach.
Impact: Enhanced the search and personalization experience for customers visiting Lowe's online website.

Open-Source Contributor @ Deepchecks - Relevancy

Oct 2023 - Feb 2024

LLMOps solutions: To leverage expertise in Language Model (LLM) engineering to identify, mitigate, and prevent issues such as hallucinations, harmful content, performance degradation, and data pipeline disruptions.
Impact: Detected and resolved over 90% of identified issues before they impacted end-users, ensuring the LLMs operated at optimal efficiency.

Research Intern @ Microsoft Research - AI & IoT for Sustainability

Aug 2020 - Jun 2021

Thesis Advisor: Dr. Akshay Nambi

Computer Vision based solution for fault localisation in Solar Panels: Implemented a holistic system for fault detection and classification using an End-to-End pipeline using RetinaNet, EfficientNet and FasterRCNN.
Optimised cell segmentation: Designed a segmentation algorithm for optimised model training and inference.
Impact: Reduction in manual workload with time reduction by 45% with increased efficiency in deployed solution.
Outcome: Paper published at ACM SenSys '21 - AIChallengeIoT on successful completion of Undergraduate Thesis.

Research Intern @ American Express - Document Recognition & Processing

May 2019 - Jul 2019

AART: AI-Assisted Review Tool: Created a Vision and Text Extraction solution for generating rich structured representation of marketing documents along with interactive GUI with dependency parsing.
Understanding error comments and creative comparison: Implemented a word embedding based topic modelling system for interpreting user-feedback and Attention-based LSTM for sentence classification.

Summer Intern @ UST Global - Infinity Lab

May 2018 - Jul 2018

Automating the in-store billing with U-Store: Created a vision and text based system to automate the billing of products by mapping the users in store with a in-house Face detection algorithm and generate bill to improve retail conversion rates with Electronic Shelf Label(ESL).
NLP Bug Tracking System using Sequential Models: Implemented a Doc2Vec model to contextually classify the bug-type flagged by the user.

Research Interests

Large Language Models

Cultural awareness in AI, multilingual evaluation frameworks, and domain adaptation. Expertise in fine-tuning techniques (PEFT, LoRA, QLoRA), in-context learning optimization, and building autonomous agentic systems.

Recommendation Systems

Contextual and personalized recommendation systems, neural approaches to user-item modeling

Search & Information Retrieval

Neural IR models, semantic search architectures, and query understanding for large-scale systems

Publications

BritLex: Development and Evaluation of a Comprehensive British English Dataset

Ahan M.R, Shivam Mangale, Amani Namboori, Abhishek Singhania, Ameya Datar

Amazon Machine Learning Conference (AMLC '24)

Read Paper

Addressing Bias in Face Detectors using Decentralised Data collection with incentives

Ahan M.R, Robin Lehmann, Richard Blythman

Workshop on Decentralization and Trustworthy Machine Learning in Web3 at NeurIPS '22

Read Paper

AI-assisted Cell-Level Fault Detection and Localization in Solar PV Electroluminescence Images

Ahan M.R, Akshay Nambi, Tanuja Ganu, D Nahata, S Kalyanaraman

AIChallengeIoT at ACM SenSys '21

Read Paper

Towards Automatic Transformer-based Cloud Classification and Segmentation

Ahan M.R, Roshan Roy, Vaibhav S, Ashish Chittora

Tackling Climate Change with Machine Learning Workshop at NeurIPS '21

Read Paper

Social Network Analysis using Data Segmentation and Neural Networks

Ahan M.R, Honnesh R, Ayush Mungad

International Research Journal in Engineering and Technology (IRJET '18)

Read Paper

Achievements

Data Science Competitions

2nd Place - Numerai Hedge Fund Challenge Sep 2024
Developed "A Detailed Case Study on Crypto Multi-factor Risk Analysis" investigating cryptocurrency investment strategies through traditional equity market frameworks. Analyzed market capitalization of $1,676B using Fama-MacBeth regression and modified Fama-French models, revealing unique cryptocurrency market dynamics and systematic return predictability patterns.

View Details
1st Place - OCEAN Protocol VC Challenge May 2024
Led analysis of venture capital landscape examining founder demographics and funding dynamics. Revealed key insights including median acquisition price of $72.6M and average time to acquisition of 695 days. Identified significant success rate disparities (40.3% vs 27.4%) between male and female founders, highlighting systemic industry patterns.

View Details
2nd Place - GitHub Developer Dynamics Challenge Jun 2024
Analysis of developer activity correlation with token prices, achieving 87% prediction accuracy through comprehensive modeling of repository patterns and commit frequencies across major crypto-AI projects.

View Details
3rd Place - OCEAN Protocol Google Trends Challenge May 2024
Conducted comprehensive analysis of Google Trends' impact on cryptocurrency prices across market capitalizations. Revealed significant correlations including Bitcoin's 0.34 correlation with 1-day lag (11.56% price variability), Ethereum's 0.48 correlation with 7-day lag (23.04%), and Dogecoin's 0.65 correlation with 1-day lag (42.25%). Analysis spanned from 2016-2024, capturing major market events like Bitcoin's 1200% search interest surge correlating with price movement from $1,000 to $20,000.

View Details

Awards & Recognition

Most Innovative Project - Amazon Devices Demo Crawl '23 Oct 2023
Best Paper Award - ACM SenSys '21 Nov 2021
AIChallengeIoT Workshop
Watch Talk
Selected for Google Research AI Summer School Oct 2020
Natural Language Understanding Track
Program Details

Other Distinctions

National Winner - Smart India Hackathon 2019 May 2019
Project deployed at IRCTC
View Details
Merit-cum-Need Scholarship Aug 2019 - May 2021
40% scholarship from BITS Pilani based on academic performance
Top-Rated Freelancer on UpWork Jan 2020 - Jun 2021
Data Science Researcher with projects worth $4k
Intel Software Innovator May 2018
Part of Intel Ambassador Program
Rank 12 - Flipkart GRID ML Challenge May 2020
Document Invoice Processing Challenge

Contact

Email: ahanmr98@gmail.com