math - Mudit Bachhawat

Neural Encoders: From Autoencoders to Modern Retrieval Architectures

Apr 19, 2025

Reading Time: 14 minutes

Autoencoders Dual Encoder: a a b b c c d d b.d > c.d b.d > c.d Item1 Item1 Q1 Q1 Q2 Q2 A A B B C C A+B >= C A+B >= C If A = 0.1 and B = 0.1, then C ≤ 0.2 If A = 0.1 and B =…

Piecewise Linear Curves in PyTorch

Jul 21, 2024

Reading Time: 8 minutes

In this blog, I will train a simple piecewise linear curve on a dummy data using pytorch. But first, why piecewise linear curve? PWL curves are set of linear equation joined at common points. They allow you to mimic any non linear curve and their simplicity helps you explain the predictions. Moreover, they can be…

On Preference Optimization and DPO

May 10, 2024

Reading Time: 6 minutes

Introduction Training with preference data has allowed large language models (LLMs) to be optimized for specific qualities such as trust, safety, and harmfulness. Preference optimization is the process of using this data to enhance LLMs. This method is particularly useful for tuning the model to emphasize certain features or for training scenarios where relative feedback…

Creating a Tiny Vector Database, Part 1

Apr 20, 2024

Reading Time: 8 minutes

[medium discussion] Introduction Vector databases allow you to search for approximate nearest neighbors from a large set of vectors. They provide an alternative to brute force nearest neighbor searches at the cost of accuracy and additional memory. Previously, many advancements in NLP and deep learning were limited to small scales due to the lack of…

External Knowledge in LLMs

Mar 9, 2024

Reading Time: 10 minutes

[medium and substack discussion] LLMs are trained on finite set of data. While it can answer wide variety of questions across multiple domain, it often fails to answer questions which are highly domain-specific and out of its training context. Additionally, training LLMs from scratch for any new information is not possible like traditional models with…

Mathematical Expectation for Interviews

Dec 17, 2021

Reading Time: 5 minutes

In this post, I will just solve for expected value of a probabilistic model with as many methods as I can. You can encountered these types of problem in data science and quant interviews. Problem Assume there is a 2×2 grid, as shown in the figure below. You can randomly walk to a neighboring block…

Tag: math