Category: Article
-
Neural Encoders: From Autoencoders to Modern Retrieval Architectures
Reading Time: 14 minutes
Autoencoders Dual Encoder: a a b b c c d d b.d > c.d b.d > c.d Item1 Item1 Q1 Q1 Q2 Q2 A A B B C C A+B >= C A+B >= C If A = 0.1 and B = 0.1, then C ≤ 0.2 If A = 0.1 and B =…
-
Scaling AI Teams and Workload
Reading Time: 4 minutes
Scaling AI teams and workloads involves organizing efforts across various levels as companies grow. Initially, teams focus on complete products like search, recommendations, or chatbots, with potential sub-product divisions. As systems mature, scaling can occur at the component level (e.g., retrieval, ranking), feature level (e.g., knowledge graphs, trust signals), or by targeting specific user cohorts…
-
Inefficiencies in Markets and Evolution
Reading Time: 6 minutes
Q: Is there any ideas common between efficient market hypothesis and red queen hypothesis A: Let me explore the connections between the Efficient Market Hypothesis (EMH) and the Red Queen Hypothesis by analyzing their core principles. The Efficient Market Hypothesis, primarily from economic theory, suggests that financial markets are informationally efficient, meaning stock prices reflect…
-
LLM-as-a-Judge for AI Systems
Reading Time: 10 minutes
Introduction Common Patterns of LLM-as-a-Judge Method Basic Evaluating Judge Model Improving Judge Performance Scaling Judgments Closing References
-
Piecewise Linear Curves in PyTorch
Reading Time: 8 minutes
In this blog, I will train a simple piecewise linear curve on a dummy data using pytorch. But first, why piecewise linear curve? PWL curves are set of linear equation joined at common points. They allow you to mimic any non linear curve and their simplicity helps you explain the predictions. Moreover, they can be…
-
On Preference Optimization and DPO
Reading Time: 6 minutes
Introduction Training with preference data has allowed large language models (LLMs) to be optimized for specific qualities such as trust, safety, and harmfulness. Preference optimization is the process of using this data to enhance LLMs. This method is particularly useful for tuning the model to emphasize certain features or for training scenarios where relative feedback…
-
Keeping Up with RAGs: Recent Developments and Optimization Techniques
Reading Time: 10 minutes
[medium discussion] RAG Basics Indexing Indexing Inference Inference Query Query Vector DB Vector DB Response Response nn scan nn scan Embedding Embedding Prompt +Passages Prompt +… LLM LLM Retrieval Retrieval Generation Generation Documents Documents Chunking Chunking Chunks Chunks LLM LLM Embeddings Embeddings write writeText is not SVG – cannot display Chunking Embedding Model Fine-tuning Embedding…
-
External Knowledge in LLMs
Reading Time: 10 minutes
[medium and substack discussion] LLMs are trained on finite set of data. While it can answer wide variety of questions across multiple domain, it often fails to answer questions which are highly domain-specific and out of its training context. Additionally, training LLMs from scratch for any new information is not possible like traditional models with…