Subhajit

Git 101 – Commands and Workflows Cheat Sheet

24 September 2025·448 words·3 mins

A quick, task-oriented Git reference. Pair this with the in-depth guide for concepts and best practices. TL;DR Working tree → git add → Staging area → git commit → Local repo → git push → Remote. Most common daily commands: status, add, commit, push, pull, checkout, merge. Undo staged changes: git restore --staged <file>. Undo last commit (keep changes): git reset HEAD~1. For concepts and workflows, read the full Git guide. Minimal Mental Model # graph LR WD[Working Dir] -- add --> ST[Staging] ST -- commit --> REPO[Local Repo] REPO -- push --> ORI[Origin] ORI -- fetch/pull --> REPO Setup # git --version git config --global user.name "Your Name" git config --global user.email "you@example.com" git config --global init.defaultBranch main Create or Clone # git init git clone <url> Status and Diffs # git status git diff # unstaged git diff --staged # staged vs HEAD Stage and Commit # git add <path> git add -p # interactive hunks git commit -m "feat: message" git commit --amend # edit last commit Branching # git branch git switch -c feature/x git switch main git branch -d feature/x Sync with Remote # git remote -v git fetch git pull # merge git pull --rebase # rebase git push -u origin my-branch Merge vs Rebase # git switch my-branch && git merge main git switch my-branch && git rebase main Resolve Conflicts # git status # edit files, remove markers git add <file> git commit # after merge git rebase --continue # during rebase Stash Work # git stash push -m "wip" git stash list git stash pop Undo Safely # git restore --staged <file> # unstage git restore <file> # discard local edits git revert <sha> # new commit to undo git reset --soft HEAD~1 # keep changes, drop last commit git reflog # find lost commits Tags and Releases # git tag -a v1.0.0 -m "msg" git push --tags Ignore and Clean # echo "node_modules/" >> .gitignore git clean -fdx # dangerous: removes untracked files Authentication (Quick) # # HTTPS + PAT git clone https://github.com/owner/repo.git # SSH ssh-keygen -t ed25519 -C "you@example.com" ssh-add ~/.ssh/id_ed25519 git clone git@github.com:owner/repo.git Conventional Commits (Optional) # feat(auth): add oauth login fix(api): handle null pointer in user service chore(ci): update node to 20 Common One-Liners # # See last commit summary git log -1 --stat # Interactive rebase last 5 commits git rebase -i HEAD~5 # Squash branch onto main git switch my-branch && git rebase -i main Quick PR Flow (GitHub) # git switch -c feat/x # edit, add, commit git push -u origin feat/x # open PR on GitHub See also: the full guide “The Definitive Guide to Version Control with Git and GitHub”.

Git & GitHub: The Definitive Version Control Guide

24 September 2025·1559 words·8 mins

Software Engineering

Version control is the foundation of reliable software delivery. This guide teaches Git from first principles, then layers in practical GitHub workflows used by high-performing teams. You’ll learn the mental models, the everyday commands, and the advanced tools to collaborate confidently without fear of breaking anything.

Designing Secure and Scalable APIs — A Comprehensive Guide

24 September 2025·1487 words·7 mins

Software Engineering

APIs are the connective tissue of modern products. This guide distills proven practices for API design, security, observability, and reliability—covering the most frequent questions and edge cases teams face in production. Examples use FastAPI and Pydantic v2, but the principles generalize to any stack.

Beginners Guide to Building and Securing APIs

24 September 2025·771 words·4 mins

Software Engineering

New to APIs? This guide explains core concepts in clear language, then walks you through building a small FastAPI service with essential security and testing tips. When you’re ready for advanced patterns, read the companion: Designing Secure and Scalable APIs — A Comprehensive Guide.

Handle Missing Values in Pandas Without Losing Information

17 September 2025·1090 words·6 mins

Data Science

Missing values are inevitable in real-world datasets. This guide covers proven methods to handle missing data in pandas without compromising data integrity or analytical accuracy. TL;DR Use df.isnull().sum() to audit missing values before doing anything. Drop rows/columns only when missingness is random and < 5% of data. Fill with mean/median for numerical columns with low missingness. Forward/backward fill for time series; interpolation for smooth numerical sequences. Never fill categoricals with mean — use mode or a dedicated “Unknown” category. What Are Missing Values in Pandas # Missing values in pandas are represented as NaN (Not a Number), None, or NaT (Not a Time) for datetime objects. These occur due to:

NLP Entity Matching with Fuzzy Search

14 August 2025·1100 words·6 mins

LLM Engineering

Product catalogs rarely match 1:1. Supplier A calls it “Apple iPhone 13 Pro 256GB Space Grey” while your system has “iPhone 13 Pro - 256 - Gray”. String equality fails. This guide covers a three-stage approach combining lexical, surface, and semantic similarity to match entities at scale with minimal false positives.

Document Summarization: Eval First

14 August 2025·823 words·4 mins

LLM Engineering

Document summarization is a critical NLP task that helps users quickly grasp key information from long documents. But how do you know if your model is actually working? This guide shows a workflow that starts with evaluation and acceptance criteria before touching models — the approach that got a finance report summarizer from prototype to production in three weeks.

RAG with LangChain: Architecture, Code, and Metrics

2 August 2025·1260 words·6 mins

LLM Engineering

RAG is a design pattern, not a product. LangChain supports it out of the box. This guide shows a production-ready RAG setup in LangChain with architecture, retrieval choices, runnable code, evaluation metrics, and trade-offs from my client projects. TL;DR # Short answer: LangChain doesn’t “contain” RAG; it provides the building blocks to implement RAG cleanly. You wire up chunking, embeddings, vector store, and a retrieval-aware prompt chain. What you get below: Architecture diagram, runnable code (LangChain 0.2+), evaluation harness, parameter trade-offs, and when to avoid LangChain for leaner stacks. Related deep dives: Foundations of RAG → RAG for Knowledge-Intensive Tasks. Lightweight pipelines → LightRAG: Lean RAG with Benchmarks. Who should read this # You’re building an internal knowledge assistant, support bot, or compliance Q&A system. You need answers that cite real documents with predictable latency and cost. You want a minimal, maintainable RAG in LangChain with evaluation, not a toy demo. The problem I solved in production # When I implemented an extractive summarizer for financial and compliance reports, two pain points surfaced:

LightRAG: Lean RAG with Benchmarks

30 July 2025·884 words·5 mins

LLM Engineering

LightRAG is a minimal RAG toolkit that strips away heavy abstractions. Here’s a complete build with code, performance numbers versus a LangChain baseline, and when LightRAG is the right choice. TL;DR LightRAG is a minimal RAG stack: FAISS + embeddings + prompt composition, ~120 lines. ~20% faster p50 latency vs LangChain on small corpora (≤ 500 chunks) due to fewer abstractions. Best for: serverless/edge deployments, small teams, single-purpose Q&A. Use LangChain instead when you need agents, tracing, callbacks, or multi-step workflows. Don’t skip data quality: clean text, handle missing values, validate numeric tables before indexing. Why LightRAG # For small, self-hosted RAG services, I often don’t need callbacks, agents, or complex runtime graphs. I need:

Difference between reshape() and flatten() in NumPy

25 July 2025·1442 words·7 mins

Data Science

NumPy’s reshape() and flatten() are both used for array manipulation, but they serve different purposes and have distinct behaviors. This guide explains when and how to use each method effectively. TL;DR reshape() returns a view (no copy) when possible — memory-efficient, changes affect original. flatten() always returns a copy — safe to modify independently. Use ravel() instead of flatten() when you want a view (like reshape(-1)) to save memory. Use reshape(-1) to flatten without copying; use flatten() only when you need an independent 1D copy. What is reshape() in NumPy # The reshape() method changes the shape of an array without changing its data. It returns a new view of the array with a different shape when possible.

↑