← ...
checklist
dsa
| category | topic | revision1 | revision2 | revision3 |
|---|---|---|---|---|
| basics | getting started | ✅ | ||
| io | ✅ | |||
| loops | ✅ | |||
| conditionals | ✅ | |||
| data types | ✅ | |||
| basic dsa | number system | ✅ | ||
| arrays | ||||
| strings | ||||
| vectors (2d arrays) | ||||
| searching | ||||
| sorting | ||||
| recursion | recursion fundamentals | |||
| time complexity | ||||
| space complexity | ||||
| dynamic programming intro | ||||
| basic dsa 2 | linked list | |||
| stack | ||||
| queue | ||||
| trees - general tree (gt) | ||||
| trees - binary tree (bt) | ||||
| trees - binary search tree (bst) | ||||
| advanced dsa | hash map | |||
| heap | ||||
| graph | ||||
| advanced dp patterns |
mini-projects
| category | topic | revision1 | revision2 | revision3 |
|---|---|---|---|---|
| data engineering projects | dataset cleaning & etl pipeline | ✅ | ||
| api to db pipeline | ✅ | |||
| data lake + dbt transformation | ||||
| airflow dags | ✅ | |||
| realtime payments + fraud detection (streaming) | ||||
| cdc pipeline | ||||
| feature store + realtime scoring | ||||
| data mesh | ✅ | |||
| crypto trading pipeline | ✅ |
core subjects
| category | topic | revision1 | revision2 | revision3 |
|---|---|---|---|---|
| core cse | sql | |||
| dbms | ||||
| operating system (os) | ||||
| computer networks | ||||
| system design | ||||
| security | ||||
| distributed systems | ||||
| concurrency & parallelism |
projects on resume
| category | topic | revision1 | revision2 | revision3 |
|---|---|---|---|---|
| data engineering projects | bp ai data platform (elt, 1m+ events/day) | |||
| mg etl + cdc streaming | ||||
| image to text ml pipeline (resnet + lstm) | ||||
| duplicate question detector (roberta + lstm) | ||||
| big data & streaming | spark (on glue) | |||
| kafka | ||||
| kinesis + firehose | ||||
| cdc (dynamodb → s3) | ||||
| orchestration & modeling | airflow (dags, retries, reconciliation) | |||
| dbt (modeling in bp) | ||||
| star-schema (redshift) | ||||
| cloud & storage | aws glue, lambda, s3, redshift, athena | |||
| s3 zoning (raw/processed/curated) | ||||
| devops & cicd | python-based ci/cd (bp) | |||
| git + jenkins (mg) | ||||
| visualization & reporting | tableau dashboards (kpis, kyc, revenue) | |||
| fintech domains | real-time fraud detection (<150ms) | |||
| p&l, subscriptions, wallet monitoring | ||||
| investor kpi tracking (seed round) | ||||
| leadership & impact | founding data engineer (from scratch) | |||
| sole data engineer (end-to-end ownership) | ||||
| reduced tickets 65%, deploy time 40% | ||||
| academic & mentoring | scalable architectures research | |||
| taught 100+ students (python, sql, aws, dsa) |