Ivandir Ndrio - AI Systems Architect and Researcher

About Me

Where I've worked, what I studied, and the certifications that stuck

Experience

19+ years in computer science
- Encoura, Anthology, Hobsons, Lockheed Martin
Engineering leadership
- Teams of 25+, SAFe, Agile
Cloud architecture & DevOps
- AWS, Azure, Terraform, CDK
Data science & ML/AI
- Python, Pandas, Jupyter

Education

Stevens Institute of Technology
- M.S. Computer Science
- Machine Learning, Systems Architecture, DB Management
Penn State University
- B.S. Computer Science (systems & algorithms)
- B.S. Mathematics (AI & data science)

Certifications

AWS Solutions Architect
Associate Level (2020)
Credential: AWS00597224
Univa Grid Engine Admin

💬 Available for conference talks, technical panels, and podcasts — get in touch

Technical Expertise

What I reach for when something needs to actually work

AI & Machine Learning

PyTorch Transformers Diffusers LLMs

Cloud & DevOps

AWS Azure Docker Kubernetes Terraform Ansible

Languages

Python Java C# C++

Development Tools

Unity3D Flutter Jenkins VSCode Jupyter

Web & Databases

Django PostgreSQL MySQL Bootstrap

Languages (Human)

English Greek Albanian Spanish

Featured Projects

Things I built, some of which other people found useful

⭐ … stars 📦 … public repos → github.com/ivandir

bark-with-voice-clone

A fork of Suno's Bark that adds voice cloning. Feed it a few seconds of audio and it generates new speech in that voice. Built this before the big labs made it easy.

Python PyTorch AI/ML Audio

⭐ Featured View →

Hydraspy

Python library for the Razer Hydra motion controller. I needed it for a VR project, the official SDK was painful, so I wrote a cleaner wrapper.

Python VR Hardware Gaming

⭐ 5 stars View →

cyao2pdf

Converts Office and image formats to PDF. Wrote it to batch-process files without depending on LibreOffice or paid converters.

Python PDF Automation

⭐ Featured View →

llmvip

Fully private, entirely in-browser LLM chatbot. Runs open-source models (Llama 3, Mistral, Phi) directly in your browser using WebGPU — no server, no data ever leaves your machine.

TypeScript WebGPU LLM Privacy

⭐ Featured View →

i66.org

A self-hosted library of 160+ curated shell commands for AWS, Azure, GCP, Kubernetes, Docker, and Terraform. Served from S3 and consumed by a minimal single-file Go CLI — your distributed cloud operations cheat sheet.

Go AWS CLI DevOps

⭐ Featured View →

More Projects

42+ repos — cloud tooling, AI experiments, VR projects, and things I built to solve specific problems. 4.3k+ stars, mostly from bark-with-voice-clone.

Open Source Various Tech

⭐ 4.3k total Browse All →

Essays & Simulations

Long-form thinking on technology, economics, and structural change

CDK for ML: Infrastructure as Code for AI Teams

Standing up a Bedrock Knowledge Base by hand means an OpenSearch collection, three policies, a vector index CloudFormation can't create, and a scoped IAM role. Here's the L3 construct that does it in one call, and the SigV4 trick that makes the index problem disappear.

May 2026 Read →

Prompt Caching: Stop Paying to Process the Same Context Twice

Every LLM request pays a prefill cost to process your system prompt and documents, even when nothing changed. Here's what KV pairs are, why prompt structure determines whether caching works, and when the savings are actually worth chasing.

March 2026 Read →

SLM, LLM, or Frontier: Matching the Model to the Problem

SLM, LLM, and Frontier aren't a size spectrum — they're three different tradeoff profiles. Here's how I pick between them, and why defaulting to the biggest model you have access to is usually the wrong call.

March 2026 Read →

RAG vs Long Context: The Architecture Decision That Changed in 2024

Context windows went from 4K to 1M tokens in two years. I've built production RAG systems and shipped long-context solutions. Here's how I decide which one to use, and why most teams are still overbuilding.

March 2026 Read →

Building Production RAG on AWS Bedrock: Patterns That Actually Work

The retrieval-augmented generation patterns that survived contact with production traffic on AWS Bedrock — chunking strategies, embedding choices, reranking, and the failure modes nobody writes about.

March 2026 Read →

Ask Claude About Your AWS Alarms: Building an MCP Server for CloudWatch

How I gave Claude Code real-time access to CloudWatch metrics, logs, and alarms — in both TypeScript and Python — and what I learned about the design decisions that make observability tools actually useful for AI agents.

March 2026 Read →

Build Your Own MCP Server in 30 Minutes

How to build a custom MCP server that connects Claude Code to any REST API — the protocol explained, a working TypeScript and Python implementation, and the patterns that make it production-ready.

March 2026 Read →

BMAD vs MASTERPLAN

Both frameworks try to make AI useful across long projects. BMAD structures the process with 12+ specialized agents and 34+ workflows. MASTERPLAN persists the state with two markdown files and zero dependencies. Here's when each wins — and how to use both.

March 2026 Read →

Beyond the Context Window

Claude Code's CLAUDE.md tells the AI how to work. But nothing tells it what's already built, what's in progress, or what comes next — until now. MASTERPLAN.md is the persistent project brain that survives context compression, structured like an Agile sprint board for AI work.

March 2026 Read →

Commoditization of Skills

An SRE's years of hard-won expertise — keyless OIDC deployment, IAM least-privilege, CDK stack design — can now be distilled into a 15-minute Claude Code prompt. What happens to specialized engineering value when superpowers become consumable AI skills, and who ends up writing the skills vs. running them?

March 2026 Read →

The 2028 Intelligence Displacement Scorecard

A sector-by-sector breakdown of who wins and who gets erased as AI agents eliminate "human friction." Compute owners, stablecoin rails, and data center landlords thrive. SaaS seat-counters, card networks, and middle management face structural collapse — and the market hasn't priced any of it yet.

February 2026 Read →

The Skill Stack: Claude Code Custom Skills as Organizational Superpowers

Custom skills in Claude Code aren't prompts — they're codified institutional knowledge that compounds. Skill chaining turns individual expertise into team infrastructure, and the gap between teams that do this and teams that don't is widening fast.

February 2026 Read →

The Bot Flood: When 4.6 Billion Phones Became AI Factories

Open-weight AI running on consumer phones doesn't just democratize intelligence—it democratizes spam. With 4.6 billion mobile internet users and zero marginal cost per message, the signal-to-noise ratio of the internet is approaching collapse. Here's what that looks like, and how to protect yourself.

February 2026 Read →

How a Markdown File Exposed a $285B Structural Break

When Anthropic released a 200-line legal contract plugin, $285 billion in SaaS market cap evaporated in 48 hours. The crash wasn't about the plugin—it revealed that the per-seat pricing model was already broken.

February 2026 Read →

AI Economic Simulation, 10-Year Outlook vs. The Great Depression

I ran a 10-year economic simulation: AI compresses white-collar wages, capital concentrates into compute, and the curve looks worse than the Great Depression in three of four scenarios.

January 2026 Read →

LLM Coding Agents: A Phase Shift in Software Engineering

How agentic LLM workflows crossed a coherence threshold in late 2025—reshaping how engineers build software, where the real leverage comes from, and why human verification is now the core bottleneck.

January 2026 Read →

IvandirNdrio

Impact & Reach

About Me

Experience

Education

Certifications

Technical Expertise

AI & Machine Learning

Cloud & DevOps

Languages

Development Tools

Web & Databases

Languages (Human)

Featured Projects

bark-with-voice-clone

Hydraspy

cyao2pdf

llmvip

i66.org

More Projects

Essays & Simulations

CDK for ML: Infrastructure as Code for AI Teams

Prompt Caching: Stop Paying to Process the Same Context Twice

SLM, LLM, or Frontier: Matching the Model to the Problem

RAG vs Long Context: The Architecture Decision That Changed in 2024

Building Production RAG on AWS Bedrock: Patterns That Actually Work

Ask Claude About Your AWS Alarms: Building an MCP Server for CloudWatch

Build Your Own MCP Server in 30 Minutes

BMAD vs MASTERPLAN

Beyond the Context Window

Commoditization of Skills

The 2028 Intelligence Displacement Scorecard

The Skill Stack: Claude Code Custom Skills as Organizational Superpowers

The Bot Flood: When 4.6 Billion Phones Became AI Factories

How a Markdown File Exposed a $285B Structural Break

AI Economic Simulation, 10-Year Outlook vs. The Great Depression

LLM Coding Agents: A Phase Shift in Software Engineering

Ivandir
Ndrio