bark-with-voice-clone
A fork of Suno's Bark that adds voice cloning. Feed it a few seconds of audio and it generates new speech in that voice. Built this before the big labs made it easy.
19+ years building software that ships — from AI/ML systems and cloud infrastructure to the engineering teams that run them. Currently deep in LLM tooling, AWS Bedrock, and MCP server design.
Where I've worked, what I studied, and the certifications that stuck
💬 Available for conference talks, technical panels, and podcasts — get in touch
What I reach for when something needs to actually work
Things I built, some of which other people found useful
A fork of Suno's Bark that adds voice cloning. Feed it a few seconds of audio and it generates new speech in that voice. Built this before the big labs made it easy.
Python library for the Razer Hydra motion controller. I needed it for a VR project, the official SDK was painful, so I wrote a cleaner wrapper.
Converts Office and image formats to PDF. Wrote it to batch-process files without depending on LibreOffice or paid converters.
Fully private, entirely in-browser LLM chatbot. Runs open-source models (Llama 3, Mistral, Phi) directly in your browser using WebGPU — no server, no data ever leaves your machine.
A self-hosted library of 160+ curated shell commands for AWS, Azure, GCP, Kubernetes, Docker, and Terraform. Served from S3 and consumed by a minimal single-file Go CLI — your distributed cloud operations cheat sheet.
42+ repos — cloud tooling, AI experiments, VR projects, and things I built to solve specific problems. 4.3k+ stars, mostly from bark-with-voice-clone.
Long-form thinking on technology, economics, and structural change
Every LLM request pays a prefill cost to process your system prompt and documents, even when nothing changed. Here's what KV pairs are, why prompt structure determines whether caching works, and when the savings are actually worth chasing.
SLM, LLM, and Frontier aren't a size spectrum — they're three different tradeoff profiles. Here's how I pick between them, and why defaulting to the biggest model you have access to is usually the wrong call.
Context windows went from 4K to 1M tokens in two years. I've built production RAG systems and shipped long-context solutions. Here's how I decide which one to use, and why most teams are still overbuilding.
The retrieval-augmented generation patterns that survived contact with production traffic on AWS Bedrock — chunking strategies, embedding choices, reranking, and the failure modes nobody writes about.
How I gave Claude Code real-time access to CloudWatch metrics, logs, and alarms — in both TypeScript and Python — and what I learned about the design decisions that make observability tools actually useful for AI agents.
How to build a custom MCP server that connects Claude Code to any REST API — the protocol explained, a working TypeScript and Python implementation, and the patterns that make it production-ready.
Both frameworks try to make AI useful across long projects. BMAD structures the process with 12+ specialized agents and 34+ workflows. MASTERPLAN persists the state with two markdown files and zero dependencies. Here's when each wins — and how to use both.
Claude Code's CLAUDE.md tells the AI how to work. But nothing tells it what's already built, what's in progress, or what comes next — until now. MASTERPLAN.md is the persistent project brain that survives context compression, structured like an Agile sprint board for AI work.
An SRE's years of hard-won expertise — keyless OIDC deployment, IAM least-privilege, CDK stack design — can now be distilled into a 15-minute Claude Code prompt. What happens to specialized engineering value when superpowers become consumable AI skills, and who ends up writing the skills vs. running them?
A sector-by-sector breakdown of who wins and who gets erased as AI agents eliminate "human friction." Compute owners, stablecoin rails, and data center landlords thrive. SaaS seat-counters, card networks, and middle management face structural collapse — and the market hasn't priced any of it yet.
Custom skills in Claude Code aren't prompts — they're codified institutional knowledge that compounds. Skill chaining turns individual expertise into team infrastructure, and the gap between teams that do this and teams that don't is widening fast.
Open-weight AI running on consumer phones doesn't just democratize intelligence—it democratizes spam. With 4.6 billion mobile internet users and zero marginal cost per message, the signal-to-noise ratio of the internet is approaching collapse. Here's what that looks like, and how to protect yourself.
When Anthropic released a 200-line legal contract plugin, $285 billion in SaaS market cap evaporated in 48 hours. The crash wasn't about the plugin—it revealed that the per-seat pricing model was already broken.
I ran a 10-year economic simulation: AI compresses white-collar wages, capital concentrates into compute, and the curve looks worse than the Great Depression in three of four scenarios.
How agentic LLM workflows crossed a coherence threshold in late 2025—reshaping how engineers build software, where the real leverage comes from, and why human verification is now the core bottleneck.