← Back

Thoughts.

Writing on AI engineering, cloud systems, and the craft of building things that last.

Building "The Catch" - A Full-Stack AI Accountability Journal

Stepping outside my comfort zone to build a daily accountability journal with a built-in AI mentor using Flutter and Gemini.

aiflutterfirebasefull-stackgeminiapp-development

2026-03-15

1 min read

Read →

1 Ticket. 8,000 Users. 0 Race Conditions.

Simulating a massive traffic spike to test Redis distributed locks and prevent data corruption in a ticket booking engine.

system-designredispythondevopsengineeringbackend

2026-01-18

1 min read

Read →

Auto-Healing Infrastructure - AI Fixed My Database in 66 Seconds

Exploring AIOps with LogSentinel and Gemini 1.5 Flash to create a self-healing system that resolves production incidents while you sleep.

sredevopsaipythonaiopsautomationopen-source

2026-01-18

3 min read

Read →

Silence is Not Reliability - The Scream Test

Building the "Voice" of the system with Prometheus and Discord to reduce Time To Detect (TTD).

sredevopsprometheusalerting

2026-01-17

1 min read

Read →

Embracing the Chaos - Automated Recovery in 45 Seconds

True reliability isn't about preventing failure, but mastering automated recovery through Chaos Engineering.

srechaos-engineeringdevopspythonresilience

2026-01-16

1 min read

Read →

From Black Box to Glass Box - Real-Time Observability Unleashed

Transforming a distributed system into a transparent powerhouse with Prometheus and Grafana.

sredevopsobservabilityinfrastructure

2026-01-15

2 min read

Read →