// INITIALIZING PIPELINES...

Turning Raw Data into
_

I build the engines that power modern AI. Specializing in high-throughput ETL pipelines, Vector RAG systems, and Autonomous Agent workflows.

View Tech Stack Latest Research

↓

Core Competencies

Data Engineering

Designing robust ELT/ETL architectures using Airflow, DBT, and Spark. Ensuring data quality and lineage for downstream AI consumption.

AI Integration

Building RAG (Retrieval-Augmented Generation) systems with Vector Databases. Fine-tuning LLMs for specific domain tasks.

>_ run
...

Process Automation

Replacing manual spreadsheet workflows with Python scripts and autonomous agents. If you do it twice, I automate it.

The Engine Room

Tools and technologies I use to bend data.

>> print(stack.current_status)

Python

SQL

Ollama

Docker

K8s

Terraform

AWS

Azure

Airflow

Kafka

DBT

Snowflake

TensorFlow

Pandas

LangChain

Pinecone

Git

n8n

Engineering Logs

Deep dives into architecture, code, and system design.

View Archive →

RAG Vector DB

Optimizing Context Windows for Financial Report Analysis

How we reduced hallucination rates by 40% using hybrid search (Keyword + Vector) and metadata filtering in Pinecone.

Oct 12, 2024 8 min read

Airflow Automation

The Death of the Daily Spreadsheet

A step-by-step guide to automating Excel reporting using Python, Pandas, and AWS Lambda, saving 15 hours per week.

Sep 28, 2024 12 min read

Architecture Data Mesh

Decentralizing Data Ownership

Why the monolithic warehouse is struggling and how a domain-oriented Data Mesh approach improves velocity.

Aug 15, 2024 15 min read

View All Posts

Ready to Automate?

Whether you need a custom RAG pipeline or a complete data infrastructure overhaul, let's build systems that scale.

mdyzma@gmail.com

GitHub LinkedIn Blog

Turning Raw Data into _