AI / LLM Systems

Production GenAI & RAG Pipelines

Building retrieval-augmented generation workflows, optimizing LLM pipelines for a client deployment, and fine-tuning generative models — across PwC and Digiiq.

Where: PwC · Digiiq
Timeline: Dec 2024 – Jul 2025
Stack: LangChain · LLaMA · Mistral
Impact: +40% pipeline perf.

Context

At PwC, internal teams needed AI tooling that could answer questions over large document sets reliably enough for client-facing work. Separately, at Digiiq, a digital agency needed to scale creative production without scaling headcount.

My role

At PwC I worked as a Data Analyst & Cloud Intern building RAG infrastructure and leading the LLM optimization workstream on the client-facing Nokia Project. At Digiiq I owned the generative AI implementation end to end — model selection, fine-tuning, and dataset construction.

Approach

At PwC, I built modular RAG workflows with LangChain and OpenAI APIs — designed so the same retrieval and orchestration components could be redeployed across tools rather than rebuilt for each one. They shipped across 4 internal tools. On the Nokia Project, I benchmarked and tuned open-weight models (LLaMA, Mistral) against the existing pipeline, optimizing for both latency and response accuracy in production tests.

At Digiiq, I fine-tuned Stable Diffusion for brand-consistent image generation and built a 40K+ entry prompt dataset to fine-tune LLaMA — the dataset work mattered more than the model work, improving content originality while minimizing plagiarism.

Outcome

PwC: 30% improvement in average response efficiency across the 4 internal tools; the Nokia Project pipeline improved 40% in performance with 20% lower latency and higher accuracy in production tests.

Digiiq: the generative pipeline automated roughly 80% of the creative production workload, and the fine-tuned models shipped in a live product (pulsar.digiiq.ai).

What I learned

Production LLM work is 20% model choice and 80% everything around it — retrieval quality, dataset construction, evaluation, and latency budgets. The teams that win treat prompts and datasets as engineering artifacts with the same rigor as code.

Live product — pulsar.digiiq.ai ↗