Key Lessons from Veronika Durgin

Category: Data Engineering

In the world of data engineering, the most impactful work is often the least glamorous. At ODSC East, Veronika Durgin, VP of Data at Saks, struck a chord with her talk on the “10 Most Neglected Data Engineering Tasks.” Drawing from decades of experience in data architecture, engineering, and analytics, she emphasized the foundational practices that keep pipelines stable, teams agile, and businesses prepared for rapid technological change.

These principles aren’t just tactical checklists—they’re a mindset shift for data teams navigating fast-moving technology landscapes.

This is the summary of an episode of ODSC’s AiX Podcast. You can listen to the full episode on Spotify, Apple, SoundCloudor below.

Join Thousands of Data Pros Who Subscribe—Listen Now:

Table of Contents

Rethinking Build vs. Buy: Adding the “Bridge” Strategy
Defining “Done” the Right Way
Walking a Mile in the Business’s Shoes
Planning for Seasonality
Building Self-Recovering Pipelines
Testing with Production Data (But Not in Production)
The Date Problem: Store in UTC, Present Clearly
Protecting the “Forgotten Bucket” of Work
Engineering with Environmental Responsibility
The Coming Wave of “Vibe” Coding
- - In-person conference | October 28th-30th, 2025 | San Francisco, CA
Key Takeaway

Rethinking Build vs. Buy: Adding the “Bridge” Strategy

The classic build vs. buy debate often gets framed as a binary choice. Durgin recommends adding a third option: the bridge solution.

AI’s rapid evolution makes long-term commitments risky. Locking into a three-year contract for an AI tool—or sinking months into a custom internal build—can leave teams stuck with outdated solutions within a year.

Instead, a bridge solution provides a short-term, value-delivering approach that can be swapped out as better tools emerge. “Today is tomorrow’s legacy,” Durgin reminds us. Designing systems with modularity in mind allows organizations to replace components quickly without destabilizing the whole platform.

“Every decision we make now—whether to build or buy—has to be made with the focus that it’s a short-term decision. We don’t want to lock ourselves into a situation where we’re left behind.” – Veronika Durgin

Defining “Done” the Right Way

One of the most overlooked engineering disciplines is the definition of done. Too often, “done” is equated with code completion. Durgin stresses that a feature isn’t done until:

All acceptance criteria are met
Data is validated end-to-end
Downstream impacts are understood
SLAs are established
Monitoring and alerting are in place
The feature is accepted by the requester (human or system)

For data teams, this is especially critical—code that runs doesn’t always mean data that works. A robust definition of done builds accountability, reduces production surprises, and fosters better collaboration with business stakeholders.

Walking a Mile in the Business’s Shoes

Technical teams often underestimate the value of empathy in engineering. Durgin encourages data engineers to spend time understanding the vocabulary, priorities, and constraints of their business counterparts.

Borrowing from the book Digital Mindsetshe notes that effective collaboration doesn’t require becoming a business expert, but it does mean learning “30% of the other side’s language.” By understanding enough of the business context, engineers can prioritize the right work, communicate trade-offs clearly, and avoid friction caused by misaligned expectations.

Planning for Seasonality

Seasonality isn’t just a retail concern—it affects industries from finance to agriculture. Durgin points out that data teams often assume a steady-state system load, only to be surprised when usage spikes during peak times.

Her advice:

Plan code freezes during critical business windows to minimize deployment risks.
Ensure scalability—especially in cloud environments—both up and down to avoid surprise costs.
Stress test systems well in advance.

She likens high-volume periods to a Chinese restaurant on New Year’s Eve: focus on volume, reduce variety, and ensure everything runs smoothly.

We’re excited to announce the Agentic AI Summit, a 3-week hands-on training experience built for AI builders, engineers, and innovators this July 16–31, 2025

🤖Learn to design, deploy, and scale autonomous agents through expert-led sessions and real-world workshops.
⚙️ Practical skills. Cutting-edge tools. Early access to the next AI frontier.

Building Self-Recovering Pipelines

Modern data stacks have shifted many teams from ETL (Extract, Transform, Load) to ELT (Extract, Load, Transform), which enables parallelism but also introduces complexity when things break.

Durgin emphasizes self-recovery mechanisms to reduce downtime and on-call burnout:

Retries for transient failures
High-watermark processing to handle only new data
Quarantining bad records instead of halting entire pipelines
Clear alerting to avoid noise fatigue
Data validation early and often

A well-designed pipeline should handle common failure modes automatically and only escalate critical issues.

Testing with Production Data (But Not in Production)

For data engineering, lower environments rarely mirror production perfectly—especially when it comes to data volume and quality.

Durgin stresses the need to test code against production-scale datasets before deployment. Synthetic or sample data is valuable for early development, but production data reveals issues like skew, missing fields, and format mismatches that often don’t appear until go-live.

The key distinction: test with production data, not in the production environment.

The Date Problem: Store in UTC, Present Clearly

Date and time issues are a universal pain point. From time zone offsets to daylight saving time changes, mismanaged dates cause downstream confusion.

Durgin’s advice:

Store dates in UTC for consistency
Be explicit in naming conventions (e.g., “_UTC” or “New_York_Time”)
Present data in the business’s preferred time zone for clarity

Even small misalignments can create major misunderstandings—especially in global systems.

Protecting the “Forgotten Bucket” of Work

Data teams juggle business projects, bug fixes, and unplanned emergencies. What often gets left behind is the ongoing maintenance and modernization work—the “forgotten bucket.”

This includes addressing technical debt, upgrading aging systems, and experimenting with new technologies. Durgin recommends dedicating a fixed percentage of team capacity—whether it’s a sprint every few months or a day each week—to this type of work.

Neglecting it leads to brittle systems that can’t support new features without breaking.

Engineering with Environmental Responsibility

Beyond code quality and delivery speed, Durgin calls attention to the environmental impact of data engineering.

Data centers already account for an estimated 3% of global greenhouse gas emissions, a number that will grow as AI workloads expand. Engineers can play a role by:

Optimizing compute usage
Scaling infrastructure down after peak loads
Advocating for efficient system design

Environmental responsibility isn’t just good citizenship—it also aligns with cost efficiency for businesses.

The Coming Wave of “Vibe” Coding

Low-code, no-code, and AI-assisted “vibe coding” will make it easier for non-engineers to create data workflows. While this democratizes access, it also risks creating poorly designed pipelines that data engineers will need to maintain.

Durgin’s take: the trend is inevitable. The best approach is to embrace it—while putting guardrails in place, such as pull requests, reviews, and robust testing frameworks.

“Everybody’s going to vibe code everything because it’s easy. Two years from now, experienced engineers will still have a job—fixing everything that was vibe coded.” – Veronika Durgin

In-person conference | October 28th-30th, 2025 | San Francisco, CA

ODSC West is back—bringing together the brightest minds in AI to deliver cutting-edge insights. Train with experts in:

Key Takeaway

The neglected tasks Durgin highlights aren’t flashy—but they are the foundation of sustainable, resilient data systems. From thoughtful decision-making (build, buy, or bridge) to operational discipline (definition of done, seasonality planning, self-recovery), her advice centers on building platforms that adapt to change without crumbling under pressure.

For organizations racing to deliver AI and analytics capabilities, these lessons are a reminder: long-term agility depends on doing the unglamorous work well.

Source link

For more info visit at Times Of Tech

mohsin

I am an author and tech enthusiast deeply passionate about AI, Data Science, and cutting-edge technologies. With expertise in Python, machine learning, and automation, he is dedicated to simplifying complex concepts, helping readers navigate and excel in the dynamic world of artificial intelligence and data science.

See All Posts