How to Fix 10 Neglected Data Engineering Tasks

A futuristic, glowing data pipeline visual overlays a modern office desk with two computer monitors displaying data engineering dashboards.

Data engineering is often seen as the backbone of modern organisations, quietly enabling businesses to harness the potential of their data. Yet, as Veronica Durgen, VP of Data at SAX, highlighted in her thought-provoking discussion, this field often struggles with overlooked and neglected tasks. These tasks may not be glamorous or headline-grabbing, but they are undeniably critical for organisations aiming to thrive in the fast-evolving data landscape.

Drawing on over two decades of experience, Veronica unpacked key lessons learned across her career, providing actionable insights for both professionals managing data teams and businesses relying on data systems. This article synthesises her wisdom into strategies that help organisations tackle challenges like defining "done", adopting bridge solutions, balancing innovation with maintenance, and prioritising environmental sustainability.

Introduction: The Hidden Work in Data Engineering

While AI and other buzzworthy technologies are advancing rapidly, many teams focus solely on delivering the next big feature, often neglecting the unglamorous yet essential work behind the scenes. These tasks are crucial for building resilient systems, reducing downtime, and enabling scalability. Veronica’s discussion offers a roadmap for addressing these gaps and creating robust, future-proof systems that deliver value over the long term.

In this article, we explore the most crucial lessons shared by Veronica, breaking them into actionable insights that professionals and organisations can immediately adopt.

The Bridge Solution: A Balanced Approach to Build vs Buy

One of Veronica’s most transformative ideas is the "bridge solution" - an innovative middle ground between the classic build-versus-buy debate. In today’s world of fast-moving AI and technology, decisions need to factor in flexibility and scalability rather than locking teams into rigid long-term contracts or resource-heavy development.

What Is the Bridge Solution?

The bridge solution involves adopting a temporary, "good enough" tool or system to unlock immediate value while allowing teams the time and space to evaluate a more permanent solution. This approach acknowledges that:

  • AI and technology evolve rapidly, and the perfect solution today may become obsolete tomorrow.
  • Flexibility to swap components is critical to avoid being "tied down" to legacy systems.

Key Considerations for Bridge Solutions

  1. Evaluate Total Cost of Ownership (TCO): Look beyond surface-level costs. Include infrastructure, salaries, and the long-term maintenance burden.
  2. Focus on Business Value: Only build solutions internally if they directly create a competitive advantage.
  3. Design for Modularity: Ensure systems are built with replaceable components to accommodate future technology upgrades.

As Veronica aptly put it, "Everything we do today is legacy tomorrow." The bridge solution enables teams to move fast without losing sight of the bigger picture.

Redefining "Done" in Data Engineering

The concept of "done" in technology projects often varies depending on the stakeholder. For engineers, it might be when the code compiles. For business users, it might be when the data is usable and generates actionable insights. Veronica presented six critical criteria for defining "done" to ensure alignment and reduce miscommunication.

Six Pillars of "Done"

  1. Code Completion: The foundation of all projects, but far from sufficient on its own.
  2. Data Validation: Ensure data quality by checking for missing, incorrect, or unexpected values.
  3. Downstream Impact Assessment: Analyse how changes affect downstream users or systems.
  4. Service Level Agreements (SLAs): Establish expectations for performance and availability.
  5. Monitoring and Alerting: Implement robust systems to flag potential issues proactively.
  6. Handover Processes: Properly transition completed work to relevant teams or stakeholders.

By adopting this comprehensive definition, teams can minimise scope creep and misaligned expectations while improving accountability. As Veronica noted, "If nobody's using it, it's not done. It’s just work in progress."

Seasonality in Data Engineering: Planning for Peaks and Valleys

Seasonality can create unexpected challenges in data engineering, particularly in industries such as retail, healthcare, and finance. Veronica emphasised the importance of designing systems that can handle both predictable and unpredictable fluctuations in demand.

Best Practices for Managing Seasonality

  • Scale Up and Down: Leverage cloud infrastructure that allows for short-term scaling during peak periods without locking in excessive resources.
  • Prepare for Known Spikes: Test systems under peak load conditions before high-demand periods.
  • Minimise Deployments During Critical Times: Implement code freezes to avoid unnecessary risks during peak business periods.
  • Set Clear SLAs: Focus on resolving critical issues during peak times, rather than addressing minor bugs.

Her analogy of Chinese restaurants streamlining operations during New Year’s Eve perfectly illustrates how preparation and focus on essentials can make or break operations during critical times.

Building Self-Recovering Systems

When pipelines fail, they disrupt operations and drain resources. Veronica’s advocacy for self-recovering pipelines stems from her belief in automating away as many manual processes as possible. This ensures smoother operations and reduces the burden on data engineering teams.

Five Key Components of Self-Recovering Pipelines

  1. High Watermarking: Only process new or changed data to avoid unnecessary rework.
  2. Retry Logic: Implement intelligent retries to handle temporary failures.
  3. Error Queues: Reroute bad records to a holding area for asynchronous troubleshooting.
  4. Alert Optimisation: Reduce noise by setting up meaningful alerts that prioritise urgent issues.
  5. Data Validation: Proactively check for anomalies to prevent cascading errors.

Automation here is not just a convenience - it’s a necessity. As Veronica noted, "The best DBA is a lazy DBA", emphasising the value of systems that "just work."

Environmental Responsibility in Data Engineering

Data centres currently account for over 3% of global greenhouse emissions, and this figure is expected to grow. Veronica’s call to action highlighted the need for data engineers to consider the environmental impact of their decisions.

Steps to Reduce Environmental Impact

  • Optimise Code and Queries: More efficient workloads consume less power.
  • Minimise Redundant Data: Store only what’s necessary to reduce storage costs and energy use.
  • Leverage Green Data Centres: Choose providers prioritising renewable energy and energy-efficient infrastructure.

By taking ownership of their role in environmental sustainability, data engineers can make a meaningful difference while also reducing operational costs.

Key Takeaways

  • Adopt Bridge Solutions: When faced with "build versus buy" decisions, consider flexible, temporary solutions to unlock immediate value.
  • Redefine "Done": Align engineering and business teams with a robust, shared definition of project completion.
  • Plan for Seasonality: Design systems for scalability during predictable and unpredictable demand spikes.
  • Automate Recovery: Build self-recovering pipelines to minimise operational disruptions.
  • Focus on Environmental Impact: Reduce emissions by optimising workloads and choosing eco-friendly data centres.
  • Dedicate Time to Maintenance: Protect time for fixing technical debt and experimenting with new technologies.
  • Test with Production Data: Simulate real-world conditions to ensure systems can handle production-level challenges.
  • Empathise with Stakeholders: Build stronger business relationships by understanding their needs and speaking their language.

Conclusion

Data engineering is an ever-evolving field, requiring teams to balance technical depth with the need for agility, scalability, and sustainability. By addressing neglected tasks such as seasonality planning, bridge solutions, and self-recovery automation, organisations can not only enhance their operations but also future-proof their systems in a rapidly changing environment.

Veronica Durgen’s insights serve as a powerful reminder that even the "unglamorous" aspects of data engineering play a crucial role in driving business success. The lessons shared here are a call to action for professionals and teams to revisit their priorities and refocus on building systems that are robust, adaptable, and environmentally conscious.

Source: "The Most Neglected Tasks in Data Engineering with Veronika Durgin" - Open Data Science, YouTube, Aug 19, 2025 - https://www.youtube.com/watch?v=3nRSRBEn_j0

Video 1