#data-pipeline
#data-pipeline

[ follow ]

How Agoda Unified Multiple Data Pipelines Into a Single Source of Truth

A centralized Apache Spark-based financial pipeline (FINUDP) creates a single source of truth and a multi-layered quality framework to ensure accurate, consistent financial metrics.

DevOps

fromHackernoon

11 months ago

Partitioning Large Messages and Normalizing Workloads Can Boost Your AWS CloudWatch Ingestion | HackerNoon

Architecture choices significantly impact performance in data ingestion systems.

Scala

frommedium.com

11 months ago

How I Made My Apache Spark Jobs Schema-Agnostic ( Part-2 )

Dynamic transformations enable flexible schema adaptations without code changes.

Using schema metadata simplifies column management, renaming, and casting.

Scala

frommedium.com

1 year ago

Spark Scala Exercise 24: Error Handling and Logging in SparkBuild Safe, Auditable ETL Pipelines

Build a defensive Spark ETL pipeline to ensure robust data processing.

Handle data issues like schema mismatches and corrupt records effectively.

Implement custom logging and audit trails for better failure management.

[ Load more ]

#data-pipeline#data-pipeline

How Agoda Unified Multiple Data Pipelines Into a Single Source of Truth

Partitioning Large Messages and Normalizing Workloads Can Boost Your AWS CloudWatch Ingestion | HackerNoon

How I Made My Apache Spark Jobs Schema-Agnostic ( Part-2 )

Spark Scala Exercise 24: Error Handling and Logging in SparkBuild Safe, Auditable ETL Pipelines

#data-pipeline
#data-pipeline