#data-partitioning

[ follow ]
fromMedium
2 months ago

Apache Spark: Fix data skew issue using salting technique (practical example)

Data skew in Apache Spark is a performance issue where a few keys dominate the data distribution, leading to uneven partitions and slow queries, especially during operations that require shuffling.
Data science
Java
fromHackernoon
5 months ago

This Code Change Sped Up MySQL Queries from 25 Minutes to 23 Seconds | HackerNoon

Dynamic partitioning in JDBC batch processing requires adjustments to include where_condition for improved performance.
[ Load more ]