Amazon Redshift 對壓縮率的改善:「Data Compression Improvements in Amazon Redshift Bring Compression Ratios Up to 4x」。
首先是引入了 Zstandard:
First, we added support for the Zstandard compression algorithm, which offers a good balance between a high compression ratio and speed in build 1.0.1172. When applied to raw data in the standard TPC-DS, 3 TB benchmark, Zstandard achieves 65% reduction in disk space. Zstandard is broadly applicable.
然後是自動選擇壓縮,對於之前沒有設定壓縮參數的人,會直接有改善:
Second, we’ve improved the automation of compression on tables created by the CREATE TABLE AS, CREATE TABLE or ALTER TABLE ADD COLUMN commands. Starting with Build 1.0.1161, Amazon Redshift automatically chooses a default compression for the columns created by those commands. Automated compression happens when we estimate that we can reduce disk space without degrading query performance. Our customers have seen up to 40% reduction in disk space.
再來是改善資料結構:
Third, we’ve been optimizing our internal on-disk data structures. Our preview customers averaged a 7% reduction in disk space usage with this improvement. This feature is delivered starting with Build 1.0.1271.
最後是提供更好的分析判斷:
Finally, we have enhanced the ANALYZE COMPRESSION command to estimate disk space reduction.
不過其他幾個產品線的使用方式更成熟 (像是 Amazon Athena 這類產品),不知道會不會讓 Amazon Redshift 慢慢退出第一線...