1 min readMay 12, 2020
I’m actually not too familiar with how Impala reads from S3 (as I’ve only used it on-premise) — but I can only guess that this article is not relevant to the S3 case. Unlike Impala over HDFS, there is no optimization for data locality when reading from S3 — so the hotspotting is probably not an issue there.