I'm importing a terabyte of data into an AWS Aurora MySQL table from an EC2 instance. Because our service will be down while migrating prod, I care a lot about the import speed.
Currently I can't break 1.0Gb/s import speed, measured using iftop
. The speed is suspiciously not 1.1Gb/s or even 0.990 GB/s, its very very close to 1.0Gb/s, which makes me think its some sort of artificial bandwidth limit? Any suggestions what the bottleneck might be?
- I'm loading the data in 150MB TSV chunks using
LOAD DATA LOCAL INFILE "chunk1.tsv"
statements executed with 4x - 16x parallelism from my EC2 instance. - My EC2 instance is currently a
m5zn.6xlarge
("50 Gbps"), but I started experiments on ac5.4xlarge
. They both hit the same bandwidth limit. The RDS instance is adb.r5.4xlarge
("Up to 10 Gpbs"). - Running the job on my local laptop to local MySQL exceeds 2.2Gb/s, and because the chunks are quite large (I've also tried 500MB chunks), I don't think its latency. My laptop shouldn't be this much faster...