Yes, the gzip is a good deal slower. That’s only in there for, well, legacy expectations. We found that other algorithms were almost ~x10 as fast, single-threaded. Parallelized compression makes it faster if you have the cores to use (the default s2 is parallelized snappy).
Regarding your originally test the compressed size in the backup storage with the default s2 should be approximately the same as the live mongod node’s data directory size, because wiredtiger also uses the snappy compression library. So if it’s 50GB on disk it should be ~50GB in backup storage too.
This also means that transferring 50GB was slow in my opinion too. Even with a 100MB/s bandwidth (let’s say you have a 1GB ethernet as the tightest bottleneck in the network) that should be more like 10m, not 37m.
But now that I think about it I have observed speeds well below 100MB/s when uploading to AWS S3 from my home. I forget for now the speeds possible when doing it from a EC2 server with AWS to S3.
Could it be that the network bandwidth through to / accepted by AWS S3 from where you’re doing the test is the bottleneck?
In the end though backing up 1 TB of data (compressed) will take a long time unless you can achieve, say, 1GB/s or at least 0.5GB/s a second.