Hi All,
I set up a Percona Distribution for MongoDB with a 3 node sharded cluster. Below is the config:
shard 1 Primary server
shard 1 Secondary server
shard 2 Primary server
shard 2 Secondary server
shard 3 Primary server
shard 3 Secondary server
Config+ Router server
Each of the servers has 800GB (NVMe SSD) of storage. I have disabled compression. I have set the Cache to 50GB.
To test the performance of the cluster, I ran the YCSB benchmark from a single client. I did run a pre-split script before running the load phase of workload A (50% Read, 50% Update). I added 512 Million records with a record size of 2K to the cluster totaling about 1.1 TB of ycsb database.
I ran workload A with 8 and 16 Million records for 128 threads but I am unable to get the total operations/second count for more than 46K.
To test out my theory, I created a single node database and repeated the workload A test with 250 Million records totaling 543GB. I am able to get about 94K operations/second.
Below is the output of the sh.status(), db.usertable.getShardDistribution(), show dbs for sharded config: The data looks balanced to me. Can I please get some guidance on where and what to look for to debug this bottleneck?
mongos> sh.status()
— Sharding Status —
sharding version: {
“_id” : 1,
“minCompatibleVersion” : 5,
“currentVersion” : 6,
“clusterId” : ObjectId(“600e0cd99c2bf168b1682c42”)
}
shards:
{ “_id” : “Shard1ReplSet”, “host” : “Shard1ReplSet/gtm-server27:27018,server28:27018”, “state” : 1 }
{ “_id” : “Shard2ReplSet”, “host” : “Shard2ReplSet/gtm-server29:27018,server30:27018”, “state” : 1 }
{ “_id” : “Shard3ReplSet”, “host” : “Shard3ReplSet/gtm-server31:27018,server32:27018”, “state” : 1 }
active mongoses:
“4.4.2-4” : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 5
Last reported error: Could not find host matching read preference { mode: “primary” } for set Shard2ReplSet
Time of Reported error: Tue Jan 26 2021 12:05:40 GMT-0800 (PST)
Migration Results for the last 24 hours:
No recent migrations
databases:
{ “_id” : “config”, “primary” : “config”, “partitioned” : true }
config.system.sessions
shard key: { “_id” : 1 }
unique: false
balancing: true
chunks:
Shard1ReplSet 342
Shard2ReplSet 341
Shard3ReplSet 341
too many chunks to print, use verbose if you want to force print
{ “_id” : “ycsb”, “primary” : “Shard2ReplSet”, “partitioned” : true, “version” : { “uuid” : UUID(“843f54f3-38cd-4bc4-885e-855fcc363f12”), “lastMod” : 1 } }
ycsb.usertable
shard key: { “_id” : 1 }
unique: true
balancing: true
chunks:
Shard1ReplSet 16412
Shard2ReplSet 16412
Shard3ReplSet 16411
too many chunks to print, use verbose if you want to force print
mongos> show dbs
admin 0.000GB
config 0.090GB
ycsb 1149.934GB
mongos>
mongos> db.usertable.getShardDistribution()
Shard Shard3ReplSet at Shard3ReplSet/gtm-server31:27018,gtm-server32:27018
data : 344.28GiB docs : 170522442 chunks : 16411
estimated data per chunk : 21.48MiB
estimated docs per chunk : 10390
Shard Shard2ReplSet at Shard2ReplSet/gtm-server29:27018,gtm-server30:27018
data : 344.31GiB docs : 170538315 chunks : 16412
estimated data per chunk : 21.48MiB
estimated docs per chunk : 10391
Shard Shard1ReplSet at Shard1ReplSet/gtm-server27:27018,gtm-server28:27018
data : 345.12GiB docs : 170939243 chunks : 16412
estimated data per chunk : 21.53MiB
estimated docs per chunk : 10415
Totals
data : 1033.72GiB docs : 512000000 chunks : 49235
Shard Shard3ReplSet contains 33.3% data, 33.3% docs in cluster, avg obj size on shard : 2KiB
Shard Shard2ReplSet contains 33.3% data, 33.3% docs in cluster, avg obj size on shard : 2KiB
Shard Shard1ReplSet contains 33.38% data, 33.38% docs in cluster, avg obj size on shard : 2KiB