Cluster performance really slow compared to just one server

I can see there is some fundamental confusion here about how Galera works and what is the difference between traditional asynchronous MySQL replication and Galera replication.
Ovi, this is not entirely true about PXC/Galera being asynchronous by default. The way Galera based replication (so the one existing in PXC) is completely different from the way how MySQL replication (available since very early MySQL days) works.
Galera is synchronous in a meaning that each transaction must reach all the nodes on commit, and only applying those transactions against real data is later asynchronous. In the same time, cluster won’t allow any node to stay too much behind in applying transactions, and will pause if there is a need. This is also called “virtually synchronous”. Please read the first chapter here:
[url]http://galeracluster.com/documentation-webpages/[/url]

Also this is true that enabling wsrep_sync_wait forces kind of full synchronous mode, however this makes sense in only very special use cases.

Now the standard MySQL replication, regardless what topology you will choose, so whether it will be master->slave or master<->master, is completely asynchronous and also does not have mechanism to take care about what the slaves are doing (with small exception of semi-sync mode). In standard replication, the slaves are allowed to be lagged in replicating from master and the master does not really care about that.

In PXC, you can mix both Galera and traditional replication, so PXC node can be a slave to normal MySQL instance or to node in another PXC cluster. PXC node can also be master to async slaves. See some examples here:
[url]http://www.percona.com/blog/2013/06/21/changing-an-async-slave-of-a-pxc-cluster-to-a-new-master/[/url]