how to make a fair comparison study

Hello everyone and thanks for a great forum.

I am a web developer with about 2 years of industry experience.

I am currently in the last phase of my graduate studies and I have decided to write my thesis about MySQL and Sphinx FULLTEXT search.

My goal is to make a fair study on how they work, their differences, pros and cons, and of course, how they perform.

I have setup a Ubuntu Web Server with latest versions of Apache, PHP, MySQL and Sphinx.

My setup is:
-Core2Duo 2.13ghz
-4gb of ram
-160gb HD

My goal is to generate a HUGE database, say 5-15 million rows
using a dictionary database(?). Every row will have a short article (300-500 words long).

I will create two database instances (MySQL and Sphinx) and they will be complete replicas (the data will be at least).

I then want to FULLTEXT search on single and multiple keywords and measure the performance results. For this study I will be using PHP.

I also want to test how they perform under heavy load. I want to create multiple threads and perform searches simultaneously and benchmark it’s results.

I will be testing:
-Single thread
-4 threads
-8 threads
-16 threads
-32 threads

So my questions are:

  1. How do I make this study as fair as possible?
  2. What tool is recommended for this? (currently I am thinking of using jMeter).
  3. What is a fair size for the database (10 million rows)?
  4. How big should the text fields the search will query be (300-500 words ok?).
  5. I think the settings are the most essential part of this study. The search should yield (more or less) the same results? They should perform the same type of search. What would be the ideal settings for this?

note: Sorry for the bad English, it’s not my native language.

I am thankful for any type of help, guidance, etc.