Varchar/text, row size limit, and row format

tcsteve · December 12, 2020, 11:41pm

We’re finally moving our application from MySQL 5.6 to MariaDB 10.3. I’ll accept any congratulations.

When doing an application update, I saw error 1118 about the row size limit, which we never encountered during testing. I see what the problem is, but I’m trying to figure out the best way forward.

The table that holds our users includes 64 fields that are customizable in the application. Currently they are all varchar(255). After the DB upgrade, I see that at least one table that holds these fields has the Dynamic row format, and another one has Compact. Looking at a database that is still on MySQL 5.6, it’s using Compact. Maybe the DB upgrade process in WHM/cPanel set one of the tables to Dynamic, I’m not sure.

Our application can spend a lot of time running select queries looking for users based on those 64 fields in the WHERE clause. It might be running those select queries on a few hundred thousand users, and it might also run several hundred or thousand of those queries (over all users) at a time. My immediate thought was to change all of the 64 fields to text from varchar, but I’m worried about the impact on select performance. Is there an obvious best case that I should use between dynamic/compact and varchar/text? From what I’m reading, I’m assuming that compact/varchar isn’t possible based on the row size limit.

Even though the 64 fields allow 255 characters, in practice most of our clients are probably using between 10 to 30 fields, and the data in each one is at most a couple dozen characters, although there are certainly outliers.

It looks like the InnoDB page size is set to the default.

CTutte · December 18, 2020, 8:18am

Hi tcsteve ,

You did not paste the error, but I assume is similar to “max size for secondary index is 767” or “maximum row size for innodb” https://dev.mysql.com/doc/refman/5.6/en/innodb-limits.html .

Is the error related to talbe creation? index creation? row insert?

One notorious change between mysql 5.6 and mariadb 10.3 (or mysql/ps 8.0) is that default charset changed from latin1 to utf8:

https://dev.mysql.com/doc/refman/5.6/en/server-system-variables.html#sysvar_character_set_database

https://mariadb.com/docs/reference/es/system-variables/character_set_database/

As can be seen on https://dev.mysql.com/doc/refman/5.6/en/charset-mysql.html , latin1 size is 1 byte in size, while utf8 can be 3 or 4 bytes per char (depending on collation), since it allows for a wider range of characters to be stored. If you will only insert latin1 characters (numbers, letters from a to z), then you can specifiy that each column/table charset/collation is latin1., otherwise default for the entire database will be converted to utf8.

Regards,

tcsteve · December 18, 2020, 1:15pm

The error was about the max row size for InnoDB. We were already using UTF8 for our character set. The specific point where I got the error was on a query to alter a table which had 64 fields that were varchar(255). I haven’t seen any errors for actually updating or inserting to those tables, which I thought was strange.

Like I said, those 64 fields get used pretty extensively for a lot of select queries, those fields are part of the conditions in the where clause, so I was concerned that if I changed all of the fields from varchar to text, and that data gets stored on another page, it would decrease the performance of those select queries. Is that accurate?

I believe the InnoDB page size is set to 16KB, would it make the most sense to just increase the page size so that those 64 fields as varchar(255), plus the other fields in that table, would fit into one page? A page size of 64KB or 128KB, for example?

I’m trying to figure out the best way forward from this upgrade for our application. I tried to change an existing installation of our application from varchar(255) to text for those 64 fields, and it worked on it for 5 and a half hours before eventually failing before finishing. I believe that a database-locking backup export command eventually caused the failure to finish the update.

CTutte · December 18, 2020, 1:35pm

Hi again,

What error are you exactly getting? Is it the following?

ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 8126. You have to change some columns to TEXT or BLOBs

Error (Code 1118): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 8126. You have to change some columns to TEXT or BLOBs

I wouldn’t suggest changing default page size since it can have other implications, as well as you might not be fixing the underlying issue.

If above, you can consider changing column type from varchar to “text” or “blob”.

You can also consider disabling https://mariadb.com/kb/en/innodb-system-variables/#innodb_strict_mode which new default is “ON”, that will print an error instead of warning for various behaviours.

If above does not fix your issue, provide the exact error message you are getting, and an example table definition that is failing for you.

Regards

tcsteve · December 18, 2020, 1:50pm

I believe it was the 42000 one.

If I disable strict mode, will everything still work? Will I be able to insert a row that is larger than the page size, or alter a table that would exceed the page size?

I understand that I can change all of the columns from varchar(255) to text, I understand what the error is telling me, and why. What I don’t understand is, if I change all of those columns to text, is that going to negatively impact the performance of select queries that have those fields in the where condition? This is my question, and whether changing the fields to text is my only option or if I have another option, like using dynamic row storage instead of compact. I understand that I can change the columns to text, and I AM considering that, but I don’t know all of the implications of doing that, specifically, select performance. This is what I’m trying to get information about, and if select performance is going to be impacted negatively, if there are other options like changing the row storage format.

Here is the table structure, obviously those 64 fields are causing this issue. My question is what are the implications for the various possible solutions. The application might run a lot of queries that use them, e.g. "SELECT * FROM users WHERE field1 = ‘…’ AND field2 LIKE ‘%…%’ AND field3 != ‘…’ etc.

CREATE TABLE users (

id int(10) UNSIGNED NOT NULL,

username varchar(50) COLLATE utf8_unicode_ci NOT NULL,

password varchar(255) COLLATE utf8_unicode_ci NOT NULL,

pw_ver tinyint(3) UNSIGNED NOT NULL DEFAULT ‘1’,

fname varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,

lname varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,

email varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,

active tinyint(1) UNSIGNED NOT NULL DEFAULT ‘1’,

date_registered int(10) UNSIGNED NOT NULL,

last_login int(10) UNSIGNED NOT NULL,

last_ip varchar(15) COLLATE utf8_unicode_ci DEFAULT NULL,

main_admin tinyint(1) UNSIGNED NOT NULL,

ug_admin tinyint(1) UNSIGNED NOT NULL,

sub_admin tinyint(1) UNSIGNED NOT NULL,

instructor tinyint(1) UNSIGNED NOT NULL,

link_to int(10) UNSIGNED DEFAULT NULL,

pw_reset_token varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,

pw_reset_time int(10) UNSIGNED DEFAULT NULL,

pw_reset_ip varchar(15) COLLATE utf8_unicode_ci DEFAULT NULL,

pw_last_change int(10) UNSIGNED NOT NULL DEFAULT ‘0’,

view_type varchar(15) COLLATE utf8_unicode_ci NOT NULL DEFAULT ‘catview’,

welcome_show tinyint(1) NOT NULL DEFAULT ‘1’,

new_fav tinyint(1) UNSIGNED NOT NULL DEFAULT ‘0’,

student_email_option tinyint(1) NOT NULL DEFAULT ‘1’,

tz_country varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,

tz_zone varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,

date_format varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,

time_format varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,

field1 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field2 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field3 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field4 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field5 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field6 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field7 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field8 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field9 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field10 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field11 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field12 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field13 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field14 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field15 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field16 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field17 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field18 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field19 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field20 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field21 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field22 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field23 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field24 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field25 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field26 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field27 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field28 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field29 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field30 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field31 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field32 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field33 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field34 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field35 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field36 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field37 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field38 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field39 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field40 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field41 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field42 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field43 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field44 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field45 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field46 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field47 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field48 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field49 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field50 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field51 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field52 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field53 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field54 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field55 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field56 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field57 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field58 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field59 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field60 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field61 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field62 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field63 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

field64 varchar(255) COLLATE utf8_unicode_ci DEFAULT ‘’,

created_by varchar(50) COLLATE utf8_unicode_ci NOT NULL DEFAULT ‘unknown’,

instructor_certs varchar(150) COLLATE utf8_unicode_ci DEFAULT NULL,

api_access_token varchar(40) COLLATE utf8_unicode_ci NOT NULL,

api_refresh_token varchar(40) COLLATE utf8_unicode_ci NOT NULL

) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CTutte · December 18, 2020, 2:35pm

Hi again,

Provided table has many columns with “varchar(255)” . Since it uses “variable char”, it might happen that some rows are shorter than others.

On previous MySQL versions, you could create a table definition that potentially could surpass 8126 bytes (since at least 2 elements should be able to fit on a 16 kb page), but not get an error. On newer versions (upstream, Percona Server or MariaDB variant), you might be getting an error while trying to create a table definition and index that exceeds the threshold .

If using “text”, content will be stored off page and size limit will be larger so there shouldn’t be any problem with storing large values.

You previously said:

Our application can spend a lot of time running select queries looking for users based on those 64 fields in the WHERE clause. It might be running those select queries on a few hundred thousand users, and it might also run several hundred or thousand of those queries (over all users) at a time.

Even though the 64 fields allow 255 characters, in practice most of our clients are probably using between 10 to 30 fields, and the data in each one is at most a couple dozen characters, although there are certainly outliers.

and

Here is the table structure, obviously those 64 fields are causing this issue. My question is what are the implications for the various possible solutions. The application might run a lot of queries that use them, e.g. "SELECT * FROM users WHERE field1 = ‘…’ AND field2 LIKE ‘%…%’ AND field3 != ‘…’ etc.

Using filters such as “LIKE” , or comparing large values to a fixed string is costly. Using and index should improve performance, but with such large column values, it can be lengthy. Remember that an index is basically an additional structure with a copy of the column data, and each MySQL query will use at most 1 index (unless you use UNION or a merge occurs in which multiple query results are merged together). If you need to create an index for every possible combination of fields you need to filter, it will be expensive, and if no indexes at all, full table scan will occur, which will take longer the more rows exist in “users” table.

You should check how long it takes now with current number of users, and check how much performance degrades the more data it’s on the table, as degradation will occur the more rows you add.

if you need to manipulate a very high number of records that do not fit into a relational structure as it is MySQL that benefits from defined structure and use of “joins”, then you might consider migrating to nosql such as Mongo, that specializes in this kind of generic structures where different types and values are stored .

Regards

tcsteve · December 18, 2020, 2:42pm

OK, I don’t think moving to another database right now is an option, this is an application that people have been using for over 10 years, every customer has their own data that they want to capture about their users, so that’s why we have so many customizable fields. The queries to find these users are actually written by the customers themselves, so there’s not a particular index that works well for every customer. This issue didn’t come up during testing with MariaDB 10.3, because we didn’t need to run any alter table queries during testing, it only came up after I upgraded one of our servers and then tried to update a customer’s test environment that hadn’t been updated in a while, and some of the tables needed to be changed.

I guess you’re telling me that we need to change all of the fields to text, and work on performance after that. Unfortunately, like I mentioned, it can take many hours to convert these fields in a single database to text.

CTutte · December 18, 2020, 2:49pm

Time for rebuilding the table with an alter or using https://www.percona.com/doc/percona-toolkit/2.1/pt-online-schema-change.html will vary depending on table size.

Current performance can behave well while table size is slow and number of concurrent queries being executed is low. But doing a full table scan of a few hundred users can be from a few MB up to some GB, which can be costly. If data fits on memory, then it should be faster than going to disk (depending on other DB load that might be ongoing),. You should also consider average row size + number of concurrent executions.

You can examine current load (number of rows and concurrent thread execution) and try to do the math to extrapolate the load you expect to have in the near future. MySQL will probably be good enough until a certain threshold in which document based DBs (like Mongo) do better

tcsteve · December 18, 2020, 3:15pm

Unfortunately, these fields are also in more than one table. We have a table for current users and another for deleted users, one for groups of users with an unrelated set of 64 fields, and then 4 different tables (2 live, 2 for deleted users) that each have the same 64 fields, where customers wanted to store the values of those fields at a particular point rather than just linking to the current fields in the users table. It’s kind of a mess, that’s why it takes so long to do the update.

Topic		Replies	Views
Maria DB Dump upload MySQL & MariaDB	1	30	January 10, 2025
InnoDB's compact row format Other MySQL® Questions	6	2652	June 29, 2007
varchar vs. text Other MySQL® Questions	0	503	November 16, 2010
Converted TEXT to VARCHAR(30000), huge increase in IO Other MySQL® Questions	3	1095	July 28, 2009
Blobs and row formats Other MySQL® Questions	3	518	August 15, 2011

Varchar/text, row size limit, and row format

Related topics