PT-archiver tables with JSON or HTML data to ClickHouse

Hello,

Can someone with more experience help me?

I am trying to export some MySQL tables to ClickHouse, but I have problems when tables has data as HTML or JSON, I have tried with CSV or TSV but I get the same results, can you recommend a method to make this migration??

Commands I used are:

pt-archiver --source h=localhost,u=ser,p="password",D=bdd,t=ticket_history --where "column_datetime < '2022-03-01'" --file load_to_clickhouse.txt --no-delete --limit 100000 --progress=100000 --no-check-charset

with this result

Code: 27. DB::ParsingException: Cannot parse input: expected '\t' at end of stream.: Buffer has gone, cannot extract information about what has been parsed.: While executing ParallelParsingBlockInputFormat: data for INSERT was parsed from stdin: (in query: INSERT INTO man_campaign_statistics_app FORMAT TSV): (at row 9178)
. (CANNOT_PARSE_INPUT_ASSERTION_FAILED)

or

pt-archiver --source h=localhost,u=bdd,p="password",D=analytics,t=campaign_statistics_app --where "id_campaign_statitics <= 1681809 " --file load_to_clickhouse.csv --output-format csv --no-delete --limit 100000 --progress=100000 --no-check-charset

with this result

Code: 27. DB::ParsingException: Cannot parse input: expected ',' before: 'viewport\\" content=\\"width=device-width, initial-scale=1\\">\r\\\n <link rel=\\"stylesheet\\" href=\\"https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.mi':

Thanks in advance

Chris

1 Like

The manual for pt-archiver says the output is the same as SELECT INTO OUTFILE. Can you try doing your export manually using that DML and see if you get similar result? If the DML works, that would indicate a possible bug in pt-archiver.

1 Like