Problem Description
After noticing an application tended to discard random emails due to incorrect string value errors, I went though and switched many text columns to use the `utf8` column charset and the default column collate (`utf8_general_ci`) so that it would accept them. This fixed most of the errors, and made the application stop getting sql errors when it hit non-latin emails, too.
Despite this, some of the emails are still causing the program to hit incorrect string value errrors: `(Incorrect string value: '\xE4\xC5\xCC\xC9\xD3\xD8...' for column 'contents' at row 1)`
The contents column is a `MEDIUMTEXT` datatybe which uses the `utf8` column charset and the `utf8_general_ci` column collate. There are no flags that I can toggle in this column.
Keeping in mind that I don't want to touch or even look at the application source code unless absolutely necessary:
* What is causing that error? (yes, I know the emails are full of random garbage, but I thought utf8 would be pretty permissive)
* How can I fix it?
* What are the likely effects of such a fix?
One thing I considered was switching to a utf8 varchar([some large number]) with the binary flag turned on, but I'm rather unfamiliar with MySQL, and have no idea if such a fix makes sense.
AI-Generated Solution
Powered by LMSouq AI · GPT-4.1-mini
Analyzing problem and generating solution…
Was this solution helpful?