Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use utf8mb4 as default encoding #1127

Merged
merged 4 commits into from
Nov 28, 2024
Merged

Conversation

gochujang-c
Copy link
Contributor

utf8 defaults to utf8mb3, which can only handle a subset of Unicode. An example which fails is:

└─>>> echo -n 🍌 | xxd
00000000: f09f 8d8c                                ....
└─>>> echo -n 🍌 | md5sum
806a7d7522baf4c10d2b45949075b382  -

If you create a hashlist with this md5 hash and try to import the solution as a pre-crack, you get the following error:

[12-Nov-2024 16:48:18 UTC] PHP Fatal error:  Uncaught PDOException: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF0\x9F\x8D\x8C' for column 'plaintext' at row 1 in /var/www/html/src/dba/AbstractModelFactory.class.php:244
Stack trace:
#0 /var/www/html/src/dba/AbstractModelFactory.class.php(244): PDOStatement->execute(Array)
#1 /var/www/html/src/inc/utils/HashlistUtils.class.php(478): DBA\AbstractModelFactory->mset(Object(DBA\Hash), Array)
#2 /var/www/html/src/inc/handlers/HashlistHandler.class.php(52): HashlistUtils::processZap('5', ':', 'paste', Array, Array, Object(DBA\User))
#3 /var/www/html/src/hashlists.php(31): HashlistHandler->handle('processZap')
#4 {main}
  thrown in /var/www/html/src/dba/AbstractModelFactory.class.php on line 244

Seems that clients since version 8.0 use this as the default, so removing it from the connection string would then work automatically, but perhaps keep it explicit/other use case?

For changing it server side (if necessary), you could add something like the following to /etc/my.cnf:

[client]
default-character-set = utf8mb4
 
[mysql]
default-character-set = utf8mb4
 
[mysqld]
character-set-server = utf8mb4
collation-server = utf8mb4_0900_ai_ci

and restart the server.

@@ -855,7 +855,7 @@ public function getDB($test = false) {
}
else {
global $CONN;
$dsn = 'mysql:dbname=' . $CONN['db'] . ";host=" . $CONN['server'] . ";port=" . $CONN['port'] . ";charset=utf8";
$dsn = 'mysql:dbname=' . $CONN['db'] . ";host=" . $CONN['server'] . ";port=" . $CONN['port'] . ";charset=utf8mb4";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment with the following comment:

# The utf8mb4 is here to force php to connect with utf8, so you can save emoji's or other non ascii chars into the database. If you are running into issues with this line, we could make this configurable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add an item to the changelog.md, you might need to create a header here for the new version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@jessevz jessevz merged commit 80c8c14 into hashtopolis:master Nov 28, 2024
1 check passed
jessevz added a commit that referenced this pull request Nov 28, 2024
* FIXED error for unsupported php 8.4 in xdebug (#1134)

* FIXED error for unsuported php 8.4 version by upgrading xdebug to version 3.4.0beta1

* temporarily fix by disabling php 8.4 deprecation warnings in apiv2

* Fixed preprocessor skip command bug. (#1126)

* Fixed preprocessor skip command bug.

* Added changelog entry.

* Updated changelog entry version.

* Use utf8mb4 as default encoding (#1127)

* Use utf8mb4.

* Added comment, added changelog entry.

---------

Co-authored-by: jessevz <[email protected]>

---------

Co-authored-by: gochujang-c <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants