Managing 4-byte Characters
4-byte Characters in Different Databases
PhixFlow stores its data in a database. The recommended database configurations, described on the database pages in Infrastructure Planning and Delivery, mean your instance of PhixFlow can store UTF8 characters in its database. However, the different databases handle 4-byte characters (such as emojis) differently.
Database | Behaviour |
---|---|
Oracle | Stores 4-byte characters correctly. |
SQL Server | Discards or converts 4-byte characters. |
MySQL | MySQL databases configured with UTF8mb4 are supported and recommended. These allow using emoticons and certain other special characters. |
If you are loading data from another database, a file or emails, you will need to either escape or remove any invalid characters before writing to PhixFlow running on MySQL.
For information about how to escape characters, see Text Expressions and Escape Characters.
Removing Characters
The following regex lists the characters expected in a JSON file and removes invalid characters from it. You could adapt this regex to validate the data that you want to load into PhixFlow.
replaceAll(JSON Code,"[^\\p{Space}0-9A-Za-z!:\\\"%&\\[*()\\],-/_\\\\{}\\.]","")
For information about how this line uses escape characters, see Regular Expressions.
An alternative expression for matching characters that are valid in MySQL is:
[^\u0000-\u0FFF]