View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0027886||Runner||HTML5||Public||2017-09-21 14:14||2019-05-09 15:49|
|Reporter||YellowAfterlife||Assigned To||Mike Dailly|
|Priority||Medium||Severity||B - Major||Reproducibility||100%|
|Platform||Windows||OS||Windows 8||OS Version||8.1|
|Target Version||Fixed in Version|
|Summary||0027886: HTML5: buffer_string and string_byte_ functions are not UTF-8 aware (implementation included)|
|Description||Currently buffer_read(_, buffer_string), buffer_write(_, buffer_string, _), string_byte_at(_, _), and string_byte_length(_) are not UTF-8 aware, which renders them useless for interoperation with anything that expects the client to handle UTF-8 correctly.|
buffer_read and buffer_write use 16-bit integers for char codes, which means that any >= 3 byte glyphs are lost on write even if there's no external code.
string_byte_ functions return regular (char) length/codes, which makes them useless for their purpose.
Attached is a project with a test case (reading/writing 1, 2, 3, 4 byte glyphs; polling byte length; polling bytes) to highlight the issues.
Also included is a GML-only implementation for the proposed way of dealing with the issue (having UTF8 range checks). Doing so is standard practice (I believe that the runtime already works like this on native) and produces both accurate and fast enough results when implemented on runtime level.
|Tags||No tags attached.|
|1.4 Found In||1.4.17|
|2.x Runtime Found In||18.104.22.168|
|2.x Runtime Verified In||22.214.171.1241|