Utf8 How Unicode can be easy...
Quite some time since I last wrote something. I recently started attending a University, so that took all of my time past weeks :)
Anyway, during a very easy lecture about programming I decided to actually program something myself, which turned out to be Unicode support for x64dbg!
At first I thought I would have to rewrite the command parser and whatnot, but that turned out to be not needed at all…
At first I wanted to convert every
char pointer and constant to the Windows-supported
wchar_t type, but this would take far too long to execute + it would break plugin compatibility and the complete internal API, which just sucks. Various discussions and a blogpost later I decided to use UTF-8 internally and call WinAPI with conversion functions.
Basically it required three things:
1) A C++ class like
QString that allows string operations on UTF-8 string;
2) Conversion functions from UTF-8 to UTF-16 and the other way around;
3) ‘Converting’ all external ASCII calls to their UNICODE variant (WinAPI, TitanEngine, dbghelp, etc).
The second step was also very easy, the blogpost I mentioned earlier had two ready-to-use functions called
ConvertFromUtf8ToUtf16. Those worked great, except that they would crash when fed with
null as argument. Wrapping them in
UString solved that issue without having to think :)
The third step seemed easy at first, I could debug a random application with a Chinese path within minutes. After that however, came a small moment of confusion, because Qt appears to be interpreting
const char* strings as Latin1 per default. The following code solved this and after that the log etc. were working correctly:
// Set QString codec to UTF-8 QTextCodec::setCodecForLocale(QTextCodec::codecForName("UTF-8")); QTextCodec::setCodecForCStrings(QTextCodec::codecForName("UTF-8")); QTextCodec::setCodecForTr(QTextCodec::codecForName("UTF-8"));
Now all that is left is the tedious task of snooping through the code looking for incompatible
GetModuleFileNameExA functions calls and convert them.
The main concern will be that plugins will need to support UTF-8 and that new developers for x64dbg will have to adapt their coding a little. For plugin coders there will be conversion functions in the Bridge, but the conversion functions from the blogpost are really easy to copy-paste.
In overall adding UTF-8 support turned out to be quite easy and the work involved is just tedious, not really hard or very annoying. It can be done in little free time by almost anyone, so feel free to submit pull-requests :)
blog comments powered by Disqus