News:

Simutrans Wiki Manual
The official on-line manual for Simutrans. Read and contribute.

Unicode paths

Started by Ters, April 02, 2017, 12:40:35 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Ters

I've posted an earlier version of this patch here, but since that discussion is about something that has been completed, I start a new topic for this.

The purpose of this patch is to convert the internal UTF-8 strings Simutrans uses throughout to UTF-16 when compiling for Windows, and then calling the wide Windows API functions, or similar Windows specific extensions of the C and gzip API. There are some hacks in already, but it does things very odd and only if a few places.

I've been running Simutrans with these changed for a year, and I've never noticed problems. However, I have only done sporadic testing of running Simutrans in a directory with a non-ASCII name. There might also be recent problem due to merging in other changes, though. And that Simutrans reads Unicode paths correctly does not mean that it can display the names correctly, if the font does not contain the glyphs.

Although prissi was against it, the implementation is still located in simio.cc and not simsys.c, because some of the modified code is shared between simutrans and makeobj, and dragging simsys into makeobj just causes problems.

The patch is based on revision 8169.

prissi

Simutrans works well with unicode paths. I tested it with japanese (for instance with a japanese user name). It works, and also japanese file names work, they are saved correctly and will be loaded fine too. However, at least my version of bzip2 cannot open a file with utf16 and needs to use the short name anyway. And linux and mac use UTF8, as well as simutrans internally for all display actions. So where is the advantage?

Using windows specific extensions in the non-OS dependent part does not sound like a great idea to me, if it does not fix a real problem.

The display problem will be not solved by utf16, since the standard font has not the needed characters. Internally everything is utf8 already (which would allow even for more characters than utf16). Changing to freefont lib will solve the display problem, then you will see the correct name even in another language.

Ters

Well there is some strange code involving short path names and creation of non-existent files. My code just does things straight. And bzip2 does not open any files in Simutrans, it just operates on files already opened by fopen. From what I can tell, it has no idea what the file name is.

Yes, Linux and Mac uses UTF-8. I said this was for Windows only, however all path stuff goes through the new functions to keep the platform conditional compilation contained in one place (plus simsys_*.cc). Windows either uses "ANSI" or UTF-16 in its APIs. Since Simutrans is all UTF-8 internally, one must convert to one or the other. "ANSI" is deprecated. Most new parts of the API only get a Unicode implementation.

prissi

gzopen did not read unicode filenames when I tested it last time.  It chocked on real Japanese characters. Maybe I need to test it again. (It is rarely used nowadys, only for network games since per default savinf is bzip2. Are you sure it does the correct stuff with utf8 characters?)

Why the different code in win32_sound.cc?

And why changing the searchfolder? That worked well for a long time?

Ters

Quote from: prissi on April 03, 2017, 03:32:00 AM
gzopen did not read unicode filenames when I tested it last time.

Exactly! (That is, on Windows.) That is what this patch is all about. It uses gzopen_w instead.

Quote from: prissi on April 03, 2017, 03:32:00 AM
Why the different code in win32_sound.cc?

The only difference is that I didn't bother going through dr_fopen since this file is platform specific anyway. However, I did use dr_fopen in simsys_w.cc, so it's not a big deal.

Quote from: prissi on April 03, 2017, 03:32:00 AM
And why changing the searchfolder? That worked well for a long time?

All I did there was change the ifdefs to check for WIN32 and not MSC_VER. There is no reason why your Unicode improvements from 2015 should be for MSVC only.

prissi

Oh, searchfolder works only for MSVC indeed, the Mingw builds just display garbage names. That is another very longstanding bug in the 102.2.2 release. Must be there again for a long time. The Japanese community should have complained!

prissi

Since the DSG patch was submitted, this is aso solved. Sorry, I lost it off the radar. Anyway, both patches were quite similar in it main function, i.e. providing dr_... functions for all file operations needed.

Ters

I'll be in for a fun merge job next time I fetch the latest code. Oh, well.

prissi

Sorry; but I think most diferences were rather in simsys.

Ters

That might be where I potentially have other changes. They may have been reverted earlier, I don't remember. Otherwise, I could have just reverted everything first.