Login
Username:

Password:

Remember me



Lost Password?

Register now!
Main Menu
Who's Online
27 user(s) are online (21 user(s) are browsing Forum)

Members: 0
Guests: 27

more...

Browsing this Thread:   1 Anonymous Users



(1) 2 »


AROS build system and Unicode

Joined:
2005/4/1 8:40
Group:
Member
Posts: 476
Offline
Hello! I'm back from long absence (i hope for a long time now). I've changed my job again and here i'm not banned from using Internet.
I'm going to set up a new environment to build Windows-hosted AROS. And i again come up accross old problem:
1. AROS build environment expects all filenames to be in Latin-1.
2. We really have some filenames in Latin-1 (countries names).
3. AROS build environment is completely unaware of Unicode in any form.
This is perfectly solved under UNIX by using 'LANG=en_US.ISO-8859-1 svn update', but there is NO way to go around this under Windows. On Windows you simply can't change processes' 8-bit codepage, it's system-wide. On the counterside, Windows subversion client simply doesn't care about it, because it's perfectly Unicode-compilant.
Friends, on your European PCs you also don't really have a problem. You use Latin-1 as your 8-bit encoding and all filenames are perfectly translated to it. However this is not my case. My 8-bit encoding is CP1251, not CP1250. In short, this means that all 8-bit software (including AROS build environment) simply can't access files with 'foreign' letters.
First i wanted to hack an SVN client in order to emulate 8-bit codepage under Windows, however:
1. It's a large task by itself.
2. It's useless effort because SVN team will never accept such patches. They'll simply say 'don't blame us, bring your project into order instead'. And they'll be right.
What do you think about making AROS build system more locale-independent?
The first point is that AROS doesn't understand Unicode by itself. I think we should leave it as it is (at least for now), otherwise we'll seriously hurt compatibility. I beleive we should change just the build system, which should either:
1. Be able to accept source files in any (UTF ?) encoding and produce files with correct 8-bit names (as if the 8-bit encoding is Latin-1). Because the destination file will be created using 8-bit API, it will still be accessible on non-European systems (despite it will look a bit wrong).
or:
2. Don't rely on file names at all, and rely on some table instead. For example we should explicitly specify somewhere that "french.c" should produce "français.language". The result will be the same as with (1).
Such a change should completely eliminate the problem. What do you think?
P.S. I can't present on the mailing list because GMail's ports are firewalled here and i can't receive my emails. I think i'll change email address once more.

Posted on: 2009/12/17 2:54
Transfer the post to other applications Transfer


Re: AROS build system and Unicode

Joined:
2008/1/7 12:41
From Poland
Group:
Member
Posts: 2345
Offline
Hi Sonic,

Nice to see you back :)

Do you know what is the scope of this problem? Is this only about the language definition files?

I think both options should work - 1st one taking more time but beeing more "correct".

Please do resubmit this on DevML once you have access - not everyone there reads Aros-exec. ;)

Posted on: 2009/12/17 6:52
_________________
Krzysztof

"There is no such thing as software for free. If it is not the user who covers the cost of software creation with money, it is the developer who covers this cost with his own free time."

www.aros3d.org
www.twitter.com/ddeadwood
Transfer the post to other applications Transfer


Re: AROS build system and Unicode

Joined:
2005/4/1 8:40
Group:
Member
Posts: 476
Offline
This affects is only language/country/etc definitions, but this is enough to completely break the build. First, MetaMake chokes on such files in scandirnode() because it sees them but can't stat().
I've already started playing with this and i've modified Metamake to ignore such files. At least it started working. Now i'm going to see what actually happens when it comes to building locale stuff.
Personally for me it would be enough just to ignore these files and not build them. Probably this could be the simplest (temporary) solution.

Posted on: 2009/12/18 0:12
Transfer the post to other applications Transfer


Re: AROS build system and Unicode

Joined:
2004/4/12 13:07
Group:
Member
Posts: 496
Offline
Quote:

SonicAmiga wrote:
First, MetaMake chokes on such files in scandirnode() because it sees them but can't stat().


Isn't this is a bug in mingw then? I would think that if scandirnode() sees the file stat() should work on them. MetaMake uses the file name it got from scandirnode() to do the stat(). Or am I missing something?

greets,
Staf.

Posted on: 2009/12/18 8:21
Transfer the post to other applications Transfer


Re: AROS build system and Unicode

Joined:
2004/10/30 17:13
From Ireland
Group:
Member
Posts: 2073
Offline
Another option is to use English names for both source and output files, as on OS4, and translate native names to English within locale.library where necessary.

Having said that, I already compile AROS in a UTF8 environment (OS X), and I don't have any problem building the main language files. The output files are in the wrong encoding, but it doesn't prevent me using AROS, at least in English. The only files that stop the build are the catalogs for some third-party Zune classes, where the source files are in English and the output files use the languages' native ISO8859 name. The problem occurs because ISO8859 filenames can't be created on a HFS partition. So if you implemented a solution that translated UTF8 source names to ISO8859 output names, it would break compilation for me.

Posted on: 2009/12/20 13:55
Transfer the post to other applications Transfer


Re: AROS build system and Unicode

Joined:
2005/4/1 8:40
Group:
Member
Posts: 476
Offline
2 Fats: you really miss one thing. Windows does not handle Unicode transparently. Instead it has the second API which works with UTF-16 (and not UTF-8). So it has to be wreaddir() instead of readdir() and wscanf() instead of scanf(). If you use 8-bit API it will transparently convert strings from ACP (currently selected "ANSI" code page) to Unicode and back. If your Unicode string has some characters that can't be converted to ACP (because they are missing in your local code page), "best match" characters will be used. So "français" becomes "francais" during readdir(), then scanf() can't find it because there's actually no "francais".
It's not a bug, it's a feature of MinGW which is a 8-bit environment which doesn't make use of Unicode in any way.
I could fix Metamake to work with Unicode on Windows, but this won't make things better because GNU make will not accept Unicode names. Overcoming this means serious changes in MSYS core and is out of my task scope.
2 Neil: as you can see this is completely different from MacOS (and any other UNIX). Under UNIXes there's no second special API for dealing with Unicode. UNIXes simply use the same functions with UTF-8 character set. This is why you get some broken names but no problems.

Posted on: 2009/12/22 0:11
Transfer the post to other applications Transfer


Re: AROS build system and Unicode

Joined:
2005/4/1 8:40
Group:
Member
Posts: 476
Offline
Ops... I meant wstat() and stat() instead of wscanf() and scanf()... :)

Posted on: 2009/12/22 2:59
Transfer the post to other applications Transfer


Re: AROS build system and Unicode

Joined:
2004/10/30 17:13
From Ireland
Group:
Member
Posts: 2073
Offline
I see. Would you like me to make a proposal on the dev list to switch completely to English/7-bit names (if you don't have access to the list ATM)?

Posted on: 2009/12/22 16:27
Transfer the post to other applications Transfer


Re: AROS build system and Unicode

Joined:
2005/4/1 8:40
Group:
Member
Posts: 476
Offline
Yes, i would like to do it. Thanks.
BTW, do you know what happened to Windows-hosted nightly build? I see last time it was performed long ago. I tried to log in to the machine via ssh but the machine seems to be dead. At least accessing http://www.falloutshelter.de does not work too.

Posted on: 2009/12/23 1:00
Transfer the post to other applications Transfer


Re: AROS build system and Unicode

Joined:
2005/4/1 8:40
Group:
Member
Posts: 476
Offline
I've just committed my changes in Workbench/Locale/Countries. Now country source files are named only in ASCII while resulting files are still created with correct Latin-1 names.
Neil: i guess you'll have to further fix makecountry to recognise host OS operating with UTF-8 and convert file names to it. There is some incomplete support for it there.
The technique i used for Makecountry can be probably adapted to the rest of places where it is needed. The main idea is to use file contents (instead of file name) for finding out corresponding target name.
As to UTF-8... I guess the rules should be simple:
1. If host OS uses 8-bit encoding, we assume that it's Latin-1 and do no conversion.
2. If host OS uses UTF-8, we should know that we get file names in Latin-1 and convert them to UTF-8.
The same rules apply to emul.handler on hosted AROS. I think this would completely eliminate our problem.

Posted on: 2009/12/23 2:28
Transfer the post to other applications Transfer



(1) 2 »



You can view topic.
You cannot start a new topic.
You cannot reply to posts.
You cannot edit your posts.
You cannot delete your posts.
You cannot add new polls.
You cannot vote in polls.
You cannot attach files to posts.
You cannot post without approval.

[Advanced Search]


Search
Top Posters
1 paolone
paolone
3674
2 nikolaos
nikolaos
3468
3 phoenixkonsole
phoenixkonsole
3049
4 magorium
magorium
3015
5 deadwood
deadwood
2345
6 ncafferkey
ncafferkey
2073
7 mazze
mazze
2068
8 clusteruk
clusteruk
2050
9 damocles
damocles
1769
10 Kalamatee
Kalamatee
1714
© 2004-2014 AROS Exec