Tuesday, January 1, 2013

JNI and modified UTF8

In Modified UTF-8, the null character (U+0000) is encoded as 0xC0,0x80; this is not valid UTF-8 because it is not the shortest possible representation. Modified UTF-8 strings never contain any actual null bytes but can contain all Unicode code points including U+0000, which allows such strings (with a null byte appended) to be processed by traditional null-terminated string functions. (Wikipedia)
A lot of NDK-JNI samples use GetStringUTFLength to allocate memory and store new native string. Even this great book by Sylvain Ratabouil. And it works well with simple English words. I used it in context of small Estonian word list, and of course it failed on accented letters.
So, to make it clear once more, according to JNI functions reference, GetStringUTFLength returns length in bytes of modified UTF-8 representation of string. And GetStringLength returns count of Unicode characters in string. I.e. for string 'täht' GetStringUTFLength will return 5, and GetStringLength will return 4 (which is correct if we want to malloc some memory for native string).
 
const jsize unicode_length = (*pEnv)->GetStringLength(pEnv,lString);
const jsize utf8_length = (*pEnv)->GetStringUTFLength(pEnv,lString);


Happy New year, btw :)

Sunday, December 30, 2012

Crystax. Just because Google can't.

Today running through habrahabr search results for 'ndk', found almost 1-year-old article about Crystax - "improved Android NDK". Ok, I added it to Favourites' 'android' folder. A few hours later trying to port existing C code to Android with NDK r8d got missed reference for 'wctomb' function. It was like existing in included headers, but compilation failed. Adding
LOCAL_ALLOW_UNDEFINED_SYMBOLS := true
resulted in successful compilation and runtime error on device. Started googling, and found a lot about missing wide characters support in official NDK. And then came back to Crystax page.
Features supported by CrystaX NDK:
1.Wide characters.
Google's NDK doesn't support wide chars properly - neither in C or C++. Using CrystaX NDK you get full standard compliant wide characters support. You can easily port existing code which use wide characters/strings/streams or write new one.
Nice :) It compiles, it works without exceptions.

And yes, I will try to write more about my programming adventures.