There is now a link to a Kanji page at aule-browser.com
It is not as varied as my own personal Kanji aule-page, but I will make one of those availalbe soon ...
And yes, in some Japanese temples there is a penultimate inner passageway about the core that might be termed an "aule" of sorts ...
At this time the page links to 2 Curl applets for reviewing set of Kanji: the Henshall set and the Heisig set (both extracted from kanjidic2 XML Japanese-English dictionary.)
Friday, April 27, 2012
Thursday, April 19, 2012
Kanjidic2 codepoint
Over at aule-browser I have added an HTML page which correctly displays the 13,108 characters in the Kanjidic2 according to the UCS codepoint found in each XML 'character' element.
The XML was parsed using an XPath expression with the Curl XML Document Model library.
日本におけるカール社のウェブサイト
The XML was parsed using an XPath expression with the Curl XML Document Model library.
日本におけるカール社のウェブサイト
Labels:
Curl programming language,
Curl RTE,
HTML,
Kanji,
kanjidic2,
learn Japanese,
learn Kanji,
parse,
UCS,
UNICODE,
XML,
XPath
0
comments
at
12:45 PM
Wednesday, April 18, 2012
Kanjidic2 UTF-8 character literal
Today I found that not all characters in the kanjidic2 XML file could be parsed and displayed as UTF-8 – at least not in the fonts available to me in IE.
I have posted a utf-8 web page with those that would display after parsing with Curl's xml path library for the content of the "literal" element for each character element.
Parsing gave a total of 13,108 elements when I was expecting fewer than 7000. Issues only arose with the last thousand or so elements - and a few in that tail-end did display. Problems start after the first 12,166 elements.
In the last thousand elements, only the following few literal values would display:
匇 匤 咊 增 寬 嵓 德 晥 栁 橫 瀨 炻 甁 皞 礰 竧 綠 緖 荢 薰 譿 賴 郞 鄕 霻 靍 馞 魲 黑 朗 隆 﨏 塚 﨑 﨓 凞 猪 神 祥 福 﨟 蘒 﨡 諸 﨤 都
If you have been parsing the kanjidic XML dictionary and have an idea, please drop a line.
UPDATE
Using a tail view of kanjidic2.xml I can see that the last UCS codepoint is FA6A which displays correctly in my browser as 頻.
I will revise my parse path to pull the codepoint instead of the literal.
Wednesday, April 11, 2012
Bloated Desktop Applications
While reviewing my options for caching some graphic containers in Curl, I checked my process explorer.
My browser was creeping up to 1.8 GB as a busy, greedy process on an XP Pentium with 1 GB RAM.
The browser is not the only offender: otherwise useful Evernote appears to lack some smarts (remember when Prolog simply required too much memory?)
Let the OS worry about it? Throw more memory in the device? Ascend to the free cloud?
Is there now a generation of Perl/PHP scriptors working in HTML + JavaScript for whom memory and CPU cycles swapping-to-disk are free? Are they avoiding multiple trips this way? Is this the AJAX heritage? SSD merely warm, but not whirring?
Just as it appears that Rebol may disappear.
Remember why Smalltalk was not in use (outside Amex and JP Morgan and ?) St required a VM and too much memory. Now as some Smalltalk implementations shrink and their VM's enter a new generation, ask a corporate programmer when she last prototyped an app in a Smalltalk dialect.
I can hear it ringing in my ears: "All we need is one little Perl script to do that !" But CGI with Perl was too costly on servers, remember? But now it appears that nothing is too costly on a client device.
But free memory, like free DASD, is only as believable as free money, or 33% annual return on your investment if you get in early.
Where is the desktop UNICODE-savvy text editor app that knows how to edit multiple small files without burping up a 30MB or more process for each tiny text file?
These apps are the bungalows of the sprawling suburbs of 50's Middle-America - and now they propose to bloat invisibly within the cloud. Is this a viable global picture of net resource consumption? Which economic model suggests greater bandwidth as evolution's competitive winning variant?
Maybe kids running Rebol on Raspberry Pi knock-off's will get us on another track.
Try this: compare some smart e-commerce sites to the UI's of Google PLAY or Canada'a zip.ca
Those UI's are not JavaScript generated by Smalltalk - or by Prolog. These sad web pages must each re-evolve from dumb and primitive. These are not smart lean app's standing on the shoulders of bulky slow app's. That is just not how the web is evolving.
PS
Ruby is just a dumbed-down Smalltalk for Perl scripting.
Scala ? Let me tell you a story that is not up-Lift'ing ...
Multi-core? Watch for MC Smalltalk. But bloat and greedy prcesses remain what they are just as a Ponzi-scheme by any name remains your personal goldmine .... and your cousin's, too !
Subscribe to:
Posts (Atom)