-----------------------------------------------------------------------
PROPOSAL FOR A COMMON APL ENCODING                            D R A F T 
March 29, 1995

APL has long been a single-platform programming language.  One immediately
encounters, when moving APL code to another platform, the problem of
incompatible character sets and encodings.  In recent years, computing has
migrated from mainframes to workstations and personal computers, and the
"life-cycles" of computers and operating systems have shrunk to a matter of
only 2-3 years.  As a result there has been an increasing need to move APL
code.  Furthermore, the greater connectedness of both computers and users
through the "Information Superhighway" has increased the desirability and
necessity of publishing and sharing APL code.  At the present we are fortunate
to have two very fine international journals, VECTOR of the British APL
Association and Quote-Quad of ACM SIGAPL.  Both provide timely and informative
articles on the use of APL.  It is quite clear, however, that the (near)
future will be "electronic" -- as Europe deregulates telecommunications
many more APLers will go "on-line", as they have in North America.  The
newsgroup "comp.lang.apl" on the Internet has a large following of several
hundred APLers and an impressive archive of discussions and articles going
back many years.

"Traffic" has grown from 500K in 1989 to more than 5M/yr, so in
some ways c.l.a. has already become the world's most active APL (and J)
user group!  During those years several "transliterations" and "ASCII APLs"
have been proposed and used, but it remains unclear whether APL glyphs can
survive the Information Age.  It would be disappointing, if the character set
we love (or hate) never makes it to the World Wide Web.  Well, I am a firm
believer, that ASCII APL can be elegant and beautiful, but also that APL
glyphs can co-exist with, even flourish under modern computing.  So why not
try both approaches?  

I have used APL*PLUS/Mac, Dyalog/SunView, APL2/RS6000, APL.68000/Mac, and APL!
(APL "bang", my own ASCII rendition of old APL\11 that runs on Unix) for my
research on pattern matching algorithms and computational biology.  I have had
to hack custom screen and printer fonts and keyboard mappings for all of them
-- in order to enter, view, print, and move APL outside of the interpreter.  
I am tired of it and want to do it one last time, for keeps!  I think I have
solved most of the technical problems, and designed a "neutral" character set
encoding that is compatible with modern computers and networks.  Indeed I have
implemented it fully on the Mac and Sun (xterm), and Steve Halasz and I are
working on the PC.  I am confident Steve's APL95NET project will be
successful, and SIGAPL's WWW homepage will soon have APL articles, tutorials,
and more.

WWW's HTML file transfer protocol and most Internet news servers are
compatible with 8-bit characters; the user simply has to download and
install the appropriate font in order to see APL.

In the October 94 issue of VECTOR, Adrian Smith proposed a common
APL encoding for Microsoft Windows that gives up underscore-caps for (most of
the) lowercase national characters.  In the January 95 issue, he and others
commented that Microsoft Word insists on treating certain positions in the
range 128-159 as special typographical symbols (quotes, dashes, bullet, etc.). 
My main critique of his encoding (compatible with APL*PLUS) is that it uses
the range 128-159 for indispensable APL characters, while many network
gateways and modem/terminal programs either treat these as 0-31 control
characters or mishandle them in other ways.  (Another problematic character is
255, sometimes treated as 127 "delete".)  For example, xterm (X windows) will
not display 0-31 and 128-159.  The well-established ISO Latin-1 (8859-1)
encoding avoids 128-159 entirely.  Almost all modern computers already support
this 8-bit standard, and mail/news/web have come very close as well.  As a
result, we can expect 8-bit characters (at least 160-254/255) to be freely
usable and transmittable.  I believe APL can hitch a ride with very little
effort.  

My encoding places all (function) glyphs defined in the APL
Extended Standard (Draft) in the "safe" region 160-254.  Additionally, this
region contains nine box-drawing characters (used together with slightly
elongated ASCII - and |), a full set of lowercase national characters (as in
Adrian's proposal) including the "oe", and symbols for the pound and yen. 
That leaves just one position, filled with "squish-quad" (index in APL2). 
Position 160 is nonbreaking-space (required by Microsoft) and position 255
(sometimes problematic) is I-beam.

The range 128-159 contains the leftover APL glyphs and typographic characters
(quotes, dashes, bullet etc.) some of which are suspected to be required by
Microsoft.  The underlined alphabetics are in the Extended Standard but not in
this encoding.  However, they can be represented using underline "style".  The
nonstandard glyphs I-beam, hoof (upon in Sharp APL), iota_(search in Sharp
APL), epsilon_ (find in APL2), 0~ (iota 0 in Dyalog APL), unequal_ (unmatch in
Dyalog APL), and four quad-arrow characters (file handling in APL.68000) can
also be shown using underlining if necessary (hoof as o_, 0~ as 0_, quad-arrow
as arrow_), although "cover functions" are much preferable.

Again, the basic motivation is to put all "essential" APL characters outside
of regions with known network/terminal/Microsoft incompatibilities. 

 * ISO Latin-1 characters (quote marks << >> up! up? have been moved)
 $ Microsoft typographic characters

 128 | box   $| up!   *| low'  $| folder$| low"  $| ...   $| <<   *| >>    *|
 136 | high^ $| \quad  | oquad  | <     $| <-quad | ->quad |^quad  | vquad  |
 144 | up?   *| left' $| right'$| left" $| right"$| bullet$|n-dash$| m-dash$|
 152 | high~ $| diaer .| hoof   | >     $| 0~     | /=_    |iota_  | eps_   |
 160 | nbsp  $| times  | rotate | pound *| diamond| yen   *|gradeup| format |
 168 | each  *| comment| execute| divide | /-     | \-     |gradedn| high- *|
 176 | or     | boxLL  | boxLM  | boxLR  | boxML  | boxMM  |boxMR  | boxUL  |
 184 | boxUM  | boxUR  | rotate-| nand   | <=     | /=     | >=    | nor    |
 192 | comma- | alpha  | decode | cap    | floor  | member |delta_ | del    |
 200 | delta  | iota   | jot    | qq     | quad   | match  |encode | circle |
 208 | log    | divquad| rho    | ceiling| transp | drop   | cup   | omega  |
 216 | discl  | take   | enclose| -|     | <-     | index  | ->    | |-     |
 224 | a`    *| a'    *| a^    *| a~    *| a"    *| ao    *| ae   *| c,    *|
 232 | e`    *| e'    *| e^    *| e"    *| i`    *| i'    *| i^   *| i"    *|
 240 | del~   | n~    *| o`    *| o'    *| o^    *| o~    *| o"   *| oe     |
 232 | e`    *| e'    *| e^    *| e"    *| i`    *| i'    *| i^   *| i"    *|
 248 | o/    *| u`    *| u'    *| u^    *| u"    *| commute| rank  | I-beam |


SOFTWARE AVAILABILITY

I can provide the following upon request (some are not quite
finished):

o Macintosh screen font and custom keyboard mapping (drag-and-drop into 
  System) that work in all applications.
o The popular shareware terminal program Zterm 0.9, patched to use APL font.
o Instructions for making APL the system font (instead of Monaco), so the 
  Mac becomes an "APL machine".
o Sun (X windows) screen font (BDF format) and keyboard mapping (xterm 
  resources) that can be used in all xterm-based programs, such as the
  command shell, unix utilities, and text editors.
o Simple installation instructions.
o A re-encoding of Adrian Smith's APL2741 (Type 3) PostScript font.
o Printing instructions and utilities.
o Dyalog APL keyboard and output translation tables.
o (work in progress) Fonts and keyboard mapping for the PC, based on either
  Dyadic's utilities or xwindemo (BDF->FON converter) and WinKeySwap.

The custom keyboard is bona-fide APL -- capslock toggles between
"unified" and "traditional" APL keyboards (overstrikes are entered using the
Alt/Option key).  The font originated in 1987 as a hand-edited, highly
optimized version of Courier/APL*PLUS that has the same bitmap size as the
Mac's system font.

Over the years it has been revised and re-encoded for APL2 (RS/6000), VAXAPL,
and APL.68000 (at least).  Also, IBM has generously given us permission to use
its APL fonts.

ACKNOWLEDGMENTS (...)

CONTACT

William Chang
ACM SIGAPL Local Liaison
8 Gary Place, Huntington, NY 11743  USA
Tel: 516-367-8866
Fax: 516-367-8461
Net: wchang@acm.org
