Notes

Mapping

Mapping requires some more explanation. For starters, each mapping program...

The last three are created from mapsetup, which can be used in one of two ways. It always calculates the transformation, either based on the coordinate file, or based on the clipping path. When you use the latter option, it will also create the clipping path as postscript, and possibly the

Separately from that, you need a map configuration file, which refers to all the required files, possibly optional files, and contains settings that control drawing details. (See this for details)

Because of the somewhat bothersome intermediates, all the mapping of the various types is united into one page. This does mean you can't eg. alter the PostScript, but if you can write PostScript you can probably run the package yourself.

Notes on unicode:

To analyse unicode, you should probably convert your data to X-SAMPA, since all current models expect X-SAMPA. It is technically possible to make a model that reads some unicode variant, but you may run into byte ordering and other such problems, and will likely end up largely duplicating an existing configuration.

My experientation led me to use Excel's `Unicode Text' format, which seems to export as UTF16 (UTF16LE, I believe, judging from the single BOM at the start of the file. I couldn't find any documentation on this at all, mind). I assume this means it is a single string written to a file as UTF. This would mean the proper way of handling this is to convert to unpacked unicode codepoints (UCS) on reading and use a [ codepoint → X-SAMPA ] string mapping from there.

The current code that effectively does this needs to be cleaner. This should in fact be entirely simple with Python, but the current implementation handles byte pairs and treats them as codepoints. I seem to have gotten away with the 'UTF word'=='UCS codepoints' assumption on the IPALA data I handled because it doesn't use any particularly large (escaped) codepoints, but this is not the way to go if any other unicode handling is ever added.