Typing Unicode characters

Today I discovered a handy shortcut you can use to type accented letters and other Unicode characters without messing about with language-specific keyboards. All you need to do is type the decimal value of the character you want, then press Alt and x, and it should change into the character.

For example, if you’re writing a bit of Romanian and want to type the t with a comma below (ţ), type 0163 and then Alt-x. You can find the character codes in the Character Map or in BabelMap. This works in Word and WordPad in Windows XP and Vista, though unfortunately doesn’t seem to work in other programs.

I also found a useful site you can use to type in a variety of languages from Czech to Welsh.

Do you know of any other ways to input Unicode characters? I normally use BabelMap.

This entry was posted in Language.

18 Responses to Typing Unicode characters

  1. Arakun says:

    In Mac OS X there’s the US Extended keyboard layout. If you press the alt key half of the keys on the keyboard will produce different diacritics (e.g. for ‘č’ you’d type alt-v followed by a ‘c’ and for ‘ů’ you’d type alt-k followed by ‘u’). The rest of the keys produce various exotic character (e.g. alt-t for ‘þ’). Many more characters can be found by pressing shift-alt-; or shift-alt-. (shift-alt-; followed by ‘w’ will produce ‘ƿ’, the letter wynn).

    There’s also a Norwegian Extended layout which I modified into a Swedish layout using an editor called Ukelele. Keyboard layouts can be found in the System Preferences: click Language & Text, and then click Input Sources.

    For non-latin characters I use the Character Viewer which can be reached by pressing alt-command-t.

  2. LAttilaD says:

    This requires you to remember the numbers. A real solution must be much easier.
    I’m always working on utilities for real Unicode keyboard support. A tiny, undocumented one is located at http://turkalofile.lattilad.org/sajat_programok/exent.zip . It has an unlimited number of dead keys, and any of them can turn any character into any other character. The text file in the zip configures it. Start the exe and press letters followed by function keys like A, F2, F4.
    I’m not using it, by the way. I prefer some utilities I didn’t publish yet. If anybody is interesting in Unicode keyboards, contact me at http://postoffice.lattilad.org .

  3. LAttilaD says:

    PS. I’m talking about Windows XP.

  4. Delodephius says:

    I use the Microsoft Keyboard Layout Creator and I have modified and created several keyboard layouts for myself. For example, I have modified the Slovak keyboard by adding all letters necessary for transcribing Old Church Slavonic as well as expanding the “dead” keys that add letters with diacritics. I have created an Old Cyrillic, Glagolitic, Gothic, Old Irish, Avestan, Bosnian Arabic, Runic, Mongol, Lycian, Lydian and Old Italic layouts, though I only use few of those.

  5. Petréa Mitchell says:

    When I need to know a Unicode number, I just go to Unicode’s own official list.

    When I create Web pages, I use the HTML-to-Unicode escape sequence: type “&#x”, then the Unicode number, then “;”. So the Romanian character above is rendered to (spaces added so you can see it) & # x 0 1 6 3 ;.

    For comment forms like this, I also use cut-and-paste, especially if it’s something I’ve just looked up anyway.

  6. Petréa Mitchell says:

    Also, the code you gave isn’t decimal, it’s hexadecimal (base-16).

  7. Qcumber says:

    I don’t understand because I have never used Unicode.
    If I type 0163 then Alt-x, all I get in MS-Word is “0163 Alt-x”
    There must be some step(s) you didn’t mention.

  8. Jayarava says:

    Since I often type Sanskrit and Pāli I set my day to day Roman keyboard up to do so without fuss using Microsoft Keyboard Creator (āīūṛḷṃḥṭḍ). I also have a customised Devanāgarī keyboard which I can switch to by pressing control-1 (Roman is control-0). Since I created this to use Roman transliterations it is very easy to use: d = द.

    I’ve found this less good with Tibetan which has more letters than a standard Roman keyboard can handle while retaining the relationship to standard transliterations. For Tibetan, which I only use sparingly, I now tend to use a picker on the web: http://rishida.net/scripts/pickers/tibetan/

    For converting Indic scripts back and forth this app is awesome: http://www.virtualvinodh.com/aksharamukha

    The problem with character codes is that one either has to remember them, or look them up. If I use them often enough to remember them I don’t want to faff around with codes, so I modify my keyboard. In windows if I’m looking them up anyway I use the Windows Character Map app, so that having found the character I can then copy and paste it.

    I’ve been trying to get my Buddhist community to embrace Unicode for a few years now so that we can use diacritics for technical terms in Sanskrit and Pāli. Very few takers so far.

  9. TJ says:

    Mainly I have 3 main ways to put non-English characters and each one is according to what I’m doing.
    1. I already added several keyboards system to my windows (even Japanese and Chinese). Sometimes, I just pull one out from there and type. Also with that I’ve assigned short-cuts in the Control Panel already to change the system to one of the essential keyboards that I frequently use: Arabic, Irish (Gaelic) and German. I use by default the Irish keyboard layout since I can type the accents more easily there (but you would have a problem with the backslash, I think you must have a 102 keyboard).

    2. I use a program called “Swapkeys.” This program enables you to modify certain buttons and characters in your default keyboard. I use it mainly to solve the backslash problem with the Irish layout, and also to add certain characters that I use regularly when I work with Ayvarith, like the “þ” and “ð”.

    3. Finally, at certain moments, when all of the tricks above do not work, I run the “Character map” software [Start>Run>type "charmap" and hit Enter], and usually I do choose Arial or Times New Roman to look for certain characters. This program comes by default with Windows, and it is helpful as well in checking your fonts (not necessarily for linguistic purposes).

  10. Qcumber: in Windows, you typically hold Alt and type a number with the numeric keypad to get the “special chars” (didn’t know they corresponded to the Unicode numbers before)

    Under GNU/Linux, you typically use the Compose key (many people use the flag key for this). Press it once, then two keys from your keyboard, and you can get pretty much any character you want. Also, of course you can edit the characters that come if you want. However, I find the multiple keypresses rather slow, so instead I use a program called xmodmap which just lets me change, well, any keypress or key combination (e.g. I now have “š” on “alt-s”)

  11. Simon says:

    Jayarava – there’s a Tibetan input system built into BabelPad (and also Manchu, Mongolian, Uyghur, Yi and Unicode.

  12. DA says:

    I type in Welsh so often want letters with circumflex accents. Some applications, e.g. Microsoft Word lets you use keyboard shortcuts for such characters, e.g. to get the letter a with a ^, press CTRL + Shift + 6^, release all keys, then press a and a^ appears, properly (it doesn’t work here!). It works with a,e,i,o, I think u, but unfortunately not w.
    Similarly, Ctrl + Shift + ;:key followed by U gives you the vowel with umlaut on top. For accute accents, use the Ctrl key and single quote followed by the vowel. Don’t forget the Alt Gr key to the right of the space-bar which with e gives an e with grave accent. Word’s Insert Symbol window gives the shortcut details for each character.
    Hope this helps.

  13. Jerry says:

    Thanks for the BabelPad link – didn’t know about that one!

    I am working on a keyboard definition (Windows only) that defines more dead keys than the ones in the US-International keyboard definition. MS Word has a few more than the keyboard definition, but I often need more. I am trying to make a keyboard definition that has all accents from European languages.

    Also, I don’t like MS Word and most importantly, I want those dead keys available in whatever program I am using!

    The idea is to press a dead key and then the letter the accent has to be combined with. I hope to be able to use only Ctrl plus a character for the dead key, like DA said. Ctrl-` for accent grave, Ctrl-~ for tilde, Ctrl-^ (which is Ctrl-Shift-6) for circumflex, Ctrl-, for cedille, Ctrl-: (which is Ctrl-Shift-;) for umlaut, etc. Those are the obvious.
    The weirder ones are Ctrl-$ (Ctrl-Shift-4) for currencies, Ctrl-@ (Ctrl-Shift-2) plsu A or O for the combined AE or OE letters, Ctrl-! and Ctrl-? for the Spanish ! and ? at the beginning of the sentence, Ctrl-/ plus L for the Polish ł and Ł, Ctrl-. for several dot accents, etc.

    I haven’t been working on this keyboard definition for a while, though.

  14. Macsen says:

    In Welsh we can use Alt + 147 for ô etc. That’s fine, but it’s cumbersome, and sometimes because, I presume, it’s based on French use of circumflexes, doesn’t include our ŵ or ŷ.

    It’s easier to go to http://www.draig.co.uk and download for free the circumflexes which go on all 7 Welsh vowels: a, e, i, o, u, w, y. It’s free and easy. You then press AltGr (on the bottom tab) and keep your finger on it whilst pressing the vowel, w, etc. It works on email too.

  15. Simon says:

    I use the United Kingdom Extended / Welsh keyboard layout in Windows. With it you can type:

    - acute accents with Alt Gr and the letter
    - grave accents with the `key (the one next to 1) and the letter
    - circumflexes with Alt Gr ^ and the letter (works for ŵ and ŷ)
    - umlauts with Alt Gr ” and the letter
    - tildes with Alt Gr ~ and the letter
    - cedillas with Alt Gr and c (only works for ç)

  16. Ivan says:

    Arabists tend to use Yamli: http://www.yamli.com/arabic-keyboard/.

  17. Qcumber says:

    Kevin Brubeck Unhammer Says:
    “November 18th, 2010 at 11:05 am
    Qcumber: in Windows, you typically hold Alt and type a number with the numeric keypad to get the “special chars” (didn’t know they corresponded to the Unicode numbers before)”
    I have always done this for ASCII codes. For example I get “á” with ALT 160.
    Is this also a Unicode number?