Unicode lookup

Input
	Character set up to date to Unicode 12. The tool as a whole is a new version, public in early stages. As I work on it, it will be missing features, occasionally its data, and sometimes give errors. Input currently recognised: String with any characters, e.g. ㋛ ☺ ㋡ or € or (ノಠ益ಠ)ノ彡┻━┻ or ᘛ⁐̤ᕐᐷ or ༼ つ ◕_◕ ༽つ or ♥︎ ♥️ or ◡̈. Refereces to one or more codepoints (one style at a time) U+205AC, U+1F9C0, U+2764 0420, c3a9, e2808b - hex that is codepoint and/or UTF8 (e.g. e2808b is probably U+200B, c3a9 could be either U+E9 or U+C3A9) %C3%A9 - percent-encoded hex UTF8 (probably from URLs) \xd0\xa0, \uc3a9, \u{c3a9}, \ud83e\uddc0, \U0001F9C0 - some escape styles used in code ➨, ➨, · decimal/hex/named HTML/XML entity Name search such as cyrillic de, n, numeral, plus-minus sign, exclamation question, arrow right, hieroglyph. Includes entity names like middot and Ntilde, and some of my own fuzziness, also to help find confusables. Or you could get a random character.
Type here
Input was
Interpretation	You didn't ask for anything, treating as empty string.

Search
Name search	no interesting words to search for


Unicode string properties
Normalization	No normalisations change the data (does not necessarily mean nothing decomposes to this form)
Encodings that can encode this properly	utf_8 utf_16 utf_32 ascii latin_1 iso8859_2 iso8859_3 iso8859_4 iso8859_5 iso8859_6 iso8859_7 iso8859_8 iso8859_9 iso8859_10 iso8859_13 iso8859_14 iso8859_15 iso2022_jp iso2022_jp_1 iso2022_jp_2 iso2022_jp_2004 iso2022_jp_3 iso2022_jp_ext iso2022_kr gb2312 gbk gb18030 big5 big5hkscs euc_jp euc_jis_2004 euc_jisx0213 euc_kr hz johab koi8_r koi8_u mac_cyrillic mac_greek mac_iceland mac_latin2 mac_roman mac_turkish ptcp154 shift_jis shift_jis_2004 shift_jisx0213 cp037 cp424 cp437 cp500 cp737 cp775 cp850 cp852 cp855 cp856 cp857 cp860 cp861 cp862 cp863 cp864 cp865 cp866 cp869 cp874 cp875 cp932 cp949 cp950 cp1006 cp1026 cp1140 cp1250 cp1251 cp1252 cp1253 cp1254 cp1255 cp1256 cp1257 cp1258
Encodings that will mangle your text

String encoding
String stuff
HTML/XML numeric entities	All but a-zA-Z0-9 and space are encoded, which is a little overzealous hexadecimal: decimal:
UTF8 bytestring	as hex: (UTF8 bytestring length is 0)
URL-encoded UTF8
Javascript ~ES3	""
ES6	""
Python py2	Unicode string: u'' UTF8 bytestring: ''
py3	Unicode string: '' UTF8 bytestring: b''
Ruby	''
CSS (in :before/:after)	''
TeX (experiment)	nothing interesting to report here

Emoji (experiment; TODO)

CJK (experiment; TODO)


Unicode layout, and blocks used by the input
	BMP - Basic Multilingual Plane: Basic Latin (128) pdf Latin-1 Supplement (128) pdf Latin Extended-A (128) pdf Latin Extended-B (208) pdf IPA Extensions (96) pdf Spacing Modifier Letters (80) pdf Combining Diacritical Marks (112) pdf Greek and Coptic (144) pdf Cyrillic (256) pdf Cyrillic Supplement (48) pdf Armenian (96) pdf Hebrew (112) pdf Arabic (256) pdf Syriac (80) pdf Arabic Supplement (48) pdf Thaana (64) pdf NKo (64) pdf Samaritan (64) pdf Mandaic (32) pdf Syriac Supplement (16) pdf not allocated (48) Arabic Extended-A (96) pdf Devanagari (128) pdf Bengali (128) pdf Gurmukhi (128) pdf Gujarati (128) pdf Oriya (128) pdf Tamil (128) pdf Telugu (128) pdf Kannada (128) pdf Malayalam (128) pdf Sinhala (128) pdf Thai (128) pdf Lao (128) pdf Tibetan (256) pdf Myanmar (160) pdf Georgian (96) pdf Hangul Jamo (256) pdf Ethiopic (384) pdf Ethiopic Supplement (32) pdf Cherokee (96) pdf Unified Canadian Aboriginal Syllabics (640) pdf Ogham (32) pdf Runic (96) pdf Tagalog (32) pdf Hanunoo (32) pdf Buhid (32) pdf Tagbanwa (32) pdf Khmer (128) pdf Mongolian (176) pdf Unified Canadian Aboriginal Syllabics Extended (80) pdf Limbu (80) pdf Tai Le (48) pdf New Tai Lue (96) pdf Khmer Symbols (32) pdf Buginese (32) pdf Tai Tham (144) pdf Combining Diacritical Marks Extended (80) pdf Balinese (128) pdf Sundanese (64) pdf Batak (64) pdf Lepcha (80) pdf Ol Chiki (48) pdf Cyrillic Extended-C (16) pdf Georgian Extended (48) pdf Sundanese Supplement (16) pdf Vedic Extensions (48) pdf Phonetic Extensions (128) pdf Phonetic Extensions Supplement (64) pdf Combining Diacritical Marks Supplement (64) pdf Latin Extended Additional (256) pdf Greek Extended (256) pdf General Punctuation (112) pdf Superscripts and Subscripts (48) pdf Currency Symbols (48) pdf Combining Diacritical Marks for Symbols (48) pdf Letterlike Symbols (80) pdf Number Forms (64) pdf Arrows (112) pdf Mathematical Operators (256) pdf Miscellaneous Technical (256) pdf Control Pictures (64) pdf Optical Character Recognition (32) pdf Enclosed Alphanumerics (160) pdf Box Drawing (128) pdf Block Elements (32) pdf Geometric Shapes (96) pdf Miscellaneous Symbols (256) pdf Dingbats (192) pdf Miscellaneous Mathematical Symbols-A (48) pdf Supplemental Arrows-A (16) pdf Braille Patterns (256) pdf Supplemental Arrows-B (128) pdf Miscellaneous Mathematical Symbols-B (128) pdf Supplemental Mathematical Operators (256) pdf Miscellaneous Symbols and Arrows (256) pdf Glagolitic (96) pdf Latin Extended-C (32) pdf Coptic (128) pdf Georgian Supplement (48) pdf Tifinagh (80) pdf Ethiopic Extended (96) pdf Cyrillic Extended-A (32) pdf Supplemental Punctuation (128) pdf CJK Radicals Supplement (128) pdf Kangxi Radicals (224) pdf not allocated (16) Ideographic Description Characters (16) pdf CJK Symbols and Punctuation (64) pdf Hiragana (96) pdf Katakana (96) pdf Bopomofo (48) pdf Hangul Compatibility Jamo (96) pdf Kanbun (16) pdf Bopomofo Extended (32) pdf CJK Strokes (48) pdf Katakana Phonetic Extensions (16) pdf Enclosed CJK Letters and Months (256) pdf CJK Compatibility (256) pdf CJK Unified Ideographs Extension A (6592) pdf Yijing Hexagram Symbols (64) pdf CJK Unified Ideographs (20992) pdf Yi Syllables (1168) pdf Yi Radicals (64) pdf Lisu (48) pdf Vai (320) pdf Cyrillic Extended-B (96) pdf Bamum (96) pdf Modifier Tone Letters (32) pdf Latin Extended-D (224) pdf Syloti Nagri (48) pdf Common Indic Number Forms (16) pdf Phags-pa (64) pdf Saurashtra (96) pdf Devanagari Extended (32) pdf Kayah Li (48) pdf Rejang (48) pdf Hangul Jamo Extended-A (32) pdf Javanese (96) pdf Myanmar Extended-B (32) pdf Cham (96) pdf Myanmar Extended-A (32) pdf Tai Viet (96) pdf Meetei Mayek Extensions (32) pdf Ethiopic Extended-A (48) pdf Latin Extended-E (64) pdf Cherokee Supplement (80) pdf Meetei Mayek (64) pdf Hangul Syllables (11184) pdf Hangul Jamo Extended-B (80) pdf High Surrogates (896) pdf High Private Use Surrogates (128) Low Surrogates (1024) pdf Private Use Area (6400) pdf CJK Compatibility Ideographs (512) pdf Alphabetic Presentation Forms (80) pdf Arabic Presentation Forms-A (688) pdf Variation Selectors (16) pdf Vertical Forms (16) pdf Combining Half Marks (16) pdf CJK Compatibility Forms (32) pdf Small Form Variants (32) pdf Arabic Presentation Forms-B (144) pdf Halfwidth and Fullwidth Forms (240) pdf Specials (16) pdf End of range that UCS2-based Unicode implementations can store. UCS4 implementations have no real limit, UTF-16 implementations can go beyond using surrogates. SMP - Supplemental Multilingual Plane: Linear B Syllabary (128) pdf Linear B Ideograms (128) pdf Aegean Numbers (64) pdf Ancient Greek Numbers (80) pdf Ancient Symbols (64) pdf Phaistos Disc (48) pdf not allocated (128) Lycian (32) pdf Carian (64) pdf Coptic Epact Numbers (32) pdf Old Italic (48) pdf Gothic (32) pdf Old Permic (48) pdf Ugaritic (32) pdf Old Persian (64) pdf not allocated (32) Deseret (80) pdf Shavian (48) pdf Osmanya (48) pdf Osage (80) pdf Elbasan (48) pdf Caucasian Albanian (64) pdf not allocated (144) Linear A (384) pdf not allocated (128) Cypriot Syllabary (64) pdf Imperial Aramaic (32) pdf Palmyrene (32) pdf Nabataean (48) pdf not allocated (48) Hatran (32) pdf Phoenician (32) pdf Lydian (32) pdf not allocated (64) Meroitic Hieroglyphs (32) pdf Meroitic Cursive (96) pdf Kharoshthi (96) pdf Old South Arabian (32) pdf Old North Arabian (32) pdf not allocated (32) Manichaean (64) pdf Avestan (64) pdf Inscriptional Parthian (32) pdf Inscriptional Pahlavi (32) pdf Psalter Pahlavi (48) pdf not allocated (80) Old Turkic (80) pdf not allocated (48) Old Hungarian (128) pdf Hanifi Rohingya (64) pdf not allocated (288) Rumi Numeral Symbols (32) pdf not allocated (128) Old Sogdian (48) pdf Sogdian (64) pdf not allocated (112) Elymaic (32) pdf Brahmi (128) pdf Kaithi (80) pdf Sora Sompeng (48) pdf Chakma (80) pdf Mahajani (48) pdf Sharada (96) pdf Sinhala Archaic Numbers (32) pdf Khojki (80) pdf not allocated (48) Multani (48) pdf Khudawadi (80) pdf Grantha (128) pdf not allocated (128) Newa (128) pdf Tirhuta (96) pdf not allocated (160) Siddham (128) pdf Modi (96) pdf Mongolian Supplement (32) pdf Takri (80) pdf not allocated (48) Ahom (64) pdf not allocated (192) Dogra (80) pdf not allocated (80) Warang Citi (96) pdf not allocated (160) Nandinagari (96) pdf Zanabazar Square (80) pdf Soyombo (96) pdf not allocated (16) Pau Cin Hau (64) pdf not allocated (256) Bhaiksuki (112) pdf Marchen (80) pdf not allocated (64) Masaram Gondi (96) pdf Gunjala Gondi (80) pdf not allocated (304) Makasar (32) pdf not allocated (192) Tamil Supplement (64) pdf Cuneiform (1024) pdf Cuneiform Numbers and Punctuation (128) pdf Early Dynastic Cuneiform (208) pdf not allocated (2736) Egyptian Hieroglyphs (1072) pdf Egyptian Hieroglyph Format Controls (16) pdf not allocated (4032) Anatolian Hieroglyphs (640) pdf not allocated (8576) Bamum Supplement (576) pdf Mro (48) pdf not allocated (96) Bassa Vah (48) pdf Pahawh Hmong (144) pdf not allocated (688) Medefaidrin (96) pdf not allocated (96) Miao (160) pdf not allocated (64) Ideographic Symbols and Punctuation (32) pdf Tangut (6144) pdf Tangut Components (768) pdf not allocated (9472) Kana Supplement (256) pdf Kana Extended-A (48) pdf Small Kana Extension (64) pdf Nushu (400) pdf not allocated (2304) Duployan (160) pdf Shorthand Format Controls (16) pdf not allocated (4944) Byzantine Musical Symbols (256) pdf Musical Symbols (256) pdf Ancient Greek Musical Notation (80) pdf not allocated (144) Mayan Numerals (32) pdf Tai Xuan Jing Symbols (96) pdf Counting Rod Numerals (32) pdf not allocated (128) Mathematical Alphanumeric Symbols (1024) pdf Sutton SignWriting (688) pdf not allocated (1360) Glagolitic Supplement (48) pdf not allocated (208) Nyiakeng Puachue Hmong (80) pdf not allocated (368) Wancho (64) pdf not allocated (1280) Mende Kikakui (224) pdf not allocated (32) Adlam (96) pdf not allocated (784) Indic Siyaq Numbers (80) pdf not allocated (64) Ottoman Siyaq Numbers (80) pdf not allocated (176) Arabic Mathematical Alphabetic Symbols (256) pdf not allocated (256) Mahjong Tiles (48) pdf Domino Tiles (112) pdf Playing Cards (96) pdf Enclosed Alphanumeric Supplement (256) pdf Enclosed Ideographic Supplement (256) pdf Miscellaneous Symbols and Pictographs (768) pdf Emoticons (80) pdf Ornamental Dingbats (48) pdf Transport and Map Symbols (128) pdf Alchemical Symbols (128) pdf Geometric Shapes Extended (128) pdf Supplemental Arrows-C (256) pdf Supplemental Symbols and Pictographs (256) pdf Chess Symbols (112) pdf Symbols and Pictographs Extended-A (144) pdf not allocated (1280) SIP - Supplemental Ideographic Plane: CJK Unified Ideographs Extension B (42720) pdf not allocated (32) CJK Unified Ideographs Extension C (4160) pdf CJK Unified Ideographs Extension D (224) pdf CJK Unified Ideographs Extension E (5776) pdf CJK Unified Ideographs Extension F (7488) pdf not allocated (3088) CJK Compatibility Ideographs Supplement (544) pdf not allocated (1504) TIP - Tertiary Ideographic Plane: CJK Unified Ideographs Extension G (4944) pdf not allocated (60592) Planes 4 through 13 - not allocated: plane 4 (not allocated) (65536) plane 5 (not allocated) (65536) plane 6 (not allocated) (65536) plane 7 (not allocated) (65536) plane 8 (not allocated) (65536) plane 9 (not allocated) (65536) plane 10 (not allocated) (65536) plane 11 (not allocated) (65536) plane 12 (not allocated) (65536) plane 13 (not allocated) (65536) SSP - Supplemental Special-purpose Plane: Tags (128) pdf not allocated (128) Variation Selectors Supplement (240) pdf not allocated (65040) PUA-A - Private Use Area A: Supplementary Private Use Area-A (65536) pdf PUA-B - Private Use Area B: Supplementary Private Use Area-B (65536) pdf Note that of the ~1.1 million codepoints under U+10FFFF (the current cap), only ~140K are general-purpose graphic codepoints (about half in BMP), ~130K are private use (with no defined characters), and ~830K are unused. The grouping used above is somewhat arbitrary, but looks halfway sensible

Input

Character set up to date to Unicode 12. The tool as a whole is a new version, public in early stages. As I work on it, it will be missing features, occasionally its data, and sometimes give errors.

Input currently recognised:

String with any characters, e.g. ㋛ ☺ ㋡ or € or (ノಠ益ಠ)ノ彡┻━┻ or ᘛ⁐̤ᕐᐷ or ༼ つ ◕_◕ ༽つ or ♥︎ ♥️ or ◡̈.
Refereces to one or more codepoints (one style at a time)
- U+205AC, U+1F9C0, U+2764
- 0420, c3a9, e2808b - hex that is codepoint and/or UTF8 (e.g. e2808b is probably U+200B, c3a9 could be either U+E9 or U+C3A9)
- %C3%A9 - percent-encoded hex UTF8 (probably from URLs)
- \xd0\xa0, \uc3a9, \u{c3a9}, \ud83e\uddc0, \U0001F9C0 - some escape styles used in code
- ➨, ➨, · decimal/hex/named HTML/XML entity
Name search such as cyrillic de, n, numeral, plus-minus sign, exclamation question, arrow right, hieroglyph. Includes entity names like middot and Ntilde, and some of my own fuzziness, also to help find confusables.

Or you could get a random character.

Type here

Input was

Interpretation

You didn't ask for anything, treating as empty string.

Name search

no interesting words to search for

Unicode string properties

Normalization

No normalisations change the data
(does not necessarily mean nothing decomposes to this form)

Encodings that can encode this properly

utf_8 utf_16 utf_32 ascii latin_1 iso8859_2 iso8859_3 iso8859_4 iso8859_5 iso8859_6 iso8859_7 iso8859_8 iso8859_9 iso8859_10 iso8859_13 iso8859_14 iso8859_15 iso2022_jp iso2022_jp_1 iso2022_jp_2 iso2022_jp_2004 iso2022_jp_3 iso2022_jp_ext iso2022_kr gb2312 gbk gb18030 big5 big5hkscs euc_jp euc_jis_2004 euc_jisx0213 euc_kr hz johab koi8_r koi8_u mac_cyrillic mac_greek mac_iceland mac_latin2 mac_roman mac_turkish ptcp154 shift_jis shift_jis_2004 shift_jisx0213 cp037 cp424 cp437 cp500 cp737 cp775 cp850 cp852 cp855 cp856 cp857 cp860 cp861 cp862 cp863 cp864 cp865 cp866 cp869 cp874 cp875 cp932 cp949 cp950 cp1006 cp1026 cp1140 cp1250 cp1251 cp1252 cp1253 cp1254 cp1255 cp1256 cp1257 cp1258

Encodings that will mangle your text

String encoding

String stuff

HTML/XML
numeric entities

All but a-zA-Z0-9 and space are encoded, which is a little overzealous
hexadecimal:

decimal:

UTF8 bytestring

as hex:
(UTF8 bytestring length is 0)

URL-encoded UTF8

Javascript
~ES3

ES6

Python
py2

Unicode string:
u''
UTF8 bytestring:
''

py3

Unicode string:
''
UTF8 bytestring:
b''

Ruby

CSS (in :before/:after)

TeX
(experiment)

nothing interesting to report here

Emoji (experiment; TODO)

CJK (experiment; TODO)

Unicode layout, and blocks used by the input

BMP - Basic Multilingual Plane:

Basic Latin (128) pdf

Latin-1 Supplement (128) pdf

Latin Extended-A (128) pdf

Latin Extended-B (208) pdf

IPA Extensions (96) pdf

Spacing Modifier Letters (80) pdf

Combining Diacritical Marks (112) pdf

Greek and Coptic (144) pdf

Cyrillic (256) pdf

Cyrillic Supplement (48) pdf

Armenian (96) pdf

Hebrew (112) pdf

Arabic (256) pdf

Syriac (80) pdf

Arabic Supplement (48) pdf

Thaana (64) pdf

NKo (64) pdf

Samaritan (64) pdf

Mandaic (32) pdf

Syriac Supplement (16) pdf

not allocated (48)

Arabic Extended-A (96) pdf

Devanagari (128) pdf

Bengali (128) pdf

Gurmukhi (128) pdf

Gujarati (128) pdf

Oriya (128) pdf

Tamil (128) pdf

Telugu (128) pdf

Kannada (128) pdf

Malayalam (128) pdf

Sinhala (128) pdf

Thai (128) pdf

Lao (128) pdf

Tibetan (256) pdf

Myanmar (160) pdf

Georgian (96) pdf

Hangul Jamo (256) pdf

Ethiopic (384) pdf

Ethiopic Supplement (32) pdf

Cherokee (96) pdf

Unified Canadian Aboriginal Syllabics (640) pdf

Ogham (32) pdf

Runic (96) pdf

Tagalog (32) pdf

Hanunoo (32) pdf

Buhid (32) pdf

Tagbanwa (32) pdf

Khmer (128) pdf

Mongolian (176) pdf

Unified Canadian Aboriginal Syllabics Extended (80) pdf

Limbu (80) pdf

Tai Le (48) pdf

New Tai Lue (96) pdf

Khmer Symbols (32) pdf

Buginese (32) pdf

Tai Tham (144) pdf

Combining Diacritical Marks Extended (80) pdf

Balinese (128) pdf

Sundanese (64) pdf

Batak (64) pdf

Lepcha (80) pdf

Ol Chiki (48) pdf

Cyrillic Extended-C (16) pdf

Georgian Extended (48) pdf

Sundanese Supplement (16) pdf

Vedic Extensions (48) pdf

Phonetic Extensions (128) pdf

Phonetic Extensions Supplement (64) pdf

Combining Diacritical Marks Supplement (64) pdf

Latin Extended Additional (256) pdf

Greek Extended (256) pdf

General Punctuation (112) pdf

Superscripts and Subscripts (48) pdf

Currency Symbols (48) pdf

Combining Diacritical Marks for Symbols (48) pdf

Letterlike Symbols (80) pdf

Number Forms (64) pdf

Arrows (112) pdf

Mathematical Operators (256) pdf

Miscellaneous Technical (256) pdf

Control Pictures (64) pdf

Optical Character Recognition (32) pdf

Enclosed Alphanumerics (160) pdf

Box Drawing (128) pdf

Block Elements (32) pdf

Geometric Shapes (96) pdf

Miscellaneous Symbols (256) pdf

Dingbats (192) pdf

Miscellaneous Mathematical Symbols-A (48) pdf

Supplemental Arrows-A (16) pdf

Braille Patterns (256) pdf

Supplemental Arrows-B (128) pdf

Miscellaneous Mathematical Symbols-B (128) pdf

Supplemental Mathematical Operators (256) pdf

Miscellaneous Symbols and Arrows (256) pdf

Glagolitic (96) pdf

Latin Extended-C (32) pdf

Coptic (128) pdf

Georgian Supplement (48) pdf

Tifinagh (80) pdf

Ethiopic Extended (96) pdf

Cyrillic Extended-A (32) pdf

Supplemental Punctuation (128) pdf

CJK Radicals Supplement (128) pdf

Kangxi Radicals (224) pdf

not allocated (16)

Ideographic Description Characters (16) pdf

CJK Symbols and Punctuation (64) pdf

Hiragana (96) pdf

Katakana (96) pdf

Bopomofo (48) pdf

Hangul Compatibility Jamo (96) pdf

Kanbun (16) pdf

Bopomofo Extended (32) pdf

CJK Strokes (48) pdf

Katakana Phonetic Extensions (16) pdf

Enclosed CJK Letters and Months (256) pdf

CJK Compatibility (256) pdf

CJK Unified Ideographs Extension A (6592) pdf

Yijing Hexagram Symbols (64) pdf

CJK Unified Ideographs (20992) pdf

Yi Syllables (1168) pdf

Yi Radicals (64) pdf

Lisu (48) pdf

Vai (320) pdf

Cyrillic Extended-B (96) pdf

Bamum (96) pdf

Modifier Tone Letters (32) pdf

Latin Extended-D (224) pdf

Syloti Nagri (48) pdf

Common Indic Number Forms (16) pdf

Phags-pa (64) pdf

Saurashtra (96) pdf

Devanagari Extended (32) pdf

Kayah Li (48) pdf

Rejang (48) pdf

Hangul Jamo Extended-A (32) pdf

Javanese (96) pdf

Myanmar Extended-B (32) pdf

Cham (96) pdf

Myanmar Extended-A (32) pdf

Tai Viet (96) pdf

Meetei Mayek Extensions (32) pdf

Ethiopic Extended-A (48) pdf

Latin Extended-E (64) pdf

Cherokee Supplement (80) pdf

Meetei Mayek (64) pdf

Hangul Syllables (11184) pdf

Hangul Jamo Extended-B (80) pdf

High Surrogates (896) pdf

High Private Use Surrogates (128)

Low Surrogates (1024) pdf

Private Use Area (6400) pdf

CJK Compatibility Ideographs (512) pdf

Alphabetic Presentation Forms (80) pdf

Arabic Presentation Forms-A (688) pdf

Variation Selectors (16) pdf

Vertical Forms (16) pdf

Combining Half Marks (16) pdf

CJK Compatibility Forms (32) pdf

Small Form Variants (32) pdf

Arabic Presentation Forms-B (144) pdf

Halfwidth and Fullwidth Forms (240) pdf

Specials (16) pdf

End of range that UCS2-based Unicode implementations can store.
UCS4 implementations have no real limit, UTF-16 implementations can go beyond using surrogates.

SMP - Supplemental Multilingual Plane:

Linear B Syllabary (128) pdf

Linear B Ideograms (128) pdf

Aegean Numbers (64) pdf

Ancient Greek Numbers (80) pdf

Ancient Symbols (64) pdf

Phaistos Disc (48) pdf

not allocated (128)

Lycian (32) pdf

Carian (64) pdf

Coptic Epact Numbers (32) pdf

Old Italic (48) pdf

Gothic (32) pdf

Old Permic (48) pdf

Ugaritic (32) pdf

Old Persian (64) pdf

not allocated (32)

Deseret (80) pdf

Shavian (48) pdf

Osmanya (48) pdf

Osage (80) pdf

Elbasan (48) pdf

Caucasian Albanian (64) pdf

not allocated (144)

Linear A (384) pdf

not allocated (128)

Cypriot Syllabary (64) pdf

Imperial Aramaic (32) pdf

Palmyrene (32) pdf

Nabataean (48) pdf

not allocated (48)

Hatran (32) pdf

Phoenician (32) pdf

Lydian (32) pdf

not allocated (64)

Meroitic Hieroglyphs (32) pdf

Meroitic Cursive (96) pdf

Kharoshthi (96) pdf

Old South Arabian (32) pdf

Old North Arabian (32) pdf

not allocated (32)

Manichaean (64) pdf

Avestan (64) pdf

Inscriptional Parthian (32) pdf

Inscriptional Pahlavi (32) pdf

Psalter Pahlavi (48) pdf

not allocated (80)

Old Turkic (80) pdf

not allocated (48)

Old Hungarian (128) pdf

Hanifi Rohingya (64) pdf

not allocated (288)

Rumi Numeral Symbols (32) pdf

not allocated (128)

Old Sogdian (48) pdf

Sogdian (64) pdf

not allocated (112)

Elymaic (32) pdf

Brahmi (128) pdf

Kaithi (80) pdf

Sora Sompeng (48) pdf

Chakma (80) pdf

Mahajani (48) pdf

Sharada (96) pdf

Sinhala Archaic Numbers (32) pdf

Khojki (80) pdf

not allocated (48)

Multani (48) pdf

Khudawadi (80) pdf

Grantha (128) pdf

not allocated (128)

Newa (128) pdf

Tirhuta (96) pdf

not allocated (160)

Siddham (128) pdf

Modi (96) pdf

Mongolian Supplement (32) pdf

Takri (80) pdf

not allocated (48)

Ahom (64) pdf

not allocated (192)

Dogra (80) pdf

not allocated (80)

Warang Citi (96) pdf

not allocated (160)

Nandinagari (96) pdf

Zanabazar Square (80) pdf

Soyombo (96) pdf

not allocated (16)

Pau Cin Hau (64) pdf

not allocated (256)

Bhaiksuki (112) pdf

Marchen (80) pdf

not allocated (64)

Masaram Gondi (96) pdf

Gunjala Gondi (80) pdf

not allocated (304)

Makasar (32) pdf

not allocated (192)

Tamil Supplement (64) pdf

Cuneiform (1024) pdf

Cuneiform Numbers and Punctuation (128) pdf

Early Dynastic Cuneiform (208) pdf

not allocated (2736)

Egyptian Hieroglyphs (1072) pdf

Egyptian Hieroglyph Format Controls (16) pdf

not allocated (4032)

Anatolian Hieroglyphs (640) pdf

not allocated (8576)

Bamum Supplement (576) pdf

Mro (48) pdf

not allocated (96)

Bassa Vah (48) pdf

Pahawh Hmong (144) pdf

not allocated (688)

Medefaidrin (96) pdf

not allocated (96)

Miao (160) pdf

not allocated (64)

Ideographic Symbols and Punctuation (32) pdf

Tangut (6144) pdf

Tangut Components (768) pdf

not allocated (9472)

Kana Supplement (256) pdf

Kana Extended-A (48) pdf

Small Kana Extension (64) pdf

Nushu (400) pdf

not allocated (2304)

Duployan (160) pdf

Shorthand Format Controls (16) pdf

not allocated (4944)

Byzantine Musical Symbols (256) pdf

Musical Symbols (256) pdf

Ancient Greek Musical Notation (80) pdf

not allocated (144)

Mayan Numerals (32) pdf

Tai Xuan Jing Symbols (96) pdf

Counting Rod Numerals (32) pdf

not allocated (128)

Mathematical Alphanumeric Symbols (1024) pdf

Sutton SignWriting (688) pdf

not allocated (1360)

Glagolitic Supplement (48) pdf

not allocated (208)

Nyiakeng Puachue Hmong (80) pdf

not allocated (368)

Wancho (64) pdf

not allocated (1280)

Mende Kikakui (224) pdf

not allocated (32)

Adlam (96) pdf

not allocated (784)

Indic Siyaq Numbers (80) pdf

not allocated (64)

Ottoman Siyaq Numbers (80) pdf

not allocated (176)

Arabic Mathematical Alphabetic Symbols (256) pdf

not allocated (256)

Mahjong Tiles (48) pdf

Domino Tiles (112) pdf

Playing Cards (96) pdf

Enclosed Alphanumeric Supplement (256) pdf

Enclosed Ideographic Supplement (256) pdf

Miscellaneous Symbols and Pictographs (768) pdf

Emoticons (80) pdf

Ornamental Dingbats (48) pdf

Transport and Map Symbols (128) pdf

Alchemical Symbols (128) pdf

Geometric Shapes Extended (128) pdf

Supplemental Arrows-C (256) pdf

Supplemental Symbols and Pictographs (256) pdf

Chess Symbols (112) pdf

Symbols and Pictographs Extended-A (144) pdf

not allocated (1280)

SIP - Supplemental Ideographic Plane:

CJK Unified Ideographs Extension B (42720) pdf

not allocated (32)

CJK Unified Ideographs Extension C (4160) pdf

CJK Unified Ideographs Extension D (224) pdf

CJK Unified Ideographs Extension E (5776) pdf

CJK Unified Ideographs Extension F (7488) pdf

not allocated (3088)

CJK Compatibility Ideographs Supplement (544) pdf

not allocated (1504)

TIP - Tertiary Ideographic Plane:

CJK Unified Ideographs Extension G (4944) pdf

not allocated (60592)

Planes 4 through 13 - not allocated:

plane 4 (not allocated) (65536)

plane 5 (not allocated) (65536)

plane 6 (not allocated) (65536)

plane 7 (not allocated) (65536)

plane 8 (not allocated) (65536)

plane 9 (not allocated) (65536)

plane 10 (not allocated) (65536)

plane 11 (not allocated) (65536)

plane 12 (not allocated) (65536)

plane 13 (not allocated) (65536)

SSP - Supplemental Special-purpose Plane:

Tags (128) pdf

not allocated (128)

Variation Selectors Supplement (240) pdf

not allocated (65040)

PUA-A - Private Use Area A:

Supplementary Private Use Area-A (65536) pdf

PUA-B - Private Use Area B:

Supplementary Private Use Area-B (65536) pdf

Note that of the ~1.1 million codepoints under U+10FFFF (the current cap), only ~140K are general-purpose graphic codepoints (about half in BMP), ~130K are private use (with no defined characters), and ~830K are unused.

The grouping used above is somewhat arbitrary, but looks halfway sensible