UnicodeByteString

This draft proposal for a new Unicode layer on top of ByteString is still being written.

Motivation

ByteString provides a faster and more memory efficient data type than [Word8] for processing raw bytes. By creating a Unicode layer on top of ByteString that deals in units of characters instead of units of bytes we can achieve similar performance improvements over String for text processing. A Unicode data type also removes the error prone process of keeping track of strings encoded as raw bytes stored in ByteStrings. Using functions such as length on a Unicode string just works even though different encodings use different numbers of bytes to represent a character.

Specification

Open Issues

References

http://www.python.org/dev/peps/pep-0358/ - PEP 3116 -- New I/O
http://python.org/dev/peps/pep-3116/ - PEP 358 -- The "bytes" Object

UnicodeByteString

Contents

Motivation

Specification

Open Issues

References

Navigation menu

Search