UnicodeByteString

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

This draft proposal for a new Unicode layer on top of ByteString is still being written.

Motivation

ByteString provides a faster and more memory efficient data type than [Word8] for processing raw bytes. By creating a Unicode layer on top of ByteString that deals in units of characters instead of units of bytes we can achieve similar performance improvements over String for text processing. A Unicode data type also removes the error prone process of keeping track of strings encoded as raw bytes stored in ByteStrings. Using functions such as length on a Unicode string just works even though different encodings use different numbers of bytes to represent a character.

Specification

Open Issues

References

http://www.python.org/dev/peps/pep-0358/ - PEP 3116 -- New I/O
http://python.org/dev/peps/pep-3116/ - PEP 358 -- The "bytes" Object

UnicodeByteString

Contents

Motivation

Specification

Open Issues

References

Navigation menu

Search