Libraries and tools/MIME Strike Force: Difference between revisions

From HaskellWiki
(Added section on Composing MIME Messages)
 
(Added more sections.)
Line 18: Line 18:


The code that shows the final, formatted message should be able to terminate the lines with LF or CRLF.
The code that shows the final, formatted message should be able to terminate the lines with LF or CRLF.
It might also be nice if the combinators did not require a monadic interface.
== Modifying MIME Messages ==
Some programs, such as a mail transfer agent (MTA), will want to look at only a few select headers, and add or modify a few headers. In this case the MIME library should allow the program to:
* Only parse as much semantic information as is needed. For example, the MTA does not need to decode all the attachments, etc. In fact, a MTA might only care about RFC2822, so it should not have to be forced to deal with the rest of the layers added by additional RFCs.
* Add new headers without modifying any of the formatting of exist headers.
* Modify an existing header, perserving the comments and formatting as much as is sensible
* Process mail messages that contain syntax errors that don't directly interfer with what the program is trying to do. For example, an invalidly formated Date field should not cause an error, if the program does not examine the Date field.
Most existing Haskell MIME parsers have the following properties:
* They whole message must be parsed, even if most of the information is not used
* They store only the semantic information, so (showMessage . parseMessage) does not produce an output with the same MD5SUM as the input.
* They are too strict about rejecting invalid messages
In addition to an MTA, another program to consider is a email virus checker which will decode the attachments to check for viruses. It will always add a header to indicate the message has been scanned, and occasionally remove an attachment that contained a virus.
== Mail User Agent ==
A mail user agent (MUA), such as mutt, pine, thunderbird, etc, will need to parse a message and display it to the user. It will also need a mechanism to recognized MIME content and display it to the user or save it to disk, etc.
A MUA would probably want the following features:
* Ability to only download and decode large attachments at a users request. (i.e. a dial-up user using IMAP).
* Ability to display invalid emails. For example, the client should be able to display a message with an invalid Date field. Although this might interfer with sorting by date, the user will be more upset about not being able to read the email at all.
* Ability to read messages that use LF instead of CRLF as the line terminator
= Summary of Features =
So, the library needs to:
* Provide an API for creating MIME messages that does not require the user to have read any RFCs
* Provide the ability to decode the message lazily
* Have a permissive parser that attempts to parse invalid email
* Provide an API for modifying messages that modifies the message as little as possible
* Allow applications to use as much or little of the MIME stack as they want.
= Other Desired Features =
The library should also:
* Support Strings, ByteStrings, etc
* Support different string encodings (unicode, etc).
* Have a good test suite
* Be extensible. It would be nice if support for additional RFCs could be implemented with out having to modify the existing libraries.

Revision as of 19:49, 18 March 2007

MIME Strike Force

The goal of the MIME Strike Force is to create the one, true MIME library for Haskell. Currently, there are a lot of partial MIME libraries, but nothing really complete.

In this document MIME includes basic RFC2822 messages.

Use Cases

This section describes different tasks the MIME library will be used for, and any special requirements of each usage.

Composing MIME Messages

The MIME library must provide a set combinators for creating valid MIME messages. The combinators should allow the user to compose any valid MIME message, but restrict the user from creating invalid MIME messages.

Error conditions, such as missing required header fields (orig-date, originator, etc), should ideally be checked via the type-system at compile time.

Formatting issues, like line-length limitations, string encoding, etc, should be handled transparently at run-time.

The code that shows the final, formatted message should be able to terminate the lines with LF or CRLF.

It might also be nice if the combinators did not require a monadic interface.

Modifying MIME Messages

Some programs, such as a mail transfer agent (MTA), will want to look at only a few select headers, and add or modify a few headers. In this case the MIME library should allow the program to:

  • Only parse as much semantic information as is needed. For example, the MTA does not need to decode all the attachments, etc. In fact, a MTA might only care about RFC2822, so it should not have to be forced to deal with the rest of the layers added by additional RFCs.
  • Add new headers without modifying any of the formatting of exist headers.
  • Modify an existing header, perserving the comments and formatting as much as is sensible
  • Process mail messages that contain syntax errors that don't directly interfer with what the program is trying to do. For example, an invalidly formated Date field should not cause an error, if the program does not examine the Date field.

Most existing Haskell MIME parsers have the following properties:

  • They whole message must be parsed, even if most of the information is not used
  • They store only the semantic information, so (showMessage . parseMessage) does not produce an output with the same MD5SUM as the input.
  • They are too strict about rejecting invalid messages

In addition to an MTA, another program to consider is a email virus checker which will decode the attachments to check for viruses. It will always add a header to indicate the message has been scanned, and occasionally remove an attachment that contained a virus.

Mail User Agent

A mail user agent (MUA), such as mutt, pine, thunderbird, etc, will need to parse a message and display it to the user. It will also need a mechanism to recognized MIME content and display it to the user or save it to disk, etc.

A MUA would probably want the following features:

  • Ability to only download and decode large attachments at a users request. (i.e. a dial-up user using IMAP).
  • Ability to display invalid emails. For example, the client should be able to display a message with an invalid Date field. Although this might interfer with sorting by date, the user will be more upset about not being able to read the email at all.
  • Ability to read messages that use LF instead of CRLF as the line terminator

Summary of Features

So, the library needs to:

  • Provide an API for creating MIME messages that does not require the user to have read any RFCs
  • Provide the ability to decode the message lazily
  • Have a permissive parser that attempts to parse invalid email
  • Provide an API for modifying messages that modifies the message as little as possible
  • Allow applications to use as much or little of the MIME stack as they want.

Other Desired Features

The library should also:

  • Support Strings, ByteStrings, etc
  • Support different string encodings (unicode, etc).
  • Have a good test suite
  • Be extensible. It would be nice if support for additional RFCs could be implemented with out having to modify the existing libraries.