Watch out for date irregularities:
- If you have some very old messages to import, better check first if they have a "Date" field in the correct format. If not, ZOE will use the day of import for them.
- The right format, according to RFC822, is
[day of week ","] 1*2DIGIT month 2DIGIT ; day month year
- (things in square brackets are optional). The most common error is to place month in front of day.
- Some old email clients don't format the "Date" corretly. E.g., Eudora Light and some early versions of Eudora. Early Eudora's put a semi-colon (";") at the start of the "Date" field. So searching for "Date: ;" (there's one space between the colon and the semi-colon) in headers is a good way to start hunting down such mail.
- Some messages don't have a "Date" field at all! If it's a received message, you may create one for it by borrowing the date/time from the earliest "Received" field (usually the one at the bottom of the pile of "Received" fields at the top or close to the top of the message header.)
Dealing with 8bit headers
- According to RFC rules, only 7bit characters (us-ascii) are allowed in headers. When 8bit characters are needed, the email client needs to encode them into 7bit ones using some encoding method, like "=?BIG5?B?rOOwUbd8vdek5cNEpdg=?=", and that's what ZOE expects to see.
- Unfortunally, many email clients allow you to relax this rule and send 8bit (and DBCS) characters in headers un-encoded, and many email servers (especially in East Asian countries) go along with it. (They used to strip the hig bit from headers.) ZOE can't handle such headers at the moment.
- There's no easy workaround. The only one I know of is to "redirect" (not forward) such a message using Becky (a Windows-based email client), for Becky would re-encode the header in an RFC-compliant manner. All other emailers I've tested (Thunderbird 0.8, The Bat! 3.0 trial, Outlook Express 6) don't do that. Probably because of that, however, Becky would re-direct only one message at a time, and you have to give it the recipient each time. I wrote a powerpro script to automate it a bit, still it took hours to redirect all such messages for me.
- The easier way is to wait for the new version, which is in the making as I type (many thanks).
Dealing with DBCS mail
- Background: DBCS means "double byte character sets," invented in pre-Unicode days to deal with languages that need character space far beyond the 256 that can be provided by a single byte. They are very common in East Asian countries, though Unicode is gaining ground. As of version 0.6.1, ZOE has troubles handling some DBCS mail. The following is an attempt at documenting the issues I've seen.
- The content doesn't display correctly.
- This happens when the "charset" setting in the "Content-Type" header field is improperly set. Check the setting in the source, and you may see "charset=us-ascii" or "charset=iso-8859-1". That's because Windows (I don't know about other OSes) have something called "default system language" (user-definable on some systems) that would override/overlook some charset setting and display it as the "default system language" (traditional Chinese using big5 code, e.g.). That's the easiest way to add support of East Asian languages to non-Unicode programs. The main drawback is that on such systems, non-Unicode programs can no longer display genuine 8bit iso-8859-1 characters (alphabets with accent marks, e.g.), for they've been intercepted by the system and displayed as some local langauge. IOW, on such systems, when a program tells the system to display something in iso-8859-1, the system ignores the request and treats it as some local language.
- ZOE is a Unicode program, and it thrives to display all languages properly. In order to do that, ZOE displays everything in utf-8. When doing the conversion, ZOE depends on the charset setting to tell what the content is made of and to convert it accordingly. ZOE doesn't ignore the "iso-8859-1" setting and substitute it with some other charset.
- Knowing this, the workaround is simply, albeit tedius if you have many such messages. Change the "us-ascii" or "iso-8859-1" part into the correct on (e.g.: big5) before importing such mail. If you have imported it already, you have to delete it, purge, and then import the corrected one.
- Sometimes the charset or the whole "Content-Type" field is missing. Add it by hand.
- The content displays correctly, but the header doesn't.
- Two possibilities here. The first is the header isn't encoded. See the above section about 8bit headers.
- Another possibility is the header is encoded, but the charset is wrong. E.g., it says
"=?ISO-8889-1?B?rOOwUbd8vdek5cNEpdg=?="
- when in fact it means
"=?BIG5?B?rOOwUbd8vdek5cNEpdg=?="
- Your emailer might display it correctly because of the "default system language" feature describled above. The solution is obvious: correct the charset and ZOE should import just fine.
- The plain text part displays correctly, but the html part (by clicking on the "HTML" link at the top) doesn't, or vice versa.
- If it's a multipart message with a plain text part and a html part, then one of the charset is wrong (though I don't know how this is possible, my emailer can't do that).
- If it's a "Content-type: text/html" message without a plain text part, and the charset is correct, then you probably run into a bug of ZOE (actually a bug in a 3rd-party library ZOE uses). The author is working on it.
- As mentioned a few times, a new version is on its way, which should solve most problems. But you shouldn't expect ZOE to work when the charset is wrong. So making sure your mail has the right charset is a must, even if you plan to wait for the new version.