This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rednaw
Recipients barry, r.david.murray, rednaw
Date 2014-02-23.17:59:35
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1393178377.04.0.906048305625.issue20747@psf.upfronthosting.co.za>
In-reply-to
Content
If you look at the `header_encode` method in the `Charset` class in `email.charset`, you'll see that depending on the `header_encoding` that is set on the `Charset` instance, it will either encode it using base64 or quoted-printable (QP):

http://hg.python.org/cpython/file/3a1db0d2747e/Lib/email/charset.py#l351

However, QP always uses `maxlinelen=None` and base64 doesn't. This results in the following behaviour:

- If you use base64 encoding and your header size is longer than the default `maxlinelen`, it will be split over multiple lines.
- If you use QP encoding with the same header it doesn't get split over multiple lines.

You can easily test it with this snippet:

    from email.charset import Charset, BASE64, QP

    header = (
        'tejkstj tlkjes takldjf aseio neaoiflk asnfoieas nflkdan foeias '
        'naskln ioeasn kldan flkansoie naslk dnaslk fndaslk fneoisaf '
        'neklasn dfklasnf oiasenf lkadsn lkfanldk fas dfknaioe nas'
    )

    charset = Charset('utf-8')

    charset.header_encoding = BASE64
    print 'BASE64:'
    print charset.header_encode(header)

    charset.header_encoding = QP
    print 'QP:'
    print charset.header_encode(header)

Which will output:

    BASE64:
    =?utf-8?b?dGVqa3N0aiB0bGtqZXMgdGFrbGRqZiBhc2VpbyBuZWFvaWZsayBhc25mb2llYXMg?=
     =?utf-8?b?bmZsa2RhbiBmb2VpYXMgbmFza2xuIGlvZWFzbiBrbGRhbiBmbGthbnNvaWUgbmFz?=
     =?utf-8?b?bGsgZG5hc2xrIGZuZGFzbGsgZm5lb2lzYWYgbmVrbGFzbiBkZmtsYXNuZiBvaWFz?=
     =?utf-8?b?ZW5mIGxrYWRzbiBsa2ZhbmxkayBmYXMgZGZrbmFpb2UgbmFz?=
    QP:
    =?utf-8?q?tejkstj_tlkjes_takldjf_aseio_neaoiflk_asnfoieas_nflkdan_foeias_naskln_ioeasn_kldan_flkansoie_naslk_dnaslk_fndaslk_fneoisaf_neklasn_dfklasnf_oiasenf_lkadsn_lkfanldk_fas_dfknaioe_nas?=

This is inconsistent behavior.

Aside from that, I think the `header_encode` method should accept an argument `maxlinelen` that defaults to an appropriate value (probably 76), but which you can overwrite on free will.

This is (I think) also necessary because the `Header` class in `email.header` has a `maxlinelen` attribute that is used for the same purpose. Normally this works fine, but when you specified a charset for your header, it uses the `Charset` class and the `maxlinelen` is lost. This is happening here:

http://hg.python.org/cpython/file/3a1db0d2747e/Lib/email/header.py#l368

You see, the `_encode_chunks` takes the `maxlinelen` argument but doesn't pass it on to the `header_encode` method of `charset` (which is a `Charset` instance).

As such, you can see this issue in action with the following snippet:

    from email.header import Header

    maxlinelen = 9999999

    print 'No charset:'
    print Header(
        u'asdfjk lasjdf sajdfl ajsdfaj sdlkfjas kfladjs flkajsdflk jsadklf jadslkfj adslkfj asdlkjf lksadjfkldas jfkldasj fkadsj fladsjf kladsjfk asdjfkldasasd kfaj  kfladsj fkadsjf asdf ',
        maxlinelen=maxlinelen
    ).encode()

    print 'Charset with special characters:'
    print Header(
        u'attachment; filename="ajdsklfj klasdjfkl asdjfkl jadsfja sdflkads fad fads adsf dasjfkl jadslkfj dlasf asd \u6211\u6211\u6211 jo \u6211\u6211 jo \u6211\u6211"',
        charset='utf-8',
        maxlinelen=9999999
    ).encode()

Which will output:

    No charset:
    asdfjk lasjdf sajdfl ajsdfaj sdlkfjas kfladjs flkajsdflk jsadklf jadslkfj adslkfj asdlkjf lksadjfkldas jfkldasj fkadsj fladsjf kladsjfk asdjfkldasasd kfaj  kfladsj fkadsjf asdf
    Charset with special characters:
    =?utf-8?b?YXR0YWNobWVudDsgZmlsZW5hbWU9ImFqZHNrbGZqIGtsYXNkamZrbCBhc2RqZmts?=
     =?utf-8?b?IGphZHNmamEgc2RmbGthZHMgZmFkIGZhZHMgYWRzZiBkYXNqZmtsIGphZHNsa2Zq?=
     =?utf-8?b?IGRsYXNmIGFzZCDmiJHmiJHmiJEgam8g5oiR5oiRIGpvIOaIkeaIkSI=?=

This is currently an issue we're experiencing in Django, see our issue in the issue tracker:
https://code.djangoproject.com/ticket/20889#comment:4
History
Date User Action Args
2014-02-23 17:59:37rednawsetrecipients: + rednaw, barry, r.david.murray
2014-02-23 17:59:37rednawsetmessageid: <1393178377.04.0.906048305625.issue20747@psf.upfronthosting.co.za>
2014-02-23 17:59:36rednawlinkissue20747 messages
2014-02-23 17:59:35rednawcreate