Title: stdlib wrongly uses len() for bytes-like object
Created on 2021-06-17 05:05 by malin, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (14)
msg395971 - (view) Author: Ma Lin (malin) * Date: 2021-06-17 05:05
If run this code, it will raise an exception: 

    import pickle
    import lzma
    import pandas as pd
    with"test.xz", "wb") as file:
        pickle.dump(pd.DataFrame(range(1_000_000)), file, protocol=5)

The exception:

    Traceback (most recent call last):
      File "E:\", line 7, in <module>
        pickle.dump(pd.DataFrame(range(1_000_000)), file, protocol=5)
      File "D:\Python39\lib\", line 234, in write
        self._pos += len(data)
    TypeError: object of type 'pickle.PickleBuffer' has no len()
The exception is raised in lzma.LZMAFile.write() method:
PickleBuffer doesn't have .__len__ method, is it intended?
msg395973 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-06-17 05:36
Oh, LZMAFile.write() should not use len() directly on input data because it does not always work correctly with memoryview and other objects supporting the buffer protocol. It should use memoryview(data).nbytes or data = memoryview(data).cast('B') if other byte-oriented operations (indexing, slicing) are used. See for example Lib/, Lib/, Lib/, Lib/, Lib/, Lib/
msg395976 - (view) Author: Ma Lin (malin) * Date: 2021-06-17 06:26
Ok, I'm working on a PR.
msg396305 - (view) Author: Ma Lin (malin) * Date: 2021-06-22 05:28
I am checking all the .py files in `Lib` folder. has two len() bugs:

I think PR 26764 is prepared, it fixes the len() bugs in files.
msg396309 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-06-22 07:04
New changeset bc6c12c72a9536acc96e7b9355fd69d1083a43c1 by Ma Lin in branch 'main':
bpo-44439: BZ2File.write() / LZMAFile.write() handle buffer protocol correctly (GH-26764)
msg396334 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-06-22 13:57
New changeset 8bc26d8c9d092840054f57f9b4620de0d40d8423 by Ma Lin in branch '3.9':
bpo-44439: BZ2File.write()/LZMAFile.write() handle length correctly (GH-26846)
msg396335 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-06-22 13:59
Thank you for your contribution Ma Lin.
msg396336 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-06-22 14:00
New changeset 01858fbe31e8e0185edfbd3f10172f7c61391c9d by Miss Islington (bot) in branch '3.10':
bpo-44439: BZ2File.write() / LZMAFile.write() handle buffer protocol correctly (GH-26764) (GH-26845)
msg405948 - (view) Author: Ma Lin (malin) * Date: 2021-11-08 13:11
Serhiy Storchaka:

Sorry, I found `zipfile` module also has this bug, fixed in PR29468.

This bug was reported & fixed by GitHub user `marcoffee` firstly, so I list him as a co-author, his work:

The second commit fixes an omission of issue41735, a very simple fix, I fix it in PR29468 by the way.
msg414737 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2022-03-08 09:35
New changeset 36dd7396fcd26d8bf9919d536d05d7000becbe5b by Ma Lin in branch 'main':
bpo-44439: _ZipWriteFile.write() handle buffer protocol correctly (GH-29468)
msg414742 - (view) Author: miss-islington (miss-islington) Date: 2022-03-08 10:03
New changeset 21c5b3f73fb11fb0d3239971f72e8f0574a07245 by Miss Islington (bot) in branch '3.10':
bpo-44439: _ZipWriteFile.write() handle buffer protocol correctly (GH-29468)
msg414743 - (view) Author: miss-islington (miss-islington) Date: 2022-03-08 10:05
New changeset 0663ca17f5535178c083c6734fa52e40bd2db2de by Miss Islington (bot) in branch '3.9':
bpo-44439: _ZipWriteFile.write() handle buffer protocol correctly (GH-29468)
msg415528 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2022-03-18 21:36
Can this be closed now or is there anything else to do?
msg415549 - (view) Author: Ma Lin (malin) * Date: 2022-03-19 13:19
`_Stream.write` method in also has this code:

But this bug will not be triggered. When calling this method, always pass bytes data.

`_ConnectionBase.send_bytes` method in multiprocessing\ can be micro-optimized:
This can be done in another issue.

So I think this issue can be closed.
