Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP 3137: make bytesobject.c methods #45602

Closed
gpshead opened this issue Oct 11, 2007 · 12 comments
Closed

PEP 3137: make bytesobject.c methods #45602

gpshead opened this issue Oct 11, 2007 · 12 comments
Assignees

Comments

@gpshead
Copy link
Member

gpshead commented Oct 11, 2007

BPO 1261
Nosy @gvanrossum, @gpshead
Files
  • bytes-pep3137-methods-04b.diff.txt
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gvanrossum'
    closed_at = <Date 2007-10-16.06:33:22.272>
    created_at = <Date 2007-10-11.08:45:43.845>
    labels = []
    title = 'PEP 3137: make bytesobject.c methods'
    updated_at = <Date 2008-01-06.22:29:45.553>
    user = 'https://github.com/gpshead'

    bugs.python.org fields:

    activity = <Date 2008-01-06.22:29:45.553>
    actor = 'admin'
    assignee = 'gvanrossum'
    closed = True
    closed_date = <Date 2007-10-16.06:33:22.272>
    closer = 'gregory.p.smith'
    components = []
    creation = <Date 2007-10-11.08:45:43.845>
    creator = 'gregory.p.smith'
    dependencies = []
    files = ['8550']
    hgrepos = []
    issue_num = 1261
    keywords = ['patch']
    message_count = 12.0
    messages = ['56341', '56351', '56352', '56354', '56466', '56468', '56474', '56477', '56479', '56480', '56487', '56495']
    nosy_count = 2.0
    nosy_names = ['gvanrossum', 'gregory.p.smith']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue1261'
    versions = ['Python 3.0']

    @gpshead
    Copy link
    Member Author

    gpshead commented Oct 11, 2007

    This makes all existing bytesobject.c methods use the buffer API rather
    than explicitly requiring bytes objects as input. It also fixes input
    to append() and remove() that was not strict enough and improves a few
    unit tests in that area.

    NOTE: this patch likely depends on http://bugs.python.org/issue1260
    removing the buffer API from the unicode type in order for all unit
    tests to pass (i only tested it with that applied since thats where
    we're headed).

    @gvanrossum gvanrossum self-assigned this Oct 11, 2007
    @gpshead
    Copy link
    Member Author

    gpshead commented Oct 12, 2007

    Patch updated. It now implements the is*() methods for PyBytes. It
    moves common code into a shared bytes_ctype.c and .h file so that
    stringobject.c and bytesobject.c can share as much as possible. Unit
    tests are updated as needed for new method coverage and any behavior
    changes.

    still TODO: adding the missing methods (listed in a comment in the
    patch). I'm working on it.

    @gpshead gpshead changed the title PEP 3137: make bytesobject.c methods use PEP 3118 buffer API PEP 3137: make bytesobject.c methods Oct 12, 2007
    @gvanrossum
    Copy link
    Member

    Patch updated. It now implements the is*() methods for PyBytes. It
    moves common code into a shared bytes_ctype.c and .h file so that
    stringobject.c and bytesobject.c can share as much as possible.

    Did you move this into the stringlib subdirectory? That's more for
    sharing between PyString and PyUnicode, but I think there are more
    opportunities for sharing still, and PyString/PyBytes sharing makes
    sense separately.

    @gps
    Copy link
    Mannequin

    gps mannequin commented Oct 12, 2007

    > Patch updated. It now implements the is*() methods for PyBytes. It
    > moves common code into a shared bytes_ctype.c and .h file so that
    > stringobject.c and bytesobject.c can share as much as possible.

    Did you move this into the stringlib subdirectory? That's more for
    sharing between PyString and PyUnicode, but I think there are more
    opportunities for sharing still, and PyString/PyBytes sharing makes
    sense separately.

    Good idea, I haven't done that yet. At the moment it lives in
    Include/bytes_ctype.h and Object/bytes_ctype.c directly. stringlib is a
    good place for it and is something I pondered but hadn't gotten to. I'll do
    that as I implement the remaining missing PyBytes_ methods to be in the next
    update to this patch.

    -gps

    @gvanrossum
    Copy link
    Member

    Is it worth my time to review this yet?

    @gpshead
    Copy link
    Member Author

    gpshead commented Oct 15, 2007

    here's the latest update. this takes care of all methods listed in the
    PEP i believe. please review.

    i'm currently working porting some test case code for more methods from
    string_tests.py to buffer_tests.py to be shared between PyString and
    PyBytes similar to the ones already in the patch that way.

    @gpshead
    Copy link
    Member Author

    gpshead commented Oct 15, 2007

    Same diff 03 here but it also adds unit test coverage for the new
    PyBytes buffer methods.

    @gvanrossum
    Copy link
    Member

    Very impressive!

    (Apologies if these lines are occasionally out of order.)

    +extern PyObject* _bytes_isspace(const char *cptr, const Py_ssize_t len);

    IMO all these functions should have names starting with _Py or _Py_,
    since they are visible to the linker.

    Also, why 'const Py_ssize_t'? It's a scalar!

    +/* these store their len sized answer in the given preallocated *result
    arg */

    Mind using a Capital first letter?

    +extern const char isspace__doc__[];

    This needs a _Py or _Py_ prefix too.

    +extern const unsigned int ctype_table[256];

    Now this is no longer static, it also needs a _Py or _Py_ prefix.

    +extern const unsigned char ctype_tolower[256];
    +extern const unsigned char ctype_toupper[256];

    Ditto (the macros are fine though, since they are only visible to code
    that #includes this file, which is only a few files).

    +Return True if all characters in S are whitespace\n\

    Shouldn't that be bytes instead of characters (and consistently
    throughout)? And probably use B for the variable name instead of S.

    +/* ------- GPS --------- */

    What's this? Your initials? :-) I don't think it's needed. I'm
    guessing you

    /* The nullbytes are used by the stringlib during partition.

    • If partition is removed from bytes, nullbytes and its helper

    This comment block refers to something that ain't gonna happen
    (partition being removed). IMO the whole comment block can go, it's a
    red herring.

    + /* XXX: this depends on a signed integer overflow to < 0 */
    + /* C compilers, including gcc, do -NOT- guarantee this. */

    (And repeated twice more.) Wouldn't it be better to write this in a way
    that doesn't require this XXX comment? ISTR that we already had one
    area where we had to fight with gcc because it had proved to itself that
    something could never be negative, even though it could be due to
    overflow. The GCC authors won. :-(

    +/* TODO(gps):

    Don't you mean XXX(gps)? :-)

    + * These methods still need implementing (porting over from
    stringobject.c):
    + *
    + * IN PROGRESS:
    + * .capitalize(), .title(), .upper(), .lower(), .center(), .zfill(),
    + * .expandtabs(), .ljust(), .rjust(), .swapcase(), .splitlines(),
    + */
    +

    Hmmm... Aren't these done now?

    + /* XXX(gps): the docstring above says any iterable int will do but the
    + * bytes_setslice code only accepts something supporting PEP-3118.
    + * Is a list or tuple of 0 <= ints <= 255 also supposed to work? */

    Yes, it is, and it's a bug that it currently doesn't.

    @gpshead
    Copy link
    Member Author

    gpshead commented Oct 16, 2007

    Ok, I believe I've fixed everything you commented on, accidentally
    included in initials used as a search placeholder included. :)

    •            /* XXX: this depends on a signed integer overflow to
      

    < >0 */

    •            /* C compilers, including gcc, do -NOT- guarantee
      

    this. */

    (And repeated twice more.) Wouldn't it be better to write this in a way
    that doesn't require this XXX comment? ISTR that we already had one
    area where we had to fight with gcc because it had proved to itself that
    something could never be negative, even though it could be due to
    overflow. The GCC authors won. :-(

    This code was copied as is from stringobject.c, I just added the XXX
    notes upon reading it. There is an existing unittest to try and make
    sure the code has compiled properly with these tests (its not
    comprehensive, it'll only test one of these three test cases).
    Regardless, yes, it needs fixing. I didn't want to take that on that as
    part of this patch.

    @gpshead
    Copy link
    Member Author

    gpshead commented Oct 16, 2007

    04b is the same as 04, i just fixed the docstrings that i had missed in
    stringlib/transmogrify.h to use 'B' instead of 'S' and say they return a
    "modified copy of B" instead of a "string"

    wording could probably be improved further if anyone has any ideas.

    @gvanrossum
    Copy link
    Member

    Good! Check it in before I change my mind. :-)

    The words can be tweaked later.

    04b is the same as 04, i just fixed the docstrings that i had missed in
    stringlib/transmogrify.h to use 'B' instead of 'S' and say they return a
    "modified copy of B" instead of a "string"

    wording could probably be improved further if anyone has any ideas.

    @gpshead
    Copy link
    Member Author

    gpshead commented Oct 16, 2007

    Committed revision 58493

    @gpshead gpshead closed this as completed Oct 16, 2007
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants