Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removal of basestring type #45599

Closed
tiran opened this issue Oct 10, 2007 · 18 comments
Closed

Removal of basestring type #45599

tiran opened this issue Oct 10, 2007 · 18 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@tiran
Copy link
Member

tiran commented Oct 10, 2007

BPO 1258
Nosy @gvanrossum, @tiran
Files
  • py3k_basestring_removal.patch
  • py3k_basestring_removal3.patch
  • fix_basestr.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gvanrossum'
    closed_at = <Date 2007-10-16.18:13:58.125>
    created_at = <Date 2007-10-10.21:23:31.864>
    labels = ['interpreter-core']
    title = 'Removal of basestring type'
    updated_at = <Date 2007-10-24.19:57:40.723>
    user = 'https://github.com/tiran'

    bugs.python.org fields:

    activity = <Date 2007-10-24.19:57:40.723>
    actor = 'gvanrossum'
    assignee = 'gvanrossum'
    closed = True
    closed_date = <Date 2007-10-16.18:13:58.125>
    closer = 'gvanrossum'
    components = ['Interpreter Core']
    creation = <Date 2007-10-10.21:23:31.864>
    creator = 'christian.heimes'
    dependencies = []
    files = ['8506', '8542', '8548']
    hgrepos = []
    issue_num = 1258
    keywords = ['patch']
    message_count = 18.0
    messages = ['56326', '56327', '56328', '56329', '56330', '56331', '56337', '56444', '56446', '56447', '56467', '56469', '56470', '56471', '56473', '56475', '56501', '56725']
    nosy_count = 2.0
    nosy_names = ['gvanrossum', 'christian.heimes']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue1258'
    versions = ['Python 3.0']

    @tiran
    Copy link
    Member Author

    tiran commented Oct 10, 2007

    The patch removes the basestring type from Python 3.0. PyString and
    PyUnicode are subclasses of PyBaseObject_Type. Each occurrence of
    basestring was replaces with str, mostly isinstance(egg, basestring)
    with a few exceptions. PyObject_TypeCheck(args, &PyBaseString_Type) is
    replaced with a check for PyUnicode and PyString.

    @tiran tiran added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Oct 10, 2007
    @gvanrossum
    Copy link
    Member

    Thanks, evaluating!

    @gvanrossum
    Copy link
    Member

    I see 10 failing tests:

    test_ctypes test_email test_httplib test_inspect test_os test_re
    test_subprocess test_sys test_xml_etree test_xml_etree_c
    

    @gvanrossum gvanrossum self-assigned this Oct 10, 2007
    @tiran
    Copy link
    Member Author

    tiran commented Oct 10, 2007

    test_ctypes: works for me
    test_email: need some help from an email expoert
    test_httplib: __file__ has a wrong type str8. I'm looking into it.
    test_inspect: same issue as httplib
    test_os: same issue
    test_re: I had the failing test before my changes
    File "Lib/test/test_re.py", line 622, in test_empty_array
    ValueError: bad typecode (must be b, B, u, h, H, i, I, l, L, f or d)
    test_subprocess: I don't understand why it fails. The traceback is
    missing a line
    test_sys: related to __file__
    test_xml_etree / test_xml_etree_c: a str8 / io error that may be related
    to __file__

    @gvanrossum
    Copy link
    Member

    On 10/10/07, Christian Heimes <report@bugs.python.org> wrote:

    Christian Heimes added the comment:

    test_ctypes: works for me

    Did you svn up, make clean and rebuild?

    test_email: need some help from an email expoert

    Which test is failing?

    test_httplib: __file__ has a wrong type str8. I'm looking into it.

    Yes, __file__ always has that type. Fixing it is messy because it
    requires using the default filesystem encoding. Can you try that as a
    separate patch?

    test_inspect: same issue as httplib
    test_os: same issue
    test_re: I had the failing test before my changes

    But it passes for me.

    File "Lib/test/test_re.py", line 622, in test_empty_array
    ValueError: bad typecode (must be b, B, u, h, H, i, I, l, L, f or d)

    Hm. It passes for me.

    test_subprocess: I don't understand why it fails. The traceback is
    missing a line
    test_sys: related to __file__
    test_xml_etree / test_xml_etree_c: a str8 / io error that may be related
    to __file__

    Thanks for looking into these!!

    @tiran
    Copy link
    Member Author

    tiran commented Oct 10, 2007

    Guido van Rossum wrote:

    Did you svn up, make clean and rebuild?

    The ctypes package didn't change since my last rebuild an hour ago. I'm
    on Linux (Ubuntu i386)

    > test_email: need some help from an email expoert

    Which test is failing?

    test_decoded_generator()
    The generator tries to print a str8 to a text file.

    Yes, __file__ always has that type. Fixing it is messy because it
    requires using the default filesystem encoding. Can you try that as a
    separate patch?

    I'm already working on it. Can I introduce a new function
    _PyUnicode_AsDefaultFSEncodedString that encodes unicode using
    Py_FileSystemDefaultEncoding or UTF-8?

    > File "Lib/test/test_re.py", line 622, in test_empty_array
    > ValueError: bad typecode (must be b, B, u, h, H, i, I, l, L, f or d)

    Hm. It passes for me.

    I'm going to look into the issue later.

    @gvanrossum
    Copy link
    Member

    On 10/10/07, Christian Heimes <report@bugs.python.org> wrote:

    Christian Heimes added the comment:

    Guido van Rossum wrote:
    > Did you svn up, make clean and rebuild?

    The ctypes package didn't change since my last rebuild an hour ago. I'm
    on Linux (Ubuntu i386)

    Odd. I'll investigate when I have more time.

    >> test_email: need some help from an email expoert
    >
    > Which test is failing?

    test_decoded_generator()
    The generator tries to print a str8 to a text file.

    Thought so. I have a tentative fix that I want approved by Barry
    Warsaw before checking; you can see if it works for you too:

    --- Lib/email/generator.py      (revision 58412)
    +++ Lib/email/generator.py      (working copy)
    @@ -288,7 +288,7 @@
             for part in msg.walk():
                 maintype = part.get_content_maintype()
                 if maintype == 'text':
    -                print(part.get_payload(decode=True), file=self)
    +                print(part.get_payload(decode=False), file=self)
                 elif maintype == 'multipart':
                     # Just skip this
                     pass

    > Yes, __file__ always has that type. Fixing it is messy because it
    > requires using the default filesystem encoding. Can you try that as a
    > separate patch?

    I'm already working on it. Can I introduce a new function
    _PyUnicode_AsDefaultFSEncodedString that encodes unicode using
    Py_FileSystemDefaultEncoding or UTF-8?

    That's a rather long name... I don't think it needs a leading
    underscore. How about

    PyUnicode_AsFSString()?

    @tiran
    Copy link
    Member Author

    tiran commented Oct 15, 2007

    Here is an updated patch which applies cleanly and fixes some additional
    unit tests and removes one that doesn't make sense any more (re.compile
    doesn't accept bytes).

    The unit tests profile, cProfile and doctest fail w/ and w/o this patch.
    They seem to suffer from the latest changes of our previous patch and
    additional calls to utf_8_decode().

    @gvanrossum
    Copy link
    Member

    Hm? This is a one-word patch to email/generator.py.

    On 10/15/07, Christian Heimes <report@bugs.python.org> wrote:

    Christian Heimes added the comment:

    Here is an updated patch which applies cleanly and fixes some additional
    unit tests and removes one that doesn't make sense any more (re.compile
    doesn't accept bytes).

    The unit tests profile, cProfile and doctest fail w/ and w/o this patch.
    They seem to suffer from the latest changes of our previous patch and
    additional calls to utf_8_decode().

    @tiran
    Copy link
    Member Author

    tiran commented Oct 15, 2007

    Hm? This is a one-word patch to email/generator.py.

    Yes, I already noticed it and I'm creating a new patch now. I saw your
    fix for the email generator problem in the bug report and wanted to add
    it to my patch. I accidentally replaced the patch with the one liner.

    Here is the new patch

    @gvanrossum
    Copy link
    Member

    The unit tests profile, cProfile and doctest fail w/ and w/o this patch.
    They seem to suffer from the latest changes of our previous patch and
    additional calls to utf_8_decode().

    Any details on those? They don't fail for me.

    @gvanrossum
    Copy link
    Member

    I'll check this in as soon as there's agreement on the list about this.

    Not that I expect disagreement, but I just realized it was never brought
    up and it isn't in PEP-3137 (yet).

    @tiran
    Copy link
    Member Author

    tiran commented Oct 15, 2007

    Any details on those? They don't fail for me.

    Here you are.

    $ ./python Lib/test/test_cProfile.py
         121 function calls (101 primitive calls) in 1.000 CPU seconds
    

    Ordered by: standard name

    ncalls tottime percall cumtime percall filename:lineno(function)
    1 0.000 0.000 1.000 1.000 <string>:1(<module>)
    8 0.064 0.008 0.080 0.010
    test_cProfile.py:103(subhelper)
    28 0.028 0.001 0.028 0.001
    test_cProfile.py:115(getattr)
    1 0.270 0.270 1.000 1.000 test_cProfile.py:30(testfunc)
    23/3 0.150 0.007 0.170 0.057 test_cProfile.py:40(factorial)
    20 0.020 0.001 0.020 0.001 test_cProfile.py:53(mul)
    2 0.040 0.020 0.600 0.300 test_cProfile.py:60(helper)
    4 0.116 0.029 0.120 0.030 test_cProfile.py:78(helper1)
    2 0.000 0.000 0.140 0.070
    test_cProfile.py:89(helper2_indirect)
    8 0.312 0.039 0.400 0.050 test_cProfile.py:93(helper2)
    1 0.000 0.000 0.000 0.000 utf_8.py:15(decode)
    1 0.000 0.000 0.000 0.000 {_codecs.utf_8_decode}
    1 0.000 0.000 1.000 1.000 {exec}
    12 0.000 0.000 0.012 0.001 {hasattr}
    4 0.000 0.000 0.000 0.000 {method 'append' of 'list'
    objects}
    1 0.000 0.000 0.000 0.000 {method 'disable' of
    '_lsprof.Profiler' objects}
    4 0.000 0.000 0.000 0.000 {sys.exc_info}

    Ordered by: standard name

    Function called...
    ncalls tottime
    cumtime
    <string>:1(<module>) -> 1 0.270
    1.000 test_cProfile.py:30(testfunc)
    test_cProfile.py:103(subhelper) -> 16 0.016
    0.016 test_cProfile.py:115(getattr)
    test_cProfile.py:115(getattr) ->
    test_cProfile.py:30(testfunc) -> 1 0.014
    0.130 test_cProfile.py:40(factorial)
    2 0.040
    0.600 test_cProfile.py:60(helper)
    test_cProfile.py:40(factorial) -> 20/3 0.130
    0.147 test_cProfile.py:40(factorial)
    20 0.020
    0.020 test_cProfile.py:53(mul)
    test_cProfile.py:53(mul) ->
    test_cProfile.py:60(helper) -> 4 0.116
    0.120 test_cProfile.py:78(helper1)
    2 0.000
    0.140 test_cProfile.py:89(helper2_indirect)
    6 0.234
    0.300 test_cProfile.py:93(helper2)
    test_cProfile.py:78(helper1) -> 4 0.000
    0.004 {hasattr}
    4 0.000
    0.000 {method 'append' of 'list' objects}
    4 0.000
    0.000 {sys.exc_info}
    test_cProfile.py:89(helper2_indirect) -> 2 0.006
    0.040 test_cProfile.py:40(factorial)
    2 0.078
    0.100 test_cProfile.py:93(helper2)
    test_cProfile.py:93(helper2) -> 8 0.064
    0.080 test_cProfile.py:103(subhelper)
    8 0.000
    0.008 {hasattr}
    utf_8.py:15(decode) -> 1 0.000
    0.000 {_codecs.utf_8_decode}
    {_codecs.utf_8_decode} ->
    {exec} -> 1 0.000
    1.000 <string>:1(<module>)
    1 0.000
    0.000 utf_8.py:15(decode)
    {hasattr} -> 12 0.012
    0.012 test_cProfile.py:115(getattr)
    {method 'append' of 'list' objects} ->
    {method 'disable' of '_lsprof.Profiler' objects} ->
    {sys.exc_info} ->

    Ordered by: standard name

    Function was called by...
    ncalls tottime
    cumtime
    <string>:1(<module>) <- 1 0.000
    1.000 {exec}
    test_cProfile.py:103(subhelper) <- 8 0.064
    0.080 test_cProfile.py:93(helper2)
    test_cProfile.py:115(getattr) <- 16 0.016
    0.016 test_cProfile.py:103(subhelper)
    12 0.012
    0.012 {hasattr}
    test_cProfile.py:30(testfunc) <- 1 0.270
    1.000 <string>:1(<module>)
    test_cProfile.py:40(factorial) <- 1 0.014
    0.130 test_cProfile.py:30(testfunc)
    20/3 0.130
    0.147 test_cProfile.py:40(factorial)
    2 0.006
    0.040 test_cProfile.py:89(helper2_indirect)
    test_cProfile.py:53(mul) <- 20 0.020
    0.020 test_cProfile.py:40(factorial)
    test_cProfile.py:60(helper) <- 2 0.040
    0.600 test_cProfile.py:30(testfunc)
    test_cProfile.py:78(helper1) <- 4 0.116
    0.120 test_cProfile.py:60(helper)
    test_cProfile.py:89(helper2_indirect) <- 2 0.000
    0.140 test_cProfile.py:60(helper)
    test_cProfile.py:93(helper2) <- 6 0.234
    0.300 test_cProfile.py:60(helper)
    2 0.078
    0.100 test_cProfile.py:89(helper2_indirect)
    utf_8.py:15(decode) <- 1 0.000
    0.000 {exec}
    {_codecs.utf_8_decode} <- 1 0.000
    0.000 utf_8.py:15(decode)
    {exec} <-
    {hasattr} <- 4 0.000
    0.004 test_cProfile.py:78(helper1)
    8 0.000
    0.008 test_cProfile.py:93(helper2)
    {method 'append' of 'list' objects} <- 4 0.000
    0.000 test_cProfile.py:78(helper1)
    {method 'disable' of '_lsprof.Profiler' objects} <-
    {sys.exc_info} <- 4 0.000
    0.000 test_cProfile.py:78(helper1)

    ####################################
    $ ./python Lib/test/test_doctest.py

    doctest (doctest) ... 66 tests with zero failures
    **********************************************************************
    File "/home/heimes/dev/python/py3k/Lib/test/test_doctest.py", line 1570,
    in test.test_doctest.test_debug
    Failed example:
        try: doctest.debug_src(s)
        finally: sys.stdin = real_stdin
    Expected:
        > <string>(1)<module>()
        (Pdb) next
        12
        --Return--
        > <string>(1)<module>()->None
        (Pdb) print(x)
        12
        (Pdb) continue
    Got:
        > /home/heimes/dev/python/py3k/Lib/encodings/utf_8.py(16)decode()
        -> return codecs.utf_8_decode(input, errors, True)
        (Pdb) next
        --Return--
        >
    /home/heimes/dev/python/py3k/Lib/encodings/utf_8.py(16)decode()->('<string>',
    8)
        -> return codecs.utf_8_decode(input, errors, True)
        (Pdb) print(x)
        *** NameError: NameError("name 'x' is not defined",)
        (Pdb) continue
        12
    **********************************************************************
    1 items had failures:
       1 of   4 in test.test_doctest.test_debug
    ***Test Failed*** 1 failures.
    Traceback (most recent call last):
      File "Lib/test/test_doctest.py", line 2422, in <module>
        test_main()
      File "Lib/test/test_doctest.py", line 2406, in test_main
        test_support.run_doctest(test_doctest, verbosity=True)
      File "/home/heimes/dev/python/py3k/Lib/test/test_support.py", line
    569, in run_doctest
        raise TestFailed("%d of %d doctests failed" % (f, t))
    test.test_support.TestFailed: 1 of 414 doctests failed

    ####################################
    $ ./python Lib/test/test_email.py

    Traceback (most recent call last):
      File "Lib/test/test_email.py", line 13, in <module>
        test_main()
      File "Lib/test/test_email.py", line 10, in test_main
        test_support.run_unittest(suite())
      File "/home/heimes/dev/python/py3k/Lib/test/test_support.py", line
    541, in run_unittest
        _run_suite(suite)
      File "/home/heimes/dev/python/py3k/Lib/test/test_support.py", line
    524, in _run_suite
        raise TestFailed(err)
    test.test_support.TestFailed: Traceback (most recent call last):
      File "/home/heimes/dev/python/py3k/Lib/email/test/test_email.py", line
    1445, in test_same_boundary_inner_outer
        msg = self._msgobj('msg_15.txt')
      File "/home/heimes/dev/python/py3k/Lib/email/test/test_email.py", line
    67, in _msgobj
        return email.message_from_file(fp)
      File "/home/heimes/dev/python/py3k/Lib/email/__init__.py", line 46, in
    message_from_file
        return Parser(*args, **kws).parse(fp)
      File "/home/heimes/dev/python/py3k/Lib/email/parser.py", line 68, in parse
        data = fp.read(8192)
      File "/home/heimes/dev/python/py3k/Lib/io.py", line 1240, in read
        readahead, pending = self._read_chunk()
      File "/home/heimes/dev/python/py3k/Lib/io.py", line 1136, in _read_chunk
        pending = self._decoder.decode(readahead, not readahead)
      File "/home/heimes/dev/python/py3k/Lib/codecs.py", line 291, in decode
        (result, consumed) = self._buffer_decode(data, self.errors, final)
    UnicodeDecodeError: 'utf8' codec can't decode byte 0xbe in position 86:
    unexpected code byte

    @gvanrossum
    Copy link
    Member

    BTW we need a 2to3 fixer for this. Should be trivial -- just replace
    *all* occurrences of basestring with str.

    @gvanrossum
    Copy link
    Member

    Even before this patch, the re module doesn't work very well on byte
    strings. IMO this should be fixed. I've filed a separate bug to remind
    us: bug 1282.

    @tiran
    Copy link
    Member Author

    tiran commented Oct 15, 2007

    Guido van Rossum wrote:

    BTW we need a 2to3 fixer for this. Should be trivial -- just replace
    *all* occurrences of basestring with str.

    I believe you that it's trivial for *you* but I've never dealt with the
    fixers or the grammar. Fortunately for me I was able to copy the fixer
    for standarderror. It toke just some minor tweaks :)

    Let's see if the mail interface can handle attachments.

    @gvanrossum
    Copy link
    Member

    Committed revision 58495.

    Thanks Christian!!!

    @gvanrossum
    Copy link
    Member

    2007/10/15, Christian Heimes <report@bugs.python.org>:

    Christian Heimes added the comment:

    Guido van Rossum wrote:
    > BTW we need a 2to3 fixer for this. Should be trivial -- just replace
    > *all* occurrences of basestring with str.

    I believe you that it's trivial for *you* but I've never dealt with the
    fixers or the grammar. Fortunately for me I was able to copy the fixer
    for standarderror. It toke just some minor tweaks :)

    Let's see if the mail interface can handle attachments.

    It did. :-) I renamed it to fix_basestring and submitted it. See:

    Committed revision 58644.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants