Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use getaddrinfo() in urllib2.py for IPv6 support #44672

Open
dcantrell-rh mannequin opened this issue Mar 7, 2007 · 12 comments
Open

Use getaddrinfo() in urllib2.py for IPv6 support #44672

dcantrell-rh mannequin opened this issue Mar 7, 2007 · 12 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@dcantrell-rh
Copy link
Mannequin

dcantrell-rh mannequin commented Mar 7, 2007

BPO 1675455
Nosy @facundobatista, @orsenthil, @ned-deily
Files
  • urllib2-getaddrinfo.patch: Patch for Lib/urllib2.py replacing gethostbyname() calls with getaddrinfo() calls
  • urllib2-getaddrinfo.patch
  • test_urllib2-getaddrinfo.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/orsenthil'
    closed_at = None
    created_at = <Date 2007-03-07.05:19:06.000>
    labels = ['type-feature', 'library']
    title = 'Use getaddrinfo() in urllib2.py for IPv6 support'
    updated_at = <Date 2012-07-20.14:31:47.568>
    user = 'https://bugs.python.org/dcantrell-rh'

    bugs.python.org fields:

    activity = <Date 2012-07-20.14:31:47.568>
    actor = 'Ramchandra Apte'
    assignee = 'orsenthil'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2007-03-07.05:19:06.000>
    creator = 'dcantrell-rh'
    dependencies = []
    files = ['7833', '8464', '8465']
    hgrepos = []
    issue_num = 1675455
    keywords = ['patch']
    message_count = 12.0
    messages = ['52082', '52083', '56125', '69209', '78716', '78717', '78750', '78754', '78759', '84810', '116619', '165931']
    nosy_count = 7.0
    nosy_names = ['facundobatista', 'jjlee', 'orsenthil', 'dcantrell-rh', 'dmorr', 'ned.deily', 'Ramchandra Apte']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue1675455'
    versions = ['Python 3.2']

    @dcantrell-rh
    Copy link
    Mannequin Author

    dcantrell-rh mannequin commented Mar 7, 2007

    A number of base Python modules use gethostbyname() when they should be using getaddrinfo(). The big limitation hit when using gethostbyname() is the lack of IPv6 support.

    This first patch is for urllib2.py. It replaces all uses of gethostbyname() with getaddrinfo() instead. getaddrinfo() returns a 5-tuple, so additional code needs to wrap a getaddrinfo() call when replacing gethostbyname() calls. Still should be pretty simple to read.

    I'd like to see this patch added to the next stable release of Python, if at all possible. I am working up patches for the other modules I see in the Lib/ subdirectory that could use getaddrinfo() instead of gethostbyname().

    @dcantrell-rh dcantrell-rh mannequin added stdlib Python modules in the Lib dir labels Mar 7, 2007
    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented Jul 11, 2007

    • Where are the tests? A functional test, perhaps in test_urllib2net.py, for IPv6 support in urllib2 would be especially welcome, I think.
    • Why does .check_host() not begin with an underscore?
    • "check_host" is a poor name. How about "_is_localhost"?
    • locals is a built-in function, hence usually considered good style not to use it as a name.
    • Is is necessary to call make_host_tuple(searchlist) twice?
    • The patch appears to fix several bugs at once (e.g. adding a try: / except: suite around a large part of an existing method to catch socket.error).

    @orsenthil
    Copy link
    Member

    Hi,
    The patch attached required a complete rewrite. I am attaching the
    modified patch, which will just substitute socket.gethostbyname with a
    function gethost_addrinfo which internally uses getaddrinfo and takes
    care of the IPv4 or IPv6 addresses translation.

    jjlee, skip: let me know your comments on this.

    One note we have to keep in mind is, testing on IPv6 address.
    For eg. on my system /etc/hosts
    10.98.1.6 goofy.goofy.com
    #fe80::219:5bff:fefd:6270 localhost
    127.0.0.1 localhost

    test_urllib2 will PASS for the above.
    But if I uncomment the IPv6 address, opening the local file fails. I am
    not sure how local file access is done with IPv6 and should urllib2
    (local file opening function) itself needs to be modified. Shall check
    into that, with next version.

    @orsenthil orsenthil added type-feature A feature request or enhancement labels Jun 30, 2008
    @facundobatista
    Copy link
    Member

    What I don't understand here is... if gethostbyname() lacks of IPv6
    support, instead of creating a new function why not to add the
    functionality to that same function?

    Right now gethostbyname() is implemented in C, which would be the
    drawback of making it a Python function?

    @dmorr
    Copy link
    Mannequin

    dmorr mannequin commented Jan 1, 2009

    Senthil,

    I don't think your gethost_addrinfo() function will work. On a v6-
    enabled machine, it will only return v6 or v4 names. Shouldn't it
    return both (since a machine could have both v4 and v6 addresses)? For
    example, on my machine, I have the following addresses for
    "localhost": ::1, fe80::1%lo0, 127.0.0.1.

    Also, why is the AI_CANONNAME flag set? The canonname field isn't used.
    And you only appear to take the last IP address returned (sa[0]).
    Shouldn't you return all the addresses?

    @dmorr
    Copy link
    Mannequin

    dmorr mannequin commented Jan 1, 2009

    Question: Why does FTPHandler.ftp_open() try to resolve the hostname()?
    The hostname will be passed into connect_ftp(), then into
    urllib.ftpwrapper(), and eventually into ftplib.FTP.connect(), which is
    IPv6-aware.

    @orsenthil
    Copy link
    Member

    Derek,

    This patch was along the lines that when IPv6 address is present, return
    the first address,which I assumed to be active address and would make
    the urllib2 work.

    I am not sure, if returning all the addresses would help and how would
    we define which address to use?

    AI_CANONNAME flag, I don't accurately remember it now. But I had
    encountered issues when testing on IPv-4 systems without it.

    I am having different opinion on this issue now.

    First is, taking from Facundo's comment on having this functionality in
    gethostbyname() and implementing it in C.

    Second is, the wrapper function and suitable way needs to be defined.

    I am sorry, I fail to understand the question on why ftp_open does
    hostname resolution. You mean to say without it, if we pass it to
    ftplib.FTP.connect() it would work for IPv6 address?

    @dmorr
    Copy link
    Mannequin

    dmorr mannequin commented Jan 2, 2009

    My understanding is that the FileHandler checks if the file:// URL
    contains the hostname or localhost IP of the local machine (isn't that
    what FileHandler.names is for?). So, shouldn't the following URLs all
    open the same file:

    file:///foo.txt
    file://localhost/foo.txt
    file://127.0.0.1/foo.txt
    file://[::1]/foo.txt

    If that is the case, then doesn't FileHandler.names need to have all of
    those values in it?

    I am a little confused by this though. It looks like
    FileHandler.file_open() checks if there is a hostname in the URL, and
    if so, uses FTPHandler instead. So why does FileHandler.open_local_file
    check the hostname value?

    For your other points, gethostbyname() in libc can only handle IPv4
    addresses. The IETF defined the getaddrinfo() interface as an IP
    version neutral replacement. I would recommend using getaddrinfo().

    Yes, FTPHandler creates an urllib.FTPWrapper object. That object calls
    into ftplib, which is already IPv6-capable. So, I don't think we need
    to do hostname resolution in FTPHandler.

    @orsenthil
    Copy link
    Member

    I am a little confused by this though. It looks like
    FileHandler.file_open() checks if there is a hostname in the URL, and
    if so, uses FTPHandler instead. So why does FileHandler.open_local_file
    check the hostname value?

    You are right. Even I had observed this, but did not dispute it. Let
    me try to look into the history to see why it so. Perhaps it needs to
    change.

    For your other points, gethostbyname() in libc can only handle IPv4
    addresses. The IETF defined the getaddrinfo() interface as an IP
    version neutral replacement. I would recommend using getaddrinfo().
    Yes, FTPHandler creates an urllib.FTPWrapper object. That object calls
    into ftplib, which is already IPv6-capable. So, I don't think we need
    to do hostname resolution in FTPHandler.

    Thanks for the info. I shall look into both in revision of the path.

    1. using getaddrinfo() for IP version neutral call.
    2. passing the hostname directly to ftplib. ( I am not sure of
      consequences, need to investigate).

    @ned-deily
    Copy link
    Member

    Note also bpo-5625 - any work for IPv6 should keep in mind that local
    hosts may have more than one IP address.

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Sep 16, 2010

    @senthil should this be assigned to your good self?

    @RamchandraApte
    Copy link
    Mannequin

    RamchandraApte mannequin commented Jul 20, 2012

    Bump.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants