classification
Title: Add 2to3 fixer to change send and recv methods of socket object.
Type: enhancement Stage: resolved
Components: 2to3 (2.x to 3.x conversion tool) Versions:
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: benjamin.peterson Nosy List: benjamin.peterson, devarakondapranav, rhettinger, serhiy.storchaka, xtreak
Priority: normal Keywords:

Created on 2018-10-04 14:45 by devarakondapranav, last changed 2018-11-27 04:41 by benjamin.peterson. This issue is now closed.

Files
File name Uploaded Description Edit
fix_socket_send_recv.py devarakondapranav, 2018-10-04 17:56 Fix to change send and recv of socket.
fix_socket_send_recv_updated.py devarakondapranav, 2018-10-06 18:15 Updated fixer that handles cases when data is converted to bytes.
fix_socket_send_recv_reupdated.py devarakondapranav, 2018-10-12 03:00
Messages (14)
msg327053 - (view) Author: Pranav Devarakonda (devarakondapranav) * Date: 2018-10-04 14:45
The send() method of the the Socket object in Python 3.x requires the data to be sent to be first converted into bytes(object) which was not the case with Python 2.x. The 2to3 tool doesn't handle this case and hence an explicit fixer would help many beginners learning the socket module(most tutorials regarding this online are in Python 2.x)

Similarly the fixer can change the recv() method by converting the returned bytes object back to a str object.
For example, consider the following lines in Python 2.x, (for demonstration only)

s.send("Foo") #where 's' is a socket object
data = s.recv(1024)

After the 2to3 fixer has been applied the above lines will change to,

s.send(str.encode("Foo"))
data = s.recv(1024).decode("utf-8")

PS: I am a beginner in open source so any changes or suggestions are welcome.
msg327064 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2018-10-04 16:18
Thanks for the report. In my opinion it's difficult to handle this scenario and as far as I know 2to3 can only do syntax level changes with modular fixers for each case. Since Python is dynamic and there is no static type system available it's difficult to differentiate between a socket object and any other object that has the same API with send and recv methods like below : 

if foo():
    s = Socket()
else:
    s = AnotherSocketAPI()

Hence in this case s might be another object that might have the same interface like Socket expecting string for send and recv. Since there is no static typing in Python it depends on runtime introspection to get it right. I will wait for others opinion on this.
msg327070 - (view) Author: Pranav Devarakonda (devarakondapranav) * Date: 2018-10-04 17:56
Thanks for taking time and updating this, Karthikeyan Singaravelan. I do agree that there there is no proper way to find out if an object is of type socket or not. 

However other fixers in lib2to3 are not any different. For example the fix_dict.py changes every instance of viewkeys() method to keys() irrespective of the type of the object. And I guess that applies to all other fixers in lib2to3 as well. So that convinced me that a fix for this can also be accommodated.

A compromise that we could foster, as I already mentioned is to make this fix explicit so users can use this only if they need it. Please find attached the fixer I wrote. I haven't made a PR yet.
msg327072 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2018-10-04 18:16
Ah, sorry I totally forgot about dict fixer. Thanks for the details. I would wait for Benjamin's call on this.

Thanks
msg327079 - (view) Author: Pranav Devarakonda (devarakondapranav) * Date: 2018-10-04 19:47
Thanks Karthikeyan
msg327198 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-10-05 23:23
> Please find attached the fixer I wrote. 

Wouldn't this break programs that had already run encode() on the input (as they were already supposed to be doing, and as they would be doing for code that runs under both Python2 and Python3)?
msg327222 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-10-06 05:30
I'm sure that this would break the code which sends bytes objects and expects to receive bytes objects.

s.send(struct.pack('!I', x))
q, w, e = struct.unpack('!IHQ', s.recv(4))
msg327257 - (view) Author: Pranav Devarakonda (devarakondapranav) * Date: 2018-10-06 18:15
Thanks for pointing out these cases. I kept in mind the Python 2 documentation which says socket.send method expects a string and hence made that fixer. I have tweaked that fixer to handle the pointed cases and a few additional ones too. Please find attached the updated fixer. To summarize, the following send() cases would not be changed,

data = "Data"
s.send(str.encode(data))
s.send(data.encode('utf-8))
s.send(bytes(data, 'utf-8'))
s.send(struct.pack('!I', x))

Similary, the following recv() cases would not be changed, 

data = s.recv(1024).decode('utf-8')
q, w, e = struct.unpack('!IHQ', s.recv(4))

The remaining generic cases will be handled as expected. I know we cannot handle cases where the data has already been converted to bytes(since there is no way to find the 'type' of the object here) and then sent as an argument to socket.send(). But if we have to go by the documentation, then this is probably the right thing to do.
msg327295 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-10-07 17:53
> I have tweaked that fixer to handle the pointed cases 
> and a few additional ones too

Playing whack-a-mole with a few cases will always fall short of being able to guarantee correct transformations and not break existing code that is working correctly.

One possibility is to add a type check to the generated code, "send(x)" -> send(x.encode() if type(x)==bytes else x)" but this has its own issues and isn't satisfying.
msg327299 - (view) Author: Pranav Devarakonda (devarakondapranav) * Date: 2018-10-07 18:10
> One possibility is to add a type check to the generated code, "send(x)" -> send(x.encode() if type(x)==bytes else x)"

That would have solved the problem. However we cannot check the type of the object while the parsing is going on. For example, printing out the type(x) for the above example in any of the fixers would print "lib2to3.pytree.Node" (or "lib2to3.pytree.Leaf" depending on the object) but not the expected type() of the object like str or bytes.

I haven't found out any method that can actually find out the type of the object in the fixer. If that can be done, then writing the fixer is pretty much straight forward. 

The other fixers in lib2to3, for example fix_dict.py, would convert all instances of viewkeys() to keys() irrespective of the type of the object that has called the method. That is also the case with all other fixers as well. Would appreciate very much if somebody can suggest how to do this.

But since that is not the case, the fixer code has to handle these cases individually and I expect the current fixer to do a good job for the same.
msg327300 - (view) Author: Pranav Devarakonda (devarakondapranav) * Date: 2018-10-07 18:24
I am sorry. I got you completely wrong with this > "One possibility is to add a type check to the generated code"

I thought you were asking to check the type in the fixer itself.
I get it now. So you meant to add the type check for the changed/generate code.I think that is a good idea.
 
> but this has its own issues and isn't satisfying.
Why do you think this way? Can you please elaborate?
msg327320 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-10-08 03:42
Benjamin, I'm dubious about this going forward, but it is up to you.
msg327557 - (view) Author: Pranav Devarakonda (devarakondapranav) * Date: 2018-10-12 03:00
I have added a final condition that converts the arguments passed to bytes only if the type of object is socket and the method is send() in the generated code. All the other conversions still function as expected.
msg330489 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2018-11-27 04:41
I concur with Raymond. Thank you for the patch, but this seems too error-prone to commit.
History
Date User Action Args
2018-11-27 04:41:27benjamin.petersonsetstatus: open -> closed
resolution: rejected
messages: + msg330489

stage: resolved
2018-10-12 03:01:00devarakondapranavsetfiles: + fix_socket_send_recv_reupdated.py

messages: + msg327557
2018-10-08 03:42:59rhettingersetassignee: benjamin.peterson
messages: + msg327320
2018-10-07 18:24:05devarakondapranavsetmessages: + msg327300
2018-10-07 18:10:04devarakondapranavsetmessages: + msg327299
2018-10-07 17:53:29rhettingersetmessages: + msg327295
2018-10-06 18:15:16devarakondapranavsetfiles: + fix_socket_send_recv_updated.py

messages: + msg327257
2018-10-06 05:30:57serhiy.storchakasetmessages: + msg327222
2018-10-05 23:23:46rhettingersetnosy: + serhiy.storchaka, rhettinger
messages: + msg327198
2018-10-04 19:47:00devarakondapranavsetmessages: + msg327079
2018-10-04 18:16:24xtreaksetmessages: + msg327072
2018-10-04 17:56:23devarakondapranavsetfiles: + fix_socket_send_recv.py

messages: + msg327070
2018-10-04 16:18:59xtreaksetnosy: + xtreak
messages: + msg327064
2018-10-04 14:45:00devarakondapranavcreate