Author baikie
Recipients baikie, jackdied, synapse, therve, wiml
Date 2010-03-01.01:03:42
SpamBayes Score 0.0
Marked as misclassified No
Message-id <1267405426.53.0.174632266739.issue6560@psf.upfronthosting.co.za>
In-reply-to
Content
Thanks for your interest!  I'm actually still working on the
patch I posted, docs and a test suite, and I'll post something
soon.

Yes, you could just use b"".join() with sendmsg() (and get
slightly annoyed because it doesn't accept buffers ;) ).  I made
sendmsg() take multiple buffers because that's the way the system
call works, but also to match recvmsg_into(), which gives you the
convenience of being able to receive part of the message into a
bytearray and part into an array.array("i"), say, if that's how
the data is formatted.

As you might know, gather-write with sendmsg() can give a
performance benefit by letting the kernel assemble the message
while copying the data from userspace rather than having
userspace copy the data once to form the message and then having
the kernel copy it again when the system call is made.  I suppose
with Python you just need a larger message to see the benefit :)
Since it can read from buffers, though, socket.sendmsg() can pull
a large chunk of data straight out of an mmap object, say, and
attach headers from a bytes object without the mmapped data being
touched by Python at all (or even entering userspace, in this
case).

The patch is for 3.x, BTW - "y*" is valid there (and does take a
buffer).

As for a good reference, I haven't personally seen one.  There's
POSIX and RFC 3542, but they don't provide a huge amount of
detail.  Perhaps the (updated) W. Richard Stevens networking
books?  I've got the Stevens/Rago second edition of Advanced
Programming in the Unix Environment, which discusses FD and
credential passing with sendmsg/recvmsg, but not very well (it
misuses CMSG_LEN, for one thing).  The networking books were
updated by different people though, so perhaps they do better.

The question of whether to use CMSG_NXTHDR() to step to the next
header when constructing the buffer for sendmsg() is a bit murky,
in particular.  I've assumed that this is the way to do it since
the examples in RFC 3542 (and most of the code I've seen
generally) use CMSG_FIRSTHDR() to get the initial pointer, but
I've found that glibc's CMSG_NXTHDR() can (wrongly, I think)
return NULL if the buffer hasn't been zero-filled beforehand
(this causes segfaults with the patch I initially posted).

@Wim:

Yes, the rfc3542 module from that package looks as if it would be
usable with these patches - although it's Python 2-only, GPL-only
and looks unmaintained.  Those kind of ancillary data
constructors will actually be needed to make full portable use of
sendmsg() and recvmsg() for things like IPv6, SCTP, Linux's
socket error queues, etc.  The same goes for data for the
existing get/setsockopt() methods, in fact - the present
suggestion to use the struct module is pretty inadequate when
there are typedefs involved and implementations might add and
reorder fields, etc.

The objects in that package seem a bit overcomplicated, though,
messing about with setter methods instead of just subclassing
"bytes" and having different constructors to create the object
from individual arguments or received bytes (say, ucred(1, 2, 3)
or ucred.from_bytes(...)).

Maybe the problem of testing patches well has been putting people
off so far?  Really exercising the system's CMSG_*HDR() macros in
particular isn't entirely straightforward.  I suppose there's
also a reluctance to write tests while still uncertain about how
to present the interface - that's another reason why I went for
the most general multiple-buffer form of sendmsg()!
History
Date User Action Args
2010-03-01 01:03:47baikiesetrecipients: + baikie, therve, jackdied, synapse, wiml
2010-03-01 01:03:46baikiesetmessageid: <1267405426.53.0.174632266739.issue6560@psf.upfronthosting.co.za>
2010-03-01 01:03:45baikielinkissue6560 messages
2010-03-01 01:03:42baikiecreate