Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_getgroups of test_posix fails (on OS X 10.10) #73748

Closed
jdlh mannequin opened this issue Feb 15, 2017 · 7 comments
Closed

test_getgroups of test_posix fails (on OS X 10.10) #73748

jdlh mannequin opened this issue Feb 15, 2017 · 7 comments
Labels
3.7 (EOL) end of life tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@jdlh
Copy link
Mannequin

jdlh mannequin commented Feb 15, 2017

BPO 29562
Nosy @ronaldoussoren, @ned-deily, @JDLH

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2017-02-15.02:03:05.714>
labels = ['3.7', 'type-bug', 'tests']
title = 'test_getgroups of test_posix fails (on OS X 10.10)'
updated_at = <Date 2017-02-20.07:32:53.146>
user = 'https://github.com/JDLH'

bugs.python.org fields:

activity = <Date 2017-02-20.07:32:53.146>
actor = 'JDLH'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Tests']
creation = <Date 2017-02-15.02:03:05.714>
creator = 'JDLH'
dependencies = []
files = []
hgrepos = []
issue_num = 29562
keywords = []
message_count = 7.0
messages = ['287806', '287807', '287911', '287915', '287917', '288185', '288186']
nosy_count = 3.0
nosy_names = ['ronaldoussoren', 'ned.deily', 'JDLH']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue29562'
versions = ['Python 3.7']

@jdlh
Copy link
Mannequin Author

jdlh mannequin commented Feb 15, 2017

When I run test.test_posix.PosixTester.test_getgroups on my Mac OS X system, it fails:

% ./python.exe -m unittest -v test.test_posix.PosixTester.test_getgroups
test_getgroups (test.test_posix.PosixTester) ... FAIL

======================================================================
FAIL: test_getgroups (test.test_posix.PosixTester)
----------------------------------------------------------------------

Traceback (most recent call last):
  File "/Users/jdlh/workspace/cpython/Lib/test/test_posix.py", line 824, in test_getgroups
    self.assertTrue(not symdiff or symdiff == {posix.getegid()})
AssertionError: False is not true

Ran 1 test in 0.013s

FAILED (failures=1)

Details of my system:
% sw_vers
ProductName: Mac OS X
ProductVersion: 10.10.5
BuildVersion: 14F2109

% id -G
20 507 12 61 80
98 399 33 100
204 395 398
701
% id -G -n
staff xampp everyone localaccounts admin
_lpadmin com.apple.access_ssh _appstore _lpoperator
_developer com.apple.access_ftp com.apple.access_screensharing
com.apple.sharepoint.group.1
# I wrapped these lines similarly, to make the correspondence clearer

% ./python.exe -c 'import grp,os; g={i: (n, p, i, mem) for (n, p, i, mem) in grp.getgrall()}; print(sorted([(i, g[i][0]) for i in os.getgroups()]) )'
[(12, 'everyone'), (20, 'staff'), (33, '_appstore'), (61, 'localaccounts'), (80, 'admin'), (98, '_lpadmin'), (100, '_lpoperator'), (204, '_developer'), (395, 'com.apple.access_ftp'), (399, 'com.apple.access_ssh'), (507, 'xampp')]

So the difference, which triggers the test failure, is that id -G is returning groups (701, 'com.apple.sharepoint.group.1'), and (398, 'com.apple.access_screensharing'), while posix.getgroups() is not. I do not yet understand why.

Others say this test works on their OS X 10.10 system, so maybe it's triggered by something in my environment.

Also: python3.6 from MacPorts, and python2.7 from MacPorts, return the same set of groupids as does the dev build of python3.7.

This bug affects the same test, and the same posix.getgroups() call, as http://bugs.python.org/issue17557 "test_getgroups of test_posix can fail on OS X 10.8 if more than 16 groups" (2013-2014, closed). But I think it is a different problem: bpo-17557 is related to how posix.getgroups() deals with large numbers of groups, and it is fixed.

I would appreciate help in getting this test to pass. Maybe my environment is wrong, in which case I should fix my environment. But maybe the cpython code is sensitive to some detail of my environment, in which case perhaps I should fix the cpython code.

@jdlh jdlh mannequin added 3.7 (EOL) end of life tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error labels Feb 15, 2017
@jdlh
Copy link
Mannequin Author

jdlh mannequin commented Feb 15, 2017

I have pushed a branch for this issue to my cpython fork:

https://github.com/JDLH/cpython/tree/bpo-29562_failing_test_getgroups_on_os_x

It modifies test_getgroups in test_posix.py to give better diagnostics in the event of a test failure. It says specifically which groups were in id -G, and posix.getgroups(), but not in the other.

% ./python.exe -m unittest -v test.test_posix.PosixTester.test_getgroups
test_getgroups (test.test_posix.PosixTester) ... FAIL

======================================================================
FAIL: test_getgroups (test.test_posix.PosixTester)
----------------------------------------------------------------------

Traceback (most recent call last):
  File "/Users/jdlh/workspace/cpython/Lib/test/test_posix.py", line 841, in test_getgroups
    self.assertEqual(len(symdiff), 0, msg)
AssertionError: 2 != 0 : id -G and posix.groups() should have zero difference.
Groups in id -G but not posix.groups(): [(701, 'com.apple.sharepoint.group.1'), (398, 'com.apple.access_screensharing')]
Groups in posix.groups() but not id -G: []
(Effective GID (20) was disregarded.)

Ran 1 test in 0.020s

I don't think this branch is ready yet to submit to the main codebase, but it may help people diagnose the issue.

@jdlh
Copy link
Mannequin Author

jdlh mannequin commented Feb 16, 2017

Some diagnosis.

Group com.apple.sharepoint.group.1 appears to be related to a certain kind of file sharing, but I don't have hard evidence.

Its only member was a test user I created as part of screen sharing with Apple Support.

% dscacheutil -q group -a name com.apple.sharepoint.group.1
name: com.apple.sharepoint.group.1
password: *
gid: 701
users: testuser

I removed File Sharing for this user's home directory.

  1. Open System Preferences... Sharing.
  2. Click on "File Sharing", which is checked. In the right pane, a list of shared folders appears.
  3. Click on the entry "Testuser Public Folder" in the Shared Folders list.
  4. Click on the "-" button below the Shared Folders list. The "Testuser Public Folder" entry disappears.

Having done that, the group com.apple.sharepoint.group.1 no longer appeared.

% dscacheutil -q group -a name com.apple.sharepoint.group.1
%

Interestingly, test_getgroups still failed, and still had a discrepancy of two groups from the output of id -G.

% ./python.exe -m unittest -v test.test_posix.PosixTester.test_getgroups
test_getgroups (test.test_posix.PosixTester) ... FAIL

======================================================================
FAIL: test_getgroups (test.test_posix.PosixTester)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/jdlh/workspace/cpython/Lib/test/test_posix.py", line 841, in test_getgroups
    self.assertEqual(len(symdiff), 0, msg)
AssertionError: 2 != 0 : id -G and posix.groups() should have zero difference.
Groups in id -G but not posix.groups(): [(395, 'com.apple.access_ftp'), (398, 'com.apple.access_screensharing')]
Groups in posix.groups() but not id -G: []
(Effective GID (20) was disregarded.)

----------------------------------------------------------------------
Ran 1 test in 0.013s

FAILED (failures=1)

Earlier, group com.apple.access_ftp was not part of the difference. Now it is. The output of id -G didn't change. The implementation of posix.getgroups() didn't change. It calls getgroups (2), I believe: https://github.com/python/cpython/blob/master/Modules/posixmodule.c#L6078-L6103

That makes me think that the behaviour of getgroups (2) in Mac OS is behaving differently than we expect.

man 2 getgroups gives documentation. (I can't find this page at an apple URL, but http://www.manpagez.com/man/2/getgroups/ seems to have the same content.) It says,

>> "To provide compatibility with applications that use getgroups() in environments where users may be in more than {NGROUPS_MAX} groups, a variant of getgroups(), obtained when compiling with either the macros _DARWIN_UNLIMITED_GETGROUPS or _DARWIN_C_SOURCE defined, can be used that is not limited to {NGROUPS_MAX} groups. However, this variant only returns the user's default group access list and not the group list modified by a call to setgroups(2) (either in the current process or an ancestor process). Use of setgroups(2) is highly discouraged, and there is no foolproof way to determine if it has been previously called."

I don't know how to determine if my copy of Mac OS X 10.10 was complied with either of these two macros.

On my system, I chased NGROUPS_MAX down to /usr/include/sys/syslimits.h:84, where it is set to 16. That is more than the number of groups id -G is reporting, so I don't see how that is relevant.

20 507 12 61 80 98 399 33 100 204 395 398

This is 12 groups, whereas before it was 13 groups (see my message from 2017-02-15 02:03). This is unsurprising. However, the number of groups returned by posix.getgroups() has also shrunk by 1:

[(12, 'everyone'), (20, 'staff'), (33, '_appstore'), (61, 'localaccounts'), (80, 'admin'), (98, '_lpadmin'), (100, '_lpoperator'), (204, '_developer'), (399, 'com.apple.access_ssh'), (507, 'xampp')]

Notice that group (395, 'com.apple.access_ftp') is no longer being returned by os.getgroups(). This is as a consequence of a different group being deleted.

The test_getgroups comment asserts: "# 'id -G' and 'os.getgroups()' should return the same groups, ignoring order, duplicates, and the effective gid." https://github.com/python/cpython/blob/master/Lib/test/test_posix.py#L819-L820

I'm getting skeptical about that claim. Does Mac OS X actually guarantee that 'id -G' and 'getgroups(2)' return the same groups?

@jdlh
Copy link
Mannequin Author

jdlh mannequin commented Feb 16, 2017

I guess I didn't state the things I find odd about what the new test_getgroups results.

  1. os.getgroups() used to return group (395, 'com.apple.access_ftp'), but no longer does. I don't see a reason why.

  2. os.getgroups() is returning 2 fewer group id's than id -G, even as the total number of groups is reduced. This is not the behaviour of an API limited by {NGROUPS_MAX}.

@jdlh
Copy link
Mannequin Author

jdlh mannequin commented Feb 16, 2017

The Mac OS 10.10 man page for initgroups(3) says:

"Processes should not use the group ID numbers from getgroups(2) to determine a user's group membership. The list obtained from getgroups() may only be a partial list of a user's group membership. Membership checks should use the mbr_gid_to_uuid(3), mbr_uid_to_uuid(3), and mbr_check_membership(3) functions."
(http://www.manpagez.com/man/3/initgroups/ -- not official Apple page, but it matches what I see in my OS.)

When the man page says, "The list obtained from getgroups() may only be a partial list of a user's group membership.", and the list from id -G is presumably a complete list, should we understand that Apple is saying their getgroups(2) implementation isn't POSIX-compliant? If so, maybe we should skip test_getgroups on Mac OS X systems?

Or, should we consider rewriting os_getgroups_impl() to use a Mac-specific implementation on Mac OS X?

@ronaldoussoren
Copy link
Contributor

Note that the result of getgroups(2) is fixed on login, while "id -G" reflects the current state of the user database on macOS. Could this explain this failure? That is, have you tried logging out and in again before running the test suite?

@jdlh
Copy link
Mannequin Author

jdlh mannequin commented Feb 20, 2017

Note that the result of getgroups(2) is fixed on login, while "id -G" reflects the current state of the user database on macOS.

Wow, that's interesting! Thank you for this information.

The test code for test_getgroups does not mention this interaction. I can certainly see how it could affect the test. Maybe it should be added?

Since I last tried that test, I've logged out and restarted several times, and changed OS to Mac OS X 10.11 El Capitan. Nothing like changing several independent variables at once while diagnosing! I will try the test again and report back.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@iritkatriel iritkatriel closed this as not planned Won't fix, can't repro, duplicate, stale Jun 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants