This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: AMD64 Debian PGO 3.x, AMD64 Clang UBSan 2.7 buildbots: No space left on device
Type: Stage: resolved
Components: Tests Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: corona10, cstratak, gregory.p.smith, pablogsal, steve.dower, vstinner, xtreak
Priority: normal Keywords:

Created on 2019-09-25 07:40 by corona10, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (14)
msg353154 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2019-09-25 07:40
https://buildbot.python.org/all/#/builders/47/builds/3578

This issue was found from GH-16359

The main issue of this failure is the lack of storage space.

OSError: [Errno 28] No space left on device: '/tmp/tmpnmcjxia9/bin/python' -> '/tmp/tmpnmcjxia9/bin/python3'
msg353155 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-09-25 07:50
I think there was a bug in the past in regrtest or tempfile where the temporary files for tests were not deleted and lead to disk space filled up in several buildbots.
msg353163 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-25 10:28
I contacted Gregory P. Smith, the buildbot worker owner, to ask him to have a look.

> I think there was a bug in the past in regrtest or tempfile where the temporary files for tests were not deleted and lead to disk space filled up in several buildbots.

regrtest now has a --cleanup command to remove all build/test_python_xxx directories. But this command cannot be run by buildbot on a worker which allows multiple jobs in parallel, since the command removes temporary directory of parallel jobs...

I fixed many bugs in regrtest recently to reduce the risk of leaving temporary files on the disk.
msg353164 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-25 10:29
https://mail.python.org/archives/list/buildbot-status@python.org/message/JCR6FQBUMLMOESWE4IVVUIATX7KTEV7C/
msg353229 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-09-25 20:44
It appears that something in the buildbot configuration (typo?) has changed which caused an entire new set of directories for the builder to be created:

@clang-ubsan:/var/lib/buildbot/clang-ubsan$ ls -al
total 68056
drwxr-xr-x 14 buildbot buildbot     4096 Sep 19 14:33 .
drwxr-xr-x  5 buildbot buildbot     4096 Sep 24 02:36 ..
drwx------  3 buildbot buildbot     4096 May 20  2018 2.7.gps-clang-ubsan
drwx------  3 buildbot buildbot     4096 Sep 24 02:43 2.7.gps-clang-ubsan.clang-usban
drwx------  3 buildbot buildbot     4096 May 20  2018 3.6.gps-clang-ubsan
drwx------  3 buildbot buildbot     4096 May 20  2018 3.7.gps-clang-ubsan
drwx------  3 buildbot buildbot     4096 Sep 19 16:41 3.7.gps-clang-ubsan.clang-usban
drwx------  3 buildbot buildbot     4096 Jun  4 20:30 3.8.gps-clang-ubsan
drwx------  3 buildbot buildbot     4096 Sep 19 16:05 3.8.gps-clang-ubsan.clang-usban
drwx------  3 buildbot buildbot     4096 May 20  2018 3.x.gps-clang-ubsan
drwx------  3 buildbot buildbot     4096 Sep 19 14:39 3.x.gps-clang-ubsan.clang-usban
-rw-------  1 buildbot buildbot     1333 May 20  2018 buildbot.tac
drwx------  3 buildbot buildbot     4096 Jun  2  2018 custom.gps-clang-ubsan
drwx------  2 buildbot buildbot     4096 Sep 19 14:33 custom.gps-clang-ubsan.clang-usban
drwxr-xr-x  2 buildbot buildbot     4096 May 20  2018 info
-rw-------  1 buildbot buildbot       12 May 15 22:14 twistd.hostname
-rw-r--r--  1 buildbot buildbot  9574124 Sep 25 20:39 twistd.log
-rw-r--r--  1 buildbot buildbot 10000179 Jul 30 18:35 twistd.log.1
-rw-r--r--  1 buildbot buildbot 10000101 Jun  2 22:17 twistd.log.2
-rw-r--r--  1 buildbot buildbot 10000025 Mar  9  2019 twistd.log.3
-rw-r--r--  1 buildbot buildbot 10000056 Dec 10  2018 twistd.log.4
-rw-r--r--  1 buildbot buildbot 10000014 Oct  9  2018 twistd.log.5
-rw-r--r--  1 buildbot buildbot 10000168 Jul 24  2018 twistd.log.6
-rw-------  1 buildbot buildbot        3 May 15 22:14 twistd.pid


Notice the directories named -clang-ubsan.clang-usban that appeared on September 19th.  From twistd.log:

2019-09-19 14:33:20+0000 [Broker,client] Lost connection to buildbot.python.org:9020
2019-09-19 14:33:20+0000 [Broker,client] <twisted.internet.tcp.Connector instance at 0x7fae10e69c68> will retry in 3 seconds
2019-09-19 14:33:20+0000 [-] Stopping factory <buildslave.bot.BotFactory instance at 0x7fae10e697a0>
2019-09-19 14:33:23+0000 [-] Starting factory <buildslave.bot.BotFactory instance at 0x7fae10e697a0>
2019-09-19 14:33:23+0000 [-] Connecting to buildbot.python.org:9020
2019-09-19 14:33:23+0000 [Uninitialized] Connection to buildbot.python.org:9020 failed: Connection Refused
2019-09-19 14:33:23+0000 [Uninitialized] <twisted.internet.tcp.Connector instance at 0x7fae10e69c68> will retry in 8 seconds
2019-09-19 14:33:23+0000 [-] Stopping factory <buildslave.bot.BotFactory instance at 0x7fae10e697a0>
2019-09-19 14:33:32+0000 [-] Starting factory <buildslave.bot.BotFactory instance at 0x7fae10e697a0>
2019-09-19 14:33:32+0000 [-] Connecting to buildbot.python.org:9020
2019-09-19 14:33:32+0000 [Uninitialized] Connection to buildbot.python.org:9020 failed: Connection Refused
2019-09-19 14:33:32+0000 [Uninitialized] <twisted.internet.tcp.Connector instance at 0x7fae10e69c68> will retry in 22 seconds
2019-09-19 14:33:32+0000 [-] Stopping factory <buildslave.bot.BotFactory instance at 0x7fae10e697a0>
2019-09-19 14:33:54+0000 [-] Starting factory <buildslave.bot.BotFactory instance at 0x7fae10e697a0>
2019-09-19 14:33:54+0000 [-] Connecting to buildbot.python.org:9020
2019-09-19 14:33:55+0000 [Broker,client] message from master: attached
2019-09-19 14:33:55+0000 [Broker,client] Peer will receive following PB traceback:
2019-09-19 14:33:55+0000 [Broker,client] Unhandled Error
        Traceback (most recent call last):
          File "/usr/lib/python2.7/dist-packages/twisted/spread/banana.py", line 173, in gotItem
            self.callExpressionReceived(item)
          File "/usr/lib/python2.7/dist-packages/twisted/spread/banana.py", line 136, in callExpressionReceived
            self.expressionReceived(obj)
          File "/usr/lib/python2.7/dist-packages/twisted/spread/pb.py", line 575, in expressionReceived
            method(*sexp[1:])
          File "/usr/lib/python2.7/dist-packages/twisted/spread/pb.py", line 896, in proto_message
            self._recvMessage(self.localObjectForID, requestID, objectID, message, answerRequired, netArgs, netKw)
        --- <exception caught here> ---
          File "/usr/lib/python2.7/dist-packages/twisted/spread/pb.py", line 913, in _recvMessage
            netResult = object.remoteMessageReceived(self, message, netArgs, netKw)
          File "/usr/lib/python2.7/dist-packages/twisted/spread/flavors.py", line 118, in remoteMessageReceived
            raise NoSuchMethod("No such method: remote_%s" % (message,))
        twisted.spread.flavors.NoSuchMethod: No such method: remote_getWorkerInfo
        
2019-09-19 14:33:55+0000 [Broker,client] message from master: buildbot-slave detected, failing back to deprecated buildslave API. (I
gnoring missing getWorkerInfo method.)
2019-09-19 14:33:55+0000 [Broker,client] changing builddir for builder AMD64 Clang UBSan 2.7 from 2.7.gps-clang-ubsan to 2.7.gps-clang-ubsan.clang-usban
2019-09-19 14:33:55+0000 [Broker,client] changing builddir for builder AMD64 Clang UBSan 3.8 from 3.8.gps-clang-ubsan to 3.8.gps-clang-ubsan.clang-usban
2019-09-19 14:33:55+0000 [Broker,client] changing builddir for builder AMD64 Clang UBSan 3.7 from 3.7.gps-clang-ubsan to 3.7.gps-clang-ubsan.clang-usban
2019-09-19 14:33:55+0000 [Broker,client] changing builddir for builder AMD64 Clang UBSan 3.x from 3.x.gps-clang-ubsan to 3.x.gps-clang-ubsan.clang-usban
2019-09-19 14:33:55+0000 [Broker,client] changing builddir for builder AMD64 Clang UBSan custom from custom.gps-clang-ubsan to custom.gps-clang-ubsan.clang-usban
2019-09-19 14:33:55+0000 [Broker,client] I have a leftover directory '3.7.gps-clang-ubsan' that is not being used by the buildmaster: you can delete it now
2019-09-19 14:33:55+0000 [Broker,client] I have a leftover directory 'custom.gps-clang-ubsan' that is not being used by the buildmaster: you can delete it now
2019-09-19 14:33:55+0000 [Broker,client] I have a leftover directory '3.6.gps-clang-ubsan' that is not being used by the buildmaster: you can delete it now
2019-09-19 14:33:55+0000 [Broker,client] I have a leftover directory '3.x.gps-clang-ubsan' that is not being used by the buildmaster: you can delete it now
2019-09-19 14:33:55+0000 [Broker,client] I have a leftover directory '3.8.gps-clang-ubsan' that is not being used by the buildmaster: you can delete it now
2019-09-19 14:33:55+0000 [Broker,client] I have a leftover directory '2.7.gps-clang-ubsan' that is not being used by the buildmaster: you can delete it now
2019-09-19 14:33:56+0000 [Broker,client] message from master: attached
2019-09-19 14:33:56+0000 [Broker,client] message from master: attached
2019-09-19 14:33:56+0000 [Broker,client] message from master: attached
2019-09-19 14:33:56+0000 [Broker,client] message from master: attached
2019-09-19 14:33:56+0000 [Broker,client] message from master: attached
2019-09-19 14:33:56+0000 [Broker,client] Connected to buildbot.python.org:9020; slave is ready
2019-09-19 14:33:56+0000 [Broker,client] sending application-level keepalives every 600 seconds
2019-09-19 14:39:12+0000 [Broker,client] message from master: ping
msg353230 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-09-25 20:47
https://github.com/python/buildmaster-config/pull/108 is to blame.
msg353231 - (view) Author: Charalampos Stratakis (cstratak) * Date: 2019-09-25 20:50
Yep that was my change as some jobs couldn't be run on the same worker due to the configs using the same directory.
msg353232 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-09-25 20:52
I'm not going to spend time manually deleting the unused build directories until the typo in the new buildsuffix that caused the disk to fill up is fixed.

https://github.com/python/buildmaster-config/pull/111

I don't even understand why the buildsuffix entries were added.  It doesn't matter on these systems and the original PR doesn't link to any issue describing why.
msg353233 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-09-25 20:55
From a yet another one of a plethora of reasons to hate buildbot point of view...  A _log message_ saying "i'm not using this anymore, you can delete it" is infinitely worse than just going ahead and automatically deleting it.  I shouldn't, as a human, have needed to be involved for this one. :P
msg353234 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-09-25 20:59
Let me know when pr 111 is deployed on the build master so I can log in and cleanup the current typo names.

otherwise, things are probably running fine for the moment.
msg353235 - (view) Author: Charalampos Stratakis (cstratak) * Date: 2019-09-25 21:01
It's already explained that the build directories are duplicated, however it could be more verbose indeed. When a config is being used alongside a config which inherits from the previous one, then buildbot aborts with an error as it tries to compile cpython in the same directory.
msg353237 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-25 21:26
> Let me know when pr 111 is deployed on the build master so I can log in and cleanup the current typo names.

Done ( https://github.com/python/buildmaster-config/pull/111 )
msg353346 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2019-09-27 06:19
@gregory.p.smith @vstinner
Looks like https://github.com/python/buildmaster-config/pull/111 is merged.
Can we close this issue?
msg353350 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-27 09:18
> Can we close this issue?

When it's an issue about specific buildbots, I prefer to first check that the buildbot is back to green.

> https://buildbot.python.org/all/#/builders/47/builds/3578

AMD64 Debian PGO 3.x buildbot is back to green.

> https://buildbot.python.org/all/#builders/136/builds/311

AMD64 Clang UBSan 2.7 is back to green.

So yes, it seems like the disk has free space again ;-)
History
Date User Action Args
2022-04-11 14:59:20adminsetgithub: 82450
2019-09-27 09:18:57vstinnersetstatus: open -> closed
resolution: fixed
stage: resolved
2019-09-27 09:18:52vstinnersetmessages: + msg353350
2019-09-27 06:19:05corona10setmessages: + msg353346
2019-09-25 21:26:13vstinnersetmessages: + msg353237
2019-09-25 21:01:48cstrataksetmessages: + msg353235
2019-09-25 20:59:07gregory.p.smithsetmessages: + msg353234
2019-09-25 20:55:40gregory.p.smithsetmessages: + msg353233
2019-09-25 20:52:43gregory.p.smithsetmessages: + msg353232
2019-09-25 20:50:05cstrataksetnosy: + cstratak
messages: + msg353231
2019-09-25 20:47:11gregory.p.smithsetmessages: + msg353230
2019-09-25 20:44:51gregory.p.smithsetmessages: + msg353229
2019-09-25 17:51:34gregory.p.smithsetassignee: gregory.p.smith
2019-09-25 10:29:04vstinnersetmessages: + msg353164
2019-09-25 10:28:42vstinnersetnosy: + gregory.p.smith

messages: + msg353163
title: test_venv failed on AMD64 Debian PGO 3.x -> AMD64 Debian PGO 3.x, AMD64 Clang UBSan 2.7 buildbots: No space left on device
2019-09-25 07:50:55xtreaksetnosy: + xtreak, pablogsal, vstinner
messages: + msg353155
2019-09-25 07:40:46corona10create