Commit Graph

166 Commits

Author SHA1 Message Date
Richard Purdie
f209f127e6 bitbake: main: Add an option to specify what to profile
Starting with python 3.12, profiling now stays enabled over threads yet
you can't extract the profile data in the threads themselves, which makes it
difficult to use for our use case.

Our main loop starts the idle loop which starts the parsing threads and this
means we can't profile in the main loop and the parsing threads or the idle
loop at the same time due to this.

Add options to the commandline so you can specify which piece of bitbake
you want to enable profiling for. This allows some profiling with python 3.12
onwards rather than crashing.

(Bitbake rev: 09f29a4968841ee5070f70277ba8c253bb14f017)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2025-07-17 10:45:57 +01:00
Richard Purdie
e16dd31445 bitbake: cooker/process/utils: Create profiling common function to remove code duplication
We have code duplication in the way we handle profiling of code sections.
Create a common function in utils which covers this.

The main loop and idle loop profile files were also reversed. Fix this and the naming,
removing a couple of unused variables containing the profile log names in the process too.

(Bitbake rev: b4f6bae97ac9607420fc49fd4c9e957d89c9a5f3)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2025-07-17 10:45:57 +01:00
Richard Purdie
fc027ef55f bitbake: server/process: Decrease idle/main loop frequency
The idle and main loops have socket select calls to know when to execute.
This means we can increase the normal timeout frequency since this is
just a fall back and have some small efficiency gains.

(Bitbake rev: 8d8e17af8619c976819170c9d5d9a686a666c317)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2024-11-07 13:36:04 +00:00
Richard Purdie
9500cf65a7 bitbake: server/process: Don't send heartbeats when no idle functions
If there are no idle functions present, don't sent heartbeat events. These
are only meant to happen while builds are active.

(Bitbake rev: 9a2d5e63b07c3912838781776c61f0f1ac9640e1)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2024-11-07 13:36:04 +00:00
Richard Purdie
f8f45ebde4 bitbake: server/process: Merge a function to simplfy code
Keeping this code separate just makes the code harder to understand,
merge them.

(Bitbake rev: e5ac26a0e1779df1da3277bf48899c8f7642f1f8)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2024-11-07 13:36:04 +00:00
Richard Purdie
babd5fea4a bitbake: process/server: Fix typo
Ensure the message matches the filenames the code actually uses.

(Bitbake rev: deb7db2e2b125c6a6732db4f185f4de5926494fd)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2024-02-10 15:25:22 +00:00
Richard Purdie
0f50f21151 bitbake: process: Add profile logging for main loop
When the idle/main loop was added, we didn't include profiling information
for it. There is a performance issue in there, add logging for it.

(Bitbake rev: d8d5cd43a60560f67e86f4f625113b0f73b944c0)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2024-02-10 14:13:51 +00:00
Mark Asselstine
bc22d82c2f bitbake: server/process: catch and expand multiprocessing connection exceptions
Doing builds on systems with limited resources, or with high demand
package builds such as chromium it isn't uncommon for the OOM Killer
to be triggered and for bitbake-server to be selected as the process
to be killed. When the bitbake-server does terminate unexpectedly due
to the OOM Killer or otherwise, this currently results in a generic
python traceback with little indication as to what has failed.

Here we trap and raise the exceptions while extending the exception
text in runCommand() to make it clear that this is most likely caused
by the bitbake-server unexpectedly terminating.

Callers of runCommand() should be updated to properly handle the
BrokenPipeError and EOFError exceptions to avoid printing a python
traceback, but even if they don't, the added text in the exceptions
should provide some hints as to what might have caused the failure.

(Bitbake rev: 5ff62b802f79acc86bbd6a99484f08501ff5dc2d)

Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2024-01-10 14:02:38 +00:00
Ross Burton
362c70a993 bitbake: bitbake/lib: spawn server/worker using the current Python interpreter
The user may have invoked ./bin/bitbake using a different Python
interpreter than whatever python3 is on $PATH (for example, explicitly
using a different version).  However, as the server and workers are
spawned directly they'll use the hashbang and thus a different Python.

We also ensure that argv[0] is set to sys.executable instead of
'bitbake-server' or 'bitbake-worker', so that sys.executable is set to
the right value inside the child.  Without this the server won't be
able to start any workers.

(Bitbake rev: b44d5d2a53d3082c8ce94e09c0cf833e33e25aec)

Signed-off-by: Ross Burton <ross.burton@arm.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-09-26 10:37:17 +01:00
Richard Purdie
022deeb0ef bitbake: server/process: Disable the flush() call in server logging
We've been chasing bitbake timeouts for a while and it was unclear where things
were blocking on IO. It appears the flush() call in server logging can cause
pauses up to minutes long on systems with slow (spinning) disks that are heavily
loaded with IO.

Since the flush() was added to aid debugging of other timing issues, we shouldn't
need it now and it can be disabled. Leave a comment as a reminder of the pain this
can cause.

(Bitbake rev: afbc169e1490a86d6250969f780062c426eb4682)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-09-18 11:35:05 +01:00
Richard Purdie
37c31a5adc bitbake: lib: Drop inotify support and replace with mtime checks
With the flush in serverlog() removed and a memory resident bitbake with a
60s timeout, the following could fail in strange ways:

rm bitbake-cookerdaemon.log
bitbake-layers add-layer ../meta-virtualization/
bitbake-layers add-layer ../meta-openembedded/meta-oe/
bitbake -m

specifically that it might error adding meta-oe with an error related to meta-virt.

This clearly shows that whilst bblayers.conf was modified, bitbake was not
recognising that. This would fit with the random autobuilder issues seen when
the serverlog flush() call was removed.

The issue appears to be that you have no way to "sync()" the inotify events with
the command stream coming over the socket. There is no way to know if there are
changes in the IO queue which bitbake needs to wait for before proceeding with
the next command.

I did experiment with os.sync() and fsync on the inotify fd, however nothing
addressed the issue. Since it is extremely important we have accurate cache data,
the only realistic thing to do is to switch to stat() calls and check mtime.

For bitbake commands, this is straightforward since we can revalidate the cache
upon new connections/commands. For tinfoil this is problematic and we need to
introduce and explict command "revalidateCaches" that the code can use to force
bitbake to re-check it's cache validity. I've exposed this through tinfoil with
a new "modified_files" function.

So, this patch:

a) drops inotify support within bitbake's cooker/server and switch to using mtime
b) requires a new function call in tinfoil when metadata has been modified

(Bitbake rev: da3ec3801bdb80180b3f1ac24edb27a698415ff7)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-09-18 11:35:05 +01:00
Richard Purdie
733afeffd1 bitbake: server/process: Add more timing debug
It is helpful to have timestamps on the ping failures so that they
can be matched against the bitbake logs. It is also useful to understand
how long the server takes for form a reply verses when it is sent.

(Bitbake rev: 65969a7a8f5ae22c230431d2db080eb187a27708)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-09-05 08:14:50 +01:00
Yang Xu
b77e23c541 bitbake: server/process: fix sig handle
process.signal_received is a list for signum and not iterable,
change a suitable method to handle sig.

(Bitbake rev: bfc53b190bd2530c2bfcea0690127d7eff620f45)

Signed-off-by: Yang Xu <yang.xu@mediatek.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-08-11 07:58:31 +01:00
Richard Purdie
4bb1f0e236 bitbake: server/process: Show command in timeout message
To learn more about the server timeout issues, be clear in the error
message about which command is showing the timeout. It is currently
unclear if this is the original command or a ping to the server.

(Bitbake rev: ac3cd866274f67b29eff89e393132bdabf76dbfd)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-06-30 17:58:47 +01:00
Joshua Watt
28a7202ac5 bitbake: server: Fix crash when checking lock file
Fixes a crash when the server process attempts to check the PID of the
lock file that resulted because an integer (os.getpid()) was attempting
to be concatenated to a string

(Bitbake rev: 5d499682a0a739b5269247a8f6dbb874e3eec456)

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-06-01 08:08:33 +01:00
Richard Purdie
76db0baba2 bitbake: server/process: Improve idle thread exception handling
If the inotifier code has an exception, bitbake currently hangs. Catch any
exception and exit if seen. Also check the idle thread is alive and exit
if it disappears. This should stop bitbake hanging if such a situation arises
in future such as this example:

3323260 21:48:31.554468 Running command ['getVariable', 'BBINCLUDELOGS']
Exception in thread Thread-1 (idle_thread):
Traceback (most recent call last):
  File "/usr/lib64/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib64/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/server/process.py", line 408, in idle_thread
    self.cooker.process_inotify_updates()
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/cooker.py", line 256, in process_inotify_updates
    n.read_events()
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/pyinotify.py", line 1207, in read_events
    if fcntl.ioctl(self._fd, termios.FIONREAD, buf_, 1) == -1:
OSError: [Errno 9] Bad file descriptor
3323260 21:48:32.206995 Command Completed (socket: True)

(Bitbake rev: 358b5b02d5de1ab0f98104c4ec4953e46999b9a5)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-02-20 15:18:53 +00:00
Richard Purdie
18d2c489f0 bitbake: server/process: Fix lockfile contents check bug
We need to check against the first line of the file, fix the typo.

(Bitbake rev: 4abc598fb01d426394f4222dfc752e620a8e1b7b)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-01-24 21:59:44 +00:00
Richard Purdie
11698e027d bitbake: server/process: Improve lockfile handling at exit
If memory resident bitbake is active and the build directory is renamed upon
build completion, several bad things can happen:

* the old build directory could be re-created to contain a lockfile
  leaving an empty directory behind
* a lockfile for a new build could be found and attempt to be locked

This patch avoids creating an empty directory (not perfectly, but should
work in the majority of cases - an empty directory is cosmetic).

It also now compares the lock file contents to it's own pid and
just exits if it doesn't match, it is clearly then belonging to some
new process.

This will be combined with bitbake shutdown calls on the autobuilder to
ensure "saved" build directories, or build directories being deleted by
clobberdir don't do strange things.

(Bitbake rev: b986eac18b6a8bf633f5ef15f32f68de4c86173b)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-01-14 14:33:00 +00:00
Richard Purdie
e26e0b92e8 bitbake: server/process: Move heartbeat to idle thread
Rather than risk the heartbeat event code locking up the server control
socket, handle it in the 'idle' thread with the other work. The aim
is to remove it as a possible issue with some ongoing hangs.

(Bitbake rev: 0f9a0c7853b181817bf01863a26da21412376294)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-01-11 23:16:31 +00:00
Richard Purdie
165e8b563d bitbake: process/cooker/command: Fix currentAsyncCommand locking/races
currentAsyncCommand currently doesn't have any locking and we have
a conflict in "idle" conditions since the idle functions count needs
to be zero *and* there needs to be no active command.

Move the changes/checks of currentAsyncCommand to within the lock
and then we can add it to the condition for idle, simplifying some
of the code.

(Bitbake rev: b5215887d2f8ea3f28f1ebda721bd5b8f93ec7f3)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-01-11 10:58:36 +00:00
Richard Purdie
7f556c6861 bitbake: cooker: Clean up inotify idle handler
We no longer need to abstract the inotify callback handler, remove the
abstraction and simplify/clean up the code.

(Bitbake rev: af4ccab8acc49e91bf7647f209d69f4858618466)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-01-06 17:40:01 +00:00
Richard Purdie
a19687acd1 bitbake: lib/bb: Update thread/process locks to use a timeout
The thread/process locks we use translate to futexes in Linux. If a
process dies holding the lock, anything else trying to take the lock
will hang indefinitely. An example would be the OOM killer taking out
a parser process.

To avoid bitbake processes just hanging indefinitely, add a timeout to
our lock calls using a context manager. If we can't obtain the lock
after waiting 5 minutes, hard exit out using os._exit(1). Use _exit()
to avoid locking in any other places trying to write error messages to
event handler queues (which also need locks).

Whilst a bit harsh, this should mean we stop having lots of long running
processes in cases where things are never going to work out and also
avoids hanging builds on the autobuilder.

(Bitbake rev: d2a3f662b0eed900fc012a392bfa0a365df0df9b)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-01-05 11:50:17 +00:00
Richard Purdie
4c57c6eeec bitbake: server/process: Run idle commands in a separate idle thread
When bitbake is off running heavier "idle" commands, it doesn't service it's
command socket which means stopping/interrupting it is hard. It also means we
can't "ping" from the UI to know if it is still alive.

For those reasons, split idle command execution into it's own thread.

The commands are generally already self containted so this is easier than
expected. We do have to be careful to only handle inotify poll() from a single
thread at a time. It also means we always have to use a thread lock when sending
events since both the idle thread and the command thread may generate log messages
(and hence events). The patch depends on  previous fixes to the builtins locking
in event.py and the heartbeat enable/disable changes as well as other locking
additions.

We use a condition to signal from the idle thread when other sections of code
can continue, thanks to Joshua Watt for the review and tweaks squashed into this
patch. We do have some sync points where we need to ensure any currently executing
commands have finished before we can start a new async command for example.

(Bitbake rev: 67dd9a5e84811df8869a82da6a37a41ee8fe94e2)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-31 17:05:17 +00:00
Richard Purdie
3cc9aed5a5 bitbake: server/process: Add locking around idle functions accesses
In preparation for adding splitting bitbakes work into two threads,
add locking around the idle functions list accesses.

(Bitbake rev: a9c63ce8932898b595fb7776cf5467d3c0afe4f7)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-31 17:05:17 +00:00
Richard Purdie
c4ecfc4dc5 bitbake: server/process: Improve idle loop exit code
When idle handlers want to exit, returning "False" isn't very clear
and also causes challenges with the ordering of the removing the idle
handler and marking that no async command is running.

Use a specific class to signal the exit condition allowing clearer code
and allowing the async command to be cleared after the handler has been
removed, reducing any opportunity for races.

(Bitbake rev: 102e8d0d4c5c0dd8c7ba09ad26589deec77e4308)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-31 17:05:17 +00:00
Richard Purdie
7723baafa6 bitbake: server/process: Improve exception and idle function logging
Currently if the idle functions loop suffers a traceback, it is
silently dropped and there is no log message to say what happened.
This change at least means the traceback is in the cooker log, making
some debugging possible.

Add some logging to show when handlers are added/removed to allow
a better idea of what the server code is doing from the server log
file.

(Bitbake rev: 9cf3102dc36513124fe5ead2f1e448b51833b6ac)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-31 17:05:17 +00:00
Richard Purdie
cb8efd4d20 bitbake: event: Add enable/disable heartbeat code
Currently heartbeat events are always generated by the server whilst it is
active. Change this so they only appear when builds are running, which is
when most code would expect to be executed. This removes a number of races
around changes in the datastore which can happen outside of builds.

(Bitbake rev: 8c36c90afc392980d999a981a924dc7d22e2766e)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-31 17:05:17 +00:00
Richard Purdie
98f1b3d6ae bitbake: server/process: Add bitbake.sock race handling
We've seen cases where the bitbake.sock file appears to disappear but the
server continues to hold bitbake.lock. The most likely explaination is
that some previous build directory was moved out the way, a server there
kept running, eventually exited and removed the sock file from the wrong
directory.

To guard against this, save the inode information for the sock file and check
it before deleting the file. The new code isn't entirely race free but should
guard against what is a rare but annoying potential issue.

(Bitbake rev: b02ebbffdae27e564450446bf84c4e98d094ee4a)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-21 14:15:26 +00:00
Frank de Brabander
3c6753d03d bitbake: process: log odd unlink events with bitbake.sock
Log when the socket file already exists and is removed before
recreating a new socket.

Log when unlinking the socket file failed.

(Bitbake rev: cfd7c9899f988bab6d9fe7bbfbdb60603fb5ed34)

Signed-off-by: Frank de Brabander <debrabander@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-17 23:50:13 +00:00
Richard Purdie
77a9c9b66b bitbake: main/process: Add extra sockname debugging
We're struggling to understand how bitbake.sock can sometimes disappear
in live builds when we can't see where it could have been deleted.
This causes connection failures to the server and failed builds.

Add some extra debugging around the server log and client retry
log messages to give more information for the next time this issue
occurs.

(Bitbake rev: 376a516dc8c96727fd042ada65f803013601ee2d)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-13 23:26:28 +00:00
Richard Purdie
a9505a86fd bitbake: main/server: Add lockfile debugging upon server retry
We keep seeing server issues where the lockfile is present but we can't
connect to it. Reuse the lockfile debugging code from the server to
dump better information to the console from the client side when we
run into this issue. Whilst not pretty, this might give us a chance
of being able to debug the problems further.

(Bitbake rev: 22685460b5ecb1aeb4ff3436088ecdacb43044d7)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-09 13:22:11 +00:00
Richard Purdie
16bc168084 bitbake: server: Ensure cooker profiling works
The previous cleanups meant that when the cooker was started, profiling
was always disabled as configuration was sent to the server later and this
was too late to profile the main loop.

Pass the "profile" option over the server commandline so that we can
profile cooker itself again, the setting can now take effect early enough.

(Bitbake rev: c97c1f1c127ef3f8fbbd1b4e187ab58bfb0a73e5)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-11-20 08:31:28 +00:00
Richard Purdie
4a241d0cfb bitbake: server/process: Fix logging issues where only the first message was displayed
I realised only the first logging message was being displayed in a given
parsing process. The reason turned out to be the UI handler failing
with a "pop from empty list". The default handler was then lost and
no further messages were processed.

Fix this by catching the exception correctly in the connection writer code.

(Bitbake rev: d3e64f64525187f1409531a0bd99df576e627f7f)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-06-25 21:14:07 +01:00
Richard Purdie
6c9a516e76 bitbake: server/process: Avoid tracebacks at exit
In theory this should have been worked around but is still occurring. Add
it to the list of things to ignore when bitbake is shutting down.

Traceback (most recent call last):
  File "/usr/lib64/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/server/process.py", line 698, in startCallbackHandler
    event = self.reader.get()
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/server/process.py", line 722, in get
    res = self.reader.recv_bytes()
  File "/usr/lib64/python3.9/multiprocessing/connection.py", line 221, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/usr/lib64/python3.9/multiprocessing/connection.py", line 426, in _recv_bytes
    return self._recv(size)
  File "/usr/lib64/python3.9/multiprocessing/connection.py", line 384, in _recv
    chunk = read(handle, remaining)
TypeError: an integer is required (got type NoneType)'

(Bitbake rev: 7a28ac4fe478bee1e52e84412da9626495f9c6c7)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-06-10 13:40:02 +01:00
Richard Purdie
ce592bc9ac bitbake: server/process: Remove daemonic thread usage
We're seeing UI deadlocks occasionally and this is possibly due to the
use of a daemonic thread in the UI event queue processing. This thread
could terminate holding a threading Lock() which would cause issues
for the process when exitting.

Change the shutdown process to handle this more cleanly.

(Bitbake rev: f5ad8349a5dbff9824a89f5708cfd011d61888c9)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-06-08 21:53:15 +01:00
Richard Purdie
5940020cfb bitbake: server/process: Avoid risk of exception deadlocks
The open coded lock acquire/release in the UI event handler doesn't
cover the case an exception occurs and if one did, it could deadlock
the code. Switch to use 'with' statements which would handle this
possibility.

We have seen deadlocks in the UI at exit this so this removes a
possible cause.

(Bitbake rev: bd12792f28efd2f03510653ec947ebf961315272)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-06-08 21:53:15 +01:00
Richard Purdie
a2e0ed2cbf bitbake: server/process: Drop unused import
(Bitbake rev: 543315e6463f15ca7ab2b4ef3e8ed41bb4207ccf)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-04-21 21:00:35 +01:00
Richard Purdie
496cbc01ca bitbake: server/process: Disable gc around critical section
The python gc can trigger whilst we're holding the event stream lock
and when cleaning up objects, they can trigger warnings. This translates
into a new event which would then need the lock and we can deadlock.

Disable gc whilst we hold that lock to avoid this unfortunate and
problematic situation.

(Bitbake rev: 96a6303949cefd469bcf5ed250ff512271354357)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-04-03 17:51:26 +01:00
Richard Purdie
928bcb10a4 bitbake: cooker/process: Fix signal handling lockups
If a parser process is terminated while holding a write lock, then it
will lead to a deadlock (see
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Process.terminate).

With SIGTERM, we don't want to terminate holding the lock. We also don't
want a SIGINT to cause a partial write to the event stream.

I tried using signal masks to avoid this but it doesn't work, see
https://bugs.python.org/issue47139

Instead, add a signal handler and catch the calls around the critical section.
We also need a thread lock to ensure other threads in the same process don't
handle the signal until all the threads are not in the lock.

(Bitbake rev: a40efaa5556a188dfe46c8d060adde37dc400dcd)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-03-30 13:05:03 +01:00
Peter Kjellerstedt
279e754d86 bitbake: server/process: Correct a typo in a comment
(Bitbake rev: b4a157b2fe2fb481ffa40e0f32659d05dd6320c2)

Signed-off-by: Peter Kjellerstedt <peter.kjellerstedt@axis.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-03-28 13:33:28 +01:00
Richard Purdie
cff6c1a18d bitbake: server/process: Move threads left debug to after cooker shutdown
This debug is useful but the cooker shutdown or post_serve() may have cleanup
left so run after those.

(Bitbake rev: 1463fc0448d1a6a7265806a4a8b165b610dfb43f)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-03-26 09:27:43 +00:00
Richard Purdie
34e4eebc32 bitbake: lib/bb: Fix string concatination potential performance issues
Python scales badly when concatinating strings in loops. Most of these
references aren't problematic but at least one (in data.py) is probably
a performance issue as the issue is compounded as strings become large.

The way to handle this in python is to create lists which don't reconstruct
all the objects when appending to them. We may as well fix all the references
since it stops them being copy/pasted into something problematic in the future.

This patch was based on issues highligthted by a report from AWS Codeguru.

(Bitbake rev: d654139a833127b16274dca0ccbbab7e3bb33ed0)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-11-03 10:12:42 +00:00
Alexander Kanavin
fbbc0f7461 bitbake: bitbake: correct deprecation warning in process.py
(Bitbake rev: aff52fe21a0b27f6302555c1e52a864550eb46ce)

Signed-off-by: Alexander Kanavin <alex@linutronix.de>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-09-17 07:26:24 +01:00
Martin Jansa
b978f7c3a0 bitbake: cooker/process: Fix typos in exiting message
(Bitbake rev: 1ff1ea3880d293b14ce0fc65e3bc4c938d587a2f)

Signed-off-by: Martin Jansa <Martin.Jansa@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-09-01 18:56:25 +01:00
Richard Purdie
dc7ef0f896 bitbake: process: Improve traceback error reporting from main loop
Currently the code can just show nothing as the exception if there was a double
fault, which in this code path is quite likely. This leads to an error log
which effectively says "it failed" with no information about how.

Improve things so we get a nice verbose traceback left in the logs/output
which is preferable to no logs.

(Bitbake rev: e5782b71647d1eb6de53bde7bc4f6019a5589f21)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-08-06 06:34:58 +01:00
Joshua Watt
75f491e5e2 bitbake: server: Fix early parsing errors preventing zombie bitbake
If the client process never sends cooker data, the server timeout will
be 0.0, not None. This will prevent the server from exiting, as it is
waiting for a new client. In particular, the client will disconnect with
a bad "INHERIT" line, such as:

    INHERIT += "this-class-does-not-exist"

Instead of checking explicitly for None, check for a false value, which
means either 0.0 or None.

(Bitbake rev: 13e2855bff6a6ead6dbd33c5be4b988aafcd4afa)

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-07-20 19:00:32 +01:00
Richard Purdie
2f12a20935 bitbake: server/process: Handle error in heartbeat funciton in OOM case
We've seen cases where an OOM error causes bitbake server to hang:

9171 02:21:09.127810 Command Completed
Traceback (most recent call last):
  File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/bin/bitbake-server", line 51, in <module>
    bb.server.process.execServer(lockfd, readypipeinfd, lockname, sockname, timeout, xmlrpcinterface)
  File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/server/process.py", line 550, in execServer
    server.run()
  File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/server/process.py", line 108, in run
    ret = self.main()
  File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/server/process.py", line 242, in main
    ready = self.idle_commands(.1, fds)
  File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/server/process.py", line 370, in idle_commands
    bb.event.fire(heartbeat, self.cooker.data)
  File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/event.py", line 216, in fire
    fire_class_handlers(event, d)
  File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/event.py", line 123, in fire_class_handlers
    execute_handler(name, handler, event, d)
  File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/event.py", line 93, in execute_handler
    ret = handler(event)
  File "/home/pokybuild/yocto-worker/qemux86/build/meta/classes/buildstats.bbclass", line 182, in defaultrun_buildstats
    write_host_data(os.path.join(bsdir, "host_stats"), e, d, "interval")
  File "/home/pokybuild/yocto-worker/qemux86/build/meta/classes/buildstats.bbclass", line 160, in write_host_data
    output = subprocess.check_output(c.split(), stderr=subprocess.STDOUT, timeout=limit).decode('utf-8')
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1295, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory

We need to wrap the calls in the same high level wrapper as idle function calls
and trigger an exit upon an unhandled exception.

(Bitbake rev: 74042b5b89d5a170013fc1a327ce3a6530fbf7d5)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-05-18 23:53:15 +01:00
Ross Burton
01066a584a bitbake: bitbake-server: ensure server timeout is a float
bitbake-server is spawned by process.py and passes the arguments it is
given to ProcessServer.  There's some type confusion here:

bitbake-server is called with a string representation of the timeout,
which may be None.  If the timeout is not set, pass 0 instead of None.

Inside bitbake-server a ProcessServer is created which expects the
timeout to be a float not a string, so always float() the value.

[ YOCTO #14350 ]

(Bitbake rev: c93ae1f861208f6d39fd15c84fbcd0e2b54331f5)

Signed-off-by: Ross Burton <ross.burton@arm.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-04-20 13:57:50 +01:00
Richard Purdie
0c0b236b4c bitbake: process: Show command exceptions in the server log as well
There are autobuilder logs where the server commands are failing
but we have no debug info in the server log. Improve this to try and
understand what is failing.

(Bitbake rev: 04d3a79226c9ea448b22f4efbab33876a72c9bdb)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2020-10-11 13:44:26 +01:00
Richard Purdie
ef21d08424 bitbake: server/process: Note when commands complete in logs
Its hard to tell from the server logs whether commands complete or not
(or how long they take). Add extra info to allow more debugging of
server timeouts.

(Bitbake rev: 56285ada585ec1481449522282b335bcb5a2671e)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2020-09-05 11:45:18 +01:00