Commit Graph

511 Commits

Author SHA1 Message Date
Joshua Watt
c6763d27ef bitbake: runqueue: Pass hashfn in taskdep data
Include the hashfn (the value of BB_HASHFILENAME) in the task dependency
data. This allows tasks to get a specific unique hash for dependent
tasks when one is available.

(Bitbake rev: 4dbecf6059e495246267b09d0f43086d51e6df2c)

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-06-01 08:08:33 +01:00
Richard Purdie
695998f921 bitbake: cooker: Add FILE_LAYERNAME variable containing the layername for a recipe
There are times when it would be useful for code to know which layer
(or collection in old bitbake terms) it is contained within.

Add support for FILE_LAYERNAME to be set by bitbake when parsing a recipe
so that it is possible to determine this. To do it, we need to pass data
from the cooker into the recipe endpoints, since only the top level cooker
information knows about the layer structure which makes the patch a bit
painful.

The idea is that this would make layer overrides possible:

OVERRIDES .= ":layer-${FILE_LAYERNAME}"

which then opens possibilities like:

WARN_QA:append:layer-core = " patch-fuzz"

as an example where OE-Core could enable specific QA tests only for that
specific layer.

(Bitbake rev: 7090a14b0035842112d073acf7f2ed1a01fdeccf)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-05-25 13:16:24 +01:00
Chen Qi
ba94f9a3b1 bitbake: runqueue: fix PSI check calculation
The current PSI check calculation does not take into consideration
the possibility of the time interval between last check and current
check being much larger than 1s. In fact, the current behavior does
not match what the manual says about BB_PRESSURE_MAX_XXX, even if
the value is set to upper limit, 1000000, we still get many blocks
on new task launch. The difference between 'total' should be divided
by the time interval if it's larger than 1s.

(Bitbake rev: b4763c2c93e7494e0a27f5970c19c1aac66c228b)

Signed-off-by: Chen Qi <Qi.Chen@windriver.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-04-13 12:01:45 +01:00
Richard Purdie
42c92cbc83 bitbake: event/cooker/runqueue: Add ability to interrupt longer running code
Bitbake is now able to understand when a UI wants it to stop the current
processing. There are some long running functions which currently have no
mechanism to interrupt them however.

This patch adds a call, bb.event.check_for_interrupts(d) which can be
placed in long running code and allows an internal state flag within
bitbake to be checked. If set, that flag will trigger an exit.

This means that Ctrl+C can be made to work in a variety of places where
it currently would not.

Long running event handlers in OE-Core can also then benefit from this
new approach with the addition of the function call as well.

(Bitbake rev: b7ed7e9a815c4e10447fd499508be3dbb47f06e8)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-02-23 12:12:18 +00:00
Richard Purdie
a7e3c0f046 bitbake: runqueue: Drop SystemExit usage
Using bb.fatal for a fatal error message is the best practise, switch
the code to match other call sites.

(Bitbake rev: c27e48fa81c2327a4a355a028884ab457cde3ae7)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2023-02-19 07:48:28 +00:00
Richard Purdie
907e91796e bitbake: event: builtins fix for 'd' deletion
I've been seeing event handlers where 'd' seems to disappear half way through
event handler execution. This is problematic when multiple threads are active
since this code assumes single threading.

The easiest fix is to change the handler function calls to contain d as a
parameter as we do elsewhere for other functions. This will break any non-text
handlers but I was only able to spot one of those in runqueue. It will also
break handlers than call functions that assume 'd' is in the global namespace
but those failures should be obvious and we can fix those to pass d around.

This solution avoids manipulating builtins which was always a horrible thing
to do anyway and solves the issue without needing locking, thankfully.

(Bitbake rev: 1e12f0a4b592dacd006d370ec29cd71d2a44312e)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-29 00:07:06 +00:00
Richard Purdie
1a45c29ff1 bitbake: build/siggen: Rework stamps functions
The current method of passing either a task's datastore, or
dataCaches and a filename into the stamp functions is rather
horrible.

Due to the different contexts, fixing this is hard but we do control
the bitbake side of the API usage so we can migrate those to use other
functions and then only support a datastore in the public bb.build API
which is only called from task context in OE-Core.

(Bitbake rev: c79ecec580e4c2a141ae483ec0f6448f70593dcf)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-17 08:52:28 +00:00
Richard Purdie
2946c56b23 bitbake: bitbake: siggen/runqueue: Switch to using RECIPE_SIGGEN_INFO feature for signature dumping
Now that we have cache support for the taskdep/gendep/lookupcache data,
we can switch to use that cooker feature and skip the secondary reparse to
write the sig files. This does make the initial parse longer but means the
secondary one isn't needed.

At present parsing with the larger cache isn't optimal but we have plans
in place which will make this faster than the current reparse code being
removed here.

(Bitbake rev: 5951b5b56449855bc2a30146af65eb287a35fcef)

(Bitbake rev: 1252e5bce51ae912ecff9dcc354a371786ff2c72)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-17 08:52:28 +00:00
Richard Purdie
4754b1021e bitbake: siggen: Directly store datacaches reference
It is becomming clear the siggen needs access to our cache data but we
can't always obtain it in the contexts we need to. Add it directly,
meaning over time we should be able to simplify the APIs and stop
convoluting new ones!

(Bitbake rev: 6b213590ed0e77683cf7fbce6bbe9605ddecf3d3)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-17 08:52:28 +00:00
Richard Purdie
f329142a31 bitbake: build/siggen/runqueue: Drop do_setscene references
do_setscene was from a different era before our modern setscene
per task code. It hasn't been used for years so remove some old
obsolete references to it.

(Bitbake rev: ef72282298f7c4db74383c23bb0251dd06d3c6d3)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-11 23:21:27 +00:00
Richard Purdie
a40875b727 bitbake: siggen: Drop non-multiconfig aware siggen support
All siggens in common use should now support multiconfig, drop the
compatibility code.

(Bitbake rev: b36545b4df6d935ed312ff407d4e0474c3ed8d1a)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-12-11 23:21:27 +00:00
Richard Purdie
70a7e7337b bitbake: runqueue: Improve error message for missing multiconfig
If you place a multiconfig which isn't enabled into an mcdepends you currently
get a traceback from runqueue. We can do better, add some code to tell the user
what happened in a more readable way without the traceback.

[YOCTO #14970]

(Bitbake rev: a4693b70764bb394ee2cf8dd12a5f6fce866008b)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-11-29 10:25:51 +00:00
Richard Purdie
5fdd28e37f bitbake: runqueue: Fix race issues around hash equivalence and sstate reuse
We identified a use case where a native recipe (autoconf-native) was
rebuilt with no change in output yet the sstate for do_package tasks
wasn't being used.

The issue is that do_package tasks have a hard dependency on
pseudo-native:do_populate_sysroot. That task was one of the many
tasks being rehashed when autoconf-native's hash was changed.

If update_tasks processed a recipe before it had processed pseudo-native,
that recipe would be marked as not possible from sstate and would
run the full tasks.

The fix is to split the processing into two passes, first to handle
the existing covered/notcovered updates, then in the second pass,
check whether there are "harddep" issues.

This defers the do_package tasks until after pseudo-native is installed
from sstate as expected and everything works well again.

(Bitbake rev: e479d1e418a7d34f0a4663b4a0e22bb11503c8ab)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-11-22 15:28:27 +00:00
Richard Purdie
09da786273 bitbake: runqueue: Add further debug for sstate reuse issues
Even after enabling all our debugging, we had a reproducible test case
where sstate wasn't being reused and it was unclear from the logs why.

This patch adds debugging on the possible codepaths that were breaking
and allowed the issue to be debugged and fixed.

(Bitbake rev: 9233ad685b9b5e9eeb775fc71761712aaf0e876c)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-11-22 15:28:27 +00:00
Richard Purdie
228f9a3a2d bitbake: worker/runqueue: Reduce initial data transfer in workerdata
When setting up the worker we were transfering large amounts of data
which aren't needed until task execution time.

Defer the fakeroot and taskdeps data until they're needed for a specific
task. This will duplicate some information when executing different tasks
for a given recipe but as is is spread over the build run, it shouldn't
be an issue overall.

Also take the opportunity to clean up the silly length argument lists
that were being passed around at the expense of extra dictionary keys.

(Bitbake rev: 3a82acdcf40bdccd933c4dcef3d7e480f0d7ad3a)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-11-20 08:31:28 +00:00
Richard Purdie
3a4aeda0fa bitbake: cache/cookerdata: Move recipe parsing functions from cache to databuilder
When 'NoCache' was written, databuilder/cookerdata didn't exist. It does
now and the recipe parsing functionality contained in NoCache clearly
belongs there, it isn't a cache function. Move those functions, renaming
to match the style in databuilder but otherwise not changing functionality
for now. Fix up the callers to match (which make it clear this is the right
move).

(Bitbake rev: 783879319c6a4cf3639fcbf763b964e42f602eca)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-11-20 08:31:28 +00:00
Oliver Lang
a09806e943 bitbake: runqueue: fix a typo
(Bitbake rev: 82f40261a06d39f0e7748942f480da5b44282fa3)

Signed-off-by: Oliver Lang <quantenkeks@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-10-25 13:58:49 +01:00
Richard Purdie
1a5c4140b0 bitbake: runqueue: Change pressure file warning to a note
The user does need to be told about this but it isn't really a warning,
just something they may need to be aware of. Drop the level accordingly.

(Bitbake rev: 9bdedc8074990e613c9567e2cd8072f8d885f07f)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-08-24 15:43:36 +01:00
Aryaman Gupta
f5c616d5af bitbake: runqueue: add memory pressure regulation
Prevent new tasks from being scheduled if the memory pressure is above
a certain threshold, specified through the "BB_MAX_PRESSURE_MEMORY"
variable in the conf/local.conf file. This is an extension to the
following commit and hence regulates pressure in the same way:
   48a6d84de1 bitbake: runqueue: add cpu/io pressure regulation

Memory pressure is experienced when time is spent swapping, refaulting
pages from the page cache or performing direct reclaim. This is why
memory pressure is rarely seen but might be useful as a last resort to
prevent OOM errors.

(Bitbake rev: 44c395434c7be8dab968630a610c8807f512920c)

Signed-off-by: Aryaman Gupta <aryaman.gupta@windriver.com>
Signed-off-by: Randy Macleod <Randy.Macleod@windriver.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-08-12 11:49:29 +01:00
Richard Purdie
c58fca98e4 bitbake: runqueue: Drop deadlock breaking force fail
I'm 99% certain this failing of a scenequeue task corrupts runqueue and
causes all kinds of breakage. I'd rather runqueue deadlocked than corrupted
and did weird things so drop this code.

We've seen builds where the deadlock triggers and it then tries to run tasks
where the SQ task already ran with very confusing failures. It is likely it
is this code causing it.

(Bitbake rev: 8efced47fcb47851a370fd6786df6fb377f99963)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-08-08 15:44:47 +01:00
Richard Purdie
645aa39512 bitbake: runqueue: Improve deadlock warning messages
Tweak the deadlock breaking messages to be explict about which task is
blocked on which other task. The messages currently imply it is "freeing"
the blocking task which is confusing.

(Bitbake rev: cf7f60b83adaded180f6717cb4681edc1d65b66d)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-08-08 15:44:47 +01:00
Richard Purdie
377a671e07 bitbake: runqueue: Ensure deferred tasks are sorted by multiconfig
We have to prefer one multiconfig over another when deferring tasks, else
we'll have cross-linked build trees and nothing will be able to build.

In the original population code, we sort like this but we don't after
rehashing. Ensure we have the same sorting after rehashing toa void
deadlocks.

(Bitbake rev: 27228c7f026acb8ae9e1211d0486ffb7338123a2)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-08-08 15:44:47 +01:00
Aryaman Gupta
48a6d84de1 bitbake: runqueue: add cpu/io pressure regulation
Prevent the scheduler from starting new tasks if the current cpu or io
pressure is above a certain threshold and there is at least one active
task. This threshold can be specified through the
"BB_PRESSURE_MAX_{CPU|IO}" variables in conf/local.conf.

The threshold represents the difference in "total" pressure from the
previous second. The pressure data is discussed in this oe-core commit:
   061931520b buildstats.py: enable collection of /proc/pressure data
where one can see that the average and "total" values are available.
From tests, it was seen that while using the averaged data was somewhat
useful, the latency in regulating builds was too high. By taking the
difference between the current pressure and the pressure seen in the
previous second, better regulation occurs. Using a shorter time period
is appealing but due to fluctations in pressure, comparing the current
pressure to 1 second ago achieves a reasonable compromise. One can look
at the buildstats logs, that usually sample once per second, to decide a
sensible threshold.

If the thresholds aren't specified, pressure is not monitored and hence
there is no impact on build times. Arbitary lower limit of 1.0 results
in a fatal error to avoid extremely long builds. If the limits are higher
than 1,000,000, then warnings are issued to inform users that the specified
limit is very high and unlikely to result in any regulation.

The current bitbake scheduling algorithm requires that at least one
task be active. This means that if high pressure is seen, then new tasks
will not be started and pressure will be checked only for as long as at
least one task is active. When there are no active tasks, an additional task
will be started and pressure checking resumed. This behaviour means that
if an external source is causing the pressure to exceed the threshold,
bitbake will continue to make some progress towards the requested target.
This violates the intent of limiting pressure but, given the current
scheduling algorithm as described above, there seems to be no other option.
In the case where only one bitbake build is running, the implications of
the scheduler requirement will likely result in pressure being higher
than the threshold. More work would be required to ensure that
the pressure threshold is never exceeded, for example by adding pressure
monitoring to make and ninja.

(Bitbake rev: 502e05cbe67fb7a0e804dcc2cc0764a2e05c014f)

Signed-off-by: Aryaman Gupta <aryaman.gupta@windriver.com>
Signed-off-by: Randy Macleod <randy.macleod@windriver.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-07-28 11:55:06 +01:00
Richard Purdie
3664efc86a bitbake: runqueue: Fix unihash cache mismatch issues
Very occasionally we see errors in eSDK testing on the autobuilder where the task
hashes in the eSDK don't match what was just built. I was able to inspect one of
these build directories and noticed that the bb_unihashes.dat file in the eSDK
was zero sized. Whilst inspecting the code to understand the cause, I noticed that
updated hashes are not saved out in subsequent updates of the values in the rehash
process.

Add a missing sync call to ensure this happens.

(Bitbake rev: 7912dabbcf444a3c3d971cca4a944a8b931e301b)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-06-02 12:28:21 +01:00
Richard Purdie
a0b8419a7c bitbake: runqueue: Fix sig file location when using multiconfig
We're using the wrong data store when trying to locate siginfo files,
fix this. Thanks to Gregory Lumen <gregorylumen@microsoft.com> for
spotting.

[YOCTO #14774]

(Bitbake rev: 0ed800e19a3197f8e622c8d3b630aae384e60aba)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-04-30 07:33:52 +01:00
Richard Purdie
6d1d9f3d78 bitbake: runqueue: Drop pointless variable assignment
This is set at the start of the loop anyway so it does nothing. Drop
the pointless code.

(Bitbake rev: e6a3173c9cdf349ccbd4cf612868f92cce8717c8)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-04-21 21:00:35 +01:00
Scott Murray
424269caea bitbake: lib/bb: Replace "abort" usage in task handling
In line with the inclusive language migration defined at:

https://wiki.yoctoproject.org/wiki/Inclusive_language

replace the use of "abort" with "halt" in code related to handling
task failure.

(Bitbake rev: 831fb7f2329a3cd95b71e9c85d7d7f0d717f947f)

Signed-off-by: Scott Murray <scott.murray@konsulko.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-02-21 23:37:26 +00:00
Scott Murray
3ff067647f bitbake: bitbake: Rename allowed multiple provider variable
In line with the inclusive language migration defined at:

https://wiki.yoctoproject.org/wiki/Inclusive_language

rename:

MULTI_PROVIDER_WHITELIST -> BB_MULTI_PROVIDER_ALLOWED

(Bitbake rev: a09546b725fda13c0279638c7c904110da7bf6cd)

(Bitbake rev: d035435c1a4951a45481867cf932faa4a6f8f936)

Signed-off-by: Scott Murray <scott.murray@konsulko.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-02-21 23:37:26 +00:00
Scott Murray
3c971c0400 bitbake: bitbake: Rename setscene enforce filtering variable
In line with the inclusive language migration defined at:

https://wiki.yoctoproject.org/wiki/Inclusive_language

rename:

BB_SETSCENE_ENFORCE_WHITELIST -> BB_SETSCENE_ENFORCE_IGNORE_TASKS

(Bitbake rev: 2e243ac06581c4de8c6e697dfba460ca017d067c)

(Bitbake rev: f8f7b80a0df4646247e58238a52a7d85a37116d4)

Signed-off-by: Scott Murray <scott.murray@konsulko.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-02-21 23:37:26 +00:00
Richard Purdie
15465422ec bitbake: runqueue: Drop BB_STAMP_POLICY/BB_STAMP_WHITELIST
The different stamp policies were poor versions of the siggen code and task
hashes, predating it and being used by packaged staging. They had many
limitations, hence their replacement. I'm not aware of any users of that
code any more so I believe it and the assoicated stamp whitelist variable
can simply be removed.

(Bitbake rev: 98407efc8c670abd71d3fa88ec3776ee9b5c38f3)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2022-01-12 21:10:24 +00:00
Richard Purdie
28937721ab bitbake: runqueue: Fix runall option handling
The previous fix for runall option handling had a small bug in it, it
didn't clear the originally processed task list which meant it was running
too many tasks. Fix this so the list is reset and rebuild correctly.

(Bitbake rev: 87c9e120897ed04dfc64d4752fc602f9bfcb8645)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-11-08 22:01:19 +00:00
Richard Purdie
47e5c456a2 bitbake: runqueue: Fix runall option task deletion ordering issue
The runbuild option handling in runqueue was flawed as items deleted from the
main task list may be dependencies and hence cause index errors.

Rather than modify runtaskentries straight away, compute a new shorted list
and use that as an input to the second phase. This avoids the need to add tasks
back to the list meaning delcount can be simplifed to a simple counter.

The second use case in runonly doen't re-add items so doesn't have this
issue.

(Bitbake rev: 3428e3c54eb5cc03ff96f9cee6dc839afee7a419)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-11-05 11:41:46 +00:00
Richard Purdie
34e4eebc32 bitbake: lib/bb: Fix string concatination potential performance issues
Python scales badly when concatinating strings in loops. Most of these
references aren't problematic but at least one (in data.py) is probably
a performance issue as the issue is compounded as strings become large.

The way to handle this in python is to create lists which don't reconstruct
all the objects when appending to them. We may as well fix all the references
since it stops them being copy/pasted into something problematic in the future.

This patch was based on issues highligthted by a report from AWS Codeguru.

(Bitbake rev: d654139a833127b16274dca0ccbbab7e3bb33ed0)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-11-03 10:12:42 +00:00
Richard Purdie
9abab49f2b bitbake: lib/bb: Clean up use of len()
(Bitbake rev: bbbc843e86639604d00d76b1949b94a78cf1d95d)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-11-03 10:12:42 +00:00
Richard Purdie
6b52061123 bitbake: runqueue/knotty: Improve UI handling of setscene task counting
The recent fixes to merge setscene and normal task accounting in runqueue
fixed some display issues but broke the task numbering of setscene tasks.

Add new accounting methods to the stats structure specifically designed
for setscene. This accounts for the fact that setscene tasks can rerun
multiple times in the build.

Then use the new data in the UI to correctly display the numbers the
user wants to see to understand progress.

(Bitbake rev: ed7e2da88bf4b7bfc7ebfc12b9bd6c0fb7d8c1aa)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-09-17 07:26:23 +01:00
Richard Purdie
15c87c405f bitbake: runqueue: Clean up task stats handling
When we parallelised normal and setscene tasks, the task stats
handling was left separate pending further thought. We had to remove
handling of the setscene tasks from the UI in order to maintain
consistent task numbering.

Currently, "0 of 0" tasks can be shown as setscene tasks execute
until the first normal task runs.

The only use left for sq_stats is in the active task numbers which
we can use the length of sq_ive for instead. We can therefore
drop it and return stats in all cases. This removes the "0 of 0" task
problem since the stats in all normal and setscene tasks matches.

[YOCTO #14479]

(Bitbake rev: eae6e947e37e18cded053814bd2a268b44fb25cd)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-09-11 22:39:19 +01:00
Richard Purdie
fd9c3a5a14 bitbake: runqueue: Fix issues with multiconfig deferred task deadlock messages
In multiconfig builds with large numbers of identical tasks, builds were
deadlocking after recent runqueue changes upon rebuilds where there was
heavy sstate usage (i.e. on second builds after a first completed).

The issue was that deferred tasks were being left indefinitely on
the deferred list. The deadlock handler was then "breaking" things
by failing tasks that had already succeeded, leading to the task
being on both covered and not covered lists, giving a further error.

The fix is to clean up the deferred task list when each setscene task
completes. I'd previously been hoping to avoid iterating that list
but it appears unavoidable.

[YOCTO #14342]

(Bitbake rev: ae24a0f2d2d8b4b5ec10efabd0e9362e560832ea)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-09-08 15:33:14 +01:00
Richard Purdie
7e599a1feb bitbake: runqueue: Avoid deadlock avoidance task graph corruption
If the deferred task deadlock avoidance code triggers, it could mark an executed
task as failed which leads to "covered and not covered" error messages. Improve
the logic so if the deadlock code is triggered, it doesn't cause the errors.

(Bitbake rev: 51bdd6cb3bd9e2c02e261fb578bb945b86b82c75)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-09-08 15:33:14 +01:00
Richard Purdie
a15eea5077 bitbake: runqueue: Improve multiconfig deferred task issues
The previous patches have exposed new issues with this code path,
the issues being around what should happen when the hash of a task
changes and the task is or is not on the deferred task list.

Rather than rebuilding the deferred task list during each rehash
event, build it once at the start of a build. This avoids the problem
of tasks being added back after they have run and also avoids problems
of always ensuring the same task is deferred. It also allows the
'outrightfail' codepath to be handled separately as the conditions
are subtly differnt.

One significant win for the new approch is the build is not continually
printing out lists of deferred tasks, that list remains fairly static
from the start of the build. Logic is added in to ensure a rehashed
task with a hash matching other deferred tasks is deferred along with
them as a small optimization.

An interesting test case for this code was reported by Mark Hatle
with four multiconfigs, each the same apart from TMPDIR and running a
build of:

bitbake buildtools-tarball mc:{one,two,three,four}:core-image-minimal

which is interesting in that the build of buildtools partially overlaps
core-image-minimal and the build has a rehash event for qemuwrapper-cross
even without any external hash equivalence server or preexisting data.

(Bitbake rev: bb424e0a6d274d398f434f7df63951da9ce305b3)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-08-05 17:15:54 +01:00
Richard Purdie
93aaa5e994 bitbake: runqueue: Handle deferred task rehashing in multiconfig builds
If the hash of a task changes and that hash is a deferred task (e.g. a multiconfig
build), we need to ensure that the hash change propagates through to all the tasks
else the build will run multiple copies of the task, sometimes with oddly differing
results as the outhashes of native tasks built in differing locations can confuse
things.

(Bitbake rev: 2db571324f755edc4981deecbcfdf0aaa5a97627)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-05-01 22:51:06 +01:00
Richard Purdie
76891afd76 bitbake: runqueue: Fix multiconfig deferred task sstate validity caching issue
We were testing the validity of deferred tasks setscene status "up front" which
is very unlikely to succeed and leads to cache invalidation issues. With the
change to rebuild the deferred task list, this status becomes out of sync. The
result was tasks being executed when they should not have been leading to extra
work for the build unnecessarily.

Instead, don't process validity status for deferred tasks and assume their
data will become available. If it doesn't, this will now result in a build
error as the setscene task will fail and the main task will run instead.

In theory we could try and track the state changes in the deferred list and
re-test validity then but I'm not sure it is worth the effort when the other
code path and errors in setscene tasks will give a pretty good idea of what
is happening anyway.

(Bitbake rev: edcafac13b3b241b6687419e59018d21811507a1)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-05-01 22:51:06 +01:00
Richard Purdie
7ea37b9291 bitbake: runqueue: Fix deferred task issues
In a multiconfig situation there are circumstances where firstly, tasks
are deferred when they shouldn't be, then later, tasks can end up as
both covered and not covered.

This patch fixes two related issues. Firstly, the stamp validity checking
is done up front in the build and not reevaulated. When rebuilding the
deferred task list after scenequeue hash change updates, we need therefore
need to check if a task was in notcovered *or* covered when deciding to
defer it. This avoids strange logs like:

NOTE: Running setscene task X of Y (mc:initrfs_guest:/A/alsa-state.bb:do_deploy_source_date_epoch_setscene)
NOTE: Deferring mc:initrfs_guest:/A/alsa-state.bb:do_deploy_source_date_epoch after mc:host:/A/alsa-state.bb:do_deploy_source_date_epoch

where tasks have run but are then deferred.

Since we're recalculating the whole list, we also need to clear it before
iterating to rebuild it. By ensuring covered tasks aren't added to the
deferred queue, the covered + notcovered issue should also be avoided.
in the task deadlock forcing code.

[YOCTO #14342]

(Bitbake rev: 3c8717fb9ee1114dd80fc1ad22ee6c9e312bdac7)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-04-18 11:38:22 +01:00
Richard Purdie
9982b2c0f2 bitbake: runqueue: Further fixes for confused setscene tasks
There is further evidence of tasks ending up being "covered" and "notcovered"
which shouldn't happen and is bad. The code that caused this problem last
time appears to have issues where stamps for tasks already exist.

Split out the setscene stamp checking code to a separate function and
use this when checking "hard dependencies" (like pseudo-native) so
that if the stamps exist and it will be "covered", it is not put on
the notcovered list.

(Bitbake rev: a1848a481e36b729c8e4130c394b1d462d4b488a)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-04-06 09:31:24 +01:00
Richard Purdie
86ec266d91 bitbake: runqueue/event: Add an event for notifying of stale setscene tasks
Use the new functionality in build.py to identify stale setscene tasks
and send an event to the metadata listing them. The metadata then
has the option of performing cleanup operations if it thinks that
appropriate.

(Bitbake rev: ef8c980a3ae92c168b7ca16a4d19cd38a9574761)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-03-23 22:51:25 +00:00
Tomasz Dziendzielski
5386b3db50 bitbake: runqueue: Print pseudo.log if fakeroot task failed
Currently if pseudo fails we can only see the path to pseudo.log. If we
have no access to server and can only rely on bitbake log then debugging
becomes impossible. This printing needs to be added in runqueue level,
not inside task execution, because in some cases task fails with pseudo
abort really early and we don't even see any log.

In this change I'm adding pseudo log printing in every fakeroot task
failure that logged `mismatch`, `error` or `fatal` to logfile, because
we have no other way to communicate with pseudo if it failed or not.
Only lines from last pseudo server execution will be printed.

(Bitbake rev: e7c664a947903ed7b868abef62af2ff5f8ef0dc6)

Signed-off-by: Tomasz Dziendzielski <tomasz.dziendzielski@gmail.com>
Signed-off-by: Jan Brzezanski <jan.brzezanski@gmail.com>
Signed-off-by: Adrian Walag
Signed-off-by: Paulo Neves <ptsneves@gmail.com>
Signed-off-by: Mikolaj Lasota <mikolaj.lasota@protonmail.com>
Signed-off-by: Wiktor Baura <wbaura@gmail.com>
Signed-off-by: Kamil Kwiek <kamil.kwiek@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-03-11 14:04:45 +00:00
Richard Purdie
31c4eec40a bitbake: runqueue: Add setscene task overlap sanity check
We've seen hard to debug issues where a task ends up in both the
covered and notcovered list. Add a sanity check to ensure if this
happens in future, we see it in the logs.

(Bitbake rev: 6e001410854792f9bb66a0409a2ac176171b0507)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-03-09 00:02:05 +00:00
Richard Purdie
341208aa87 bitbake: runqueue: Fix task execution corruption issue
We've seen occasional issues where linux-yocto:do_compile_kernelmodules would
run without do_shared_workdir running before it. do_shared_workdir is an
setscene task but never has an sstate object generated so it will always
rerun. This should not happen since compile_kernemodules should only
execute if a setscene that depends on it didn't run and that should trigger
do_shared_workdir not to be marked as covered.

The issue is that build-appliance-image:do_package is one of the tasks which
covers linux-yocto:do_compile_kernelmodules but it is also a noexec task
and has a dependecy on pseudo-native:do_populate_sysroot.

In the problem case, pseudo-native:do_populate_sysroot is unavailable but
marked as covered since it is noexec. The "harddeps" code then also marks it
as notcovered. No task should ever be both covered and notcovered and this
is where the problems come from.

The solution is for the harddeps code only to to fail tasks if they've not
already been handled in some way. The code is assuming code couldn't have
handled revdeps at this point but we now have clear evidence they can.

(Bitbake rev: f66556bbb38449789ceea2fd105e9f68df7fb660)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-03-09 00:02:05 +00:00
Richard Purdie
f4fb744657 bitbake: bitbake-worker/runqueue: Add support for BB_DEFAULT_UMASK
Currently each task has to have a umask specified individually. This
is leading to determinism issues since it is easy to miss specifying
this for an extra task.

Add support for specifing the default task umask globally which
simplifies the problem.

(Bitbake rev: 3e664599fd54a8a37ce587022fcbce5ca26f2ed3)

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-02-16 11:26:11 +00:00
Tomasz Dziendzielski
8b792d4f75 bitbake: event: Prevent bitbake from executing event handler for wrong multiconfig target
When multiconfig is used bitbake might try to run events that don't
exist for specific mc target. In cooker.py we pass
`self.databuilder.mcdata[mc]` data that contains names of events'
handlers per mc target, but fire_class_handlers uses global _handlers
variable that is created during parsing of all the targets.

This leads to a problem where bitbake runs event handler that don't
exist for a target or even overrides them - if multiple targets use
event handler with the same name but different code then only one
version will be executed for all targets.

See [YOCTO #13071] for detailed bug information.

Add mc target name as a prefix to event handler name so there won't be
two different handlers with the same name. Add internal __BBHANDLERS_MC
variable to have the handlers lists per machine.

(Bitbake rev: 5f7fdf7b2d8c59805c8ef4dae84f536baa5e172b)

Signed-off-by: Tomasz Dziendzielski <tomasz.dziendzielski@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-02-16 11:26:11 +00:00
Joshua Watt
75f87db413 bitbake: logging: Make bitbake logger compatible with python logger
The bitbake logger overrode the definition of the debug() logging call
to include a debug level, but this causes problems with code that may
be using standard python logging, since the extra argument is
interpreted differently.

Instead, change the bitbake loggers debug() call to match the python
logger call and add a debug2() and debug3() API to replace calls that
were logging to a different debug level.

[RP: Small fix to ensure bb.debug calls bbdebug()]
(Bitbake rev: f68682a79d83e6399eb403f30a1f113516575f51)

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-02-10 23:48:16 +00:00