- Remove tristate choice support from Kconfig
- Stop using the PROVIDE() directive in the linker script
- Reduce the number of links for the combination of CONFIG_DEBUG_INFO_BTF
and CONFIG_KALLSYMS
- Enable the warning for symbol reference to .exit.* sections by default
- Fix warnings in RPM package builds
- Improve scripts/make_fit.py to generate a FIT image with separate base
DTB and overlays
- Improve choice value calculation in Kconfig
- Fix conditional prompt behavior in choice in Kconfig
- Remove support for the uncommon EMAIL environment variable in Debian
package builds
- Remove support for the uncommon "name <email>" form for the DEBEMAIL
environment variable
- Raise the minimum supported GNU Make version to 4.0
- Remove stale code for the absolute kallsyms
- Move header files commonly used for host programs to scripts/include/
- Introduce the pacman-pkg target to generate a pacman package used in
Arch Linux
- Clean up Kconfig
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmagBLUVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGmoUQAJ8pnURs0g+Rcyk6bdY/qtXBYkS+
nXpIK1ssFgRRgAQdeszYtvBqLFzb0wRCSie87G1AriD/JkVVTjCCY1For1y+vs0u
a7HfxitHhZpPyZW/T+WMQ3LViNccpkx+DFAcoRH8xOY/XPEJKVUby332jOIXMuyg
+NKIELQJVsLhcDofTUGb5VfIQektw219n5c4jKjXdNk4ZtE24xCRM5X528ZebwWJ
RZhMvJ968PyIH1IRXvNt6dsKBxoGIwPP8IO6yW9hzHaNsBqt7MGSChSel7r1VKpk
iwCNApJvEiVBe5wvTSVOVro7/8p/AZ70CQAqnMJV+dNnRqtGqW7NvL6XAjZRJgJJ
Uxe5NSrXgQd3FtqfcbXLetBgp9zGVt328nHm1HXHR5rFsvoOiTvO7hHPbhA+OoWJ
fs+jHzEXdAMRgsNrczPWU5Svq6MgGe4v8HBf0m8N1Uy65t/O+z9ti2QAw7kIFlbu
/VSFNjw4CHmNxGhnH0khCMsy85FwVIt9Ux+2d6IEc0gP8S1Qa1HgHGAoVI4U51eS
9dxEPVJNPOugaIVHheuS3wimEO6wzaJcQHn4IXaasMA7P6Yo4G/jiGoy4cb9qPTM
Hb+GaOltUy7vDoG4D2LSym8zR8rdKwbIf/5psdZrq/IWVKq5p+p7KWs3aOykSoM7
o6Hb532Ioalhm8je
=BYu7
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Remove tristate choice support from Kconfig
- Stop using the PROVIDE() directive in the linker script
- Reduce the number of links for the combination of CONFIG_KALLSYMS and
CONFIG_DEBUG_INFO_BTF
- Enable the warning for symbol reference to .exit.* sections by
default
- Fix warnings in RPM package builds
- Improve scripts/make_fit.py to generate a FIT image with separate
base DTB and overlays
- Improve choice value calculation in Kconfig
- Fix conditional prompt behavior in choice in Kconfig
- Remove support for the uncommon EMAIL environment variable in Debian
package builds
- Remove support for the uncommon "name <email>" form for the DEBEMAIL
environment variable
- Raise the minimum supported GNU Make version to 4.0
- Remove stale code for the absolute kallsyms
- Move header files commonly used for host programs to scripts/include/
- Introduce the pacman-pkg target to generate a pacman package used in
Arch Linux
- Clean up Kconfig
* tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (65 commits)
kbuild: doc: gcc to CC change
kallsyms: change sym_entry::percpu_absolute to bool type
kallsyms: unify seq and start_pos fields of struct sym_entry
kallsyms: add more original symbol type/name in comment lines
kallsyms: use \t instead of a tab in printf()
kallsyms: avoid repeated calculation of array size for markers
kbuild: add script and target to generate pacman package
modpost: use generic macros for hash table implementation
kbuild: move some helper headers from scripts/kconfig/ to scripts/include/
Makefile: add comment to discourage tools/* addition for kernel builds
kbuild: clean up scripts/remove-stale-files
kconfig: recursive checks drop file/lineno
kbuild: rpm-pkg: introduce a simple changelog section for kernel.spec
kallsyms: get rid of code for absolute kallsyms
kbuild: Create INSTALL_PATH directory if it does not exist
kbuild: Abort make on install failures
kconfig: remove 'e1' and 'e2' macros from expression deduplication
kconfig: remove SYMBOL_CHOICEVAL flag
kconfig: add const qualifiers to several function arguments
kconfig: call expr_eliminate_yn() at least once in expr_eliminate_dups()
...
Commit cf8e865810 ("arch: Remove Itanium (IA-64) architecture")
removed the last use of the absolute kallsyms.
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/all/20240221202655.2423854-1-jannh@google.com/
[masahiroy@kernel.org: rebase the code and reword the commit description]
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Commit b2b9d3a3f0 ("perf pmu: Support wildcards on pmu name in dynamic
pmu events") gives the following example for wildcarding a subset of
PMUs:
E.g., in a system with the following dynamic pmus:
mypmu_0
mypmu_1
mypmu_2
mypmu_4
perf stat -e mypmu_[01]/<config>/
Since commit f91fa2ae63 ("perf pmu: Refactor perf_pmu__match()"), only
"*" has been supported, removing the ability to subset PMUs, even though
parse-events.l still supports ? and [] characters.
Fix it by using fnmatch() when any glob character is detected and add a
test which covers that and other scenarios of
perf_pmu__match_ignoring_suffix().
Fixes: f91fa2ae63 ("perf pmu: Refactor perf_pmu__match()")
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: robin.murphy@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240626145448.896746-2-james.clark@arm.com
The test has been failing for some time when two separate runs of
perf benchmarks are recorded for cycles events and their counts are
compared, while once the recording was done with option --bpf-counters
and once without it. It is expected that the count of the samples
should be within a certain range, firstly the difference was set to be
within 10%, which was then later raised to 20%. However, the test case
keeps failing on certain architectures as recording the provided
benchmark can produce completely different counts based on the
current load of the system.
Sampling two separate runs on intel-eaglestream-spr-13 of "perf stat
--no-big-num -e cycles -- perf bench sched messaging -g 1 -l 100 -t":
Performance counter stats for 'perf bench sched messaging -g 1 -l 100 -t':
396782898 cycles
0.010051983 seconds time elapsed
0.008664000 seconds user
0.097058000 seconds sys
Performance counter stats for 'perf bench sched messaging -g 1 -l 100 -t':
1431133032 cycles
0.021803714 seconds time elapsed
0.023377000 seconds user
0.349918000 seconds sys
, which is ranging from 400mil to 1400mil samples.
Instead of recording the cycles use instructions event, which provides
more stable values. At the same time change the tested workload to one
of the provided testing workloads by perf that is not based on a
scheduler, which can provide another dependency on the current load.
Sampling instructions event with the new workload provide much more
stable results on intel-eaglestream-spr-13 of "perf stat --no-big-num
-e instructions -- perf test -w brstack":
Performance counter stats for 'perf test -w brstack':
64584494 instructions
0.009173945 seconds time elapsed
0.007262000 seconds user
0.002071000 seconds sys
Performance counter stats for 'perf test -w brstack':
64672669 instructions
0.008888135 seconds time elapsed
0.005018000 seconds user
0.004018000 seconds sys
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: mpetlan@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625092001.10909-1-vmolnaro@redhat.com
Make the tests code its own library. This is done to avoid compiling
code twice, once for the perf tool and once for the perf python
module.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-5-irogers@google.com
Test "perf probe of function from different CU" only checks if the perf
command has failed and doesn't test the --funcs output. In the issue
reported in the previous commit, the garbage output of the --funcs
command was being ignored by the test when it could have been caught.
The script first makes use of --funcs option with the perf probe command
to check if the function "foo" exists in the testfile before adding a
probe to it in the next command. The output of probe...--funcs command
is redirected to stdout, therefore, add '| grep "foo"' to validate the
result.
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: anshuman.khandual@arm.com
Cc: james.clark@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240601125946.1741414-11-ChaitanyaS.Prakash@arm.com
The 2 second sleep can cause the test to fail on very slow network file
systems because Perf ends up being killed before it finishes starting
up.
Fix it by making the leafloop workload end after a fixed time like the
other workloads so there is no need to kill it after 2 seconds.
Also remove the 1 second start sampling delay because it is similarly
fragile. Instead, search through all samples for a matching one, rather
than just checking the first sample and hoping it's in the right place.
Fixes: cd6382d827 ("perf test arm64: Test unwinding using fame-pointer (fp) mode")
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: German Gomez <german.gomez@arm.com>
Cc: Spoorthy S <spoorts2@in.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240612140316.3006660-1-james.clark@arm.com
Running "perftool-testsuite_probe" fails as below:
./perf test -v "perftool-testsuite_probe"
83: perftool-testsuite_probe : FAILED
There are three fails:
1. Regexp not found: "\s*probe:inode_permission(?:_\d+)?\s+\(on inode_permission(?:[:\+][0-9A-Fa-f]+)?@.+\)"
-- [ FAIL ] -- perf_probe :: test_adding_kernel :: listing added probe :: perf probe -l (output regexp parsing)
2. Regexp not found: "probe:vfs_mknod"
Regexp not found: "probe:vfs_create"
Regexp not found: "probe:vfs_rmdir"
Regexp not found: "probe:vfs_link"
Regexp not found: "probe:vfs_write"
-- [ FAIL ] -- perf_probe :: test_adding_kernel :: wildcard adding support (command exitcode + output regexp parsing)
3. Regexp not found: "Failed to find"
Regexp not found: "somenonexistingrandomstuffwhichisalsoprettylongorevenlongertoexceed64"
Regexp not found: "in this function|at this address"
Line did not match any pattern: "The /boot/vmlinux file has no debug information."
Line did not match any pattern: "Rebuild with CONFIG_DEBUG_INFO=y, or install an appropriate debuginfo package."
These three tests depends on kernel debug info.
1. Fail 1 expects file name along with probe which needs debuginfo
2. Fail 2 :
perf probe -nf --max-probes=512 -a 'vfs_* $params'
Debuginfo-analysis is not supported.
Error: Failed to add events.
3. Fail 3 :
perf probe 'vfs_read somenonexistingrandomstuffwhichisalsoprettylongorevenlongertoexceed64'
Debuginfo-analysis is not supported.
Error: Failed to add events.
There is already helper function skip_if_no_debuginfo in
lib/probe_vfs_getname.sh which does perf probe and returns
"2" if debug info is not present. Use the skip_if_no_debuginfo
function and skip only the three tests which needs debuginfo
based on the result.
With the patch:
83: perftool-testsuite_probe:
--- start ---
test child forked, pid 3927
-- [ PASS ] -- perf_probe :: test_adding_kernel :: adding probe inode_permission ::
-- [ PASS ] -- perf_probe :: test_adding_kernel :: adding probe inode_permission :: -a
-- [ PASS ] -- perf_probe :: test_adding_kernel :: adding probe inode_permission :: --add
-- [ PASS ] -- perf_probe :: test_adding_kernel :: listing added probe :: perf list
Regexp not found: "\s*probe:inode_permission(?:_\d+)?\s+\(on inode_permission(?:[:\+][0-9A-Fa-f]+)?@.+\)"
-- [ SKIP ] -- perf_probe :: test_adding_kernel :: 2 2 Skipped due to missing debuginfo :: testcase skipped
-- [ PASS ] -- perf_probe :: test_adding_kernel :: using added probe
-- [ PASS ] -- perf_probe :: test_adding_kernel :: deleting added probe
-- [ PASS ] -- perf_probe :: test_adding_kernel :: listing removed probe (should NOT be listed)
-- [ PASS ] -- perf_probe :: test_adding_kernel :: dry run :: adding probe
-- [ PASS ] -- perf_probe :: test_adding_kernel :: force-adding probes :: first probe adding
-- [ PASS ] -- perf_probe :: test_adding_kernel :: force-adding probes :: second probe adding (without force)
-- [ PASS ] -- perf_probe :: test_adding_kernel :: force-adding probes :: second probe adding (with force)
-- [ PASS ] -- perf_probe :: test_adding_kernel :: using doubled probe
-- [ PASS ] -- perf_probe :: test_adding_kernel :: removing multiple probes
Regexp not found: "probe:vfs_mknod"
Regexp not found: "probe:vfs_create"
Regexp not found: "probe:vfs_rmdir"
Regexp not found: "probe:vfs_link"
Regexp not found: "probe:vfs_write"
-- [ SKIP ] -- perf_probe :: test_adding_kernel :: 2 2 Skipped due to missing debuginfo :: testcase skipped
Regexp not found: "Failed to find"
Regexp not found: "somenonexistingrandomstuffwhichisalsoprettylongorevenlongertoexceed64"
Regexp not found: "in this function|at this address"
Line did not match any pattern: "The /boot/vmlinux file has no debug information."
Line did not match any pattern: "Rebuild with CONFIG_DEBUG_INFO=y, or install an appropriate debuginfo package."
-- [ SKIP ] -- perf_probe :: test_adding_kernel :: 2 2 Skipped due to missing debuginfo :: testcase skipped
-- [ PASS ] -- perf_probe :: test_adding_kernel :: function with retval :: add
-- [ PASS ] -- perf_probe :: test_adding_kernel :: function with retval :: record
-- [ PASS ] -- perf_probe :: test_adding_kernel :: function argument probing :: script
## [ PASS ] ## perf_probe :: test_adding_kernel SUMMARY
---- end(0) ----
83: perftool-testsuite_probe : Ok
Only the three specific tests are skipped and remaining
ran successfully.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240617122121.7484-1-atrajeev@linux.vnet.ibm.com
PowerPC has mixed case events matching legacy hardware cache
events. Warn but don't fail in this case. Event parsing will still
work in this case by matching the legacy case.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240612124027.2712643-1-irogers@google.com
On some s390 linux machine (mostly older models) and with debug
packages installed, the test case 'perf annotate basic tests' runs
for some longer time.
Speed up the test and save the output of command perf annotate
in a temporary file. This is used to perform pattern matching via
grep command. This saves on invocation of perf annotate which
runs for some time.
Output before:
# time bash -x tests/shell/annotate.sh >/dev/null 2>&1; echo EXIT CODE $?
real 4m35.543s
user 3m19.442s
sys 1m14.322s
EXIT CODE 0
#
Output after:
# time bash -x tests/shell/annotate.sh >/dev/null 2>&1; echo EXIT CODE $?
real 2m2.881s
user 1m30.980s
sys 0m30.684s
EXIT CODE 0
#
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: sumanthk@linux.ibm.com
Cc: svens@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607054352.2774936-1-tmricht@linux.ibm.com
Test behavior of PMU names and comparisons wrt suffixes using Intel
uncore_cha, marvell mrvl_ddr_pmu and S390's cpum_cf as examples.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: Bhaskara Budiredla <bbudiredla@marvell.com>
Cc: Tuan Phan <tuanphan@os.amperecomputing.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240515060114.3268149-3-irogers@google.com
Tracepoints can start with digits, although we don't have many of these:
$ rg -g '*.h' '\bTRACE_EVENT\([0-9]'
net/mac802154/trace.h
53:TRACE_EVENT(802154_drv_return_int,
...
net/ieee802154/trace.h
66:TRACE_EVENT(802154_rdev_add_virtual_intf,
...
include/trace/events/9p.h
124:TRACE_EVENT(9p_client_req,
...
Just allow names to start with digits too so e.g. "perf trace -e '9p:*'"
works
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240510-perf_digit-v4-3-db1553f3233b@codewreck.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The next commit will allow tracepoints starting with digits, but most
systems do not have any available by default so tests should skip the
actual "check if it exists in /sys/kernel/debug/tracing" step.
In order to do that, add a new boolean flag specifying if we should
actually "format" the probe or not.
Originally-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240510-perf_digit-v4-2-db1553f3233b@codewreck.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add reference count checking and switch 'struct mem_info' usage to use
accessor functions.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move mem-info to its own header rather than having it split between
mem-events and symbol.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The dso pointer in 'struct dso_data' is necessary for reference count
checking to account for the dso_data forming a global list of open dso's
with references to the dso.
The dso pointer also allows for the indirection that reference count
checking needs. Outside of reference count checking the indirection
isn't needed and container_of() is more efficient and saves space.
The reference count won't be increased by placing items onto the global
list, matching how things were before the reference count checking
change, but we assert the dso is in dsos holding it live (and that the
set of open dsos is a subset of all dsos for the machine).
Update the DSO data tests so that they use a dsos struct to make the
invariant true.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/20240506180104.485674-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add reference count checking to struct dso, this can help with
implementing correct reference counting discipline. To avoid
RC_CHK_ACCESS everywhere, add accessor functions for the variables in
struct dso.
The majority of the change is mechanical in nature and not easy to
split up.
Committer testing:
'perf test' up to this patch shows no regressions.
But:
util/symbol.c: In function ‘dso__load_bfd_symbols’:
util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’
1683 | dso__set_adjust_symbols(dso);
| ^~~~~~~~~~~~~~~~~~~~~~~
In file included from util/symbol.c:21:
util/dso.h:268:20: note: declared here
268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val)
| ^~~~~~~~~~~~~~~~~~~~~~~
make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1
MKDIR /tmp/tmp.ZWHbQftdN6/tests/workloads/
make[6]: *** Waiting for unfinished jobs....
This was updated:
- symbols__fixup_end(&dso->symbols, false);
- symbols__fixup_duplicate(&dso->symbols);
- dso->adjust_symbols = 1;
+ symbols__fixup_end(dso__symbols(dso), false);
+ symbols__fixup_duplicate(dso__symbols(dso));
+ dso__set_adjust_symbols(dso);
But not build tested with BUILD_NONDISTRO and libbfd devel files installed
(binutils-devel on fedora).
Add the missing argument:
symbols__fixup_end(dso__symbols(dso), false);
symbols__fixup_duplicate(dso__symbols(dso));
- dso__set_adjust_symbols(dso);
+ dso__set_adjust_symbols(dso, true);
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Chengen Du <chengen.du@canonical.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Being either lower or upper case means event name probes can avoid
scanning the directory doing case insensitive comparisons, just the
lower or upper case version of the name can be checked for
existence.
For the majority of PMUs event names are all lower case, upper case
names are present on S390.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240502213507.2339733-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allow events/aliases to be eagerly loaded for a PMU. Factor out the
pmu_aliases_parse to allow this.
Parse a test event and check it configures the attribute as expected.
There is overlap with the parse-events tests, but this test is done with
a PMU created in a temp directory and doesn't rely on PMUs in sysfs.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240502213507.2339733-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In tests/pmu.c, make a common utility that creates a PMU in a mkdtemp
directory and uses regular PMU parsing logic to load that PMU. Formats
must still be eagerly loaded as by default the PMU code assumes devices
are going to be in sysfs.
In util/pmu.[ch], hide perf_pmu__format_parse but add the eager argument
to perf_pmu__lookup called by perf_pmus__add_test_pmu. Later patches
will eagerly load other non-sysfs files when eager loading is enabled.
In tests/pmu.c, rather than manually constructing a list of term
arguments, just use the term parsing code from a string.
Add more comments and debug logging.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240502213507.2339733-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add JSON to the test name.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240502213507.2339733-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can't default to doing parallel tests as there are tests that compete
for the same resources and thus clash, for instance tests that put in
place 'perf probe' probes, that clean the probes without regard to other
tests needs, ARM64 coresight tests, Intel PT ones, etc.
So reintroduce --p/--parallel and make -S/--sequential the default.
We need to come up with infrastructure that state which tests can't run
in parallel because they need exclusive access to some resource,
something as simple as "probes" that would then avoid 'perf probe' tests
from running while other such test is running, or make the tests more
resilient, till then we can't use parallel mode as default.
While at it, document all these options in the 'perf test' man page.
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reported-by: James Clark <james.clark@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Ziwm18BqIn_vc1vn@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Switch from "cache-references" to "branches" in test as Intel has a
sysfs event for "cache-references" and changing the priority for sysfs
over legacy causes the test to fail.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Beeman Strong <beeman@rivosinc.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240416061533.921723-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These tests record in a mode that includes kernel trace but look for
samples of a userspace process. This makes them sensitive to any kernel
compilation options that increase the amount of time spent in the
kernel. If the trace buffer is completely filled before userspace is
reached then the test will fail. Double the buffer size to fix this.
The other tests in the same file aren't sensitive to this for various
reasons, for example the iterate devices test filters by userspace trace
only. But in order to keep coverage of all the modes, increase the
buffer size rather than filtering by userspace for the basic tests.
Fixes: d1efa4a0a6 ("perf cs-etm: Add separate decode paths for timeless and per-thread modes")
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20240326113749.257250-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Refactor test to better enable sharing of logic, to give an idea of
progress and introduce test functions. Add test of measuring both
cycles and cycles:b simultaneously.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240416170014.985191-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Test "perf probe of function from different CU" fails due to certain
configs not being enabled. Building the kernel with
CONFIG_KPROBE_EVENTS=y and CONFIG_UPROBE_EVENTS=y fixes the issue. As
CONFIG_KPROBE_EVENTS is dependent on CONFIG_KPROBES, enable it as well.
Some platforms enable these configs as a part of their defconfig, so
this change is only required for the ones that don't do so.
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20240408062230.1949882-1-ChaitanyaS.Prakash@arm.com
Link: https://lore.kernel.org/r/20240408062230.1949882-7-ChaitanyaS.Prakash@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This check can be done with uname which is more portable. At the same
time re-arrange it into a standard if statement so that it's more
readable.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Spoorthy S <spoorts2@in.ibm.com>
Link: https://lore.kernel.org/r/20240410103458.813656-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PERF_PMU_CAP_EXTENDED_HW_TYPE results in multiple events being opened on
heterogeneous systems. Currently this test only sets its required
attributes on the first event. Not disabling enable_on_exec on the other
events causes the test to fail because the forked objdump processes are
sampled. No tracking event is opened so Perf only knows about its own
mappings causing the objdump samples to give the following error:
$ perf test -vvv "object code reading"
Reading object code for memory address: 0xffff9aaa55ec
thread__find_map failed
---- end(-1) ----
24: Object code reading : FAILED!
Fixes: 251aa04024 ("perf parse-events: Wildcard most "numeric" events")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Spoorthy S <spoorts2@in.ibm.com>
Link: https://lore.kernel.org/r/20240410103458.813656-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To prevent anyone from seeing a test failure appear as a regression and
thinking that it was caused by their code change, insert some noise into
the loop which makes it immune to sampling bias issues (errata 1694299).
The "test data symbol" test can fail with any unrelated change that
shifts the loop into an unfortunate position in the Perf binary which is
almost impossible to debug as the root cause of the test failure.
Ultimately it's caused by the referenced errata.
Fixes: 60abedb8aa ("perf test: Introduce script for data symbol testing")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Spoorthy S <spoorts2@in.ibm.com>
Link: https://lore.kernel.org/r/20240410103458.813656-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Switch from running tests sequentially to running in parallel by
default. Change the opt-in '-p' or '--parallel' flag to '-S' or
'--sequential'.
On an 8 core tigerlake an address sanitizer run time changes from:
326.54user 622.73system 6:59.91elapsed 226%CPU
to:
973.02user 583.98system 3:01.17elapsed 859%CPU
So over twice as fast, saving 4 minutes.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240301174711.2646944-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make the perf test output smoother by timing out the poll of the child
process after 100ms rather than 1s.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240301074639.2260708-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Switch from dumping err then out, to a single file descriptor for both
of them. This allows the err and output to be correctly interleaved in
verbose output.
Fixes: b482f5f8e0 ("perf tests: Add option to run tests in parallel")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240301074639.2260708-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Per-thread mode requires either system-wide (-a), a pid (-p) or a tid
(-t).
The stat output tests were using system-wide mode but this is racy when
threads are starting and exiting - something that happens a lot when
running the tests in parallel (perf test -p).
Avoid the race conditions by using pid mode with the pid of the parent
process.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240301074639.2260708-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rather than manually iterating the CPU map, use
perf_cpu_map__for_each_cpu(). When possible tidy local variables.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Link: https://lore.kernel.org/r/20240202234057.2085863-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just to make things clearer, return TEST_FAIL (-1) instead of an open
coded -1.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/ZdepeMsjagbf1ufD@x1
By default tests are forked, add an option (-p or --parallel) so that
the forked tests are all started in parallel and then their output
gathered serially. This is opt-in as running in parallel can cause
test flakes.
Rather than fork within the code, the start_command/finish_command
from libsubcmd are used. This changes how stderr and stdout are
handled. The child stderr and stdout are always read to avoid the
child blocking. If verbose is 1 (-v) then if the test fails the child
stdout and stderr are displayed. If the verbose is >1 (e.g. -vv) then
the stdout and stderr from the child are immediately displayed.
An unscientific test on my laptop shows the wall clock time for perf
test without parallel being 5 minutes 21 seconds and with parallel
(-p) being 1 minute 50 seconds.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: llvm@lists.linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221034155.1500118-9-irogers@google.com
Rather than special shell test logic, do a single pass to create an
array of test suites. Hold the shell test file name in the test suite
priv field. This makes the special shell test logic in builtin-test.c
redundant so remove it.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: llvm@lists.linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221034155.1500118-8-irogers@google.com
Avoid filename appending buffers by using openat, faccessat and
scandirat more widely. Turn the script's path back to a file name
using readlink from /proc/<pid>/fd/<fd>.
Read the script's description using api/io.h to avoid fdopen
conversions. Whilst reading perform additional sanity checks on the
script's contents.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: llvm@lists.linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221034155.1500118-7-irogers@google.com
builtin-test-list is primarily concerned with shell script
tests. Rename the file to better reflect this and add a missed header
guard.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: llvm@lists.linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221034155.1500118-6-irogers@google.com
perf test -vv Symbols is used to indentify symbols within the perf
binary. Add the -F flag so that the test command doesn't fork the test
before running. This removes a little overhead.
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: llvm@lists.linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221034155.1500118-4-irogers@google.com
Later we will use libcapstone to disassemble instructions of samples.
Signed-off-by: Changbin Du <changbin.du@huawei.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: changbin.du@gmail.com
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240217074046.4100789-2-changbin.du@huawei.com
As a form of validation, it is a common practice to check the outputs
of commands whether they contain expected patterns or match a certain
regex.
Add helpers for verifying that all regexes are found in the output, that
all lines match any pattern from a set and that a certain expression is
not present in the output.
In verbose mode these helpers log mismatches for easier failure
investigation.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: kjain@linux.ibm.com
Cc: atrajeev@linux.vnet.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240215110231.15385-6-mpetlan@redhat.com
Add new perf probe test case that acts as an entry element in perf test
list. Runs multiple subtests from directory "base_probe", which will be
added in incomming patches and can be expanded without further editing.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: kjain@linux.ibm.com
Cc: atrajeev@linux.vnet.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240215110231.15385-5-mpetlan@redhat.com
Unify perf regexes for checking testing output into a single file
to reduce duplicates and prevent errors when editing.
This will be used in upcomming patches in shell tests.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: kjain@linux.ibm.com
Cc: atrajeev@linux.vnet.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240215110231.15385-2-mpetlan@redhat.com
The test needs a struct machine and creates one for the current host,
but a side-effect is that struct machine has set up kernel maps
including module maps.
If the 'Symbols' test --dso option specifies a current kernel module,
it will already be present as a kernel dso, and a map with kmaps needs
to be used otherwise there will be a segfault - see below.
For that case, find the existing map and use that. In that case also,
the dso is split by section into multiple dsos, so test those dsos
also. That in turn, shows up that those dsos have not had overlapping
symbols removed, so the test fails.
Example:
Before:
$ perf test -F -v Symbols --dso /lib/modules/$(uname -r)/kernel/arch/x86/kvm/kvm-intel.ko
70: Symbols :
--- start ---
Testing /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
Segmentation fault (core dumped)
After:
$ perf test -F -v Symbols --dso /lib/modules/$(uname -r)/kernel/arch/x86/kvm/kvm-intel.ko
70: Symbols :
--- start ---
Testing /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
Overlapping symbols:
41d30-41fbb l vmx_init
41d30-41fbb g init_module
---- end ----
Symbols: FAILED!
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240131192416.16387-1-adrian.hunter@intel.com
Move the struct into the C file. Add maps__equal to work around
exposing the struct for reference count checking. Add accessors for
the unwind_libunwind_ops. Move maps_list_node to its only use in
symbol.c.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Artem Savkov <asavkov@redhat.com>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240210031746.4057262-6-irogers@google.com
Finding a map is done under a lock, returning the map without a
reference count means it can be removed without notice and causing
uses after free. Grab a reference count to the map within the lock
region and return this. Fix up locations that need a map__put
following this. Also fix some reference counted pointer comparisons.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Artem Savkov <asavkov@redhat.com>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240210031746.4057262-4-irogers@google.com
Finding a map is done under a lock, returning the map without a
reference count means it can be removed without notice and causing
uses after free. Grab a reference count to the map within the lock
region and return this. Fix up locations that need a map__put
following this.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Artem Savkov <asavkov@redhat.com>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240210031746.4057262-3-irogers@google.com
Maps is a collection of maps primarily sorted by the starting address
of the map. Prior to this change the maps were held in an rbtree
requiring 4 pointers per node. Prior to reference count checking, the
rbnode was embedded in the map so 3 pointers per node were
necessary. This change switches the rbtree to an array lazily sorted
by address, much as the array sorting nodes by name. 1 pointer is
needed per node, but to avoid excessive resizing the backing array may
be twice the number of used elements. Meaning the memory overhead is
roughly half that of the rbtree. For a perf record with
"--no-bpf-event -g -a" of true, the memory overhead of perf inject is
reduce fom 3.3MB to 3MB, so 10% or 300KB is saved.
Map inserts always happen at the end of the array. The code tracks
whether the insertion violates the sorting property. O(log n) rb-tree
complexity is switched to O(1).
Remove slides the array, so O(log n) rb-tree complexity is degraded to
O(n).
A find may need to sort the array using qsort which is O(n*log n), but
in general the maps should be sorted and so average performance should
be O(log n) as with the rbtree.
An rbtree node consumes a cache line, but with the array 4 nodes fit
on a cache line. Iteration is simplified to scanning an array rather
than pointer chasing.
Overall it is expected the performance after the change should be
comparable to before, but with half of the memory consumed.
To avoid a list and repeated logic around splitting maps,
maps__merge_in is rewritten in terms of
maps__fixup_overlap_and_insert. maps_merge_in splits the given mapping
inserting remaining gaps. maps__fixup_overlap_and_insert splits the
existing mappings, then adds the incoming mapping. By adding the new
mapping first, then re-inserting the existing mappings the splitting
behavior matches.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Artem Savkov <asavkov@redhat.com>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240210031746.4057262-2-irogers@google.com
Some platforms have 'cluster' topology and CPUs in the cluster will
share resources like L3 Cache Tag (for HiSilicon Kunpeng SoC) or L2
cache (for Intel Jacobsville). Currently parsing and building cluster
topology have been supported since [1].
perf stat has already supported aggregation for other topologies like
die or socket, etc. It'll be useful to aggregate per-cluster to find
problems like L3T bandwidth contention.
This patch add support for "--per-cluster" option for per-cluster
aggregation. Also update the docs and related test. The output will
be like:
[root@localhost tmp]# perf stat -a -e LLC-load --per-cluster -- sleep 5
Performance counter stats for 'system wide':
S56-D0-CLS158 4 1,321,521,570 LLC-load
S56-D0-CLS594 4 794,211,453 LLC-load
S56-D0-CLS1030 4 41,623 LLC-load
S56-D0-CLS1466 4 41,646 LLC-load
S56-D0-CLS1902 4 16,863 LLC-load
S56-D0-CLS2338 4 15,721 LLC-load
S56-D0-CLS2774 4 22,671 LLC-load
[...]
On a legacy system without cluster or cluster support, the output will
be look like:
[root@localhost perf]# perf stat -a -e cycles --per-cluster -- sleep 1
Performance counter stats for 'system wide':
S56-D0-CLS0 64 18,011,485 cycles
S7182-D0-CLS0 64 16,548,835 cycles
Note that this patch doesn't mix the cluster information in the outputs
of --per-core to avoid breaking any tools/scripts using it.
Note that perf recently supports "--per-cache" aggregation, but it's not
the same with the cluster although cluster CPUs may share some cache
resources. For example on my machine all clusters within a die share the
same L3 cache:
$ cat /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list
0-31
$ cat /sys/devices/system/cpu/cpu0/topology/cluster_cpus_list
0-3
[1] commit c5e22feffd ("topology: Represent clusters of CPUs within a die")
Tested-by: Jie Zhan <zhanjie9@hisilicon.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Cc: james.clark@arm.com
Cc: 21cnbao@gmail.com
Cc: prime.zeng@hisilicon.com
Cc: Jonathan.Cameron@huawei.com
Cc: fanghao11@huawei.com
Cc: linuxarm@huawei.com
Cc: tim.c.chen@intel.com
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240208024026.2691-1-yangyicong@huawei.com
stat+std_output.sh test fails on my arm64 machine:
[root@localhost shell]# ./stat+std_output.sh
Checking STD output: no args Unknown event name in TopDownL1 # 0.18 retiring
[root@localhost shell]# ./stat+std_output.sh
Checking STD output: no args [Success]
Checking STD output: system wide [Success]
Checking STD output: interval [Success]
Checking STD output: per thread Unknown event name in tmux: server-1114960 # 0.41 frontend_bound
When no args specified `perf stat` will add TopdownL1 metric group
and the output will be like:
[root@localhost shell]# perf stat -- stress-ng --vm 1 --timeout 1
stress-ng: info: [3351733] setting to a 1 second run per stressor
stress-ng: info: [3351733] dispatching hogs: 1 vm
stress-ng: info: [3351733] successful run completed in 1.02s
Performance counter stats for 'stress-ng --vm 1 --timeout 1':
1,037.71 msec task-clock # 1.000 CPUs utilized
13 context-switches # 12.528 /sec
1 cpu-migrations # 0.964 /sec
67,544 page-faults # 65.090 K/sec
2,691,932,561 cycles # 2.594 GHz (74.56%)
6,571,333,653 instructions # 2.44 insn per cycle (74.92%)
521,863,142 branches # 502.901 M/sec (75.21%)
425,879 branch-misses # 0.08% of all branches (87.57%)
TopDownL1 # 0.61 retiring (87.67%)
# 0.03 frontend_bound (87.67%)
# 0.02 bad_speculation (87.67%)
# 0.34 backend_bound (74.61%)
1.038138390 seconds time elapsed
0.844849000 seconds user
0.189053000 seconds sys
Metrics in group TopDownL1 don't have event name on arm64 but are not
listed in the $skip_metric list which they should be listed. Add them
to the skip list as what does for x86 platforms in [1].
[1] commit 4d60e83dfc ("perf test: Skip metrics w/o event name in stat STD output linter")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: linuxarm@huawei.com
Cc: kan.liang@linux.intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240207091222.54096-1-yangyicong@huawei.com
Prior to this patch '0' would be dropped as the config values default
to 0. Some json values are hex and the string '0' wouldn't match '0x0'
as zero. Add a more robust is_zero test to drop these event terms.
When encoding numbers as hex, if the number is between 0 and 9
inclusive then don't add a 0x prefix.
Update test expectations for these changes.
On x86 this reduces the event/metric C string by 58,411 bytes.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240131201429.792138-1-irogers@google.com
Prior to this patch the first and the last error encountered during
parsing are printed. To see other errors verbose needs
enabling. Unfortunately this can drop useful errors, in particular on
terms. This patch changes the errors so that instead of the first and
last all errors are recorded and printed, the underlying data
structure is changed to a list.
Before:
```
$ perf stat -e 'slots/edge=2/' true
event syntax error: 'slots/edge=2/'
\___ Bad event or PMU
Unable to find PMU or event on a PMU of 'slots'
Initial error:
event syntax error: 'slots/edge=2/'
\___ Cannot find PMU `slots'. Missing kernel support?
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
```
After:
```
$ perf stat -e 'slots/edge=2/' true
event syntax error: 'slots/edge=2/'
\___ Bad event or PMU
Unable to find PMU or event on a PMU of 'slots'
event syntax error: 'slots/edge=2/'
\___ value too big for format (edge), maximum is 1
event syntax error: 'slots/edge=2/'
\___ Cannot find PMU `slots'. Missing kernel support?
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
```
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: tchen168@asu.edu
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240131134940.593788-3-irogers@google.com
The original test report was too complicated to read with information
that not really useful. This new update simplify the report which should
largely improve the readibility.
Signed-off-by: Weilin Wang <weilin.wang@intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240130180907.639729-1-weilin.wang@intel.com
The daemon signal test sends signals and then expects files to be
written. It was observed on an Intel Alderlake that the signals were
sent too quickly leading to the 3 expected files not appearing.
To avoid this send the next signal only after the expected previous file
has appeared. To avoid an infinite loop the number of retries is
limited.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ross Zwisler <zwisler@chromium.org>
Cc: Shirisha G <shirisha@linux.ibm.com>
Link: https://lore.kernel.org/r/20240124043015.1388867-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
"grep -cv" can exit with an error code that causes the "set -e" to abort
the script. Switch to using the grep exit code in the if condition to
avoid this.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ross Zwisler <zwisler@chromium.org>
Cc: Shirisha G <shirisha@linux.ibm.com>
Link: https://lore.kernel.org/r/20240124043015.1388867-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Write the JSON output to a specific file to avoid debug output
breaking it.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ross Zwisler <zwisler@chromium.org>
Cc: Shirisha G <shirisha@linux.ibm.com>
Link: https://lore.kernel.org/r/20240124043015.1388867-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In linux next repo, test case 'perf script tests' fails on s390.
The root case is a command line invocation of 'perf record' with
call-graph information. On s390 only DWARF formatted call-graphs are
supported and only on software events.
Change the command line parameters for s390.
Output before:
# perf test 89
89: perf script tests : FAILED!
#
Output after:
# perf test 89
89: perf script tests : Ok
#
Fixes: 0dd5041c9a ("perf addr_location: Add init/exit/copy functions")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20240125100351.936262-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Even though this is a frame pointer unwind test, it's testing that a
frame pointer stack can be augmented correctly with a partial
Dwarf unwind. So add a feature check so that this test skips instead of
fails if Dwarf unwinding isn't present.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Spoorthy S <spoorts2@in.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Link: https://lore.kernel.org/r/20240123163903.350306-3-james.clark@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
This test case often fails on s390 (about 2 out of 10) because the
10% percent limit on the difference between --bpf-counters event counting
and s390 hardware counting is more than 10% in all failure cases.
Raise the limit to 20% on s390 and the test case succeeds.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: sumanthk@linux.ibm.com
Cc: svens@linux.ibm.com
Link: https://lore.kernel.org/r/20240108084009.3959211-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
perf test 17 'Setup struct perf_event_attr' fails on s390 z/VM guest,
using linux-next kernel.
Root cause is the fall-back from hardware counter cycles
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
size 136
config 0 (PERF_COUNT_HW_CPU_CYCLES)
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ADDR|PERIOD|DATA_SRC
read_format ID|LOST
which returns -ENOENT on s390 z/VM guest. This causes the code to fall
back to software counter task-clock, as can be seen in the debug output:
------------------------------------------------------------
perf_event_attr:
type 1 (PERF_TYPE_SOFTWARE)
size 136
config 0x1 (PERF_COUNT_SW_TASK_CLOCK) <-here
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ADDR|PERIOD|DATA_SRC
read_format ID|LOST
This succeeds on s390 z/VM guest.
This successful installation of the counter task-clock is not listed in
the expected results and the test case fails.
This is caused by commit eb2eac0c7b ("perf evsel: Fallback to
"task-clock" when not system wide") which introduced fall back from
event 'cycles' to event 'task-clock'.
To fix this on s390 allow event number 0 (cycles) and event number 1
(task-clock) as expected result.
Output before:
# ./perf test -Fv 17
17: Setup struct perf_event_attr :
--- start ---
running './tests/attr/test-stat-group1'
unsupp './tests/attr/test-stat-group1'
running './tests/attr/test-record-graph-default'
test limitation '!aarch64'
excluded architecture list ['aarch64']
expected config=0, got 1
FAILED './tests/attr/test-record-graph-default' - match failure
---- end ----
Setup struct perf_event_attr: FAILED!
#
Output after:
# ./perf test -F 17
17: Setup struct perf_event_attr : Ok
#
Fixes: eb2eac0c7b ("perf evsel: Fallback to "task-clock" when not system wide")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20231219143235.1075522-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Start a new set of shell tests for testing perf script. The initial
contribution is checking that some perf db-export functionality works
as reported in this regression by Ben Gainey <ben.gainey@arm.com>:
https://lore.kernel.org/lkml/20231207140911.3240408-1-ben.gainey@arm.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Ben Gainey <ben.gainey@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231207174057.1482161-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Switch loop macro maps__for_each_entry to maps__for_each_map function
that takes a callback. The function holds the maps lock, which should
be held during iteration.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231207011722.1220634-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make the DSO data tests a suite rather than individual so their output
is grouped.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231128194624.1419260-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Passing NULL to perf_cpu_map__new() performs
perf_cpu_map__new_online_cpus(), just directly call
perf_cpu_map__new_online_cpus() to be more intention revealing.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: bpf@vger.kernel.org
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20231129060211.1890454-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_cpu_map__dummy_new() to perf_cpu_map__new_any_cpu() to
better indicate this is creating a CPU map for the perf_event_open "any"
CPU case.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: bpf@vger.kernel.org
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20231129060211.1890454-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The diff test depends on finding the symbol test_loop in perf and will
fail if perf has been stripped and no debug object is available. In that
case, skip the test instead.
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231205164924.835682-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There are some old bug reports on perf diff crashing:
https://rhaas.blogspot.com/2012/06/perf-good-bad-ugly.html
Happening across them I was prompted to add two very basic tests that
will give some 'perf diff' coverage.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231120190408.281826-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Test that JSON output produces valid JSON.
Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231129213428.2227448-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Avoid replicated logic by having a common library to set the PYTHON
environment variable.
Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231129213428.2227448-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Migrate Makefile.tests to Build so that variables like rule_mkdir are
defined via Makefile.build (needed so the output directory can be
created). This requires SHELLCHECK being exported and the clean rule
tweaking to remove the files in find.
Change find "-perm -o=x" as it was failing on my Debian based Linux
kernel tree, switch to using "-executable".
Adding a filename prefix of "." to the shellcheck log files is a pain
and error prone in make, remove this prefix and just add the
shellcheck log files to .gitignore.
Fix the command echo so that running the test is displayed.
Fixes: 1638b11ef8 ("perf tools: Add perf binary dependent rule for shellcheck log in Makefile.perf")
Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231129213428.2227448-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'vg' register for arm64 shows up in --user_regs as available when
masking the variable AT_HWCAP with 1 << 22 returns '1' as done in
perf_regs.c.
However, in subtests for support of SVE, the check for the 'vg' register
is done by masking the variable AT_HWCAP with the value 0x200000 which
is equals to 1 << 21 instead of 1 << 22.
This results in inconsistencies on certain systems where the test
expects that the 'vg' register is not operational when it is, and
vice-versa.
During the testing on a machine that the test expected not to have the
'vg' register available, 'perf record' with the option --user-regs
showed records for the 'vg' register together with all of the others,
which means that the mask for the subtest of perf_event_attr is off by
one.
Change the value of the mask from 0x200000 to 0x400000 to correct it.
Fixes: 9440ebdc33 ("perf test arm64: Add attr tests for new VG register")
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Link: https://lore.kernel.org/r/20231201194617.13012-1-vmolnaro@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf test "probe libc's inet_pton & backtrace it with ping" fails on
powerpc as below:
# perf test -v "probe libc's inet_pton & backtrace it with
ping"
85: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 96028
ping 96056 [002] 127271.101961: probe_libc:inet_pton: (7fffa1779a60)
7fffa1779a60 __GI___inet_pton+0x0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
7fffa172a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
FAIL: expected backtrace entry
"gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$"
got "7fffa172a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)"
test child finished with -1
---- end ----
probe libc's inet_pton & backtrace it with ping: FAILED!
This test installs a probe on libc's inet_pton function, which will use
uprobes and then uses perf trace on a ping to localhost. It gets 3
levels deep backtrace and checks whether it is what we expected or not.
The test started failing from RHEL 9.4 where as it works in previous
distro version (RHEL 9.2). Test expects gaih_inet function to be part of
backtrace. But in the glibc version (2.34-86) which is part of distro
where it fails, this function is missing and hence the test is failing.
From nm and ping command output we can confirm that gaih_inet function
is not present in the expected backtrace for glibc version glibc-2.34-86
[root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet
00000000001273e0 t gaih_inet_serv
00000000001cd8d8 r gaih_inet_typeproto
[root@xxx perf]# perf script -i /tmp/perf.data.6E8
ping 104048 [000] 128582.508976: probe_libc:inet_pton: (7fff83779a60)
7fff83779a60 __GI___inet_pton+0x0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
7fff8372a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
11dc73534 [unknown] (/usr/bin/ping)
7fff8362a8c4 __libc_start_call_main+0x84 (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
FAIL: expected backtrace entry
"gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$"
got "7fff9d52a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)"
With version glibc-2.34-60 gaih_inet function is present as part of the
expected backtrace. So we cannot just remove the gaih_inet function from
the backtrace.
[root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet
0000000000130490 t gaih_inet.constprop.0
000000000012e830 t gaih_inet_serv
00000000001d45e4 r gaih_inet_typeproto
[root@xxx perf]# ./perf script -i /tmp/perf.data.b6S
ping 67906 [000] 22699.591699: probe_libc:inet_pton_3: (7fffbdd80820) 7fffbdd80820 __GI___inet_pton+0x0
(/usr/lib64/glibc-hwcaps/power10/libc.so.6) 7fffbdd31160 gaih_inet.constprop.0+0xcd0
(/usr/lib64/glibc-hwcaps/power10/libc.so.6) 7fffbdd31c7c getaddrinfo+0x14c
(/usr/lib64/glibc-hwcaps/power10/libc.so.6) 1140d3558 [unknown] (/usr/bin/ping)
This patch solves this issue by doing a conditional skip. If there is a
gaih_inet function present in the libc then it will be added to the
expected backtrace else the function will be skipped from being added
to the expected backtrace.
Output with the patch
[root@xxx perf]# ./perf test -v "probe libc's inet_pton & backtrace it
with ping"
83: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 102662
ping 102692 [000] 127935.549973: probe_libc:inet_pton: (7fff93379a60)
7fff93379a60 __GI___inet_pton+0x0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
7fff9332a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
11ef03534 [unknown] (/usr/bin/ping)
test child finished with 0
---- end ----
probe libc's inet_pton & backtrace it with ping: Ok
Reported-by: Disha Goel <disgoel@linux.ibm.com>
Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Likhitha Korrapati <likhitha@linux.ibm.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20231126070914.175332-1-likhitha@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There are issues as reported that need some more investigation on the
RT kernel front, till that is addressed, skip this test.
This test is already skipped for multiple hardware architectures where
the tested kernel feature is not supported.
Acked-by: Marco Elver <elver@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kate Carcia <kcarcia@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/e368f2c848d77fbc8d259f44e2055fe469c219cf.camel@gmx.de/
Link: https://lore.kernel.org/r/20231129154718.326330-3-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the part that loads the BTF info to a "btf__available()" that will
lazy load the BTF info so that if we need it for some other test, which
we will in the following cset, we can reuse it.
At some point this will move from this specific 'perf test' entry to be
used in other parts of perf, do it when needed.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kate Carcia <kcarcia@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20231129154718.326330-2-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is already an existing config value for changing the objdump path,
so instead of having two values that do the same thing, make 'perf test'
use annotate.objdump as well.
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/ZU5Cx4LTrB5q0sIG@kernel.org
Link: https://lore.kernel.org/r/20231113102327.695386-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf data symbol test depends on finding symbol buf1 in perf, and fails if
perf has been stripped and no debug object is available. In that case, skip
the test instead.
Example:
Before:
$ strip tools/perf/perf
$ tools/perf/perf buildid-cache -p `realpath tools/perf/perf`
$ tools/perf/perf test -v 'data symbol'
113: Test data symbol :
--- start ---
test child forked, pid 125646
Recording workload...
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.577 MB /tmp/__perf_test.perf.data.Jhbdp (7794 samples) ]
Cleaning up files...
test child finished with -1
---- end ----
Test data symbol: FAILED!
After:
$ tools/perf/perf test -v 'data symbol'
113: Test data symbol :
--- start ---
test child forked, pid 125747
perf does not have symbol 'buf1'
perf is missing symbols - skipping test
test child finished with -2
---- end ----
Test data symbol: Skip
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf data symbol test waits 1 second for perf to run and collect data,
which may be too little if perf takes a long time to start up, which has
been noticed on systems with many CPUs. Use existing wait_for_perf_to_start
helper to wait for perf to start.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The test "Check branch stack sampling" depends on finding symbol
brstack_bench (and several others) in perf, and fails if perf has been
stripped and no debug object is available. In that case, skip the test
instead.
Example:
Before:
$ strip tools/perf/perf
$ tools/perf/perf buildid-cache -p `realpath tools/perf/perf`
$ tools/perf/perf test -v 'branch stack sampling'
112: Check branch stack sampling :
--- start ---
test child forked, pid 123741
Testing user branch stack sampling
+ grep -E -m1 ^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/IND_CALL/.*$ /tmp/__perf_test.program.5Dz1U/perf.script
+ cleanup
+ rm -rf /tmp/__perf_test.program.5Dz1U
test child finished with -1
---- end ----
Check branch stack sampling: FAILED!
After:
$ tools/perf/perf test -v 'branch stack sampling'
112: Check branch stack sampling :
--- start ---
test child forked, pid 125157
perf does not have symbol 'brstack_bench'
perf is missing symbols - skipping test
test child finished with -2
---- end ----
Check branch stack sampling: Skip
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The test "Check Arm64 callgraphs are complete in fp mode" depends on
finding symbol leafloop in perf, and fails if perf has been stripped and no
debug object is available. In that case, skip the test instead.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf record test depends on finding symbol test_loop in perf, and fails if
perf has been stripped and no debug object is available. In that case, skip
the test instead.
Example:
Note, building with perl support adds option -Wl,-E which causes the
linker to add all (global) symbols to the dynamic symbol table. So the
test_loop symbol, being global, does not get stripped unless NO_LIBPERL=1
Before:
$ make NO_LIBPERL=1 -C tools/perf >/dev/null 2>&1
$ strip tools/perf/perf
$ tools/perf/perf buildid-cache -p `realpath tools/perf/perf`
$ tools/perf/perf test -v 'record tests'
91: perf record tests :
--- start ---
test child forked, pid 118750
Basic --per-thread mode test
Per-thread record [Failed missing output]
Register capture test
Register capture test [Success]
Basic --system-wide mode test
System-wide record [Skipped not supported]
Basic target workload test
Workload record [Failed missing output]
test child finished with -1
---- end ----
perf record tests: FAILED!
After:
$ tools/perf/perf test -v 'record tests'
91: perf record tests :
--- start ---
test child forked, pid 120025
perf does not have symbol 'test_loop'
perf is missing symbols - skipping test
test child finished with -2
---- end ----
perf record tests: Skip
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf pipe recording and injection test depends on finding symbol noploop in
perf, and fails if perf has been stripped and no debug object is available.
In that case, skip the test instead.
Example:
Before:
$ strip tools/perf/perf
$ tools/perf/perf buildid-cache -p `realpath tools/perf/perf`
$ tools/perf/perf test -v pipe
86: perf pipe recording and injection test :
--- start ---
test child forked, pid 47734
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
47741 47741 -1 |perf
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
cannot find noploop function in pipe #1
test child finished with -1
---- end ----
perf pipe recording and injection test: FAILED!
After:
$ tools/perf/perf test -v pipe
86: perf pipe recording and injection test :
--- start ---
test child forked, pid 48996
perf does not have symbol 'noploop'
perf is missing symbols - skipping test
test child finished with -2
---- end ----
perf pipe recording and injection test: Skip
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some shell tests depend on finding symbols for perf itself, and fail if
perf has been stripped and no debug object is available. Add helper
functions to check if perf has a needed symbol. This is preparation for
amending the tests themselves to be skipped if a needed symbol is not
found.
The functions make use of the "Symbols" test which reads and checks symbols
from a dso, perf itself by default. Note the "Symbols" test will find
symbols using the same method as other perf tests, including, for example,
looking in the buildid cache.
An alternative would be to prevent the needed symbols from being stripped,
which seems to work with gcc's externally_visible attribute, but that
attribute is not supported by clang.
Another alternative would be to use option -Wl,-E (which is already used
when perf is built with perl support) which causes the linker to add all
(global) symbols to the dynamic symbol table. Then the required symbols
need only be made global in scope to avoid being strippable. However that
goes beyond what is needed.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add rule in new Makefile "tests/Makefile.tests" for running shellcheck
on shell test scripts. This automates below shellcheck into the build.
$ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S warning $F; done
Condition for shellcheck is added in Makefile.perf to avoid build
breakage in the absence of shellcheck binary. Update Makefile.perf to
contain new rule for "SHELLCHECK_TEST" which is for making shellcheck
test as a dependency on perf binary.
Added "tests/Makefile.tests" to run shellcheck on shellscripts in
tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every time
during make, shellcheck will be run only on modified files during
subsequent invocations. By this, if any newly added shell scripts or
fixes in existing scripts breaks coding/formatting style, it will get
captured during the perf build.
Example build failure by modifying probe_vfs_getname.sh in tests/shell:
In tests/shell/probe_vfs_getname.sh line 8:
. $(dirname $0)/lib/probe.sh
^-----------^ SC2046 (warning): Quote this to prevent word splitting.
For more information:
https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt...
make[3]: *** [/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.probe_vfs_getname.sh.shellcheck_log] Error 1
make[2]: *** [Makefile.perf:686: SHELLCHECK_TEST] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile.perf:244: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
Here, like other files which gets created during compilation (ex:
.builtin-bench.o.cmd or .perf.o.cmd ), create .shellcheck_log also as a
hidden file. Example: tests/shell/.probe_vfs_getname.sh.shellcheck_log
shellcheck is re-run if any of the script gets modified based on its
dependency of this log file.
After this, for testing, changed "tests/shell/trace+probe_vfs_getname.sh" to
break shellcheck format. In the next make run, it is also captured:
In tests/shell/probe_vfs_getname.sh line 8:
. $(dirname $0)/lib/probe.sh
^-----------^ SC2046 (warning): Quote this to prevent word splitting.
For more information:
https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt...
make[3]: *** [/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.probe_vfs_getname.sh.shellcheck_log] Error 1
make[3]: *** Waiting for unfinished jobs....
In tests/shell/trace+probe_vfs_getname.sh line 14:
. $(dirname $0)/lib/probe.sh
^-----------^ SC2046 (warning): Quote this to prevent word splitting.
For more information:
https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt...
make[3]: *** [/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.trace+probe_vfs_getname.sh.shellcheck_log] Error 1
make[2]: *** [Makefile.perf:686: SHELLCHECK_TEST] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile.perf:244: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
Failure log can be found in the stdout of make itself.
This is reported at build time. To be able to go ahead with the build or
disable shellcheck even though it is known that some test is broken, add
a "NO_SHELLCHECK" option. Example:
make NO_SHELLCHECK=1
INSTALL libsubcmd_headers
INSTALL libsymbol_headers
INSTALL libapi_headers
INSTALL libperf_headers
INSTALL libbpf_headers
LINK perf
Note:
This is tested on RHEL and also SLES. Use below check:
"$(shell which shellcheck 2> /dev/null)" to look for presence
of shellcheck binary. The approach "shell command -v" is not
used here. In some of the distros(RHEL), command is available
as executable file (/usr/bin/command). But in some distros(SLES),
it is a shell builtin and not available as executable file.
Committer testing:
$ type shellcheck
shellcheck is hashed (/usr/bin/shellcheck)
$ rpm -qf /usr/bin/shellcheck
ShellCheck-0.9.0-2.fc38.x86_64
$
$ alias m
$ git diff
diff --git a/tools/perf/tests/shell/probe_vfs_getname.sh b/tools/perf/tests/shell/probe_vfs_getname.sh
index 554e12e83c55fd56..dbc14634678e2bf6 100755
--- a/tools/perf/tests/shell/probe_vfs_getname.sh
+++ b/tools/perf/tests/shell/probe_vfs_getname.sh
@@ -5,7 +5,7 @@
# Arnaldo Carvalho de Melo <acme@kernel.org>, 2017
# shellcheck source=lib/probe.sh
-. "$(dirname $0)"/lib/probe.sh
+. $(dirname $0)/lib/probe.sh
skip_if_no_perf_probe || exit 2
alias m='rm -rf ~/libexec/perf-core/ ; make -k CORESIGHT=1 O=/tmp/build/$(basename $PWD) -C tools/perf install-bin && perf test python'
$ m
make: Entering directory '/home/acme/git/perf-tools-next/tools/perf'
BUILD: Doing 'make -j32' parallel build
<SNIP>
INSTALL libbpf_headers
In tests/shell/probe_vfs_getname.sh line 8:
. $(dirname $0)/lib/probe.sh
^-----------^ SC2046 (warning): Quote this to prevent word splitting.
For more information:
https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt...
make[3]: *** [/home/acme/git/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.probe_vfs_getname.sh.shellcheck_log] Error 1
make[2]: *** [Makefile.perf:686: SHELLCHECK_TEST] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile.perf:244: sub-make] Error 2
make: *** [Makefile:113: install-bin] Error 2
make: Leaving directory '/home/acme/git/perf-tools-next/tools/perf'
$
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20231123160232.94253-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These variables are never referenced in the code, just remove them.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: zhujun2 <zhujun2@cmss.chinamobile.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20231115064255.11057-1-zhujun2@cmss.chinamobile.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf tool has previously made legacy events the priority so with
or without a PMU the legacy event would be opened:
$ perf stat -e cpu-cycles,cpu/cpu-cycles/ true
Using CPUID GenuineIntel-6-8D-1
intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
Attempting to add event pmu 'cpu' with 'cpu-cycles,' that may result in non-fatal errors
After aliases, add event pmu 'cpu' with 'cpu-cycles,' that may result in non-fatal errors
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
size 136
config 0 (PERF_COUNT_HW_CPU_CYCLES)
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 833967 cpu -1 group_fd -1 flags 0x8 = 3
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
size 136
config 0 (PERF_COUNT_HW_CPU_CYCLES)
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
...
Fixes to make hybrid/BIG.little PMUs behave correctly, ie as core PMUs
capable of opening legacy events on each, removing hard coded "cpu_core"
and "cpu_atom" Intel PMU names, etc. caused a behavioral difference on
Apple/ARM due to latent issues in the PMU driver reported in:
https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@marcan.st/
As part of that report Mark Rutland <mark.rutland@arm.com> requested
that legacy events not be higher in priority when a PMU is specified
reversing what has until this change been perf's default behavior. With
this change the above becomes:
$ perf stat -e cpu-cycles,cpu/cpu-cycles/ true
Using CPUID GenuineIntel-6-8D-1
Attempt to add: cpu/cpu-cycles=0/
..after resolving event: cpu/event=0x3c/
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
size 136
config 0 (PERF_COUNT_HW_CPU_CYCLES)
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 827628 cpu -1 group_fd -1 flags 0x8 = 3
------------------------------------------------------------
perf_event_attr:
type 4 (PERF_TYPE_RAW)
size 136
config 0x3c
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
...
So the second event has become a raw event as
/sys/devices/cpu/events/cpu-cycles exists.
A fix was necessary to config_term_pmu in parse-events.c as check_alias
expansion needs to happen after config_term_pmu, and config_term_pmu may
need calling a second time because of this.
config_term_pmu is updated to not use the legacy event when the PMU has
such a named event (either from JSON or sysfs).
The bulk of this change is updating all of the parse-events test
expectations so that if a sysfs/JSON event exists for a PMU the test
doesn't fail - a further sign, if it were needed, that the legacy event
priority was a known and tested behavior of the perf tool.
Reported-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Hector Martin <marcan@marcan.st>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231123042922.834425-1-irogers@google.com
[ Initialize the 'alias_rewrote_terms' variable to false to address a clang warning ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a basic test for the branch counter feature.
The test verifies that
- The new filter can be successfully applied on the supported platforms.
- The counter value can be outputted via the perf report -D
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tinghao Zhang <tinghao.zhang@intel.com>
Link: https://lore.kernel.org/r/20231107184020.1497571-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The current use of atomics can lead to test failures, as tests (such as
tests/shell/record.sh) search for samples with "test_loop" as the
top-most stack frame, but find frames related to the atomic operation
(e.g. __aarch64_ldadd4_relax).
This change simply removes the "count" variable, as it is not necessary.
Fixes: 1962ab6f6e ("perf test workload thloop: Make count increments atomic")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Nick Forrington <nick.forrington@arm.com>
Acked-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231102162225.50028-1-nick.forrington@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All of the other Perf subcommands that use objdump have an option to
specify the binary, so add the same option to 'perf test'.
This is useful if you have built the kernel with a different toolchain
to the system one, where the system objdump may fail to disassemble
vmlinux.
Now this can be fixed with something like this:
$ perf test --objdump llvm-objdump "object code reading"
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20231106151051.129440-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On s390 using linux-next the test case:
87: perf record offcpu profiling tests
fails. The root cause is this command
# ./perf record --off-cpu -e dummy -- ./perf bench sched messaging -l 10
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 0.231 [sec]
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.077 MB perf.data (401 samples) ]
#
It does not generate 800+ sample entries, on s390 usually around
40[1-9], sometimes a few more, but never more than 450. The higher the
number of CPUs the lower the number of samples.
Looking at function chain:
bench_sched_messaging()
+--> group()
the senders and receiver threads are created. The senders and receivers
call function ready() which writes one bytes and wait for a reply using
poll system() call.
As context switches are counted, the function ready() will trigger a
context switch when no input data is available after the write system
call. The write system call does not trigger context switches when the
data size is small. And writing 1000 bytes (10 iterations with
100 bytes) is not much and certainly won't block.
The 400+ context switch on s390 occur when the some receiver/sender
threads call ready() and wait for the response from function
bench_sched_messaging() being kicked off.
Lower the number of expected context switches to 400 to succeed on s390.
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Co-developed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20231106091627.2022530-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>