Commit Graph

814759 Commits

Author SHA1 Message Date
Bruce Ashfield
55dd15336b arch/x86/boot: use prefix map to avoid embedded paths
It was observed that the kernel embeds the path in the x86 boot
artifacts.

From https://bugzilla.yoctoproject.org/show_bug.cgi?id=13458:

[
   If you turn on the buildpaths QA test, or try a reproducible build, you
   discover that the kernel image contains build paths.

   $ strings bzImage-5.0.19-yocto-standard |grep tmp/
   out of pgt_buf in
   /data/poky-tmp/reproducible/tmp/work-shared/qemux86-64/kernel-source/arch/x86/boot/compressed/kaslr_64.c!?

   But what's this in the top-level Makefile:

   $ git grep prefix-map
   Makefile:KBUILD_CFLAGS  += $(call
   cc-option,-fmacro-prefix-map=$(srctree)/=)

   So the __FILE__ shouldn't be using the full path.  However
   arch/x86/boot/compressed/Makefile has this:

   KBUILD_CFLAGS := -m$(BITS) -O2

   So that clears KBUILD_FLAGS, removing the -fmacro-prefix-map option.
]

Other architectures do not clear the flags, but instead prune before
adding boot or specific options. There's no obvious reason why x86 isn't
doing the same thing (pruning vs clearing) and no build or boot issues
have been observed.

So we make x86 can do the same thing, and we no longer have embedded paths.

Signed-off-by: Bruce Ashfield <bruce.ashfield@gmail.com>
2019-08-08 23:39:26 -04:00
Bruce Ashfield
00638cdd8f This is the 5.0.19 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlzpa8gACgkQONu9yGCS
 aT50+RAAxYmH/oX6aUNVgA2IfeODEmvD4DsGisMyr3rvnyoOt7brH6r2DDgQ9edh
 yk+BdplT9yjqsbwFl7yWjzIfR9RI/zU6enUHM2q+NVRmeeIhAsSQwOx/q77kjvZe
 kqiLPgTpzEUekbvll2AdQhE0WNx3SOzKpvmONdBQXAtweXs7rPzB3ckdyBUuM2g8
 inVgEusm3KmjUsctVDYmT3uu7EIh+QAQkYVcl6TxqZXEqEKhWev71dGUHXqO0L2M
 lSnzj37n7FMZh9g81ma9ILcdC7imuP8Q2sXQA8FCAGKghvm+XqYSemYydo2qdoM3
 oW9RC0VyV8cHwvenULxtLxxiA3I10JrYE4RjK9EBlx79tTMVL6/OTHFuo4G6gCMt
 dEqHzJ/gcLX0IyGvOwXbla/17SmWYgHx2ovLhIsdd3FE3ocWyEnwKCliRe4Dd8C1
 HSozSQv3M/2m9AQ6D+AnbKUvqiH0QeAvgXFj0zvqHftHFnxFefoz7gr6yzSq1EYA
 1PxQhNaICDCwTo5NEz/pTTAIhoA+VSWPdd8fe6Eda4HhW6rYA1EaNNF4IxLIXUQ9
 Xq2ssT7NPOKvk0OJRShPQyfbE57CT/UqVDLAVemXlQeLRy9/OMoGFE5nRvihAEng
 hg0pG8m2kSnJQlHp5dDBzwjwxcBk/nvoqKtmvWnQ3ZlfQe3xFHI=
 =WuEC
 -----END PGP SIGNATURE-----

Merge tag 'v5.0.19' into v5.0/standard/base

This is the 5.0.19 stable release
2019-05-29 22:39:08 -04:00
Bruce Ashfield
455fca293d This is the 5.0.18 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlzk4HIACgkQONu9yGCS
 aT5fBw/9GPu5LTe/NPUdQApZglEwNLKYTKjtpy8s25UFFpkEkvtV/aYFt9z32eUV
 6M14I40gq5D0nNxNAjUamRObCa7wmD1nYnoY9cAEBMNNNkpJscEzKwmNqXePfjHh
 t6ViLYAaO1JBqWDIlBSztqYcqzt99WdK0bTxgMq1DdNzIfpQWgQl8fRf+p76QoTB
 qb/iynu0tsgykbwvi/xsV3c7rcGB/i8i9rfa16u8tSHi8Vgkoc4umCdyskWHmHQD
 69HG4ursOOf3xgwcx/ONA3XlBjYX8jXa5o1um+k3C0OumKf5rW5cLwoSp6RteOMU
 Qz3v5uUtLtqujeJRVLHBHyOz+PJwZx2sCfuvAYx98M/QJpgf3OFPnqgcMIP1L6+9
 GZkSUPKkzvflOaIwPmSXYdh77hLbJlOqs9t3bFK4s2emaCDgLBrg7IAAvNz8lF9N
 ZNU1DUtjj6szuCM3F0OOQ+BvS9X0dyMLW8/anNZy0M5vsZBHL6OFjLATMk8TMUag
 pzX5Xhh+2o8YN/sjY4ykTfZAgfWZ+d7Wx1HOBQbeqpev0o809LMpv1Bh6Pla+B5w
 nABd175W+CQQ/lgM5+BeC0wK2WlMqQj2BerPsYpA6BXPLDVbyZ0nzCd7skb/Bc4P
 roCKNY2WewJ0Gjr0ArAXhPbW7PzurUppU7pLJEfcUiWYrFeTc5Q=
 =jcFH
 -----END PGP SIGNATURE-----

Merge tag 'v5.0.18' into v5.0/standard/base

This is the 5.0.18 stable release
2019-05-29 22:09:34 -04:00
Greg Kroah-Hartman
3f7c1cab1a Linux 5.0.19 2019-05-25 18:22:32 +02:00
Yifeng Li
64d314bd8c fbdev: sm712fb: fix memory frequency by avoiding a switch/case fallthrough
commit 9dc2011398 upstream.

A fallthrough in switch/case was introduced in f627caf55b ("fbdev:
sm712fb: fix crashes and garbled display during DPMS modesetting"),
due to my copy-paste error, which would cause the memory clock frequency
for SM720 to be programmed to SM712.

Since it only reprograms the clock to a different frequency, it's only
a benign issue without visible side-effect, so it also evaded Sudip
Mukherjee's code review and regression tests. scripts/checkpatch.pl
also failed to discover the issue, possibly due to nested switch
statements.

This issue was found by Stephen Rothwell by building linux-next with
-Wimplicit-fallthrough.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: f627caf55b ("fbdev: sm712fb: fix crashes and garbled display during DPMS modesetting")
Signed-off-by: Yifeng Li <tomli@tomli.me>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-25 18:22:31 +02:00
Daniel Borkmann
e5c6d75b0f bpf, lru: avoid messing with eviction heuristics upon syscall lookup
commit 50b045a8c0 upstream.

One of the biggest issues we face right now with picking LRU map over
regular hash table is that a map walk out of user space, for example,
to just dump the existing entries or to remove certain ones, will
completely mess up LRU eviction heuristics and wrong entries such
as just created ones will get evicted instead. The reason for this
is that we mark an entry as "in use" via bpf_lru_node_set_ref() from
system call lookup side as well. Thus upon walk, all entries are
being marked, so information of actual least recently used ones
are "lost".

In case of Cilium where it can be used (besides others) as a BPF
based connection tracker, this current behavior causes disruption
upon control plane changes that need to walk the map from user space
to evict certain entries. Discussion result from bpfconf [0] was that
we should simply just remove marking from system call side as no
good use case could be found where it's actually needed there.
Therefore this patch removes marking for regular LRU and per-CPU
flavor. If there ever should be a need in future, the behavior could
be selected via map creation flag, but due to mentioned reason we
avoid this here.

  [0] http://vger.kernel.org/bpfconf.html

Fixes: 29ba732acb ("bpf: Add BPF_MAP_TYPE_LRU_HASH")
Fixes: 8f8449384e ("bpf: Add BPF_MAP_TYPE_LRU_PERCPU_HASH")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-25 18:22:31 +02:00
Daniel Borkmann
b5f95aa7a8 bpf: add map_lookup_elem_sys_only for lookups from syscall side
commit c6110222c6 upstream.

Add a callback map_lookup_elem_sys_only() that map implementations
could use over map_lookup_elem() from system call side in case the
map implementation needs to handle the latter differently than from
the BPF data path. If map_lookup_elem_sys_only() is set, this will
be preferred pick for map lookups out of user space. This hook is
used in a follow-up fix for LRU map, but once development window
opens, we can convert other map types from map_lookup_elem() (here,
the one called upon BPF_MAP_LOOKUP_ELEM cmd is meant) over to use
the callback to simplify and clean up the latter.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-25 18:22:31 +02:00
Chenbo Feng
d811930f74 bpf: relax inode permission check for retrieving bpf program
commit e547ff3f80 upstream.

For iptable module to load a bpf program from a pinned location, it
only retrieve a loaded program and cannot change the program content so
requiring a write permission for it might not be necessary.
Also when adding or removing an unrelated iptable rule, it might need to
flush and reload the xt_bpf related rules as well and triggers the inode
permission check. It might be better to remove the write premission
check for the inode so we won't need to grant write access to all the
processes that flush and restore iptables rules.

Signed-off-by: Chenbo Feng <fengc@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-25 18:22:31 +02:00
John Garry
ca7ef7e3dd driver core: Postpone DMA tear-down until after devres release for probe failure
commit 0b777eee88 upstream.

In commit 376991db4b ("driver core: Postpone DMA tear-down until after
devres release"), we changed the ordering of tearing down the device DMA
ops and releasing all the device's resources; this was because the DMA ops
should be maintained until we release the device's managed DMA memories.

However, we have seen another crash on an arm64 system when a
device driver probe fails:

  hisi_sas_v3_hw 0000:74:02.0: Adding to iommu group 2
  scsi host1: hisi_sas_v3_hw
  BUG: Bad page state in process swapper/0  pfn:313f5
  page:ffff7e0000c4fd40 count:1 mapcount:0
  mapping:0000000000000000 index:0x0
  flags: 0xfffe00000001000(reserved)
  raw: 0fffe00000001000 ffff7e0000c4fd48 ffff7e0000c4fd48
0000000000000000
  raw: 0000000000000000 0000000000000000 00000001ffffffff
0000000000000000
  page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
  bad because of flags: 0x1000(reserved)
  Modules linked in:
  CPU: 49 PID: 1 Comm: swapper/0 Not tainted
5.1.0-rc1-43081-g22d97fd-dirty #1433
  Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI
RC0 - V1.12.01 01/29/2019
  Call trace:
  dump_backtrace+0x0/0x118
  show_stack+0x14/0x1c
  dump_stack+0xa4/0xc8
  bad_page+0xe4/0x13c
  free_pages_check_bad+0x4c/0xc0
  __free_pages_ok+0x30c/0x340
  __free_pages+0x30/0x44
  __dma_direct_free_pages+0x30/0x38
  dma_direct_free+0x24/0x38
  dma_free_attrs+0x9c/0xd8
  dmam_release+0x20/0x28
  release_nodes+0x17c/0x220
  devres_release_all+0x34/0x54
  really_probe+0xc4/0x2c8
  driver_probe_device+0x58/0xfc
  device_driver_attach+0x68/0x70
  __driver_attach+0x94/0xdc
  bus_for_each_dev+0x5c/0xb4
  driver_attach+0x20/0x28
  bus_add_driver+0x14c/0x200
  driver_register+0x6c/0x124
  __pci_register_driver+0x48/0x50
  sas_v3_pci_driver_init+0x20/0x28
  do_one_initcall+0x40/0x25c
  kernel_init_freeable+0x2b8/0x3c0
  kernel_init+0x10/0x100
  ret_from_fork+0x10/0x18
  Disabling lock debugging due to kernel taint
  BUG: Bad page state in process swapper/0  pfn:313f6
  page:ffff7e0000c4fd80 count:1 mapcount:0
mapping:0000000000000000 index:0x0
[   89.322983] flags: 0xfffe00000001000(reserved)
  raw: 0fffe00000001000 ffff7e0000c4fd88 ffff7e0000c4fd88
0000000000000000
  raw: 0000000000000000 0000000000000000 00000001ffffffff
0000000000000000

The crash occurs for the same reason.

In this case, on the really_probe() failure path, we are still clearing
the DMA ops prior to releasing the device's managed memories.

This patch fixes this issue by reordering the DMA ops teardown and the
call to devres_release_all() on the failure path.

Reported-by: Xiang Chen <chenxiang66@hisilicon.com>
Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-25 18:22:31 +02:00
Nigel Croxon
bad4fbe76c md/raid: raid5 preserve the writeback action after the parity check
commit b2176a1dfb upstream.

The problem is that any 'uptodate' vs 'disks' check is not precise
in this path. Put a "WARN_ON(!test_bit(R5_UPTODATE, &dev->flags)" on the
device that might try to kick off writes and then skip the action.
Better to prevent the raid driver from taking unexpected action *and* keep
the system alive vs killing the machine with BUG_ON.

Note: fixed warning reported by kbuild test robot <lkp@intel.com>

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-25 18:22:31 +02:00
Song Liu
3770eb3721 Revert "Don't jump to compute_result state from check_result state"
commit a25d8c327b upstream.

This reverts commit 4f4fd7c579.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Nigel Croxon <ncroxon@redhat.com>
Cc: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-25 18:22:30 +02:00
Jiri Olsa
07116a6548 perf/x86/intel: Fix race in intel_pmu_disable_event()
[ Upstream commit 6f55967ad9 ]

New race in x86_pmu_stop() was introduced by replacing the
atomic __test_and_clear_bit() of cpuc->active_mask by separate
test_bit() and __clear_bit() calls in the following commit:

  3966c3feca ("x86/perf/amd: Remove need to check "running" bit in NMI handler")

The race causes panic for PEBS events with enabled callchains:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
  ...
  RIP: 0010:perf_prepare_sample+0x8c/0x530
  Call Trace:
   <NMI>
   perf_event_output_forward+0x2a/0x80
   __perf_event_overflow+0x51/0xe0
   handle_pmi_common+0x19e/0x240
   intel_pmu_handle_irq+0xad/0x170
   perf_event_nmi_handler+0x2e/0x50
   nmi_handle+0x69/0x110
   default_do_nmi+0x3e/0x100
   do_nmi+0x11a/0x180
   end_repeat_nmi+0x16/0x1a
  RIP: 0010:native_write_msr+0x6/0x20
  ...
   </NMI>
   intel_pmu_disable_event+0x98/0xf0
   x86_pmu_stop+0x6e/0xb0
   x86_pmu_del+0x46/0x140
   event_sched_out.isra.97+0x7e/0x160
  ...

The event is configured to make samples from PEBS drain code,
but when it's disabled, we'll go through NMI path instead,
where data->callchain will not get allocated and we'll crash:

          x86_pmu_stop
            test_bit(hwc->idx, cpuc->active_mask)
            intel_pmu_disable_event(event)
            {
              ...
              intel_pmu_pebs_disable(event);
              ...

EVENT OVERFLOW ->  <NMI>
                     intel_pmu_handle_irq
                       handle_pmi_common
   TEST PASSES ->        test_bit(bit, cpuc->active_mask))
                           perf_event_overflow
                             perf_prepare_sample
                             {
                               ...
                               if (!(sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY))
                                     data->callchain = perf_callchain(event, regs);

         CRASH ->              size += data->callchain->nr;
                             }
                   </NMI>
              ...
              x86_pmu_disable_event(event)
            }

            __clear_bit(hwc->idx, cpuc->active_mask);

Fixing this by disabling the event itself before setting
off the PEBS bit.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Arcari <darcari@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Lendacky Thomas <Thomas.Lendacky@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 3966c3feca ("x86/perf/amd: Remove need to check "running" bit in NMI handler")
Link: http://lkml.kernel.org/r/20190504151556.31031-1-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:30 +02:00
Leo Yan
58d1e074c7 perf cs-etm: Always allocate memory for cs_etm_queue::prev_packet
[ Upstream commit 35bb59c10a ]

Robert Walker reported a segmentation fault is observed when process
CoreSight trace data; this issue can be easily reproduced by the command
'perf report --itrace=i1000i' for decoding tracing data.

If neither the 'b' flag (synthesize branches events) nor 'l' flag
(synthesize last branch entries) are specified to option '--itrace',
cs_etm_queue::prev_packet will not been initialised.  After merging the
code to support exception packets and sample flags, there introduced a
number of uses of cs_etm_queue::prev_packet without checking whether it
is valid, for these cases any accessing to uninitialised prev_packet
will cause crash.

As cs_etm_queue::prev_packet is used more widely now and it's already
hard to follow which functions have been called in a context where the
validity of cs_etm_queue::prev_packet has been checked, this patch
always allocates memory for cs_etm_queue::prev_packet.

Reported-by: Robert Walker <robert.walker@arm.com>
Suggested-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Robert Walker <robert.walker@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Fixes: 7100b12cf4 ("perf cs-etm: Generate branch sample for exception packet")
Fixes: 24fff5eb2b ("perf cs-etm: Avoid stale branch samples when flush packet")
Link: http://lkml.kernel.org/r/20190428083228.20246-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:30 +02:00
Arnaldo Carvalho de Melo
cd448c27b0 perf bench numa: Add define for RUSAGE_THREAD if not present
[ Upstream commit bf561d3c13 ]

While cross building perf to the ARC architecture on a fedora 30 host,
we were failing with:

      CC       /tmp/build/perf/bench/numa.o
  bench/numa.c: In function ‘worker_thread’:
  bench/numa.c:1261:12: error: ‘RUSAGE_THREAD’ undeclared (first use in this function); did you mean ‘SIGEV_THREAD’?
    getrusage(RUSAGE_THREAD, &rusage);
              ^~~~~~~~~~~~~
              SIGEV_THREAD
  bench/numa.c:1261:12: note: each undeclared identifier is reported only once for each function it appears in

[perfbuilder@60d5802468f6 perf]$ /arc_gnu_2019.03-rc1_prebuilt_uclibc_le_archs_linux_install/bin/arc-linux-gcc --version | head -1
arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225
[perfbuilder@60d5802468f6 perf]$

Trying to reproduce a report by Vineet, I noticed that, with just
cross-built zlib and numactl libraries, I ended up with the above
failure.

So, since RUSAGE_THREAD is available as a define, check for that and
numactl libraries, I ended up with the above failure.

So, since RUSAGE_THREAD is available as a define in the system headers,
check if it is defined in the 'perf bench numa' sources and define it if
not.

Now it builds and I have to figure out if the problem reported by Vineet
only takes place if we have libelf or some other library available.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: linux-snps-arc@lists.infradead.org
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Link: https://lkml.kernel.org/n/tip-2wb4r1gir9xrevbpq7qp0amk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:30 +02:00
Wolfram Sang
7325696ce2 i2c: designware: ratelimit 'transfer when suspended' errors
[ Upstream commit 6bac9bc273 ]

There are two problems with dev_err() here. One: It is not ratelimited.
Two: We don't see which driver tried to transfer something with a
suspended adapter. Switch to dev_WARN_ONCE to fix both issues. Drawback
is that we don't see if multiple drivers are trying to transfer while
suspended. They need to be discovered one after the other now. This is
better than a high CPU load because a really broken driver might try to
resend endlessly.

Link: https://bugs.archlinux.org/task/62391
Fixes: 2751541555 ("i2c: designware: Do not allow i2c_dw_xfer() calls while suspended")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reported-by: skidnik <skidnik@gmail.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: skidnik <skidnik@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:30 +02:00
Al Viro
8258661858 ufs: fix braino in ufs_get_inode_gid() for solaris UFS flavour
[ Upstream commit 4e9036042f ]

To choose whether to pick the GID from the old (16bit) or new (32bit)
field, we should check if the old gid field is set to 0xffff.  Mainline
checks the old *UID* field instead - cut'n'paste from the corresponding
code in ufs_get_inode_uid().

Fixes: 252e211e90
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:30 +02:00
Vitaly Kuznetsov
5b73764a5d KVM: selftests: make hyperv_cpuid test pass on AMD
[ Upstream commit eba3afde1c ]

Enlightened VMCS is only supported on Intel CPUs but the test shouldn't
fail completely.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:29 +02:00
Paolo Bonzini
fb654d0763 KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size
[ Upstream commit 76d58e0f07 ]

If a memory slot's size is not a multiple of 64 pages (256K), then
the KVM_CLEAR_DIRTY_LOG API is unusable: clearing the final 64 pages
either requires the requested page range to go beyond memslot->npages,
or requires log->num_pages to be unaligned, and kvm_clear_dirty_log_protect
requires log->num_pages to be both in range and aligned.

To allow this case, allow log->num_pages not to be a multiple of 64 if
it ends exactly on the last page of the slot.

Reported-by: Peter Xu <peterx@redhat.com>
Fixes: 98938aa8ed ("KVM: validate userspace input in kvm_clear_dirty_log_protect()", 2019-01-02)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:29 +02:00
Gary Hook
497ce5c7f5 x86/mm/mem_encrypt: Disable all instrumentation for early SME setup
[ Upstream commit b51ce3744f ]

Enablement of AMD's Secure Memory Encryption feature is determined very
early after start_kernel() is entered. Part of this procedure involves
scanning the command line for the parameter 'mem_encrypt'.

To determine intended state, the function sme_enable() uses library
functions cmdline_find_option() and strncmp(). Their use occurs early
enough such that it cannot be assumed that any instrumentation subsystem
is initialized.

For example, making calls to a KASAN-instrumented function before KASAN
is set up will result in the use of uninitialized memory and a boot
failure.

When AMD's SME support is enabled, conditionally disable instrumentation
of these dependent functions in lib/string.c and arch/x86/lib/cmdline.c.

 [ bp: Get rid of intermediary nostackp var and cleanup whitespace. ]

Fixes: aca20d5462 ("x86/mm: Add support to make use of Secure Memory Encryption")
Reported-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Boris Brezillon <bbrezillon@kernel.org>
Cc: Coly Li <colyli@suse.de>
Cc: "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: "luto@kernel.org" <luto@kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: "mingo@redhat.com" <mingo@redhat.com>
Cc: "peterz@infradead.org" <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/155657657552.7116.18363762932464011367.stgit@sosrh3.amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:29 +02:00
Tobin C. Harding
96f0be982c sched/cpufreq: Fix kobject memleak
[ Upstream commit 9a4f26cc98 ]

Currently the error return path from kobject_init_and_add() is not
followed by a call to kobject_put() - which means we are leaking
the kobject.

Fix it by adding a call to kobject_put() in the error path of
kobject_init_and_add().

Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tobin C. Harding <tobin@kernel.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/20190430001144.24890-1-tobin@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:29 +02:00
Luca Coelho
2a9605f177 iwlwifi: mvm: check for length correctness in iwl_mvm_create_skb()
[ Upstream commit de1887c064 ]

We don't check for the validity of the lengths in the packet received
from the firmware.  If the MPDU length received in the rx descriptor
is too short to contain the header length and the crypt length
together, we may end up trying to copy a negative number of bytes
(headlen - hdrlen < 0) which will underflow and cause us to try to
copy a huge amount of data.  This causes oopses such as this one:

BUG: unable to handle kernel paging request at ffff896be2970000
PGD 5e201067 P4D 5e201067 PUD 5e205067 PMD 16110d063 PTE 8000000162970161
Oops: 0003 [#1] PREEMPT SMP NOPTI
CPU: 2 PID: 1824 Comm: irq/134-iwlwifi Not tainted 4.19.33-04308-geea41cf4930f #1
Hardware name: [...]
RIP: 0010:memcpy_erms+0x6/0x10
Code: 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3
 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe
RSP: 0018:ffffa4630196fc60 EFLAGS: 00010287
RAX: ffff896be2924618 RBX: ffff896bc8ecc600 RCX: 00000000fffb4610
RDX: 00000000fffffff8 RSI: ffff896a835e2a38 RDI: ffff896be2970000
RBP: ffffa4630196fd30 R08: ffff896bc8ecc600 R09: ffff896a83597000
R10: ffff896bd6998400 R11: 000000000200407f R12: ffff896a83597050
R13: 00000000fffffff8 R14: 0000000000000010 R15: ffff896a83597038
FS:  0000000000000000(0000) GS:ffff896be8280000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff896be2970000 CR3: 000000005dc12002 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 iwl_mvm_rx_mpdu_mq+0xb51/0x121b [iwlmvm]
 iwl_pcie_rx_handle+0x58c/0xa89 [iwlwifi]
 iwl_pcie_irq_rx_msix_handler+0xd9/0x12a [iwlwifi]
 irq_thread_fn+0x24/0x49
 irq_thread+0xb0/0x122
 kthread+0x138/0x140
 ret_from_fork+0x1f/0x40

Fix that by checking the lengths for correctness and trigger a warning
to show that we have received wrong data.

Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:28 +02:00
Bjørn Mork
df5eba5f41 qmi_wwan: new Wistron, ZTE and D-Link devices
[ Upstream commit 88ef66a283 ]

Adding device entries found in vendor modified versions of this
driver.  Function maps for some of the devices follow:

WNC D16Q1, D16Q5, D18Q1 LTE CAT3 module (1435:0918)

MI_00 Qualcomm HS-USB Diagnostics
MI_01 Android Debug interface
MI_02 Qualcomm HS-USB Modem
MI_03 Qualcomm Wireless HS-USB Ethernet Adapter
MI_04 Qualcomm Wireless HS-USB Ethernet Adapter
MI_05 Qualcomm Wireless HS-USB Ethernet Adapter
MI_06 USB Mass Storage Device

 T:  Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480 MxCh= 0
 D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
 P:  Vendor=1435 ProdID=0918 Rev= 2.32
 S:  Manufacturer=Android
 S:  Product=Android
 S:  SerialNumber=0123456789ABCDEF
 C:* #Ifs= 7 Cfg#= 1 Atr=80 MxPwr=500mA
 I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
 E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
 E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
 E:  Ad=84(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
 E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
 E:  Ad=86(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
 E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
 E:  Ad=88(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
 E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
 E:  Ad=8a(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
 E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms

WNC D18 LTE CAT3 module (1435:d182)

MI_00 Qualcomm HS-USB Diagnostics
MI_01 Androd Debug interface
MI_02 Qualcomm HS-USB Modem
MI_03 Qualcomm HS-USB NMEA
MI_04 Qualcomm Wireless HS-USB Ethernet Adapter
MI_05 Qualcomm Wireless HS-USB Ethernet Adapter
MI_06 USB Mass Storage Device

ZM8510/ZM8620/ME3960 (19d2:0396)

MI_00 ZTE Mobile Broadband Diagnostics Port
MI_01 ZTE Mobile Broadband AT Port
MI_02 ZTE Mobile Broadband Modem
MI_03 ZTE Mobile Broadband NDIS Port (qmi_wwan)
MI_04 ZTE Mobile Broadband ADB Port

ME3620_X (19d2:1432)

MI_00 ZTE Diagnostics Device
MI_01 ZTE UI AT Interface
MI_02 ZTE Modem Device
MI_03 ZTE Mobile Broadband Network Adapter
MI_04 ZTE Composite ADB Interface

Reported-by: Lars Melin <larsm17@gmail.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:28 +02:00
Peter Zijlstra
bd61ddd3e9 bpf: Fix preempt_enable_no_resched() abuse
[ Upstream commit 0edd6b64d1 ]

Unless the very next line is schedule(), or implies it, one must not use
preempt_enable_no_resched(). It can cause a preemption to go missing and
thereby cause arbitrary delays, breaking the PREEMPT=y invariant.

Cc: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:28 +02:00
Alban Crequy
bd3713424a tools: bpftool: fix infinite loop in map create
[ Upstream commit 8694d8c1f8 ]

"bpftool map create" has an infinite loop on "while (argc)". The error
case is missing.

Symptoms: when forgetting to type the keyword 'type' in front of 'hash':
$ sudo bpftool map create /sys/fs/bpf/dir/foobar hash key 8 value 8 entries 128
(infinite loop, taking all the CPU)
^C

After the patch:
$ sudo bpftool map create /sys/fs/bpf/dir/foobar hash key 8 value 8 entries 128
Error: unknown arg hash

Fixes: 0b592b5a01 ("tools: bpftool: add map create command")
Signed-off-by: Alban Crequy <alban@kinvolk.io>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:27 +02:00
Andrey Smirnov
1e61a21909 power: supply: sysfs: prevent endless uevent loop with CONFIG_POWER_SUPPLY_DEBUG
[ Upstream commit 349ced9984 ]

Fix a similar endless event loop as was done in commit
8dcf32175b ("i2c: prevent endless uevent loop with
CONFIG_I2C_DEBUG_CORE"):

  The culprit is the dev_dbg printk in the i2c uevent handler. If
  this is activated (for instance by CONFIG_I2C_DEBUG_CORE) it results
  in an endless loop with systemd-journald.

  This happens if user-space scans the system log and reads the uevent
  file to get information about a newly created device, which seems
  fair use to me. Unfortunately reading the "uevent" file uses the
  same function that runs for creating the uevent for a new device,
  generating the next syslog entry

Both CONFIG_I2C_DEBUG_CORE and CONFIG_POWER_SUPPLY_DEBUG were reported
in https://bugs.freedesktop.org/show_bug.cgi?id=76886 but only former
seems to have been fixed. Drop debug prints as it was done in I2C
subsystem to resolve the issue.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:27 +02:00
Andrew Jones
e6ae439228 KVM: arm/arm64: Ensure vcpu target is unset on reset failure
[ Upstream commit 811328fc32 ]

A failed KVM_ARM_VCPU_INIT should not set the vcpu target,
as the vcpu target is used by kvm_vcpu_initialized() to
determine if other vcpu ioctls may proceed. We need to set
the target before calling kvm_reset_vcpu(), but if that call
fails, we should then unset it and clear the feature bitmap
while we're at it.

Signed-off-by: Andrew Jones <drjones@redhat.com>
[maz: Simplified patch, completed commit message]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:27 +02:00
Kangjie Lu
5450811a02 net: ieee802154: fix missing checks for regmap_update_bits
[ Upstream commit 22e8860cf8 ]

regmap_update_bits could fail and deserves a check.

The patch adds the checks and if it fails, returns its error
code upstream.

Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:26 +02:00
Bhagavathi Perumal S
15f64f420b mac80211: Fix kernel panic due to use of txq after free
[ Upstream commit f1267cf3c0 ]

The txq of vif is added to active_txqs list for ATF TXQ scheduling
in the function ieee80211_queue_skb(), but it was not properly removed
before freeing the txq object. It was causing use after free of the txq
objects from the active_txqs list, result was kernel panic
due to invalid memory access.

Fix kernel invalid memory access by properly removing txq object
from active_txqs list before free the object.

Signed-off-by: Bhagavathi Perumal S <bperumal@codeaurora.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:26 +02:00
Vitaly Kuznetsov
eff6d5429b x86: kvm: hyper-v: deal with buggy TLB flush requests from WS2012
[ Upstream commit da66761c2d ]

It was reported that with some special Multi Processor Group configuration,
e.g:
 bcdedit.exe /set groupsize 1
 bcdedit.exe /set maxgroup on
 bcdedit.exe /set groupaware on
for a 16-vCPU guest WS2012 shows BSOD on boot when PV TLB flush mechanism
is in use.

Tracing kvm_hv_flush_tlb immediately reveals the issue:

 kvm_hv_flush_tlb: processor_mask 0x0 address_space 0x0 flags 0x2

The only flag set in this request is HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES,
however, processor_mask is 0x0 and no HV_FLUSH_ALL_PROCESSORS is specified.
We don't flush anything and apparently it's not what Windows expects.

TLFS doesn't say anything about such requests and newer Windows versions
seem to be unaffected. This all feels like a WS2012 bug, which is, however,
easy to workaround in KVM: let's flush everything when we see an empty
flush request, over-flushing doesn't hurt.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:26 +02:00
Logan Gunthorpe
48be4d7ced PCI: Fix issue with "pci=disable_acs_redir" parameter being ignored
[ Upstream commit d5bc73f34c ]

In most cases, kmalloc() will not be available early in boot when
pci_setup() is called.  Thus, the kstrdup() call that was added to fix the
__initdata bug with the disable_acs_redir parameter usually returns NULL,
so the parameter is discarded and has no effect.

To fix this, store the string that's in initdata until an initcall function
can allocate the memory appropriately.  This way we don't need any
additional static memory.

Fixes: d2fd6e8191 ("PCI: Fix __initdata issue with "pci=disable_acs_redir" parameter")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:26 +02:00
Al Viro
fa42fde1f8 apparmorfs: fix use-after-free on symlink traversal
[ Upstream commit f51dcd0f62 ]

symlink body shouldn't be freed without an RCU delay.  Switch apparmorfs
to ->destroy_inode() and use of call_rcu(); free both the inode and symlink
body in the callback.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:25 +02:00
Al Viro
cf0259f766 securityfs: fix use-after-free on symlink traversal
[ Upstream commit 46c8744196 ]

symlink body shouldn't be freed without an RCU delay.  Switch securityfs
to ->destroy_inode() and use of call_rcu(); free both the inode and symlink
body in the callback.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:25 +02:00
Tony Lindgren
04aa8a51e7 power: supply: cpcap-battery: Fix division by zero
[ Upstream commit dbe7208c6c ]

If called fast enough so samples do not increment, we can get
division by zero in kernel:

__div0
cpcap_battery_cc_raw_div
cpcap_battery_get_property
power_supply_get_property.part.1
power_supply_get_property
power_supply_show_property
power_supply_uevent

Fixes: 874b2adbed ("power: supply: cpcap-battery: Add a battery driver")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:25 +02:00
Alexey Kardashevskiy
38a725dd0b KVM: PPC: Book3S: Protect memslots while validating user address
[ Upstream commit 345077c8e1 ]

Guest physical to user address translation uses KVM memslots and reading
these requires holding the kvm->srcu lock. However recently introduced
kvmppc_tce_validate() broke the rule (see the lockdep warning below).

This moves srcu_read_lock(&vcpu->kvm->srcu) earlier to protect
kvmppc_tce_validate() as well.

=============================
WARNING: suspicious RCU usage
5.1.0-rc2-le_nv2_aikATfstn1-p1 #380 Not tainted
-----------------------------
include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by qemu-system-ppc/8020:
 #0: 0000000094972fe9 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0xdc/0x850 [kvm]

stack backtrace:
CPU: 44 PID: 8020 Comm: qemu-system-ppc Not tainted 5.1.0-rc2-le_nv2_aikATfstn1-p1 #380
Call Trace:
[c000003fece8f740] [c000000000bcc134] dump_stack+0xe8/0x164 (unreliable)
[c000003fece8f790] [c000000000181be0] lockdep_rcu_suspicious+0x130/0x170
[c000003fece8f810] [c0000000000d5f50] kvmppc_tce_to_ua+0x280/0x290
[c000003fece8f870] [c00800001a7e2c78] kvmppc_tce_validate+0x80/0x1b0 [kvm]
[c000003fece8f8e0] [c00800001a7e3fac] kvmppc_h_put_tce+0x94/0x3e4 [kvm]
[c000003fece8f9a0] [c00800001a8baac4] kvmppc_pseries_do_hcall+0x30c/0xce0 [kvm_hv]
[c000003fece8fa10] [c00800001a8bd89c] kvmppc_vcpu_run_hv+0x694/0xec0 [kvm_hv]
[c000003fece8fae0] [c00800001a7d95dc] kvmppc_vcpu_run+0x34/0x48 [kvm]
[c000003fece8fb00] [c00800001a7d56bc] kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm]
[c000003fece8fb90] [c00800001a7c3618] kvm_vcpu_ioctl+0x460/0x850 [kvm]
[c000003fece8fd00] [c00000000041c4f4] do_vfs_ioctl+0xe4/0x930
[c000003fece8fdb0] [c00000000041ce04] ksys_ioctl+0xc4/0x110
[c000003fece8fe00] [c00000000041ce78] sys_ioctl+0x28/0x80
[c000003fece8fe20] [c00000000000b5a4] system_call+0x5c/0x70

Fixes: 42de7b9e21 ("KVM: PPC: Validate TCEs against preregistered memory page sizes", 2018-09-10)
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:24 +02:00
Suraj Jitindar Singh
eec0c74675 KVM: PPC: Book3S HV: Perserve PSSCR FAKE_SUSPEND bit on guest exit
[ Upstream commit 7cb9eb106d ]

There is a hardware bug in some POWER9 processors where a treclaim in
fake suspend mode can cause an inconsistency in the XER[SO] bit across
the threads of a core, the workaround being to force the core into SMT4
when doing the treclaim.

The FAKE_SUSPEND bit (bit 10) in the PSSCR is used to control whether a
thread is in fake suspend or real suspend. The important difference here
being that thread reconfiguration is blocked in real suspend but not
fake suspend mode.

When we exit a guest which was in fake suspend mode, we force the core
into SMT4 while we do the treclaim in kvmppc_save_tm_hv().
However on the new exit path introduced with the function
kvmhv_run_single_vcpu() we restore the host PSSCR before calling
kvmppc_save_tm_hv() which means that if we were in fake suspend mode we
put the thread into real suspend mode when we clear the
PSSCR[FAKE_SUSPEND] bit. This means that we block thread reconfiguration
and the thread which is trying to get the core into SMT4 before it can
do the treclaim spins forever since it itself is blocking thread
reconfiguration. The result is that that core is essentially lost.

This results in a trace such as:
[   93.512904] CPU: 7 PID: 13352 Comm: qemu-system-ppc Not tainted 5.0.0 #4
[   93.512905] NIP:  c000000000098a04 LR: c0000000000cc59c CTR: 0000000000000000
[   93.512908] REGS: c000003fffd2bd70 TRAP: 0100   Not tainted  (5.0.0)
[   93.512908] MSR:  9000000302883033 <SF,HV,VEC,VSX,FP,ME,IR,DR,RI,LE,TM[SE]>  CR: 22222444  XER: 00000000
[   93.512914] CFAR: c000000000098a5c IRQMASK: 3
[   93.512915] PACATMSCRATCH: 0000000000000001
[   93.512916] GPR00: 0000000000000001 c000003f6cc1b830 c000000001033100 0000000000000004
[   93.512928] GPR04: 0000000000000004 0000000000000002 0000000000000004 0000000000000007
[   93.512930] GPR08: 0000000000000000 0000000000000004 0000000000000000 0000000000000004
[   93.512932] GPR12: c000203fff7fc000 c000003fffff9500 0000000000000000 0000000000000000
[   93.512935] GPR16: 2000000000300375 000000000000059f 0000000000000000 0000000000000000
[   93.512951] GPR20: 0000000000000000 0000000000080053 004000000256f41f c000003f6aa88ef0
[   93.512953] GPR24: c000003f6aa89100 0000000000000010 0000000000000000 0000000000000000
[   93.512956] GPR28: c000003f9e9a0800 0000000000000000 0000000000000001 c000203fff7fc000
[   93.512959] NIP [c000000000098a04] pnv_power9_force_smt4_catch+0x1b4/0x2c0
[   93.512960] LR [c0000000000cc59c] kvmppc_save_tm_hv+0x40/0x88
[   93.512960] Call Trace:
[   93.512961] [c000003f6cc1b830] [0000000000080053] 0x80053 (unreliable)
[   93.512965] [c000003f6cc1b8a0] [c00800001e9cb030] kvmhv_p9_guest_entry+0x508/0x6b0 [kvm_hv]
[   93.512967] [c000003f6cc1b940] [c00800001e9cba44] kvmhv_run_single_vcpu+0x2dc/0xb90 [kvm_hv]
[   93.512968] [c000003f6cc1ba10] [c00800001e9cc948] kvmppc_vcpu_run_hv+0x650/0xb90 [kvm_hv]
[   93.512969] [c000003f6cc1bae0] [c00800001e8f620c] kvmppc_vcpu_run+0x34/0x48 [kvm]
[   93.512971] [c000003f6cc1bb00] [c00800001e8f2d4c] kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm]
[   93.512972] [c000003f6cc1bb90] [c00800001e8e3918] kvm_vcpu_ioctl+0x460/0x7d0 [kvm]
[   93.512974] [c000003f6cc1bd00] [c0000000003ae2c0] do_vfs_ioctl+0xe0/0x8e0
[   93.512975] [c000003f6cc1bdb0] [c0000000003aeb24] ksys_ioctl+0x64/0xe0
[   93.512978] [c000003f6cc1be00] [c0000000003aebc8] sys_ioctl+0x28/0x80
[   93.512981] [c000003f6cc1be20] [c00000000000b3a4] system_call+0x5c/0x70
[   93.512983] Instruction dump:
[   93.512986] 419dffbc e98c0000 2e8b0000 38000001 60000000 60000000 60000000 40950068
[   93.512993] 392bffff 39400000 79290020 39290001 <7d2903a6> 60000000 60000000 7d235214

To fix this we preserve the PSSCR[FAKE_SUSPEND] bit until we call
kvmppc_save_tm_hv() which will mean the core can get into SMT4 and
perform the treclaim. Note kvmppc_save_tm_hv() clears the
PSSCR[FAKE_SUSPEND] bit again so there is no need to explicitly do that.

Fixes: 95a6432ce9 ("KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests")

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:24 +02:00
Jernej Skrabec
f3adb80bb2 clk: sunxi-ng: nkmp: Avoid GENMASK(-1, 0)
[ Upstream commit 2abc330c51 ]

Sometimes one of the nkmp factors is unused. This means that one of the
factors shift and width values are set to 0. Current nkmp clock code
generates a mask for each factor with GENMASK(width + shift - 1, shift).
For unused factor this translates to GENMASK(-1, 0). This code is
further expanded by C preprocessor to final version:
(((~0UL) - (1UL << (0)) + 1) & (~0UL >> (BITS_PER_LONG - 1 - (-1))))
or a bit simplified:
(~0UL & (~0UL >> BITS_PER_LONG))

It turns out that result of the second part (~0UL >> BITS_PER_LONG) is
actually undefined by C standard, which clearly specifies:

"If the value of the right operand is negative or is greater than or
equal to the width of the promoted left operand, the behavior is
undefined."

Additionally, compiling kernel with aarch64-linux-gnu-gcc 8.3.0 gave
different results whether literals or variables with same values as
literals were used. GENMASK with literals -1 and 0 gives zero and with
variables gives 0xFFFFFFFFFFFFFFF (~0UL). Because nkmp driver uses
GENMASK with variables as parameter, expression calculates mask as ~0UL
instead of 0. This has further consequences that LSB in register is
always set to 1 (1 is neutral value for a factor and shift is 0).

For example, H6 pll-de clock is set to 600 MHz by sun4i-drm driver, but
due to this bug ends up being 300 MHz. Additionally, 300 MHz seems to be
too low because following warning can be found in dmesg:

[    1.752763] WARNING: CPU: 2 PID: 41 at drivers/clk/sunxi-ng/ccu_common.c:41 ccu_helper_wait_for_lock.part.0+0x6c/0x90
[    1.763378] Modules linked in:
[    1.766441] CPU: 2 PID: 41 Comm: kworker/2:1 Not tainted 5.1.0-rc2-next-20190401 #138
[    1.774269] Hardware name: Pine H64 (DT)
[    1.778200] Workqueue: events deferred_probe_work_func
[    1.783341] pstate: 40000005 (nZcv daif -PAN -UAO)
[    1.788135] pc : ccu_helper_wait_for_lock.part.0+0x6c/0x90
[    1.793623] lr : ccu_helper_wait_for_lock.part.0+0x48/0x90
[    1.799107] sp : ffff000010f93840
[    1.802422] x29: ffff000010f93840 x28: 0000000000000000
[    1.807735] x27: ffff800073ce9d80 x26: ffff000010afd1b8
[    1.813049] x25: ffffffffffffffff x24: 00000000ffffffff
[    1.818362] x23: 0000000000000001 x22: ffff000010abd5c8
[    1.823675] x21: 0000000010000000 x20: 00000000685f367e
[    1.828987] x19: 0000000000001801 x18: 0000000000000001
[    1.834300] x17: 0000000000000001 x16: 0000000000000000
[    1.839613] x15: 0000000000000000 x14: ffff000010789858
[    1.844926] x13: 0000000000000000 x12: 0000000000000001
[    1.850239] x11: 0000000000000000 x10: 0000000000000970
[    1.855551] x9 : ffff000010f936c0 x8 : ffff800074cec0d0
[    1.860864] x7 : 0000800067117000 x6 : 0000000115c30b41
[    1.866177] x5 : 00ffffffffffffff x4 : 002c959300bfe500
[    1.871490] x3 : 0000000000000018 x2 : 0000000029aaaaab
[    1.876802] x1 : 00000000000002e6 x0 : 00000000686072bc
[    1.882114] Call trace:
[    1.884565]  ccu_helper_wait_for_lock.part.0+0x6c/0x90
[    1.889705]  ccu_helper_wait_for_lock+0x10/0x20
[    1.894236]  ccu_nkmp_set_rate+0x244/0x2a8
[    1.898334]  clk_change_rate+0x144/0x290
[    1.902258]  clk_core_set_rate_nolock+0x180/0x1b8
[    1.906963]  clk_set_rate+0x34/0xa0
[    1.910455]  sun8i_mixer_bind+0x484/0x558
[    1.914466]  component_bind_all+0x10c/0x230
[    1.918651]  sun4i_drv_bind+0xc4/0x1a0
[    1.922401]  try_to_bring_up_master+0x164/0x1c0
[    1.926932]  __component_add+0xa0/0x168
[    1.930769]  component_add+0x10/0x18
[    1.934346]  sun8i_dw_hdmi_probe+0x18/0x20
[    1.938443]  platform_drv_probe+0x50/0xa0
[    1.942455]  really_probe+0xcc/0x280
[    1.946032]  driver_probe_device+0x54/0xe8
[    1.950130]  __device_attach_driver+0x80/0xb8
[    1.954488]  bus_for_each_drv+0x78/0xc8
[    1.958326]  __device_attach+0xd4/0x130
[    1.962163]  device_initial_probe+0x10/0x18
[    1.966348]  bus_probe_device+0x90/0x98
[    1.970185]  deferred_probe_work_func+0x6c/0xa0
[    1.974720]  process_one_work+0x1e0/0x320
[    1.978732]  worker_thread+0x228/0x428
[    1.982484]  kthread+0x120/0x128
[    1.985714]  ret_from_fork+0x10/0x18
[    1.989290] ---[ end trace 9babd42e1ca4b84f ]---

This commit solves the issue by first checking value of the factor
width. If it is equal to 0 (unused factor), mask is set to 0, otherwise
GENMASK() macro is used as before.

Fixes: d897ef56fa ("clk: sunxi-ng: Mask nkmp factors when setting register")
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:24 +02:00
Vineet Gupta
791746a758 ARC: PAE40: don't panic and instead turn off hw ioc
[ Upstream commit 99bd5fcc50 ]

HSDK currently panics when built for HIGHMEM/ARC_HAS_PAE40 because ioc
is enabled with default which doesn't work for the 2 non contiguous
memory nodes. So get PAE working by disabling ioc instead.

Tested with !PAE40 by forcing @ioc_enable=0 and running the glibc
testsuite over ssh

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:23 +02:00
Steffen Klassert
30bd4585bf xfrm4: Fix uninitialized memory read in _decode_session4
[ Upstream commit 8742dc86d0 ]

We currently don't reload pointers pointing into skb header
after doing pskb_may_pull() in _decode_session4(). So in case
pskb_may_pull() changed the pointers, we read from random
memory. Fix this by putting all the needed infos on the
stack, so that we don't need to access the header pointers
after doing pskb_may_pull().

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:23 +02:00
Martin Willi
79fad8fd2b xfrm: Honor original L3 slave device in xfrmi policy lookup
[ Upstream commit 025c65e119 ]

If an xfrmi is associated to a vrf layer 3 master device,
xfrm_policy_check() fails after traffic decapsulation. The input
interface is replaced by the layer 3 master device, and hence
xfrmi_decode_session() can't match the xfrmi anymore to satisfy
policy checking.

Extend ingress xfrmi lookup to honor the original layer 3 slave
device, allowing xfrm interfaces to operate within a vrf domain.

Fixes: f203b76d78 ("xfrm: Add virtual xfrm interfaces")
Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:22 +02:00
Sabrina Dubroca
ff7fa2c801 esp4: add length check for UDP encapsulation
[ Upstream commit 8dfb4eba41 ]

esp_output_udp_encap can produce a length that doesn't fit in the 16
bits of a UDP header's length field. In that case, we'll send a
fragmented packet whose length is larger than IP_MAX_MTU (resulting in
"Oversized IP packet" warnings on receive) and with a bogus UDP
length.

To prevent this, add a length check to esp_output_udp_encap and return
 -EMSGSIZE on failure.

This seems to be older than git history.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:22 +02:00
Cong Wang
4e8ce26804 xfrm: clean up xfrm protocol checks
[ Upstream commit dbb2483b2a ]

In commit 6a53b75932 ("xfrm: check id proto in validate_tmpl()")
I introduced a check for xfrm protocol, but according to Herbert
IPSEC_PROTO_ANY should only be used as a wildcard for lookup, so
it should be removed from validate_tmpl().

And, IPSEC_PROTO_ANY is expected to only match 3 IPSec-specific
protocols, this is why xfrm_state_flush() could still miss
IPPROTO_ROUTING, which leads that those entries are left in
net->xfrm.state_all before exit net. Fix this by replacing
IPSEC_PROTO_ANY with zero.

This patch also extracts the check from validate_tmpl() to
xfrm_id_proto_valid() and uses it in parse_ipsecrequest().
With this, no other protocols should be added into xfrm.

Fixes: 6a53b75932 ("xfrm: check id proto in validate_tmpl()")
Reported-by: syzbot+0bf0519d6e0de15914fe@syzkaller.appspotmail.com
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:22 +02:00
Jeremy Sowden
6c0db1cbf7 vti4: ipip tunnel deregistration fixes.
[ Upstream commit 5483844c3f ]

If tunnel registration failed during module initialization, the module
would fail to deregister the IPPROTO_COMP protocol and would attempt to
deregister the tunnel.

The tunnel was not deregistered during module-exit.

Fixes: dd9ee34440 ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel")
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:22 +02:00
Su Yanjun
f8a427ca50 xfrm6_tunnel: Fix potential panic when unloading xfrm6_tunnel module
[ Upstream commit 6ee02a54ef ]

When unloading xfrm6_tunnel module, xfrm6_tunnel_fini directly
frees the xfrm6_tunnel_spi_kmem. Maybe someone has gotten the
xfrm6_tunnel_spi, so need to wait it.

Fixes: 91cc3bb0b04ff("xfrm6_tunnel: RCU conversion")
Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:21 +02:00
Myungho Jung
70a8732702 xfrm: Reset secpath in xfrm failure
[ Upstream commit 6ed69184ed ]

In esp4_gro_receive() and esp6_gro_receive(), secpath can be allocated
without adding xfrm state to xvec. Then, sp->xvec[sp->len - 1] would
fail and result in dereferencing invalid pointer in esp4_gso_segment()
and esp6_gso_segment(). Reset secpath if xfrm function returns error.

Fixes: 7785bba299 ("esp: Add a software GRO codepath")
Reported-by: syzbot+b69368fd933c6c592f4c@syzkaller.appspotmail.com
Signed-off-by: Myungho Jung <mhjungk@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:21 +02:00
YueHaibing
9531aac1ee xfrm: policy: Fix out-of-bound array accesses in __xfrm_policy_unlink
[ Upstream commit b805d78d30 ]

UBSAN report this:

UBSAN: Undefined behaviour in net/xfrm/xfrm_policy.c:1289:24
index 6 is out of range for type 'unsigned int [6]'
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.162-514.55.6.9.x86_64+ #13
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
 0000000000000000 1466cf39b41b23c9 ffff8801f6b07a58 ffffffff81cb35f4
 0000000041b58ab3 ffffffff83230f9c ffffffff81cb34e0 ffff8801f6b07a80
 ffff8801f6b07a20 1466cf39b41b23c9 ffffffff851706e0 ffff8801f6b07ae8
Call Trace:
 <IRQ>  [<ffffffff81cb35f4>] __dump_stack lib/dump_stack.c:15 [inline]
 <IRQ>  [<ffffffff81cb35f4>] dump_stack+0x114/0x1a0 lib/dump_stack.c:51
 [<ffffffff81d94225>] ubsan_epilogue+0x12/0x8f lib/ubsan.c:164
 [<ffffffff81d954db>] __ubsan_handle_out_of_bounds+0x16e/0x1b2 lib/ubsan.c:382
 [<ffffffff82a25acd>] __xfrm_policy_unlink+0x3dd/0x5b0 net/xfrm/xfrm_policy.c:1289
 [<ffffffff82a2e572>] xfrm_policy_delete+0x52/0xb0 net/xfrm/xfrm_policy.c:1309
 [<ffffffff82a3319b>] xfrm_policy_timer+0x30b/0x590 net/xfrm/xfrm_policy.c:243
 [<ffffffff813d3927>] call_timer_fn+0x237/0x990 kernel/time/timer.c:1144
 [<ffffffff813d8e7e>] __run_timers kernel/time/timer.c:1218 [inline]
 [<ffffffff813d8e7e>] run_timer_softirq+0x6ce/0xb80 kernel/time/timer.c:1401
 [<ffffffff8120d6f9>] __do_softirq+0x299/0xe10 kernel/softirq.c:273
 [<ffffffff8120e676>] invoke_softirq kernel/softirq.c:350 [inline]
 [<ffffffff8120e676>] irq_exit+0x216/0x2c0 kernel/softirq.c:391
 [<ffffffff82c5edab>] exiting_irq arch/x86/include/asm/apic.h:652 [inline]
 [<ffffffff82c5edab>] smp_apic_timer_interrupt+0x8b/0xc0 arch/x86/kernel/apic/apic.c:926
 [<ffffffff82c5c985>] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:735
 <EOI>  [<ffffffff81188096>] ? native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:52
 [<ffffffff810834d7>] arch_safe_halt arch/x86/include/asm/paravirt.h:111 [inline]
 [<ffffffff810834d7>] default_idle+0x27/0x430 arch/x86/kernel/process.c:446
 [<ffffffff81085f05>] arch_cpu_idle+0x15/0x20 arch/x86/kernel/process.c:437
 [<ffffffff8132abc3>] default_idle_call+0x53/0x90 kernel/sched/idle.c:92
 [<ffffffff8132b32d>] cpuidle_idle_call kernel/sched/idle.c:156 [inline]
 [<ffffffff8132b32d>] cpu_idle_loop kernel/sched/idle.c:251 [inline]
 [<ffffffff8132b32d>] cpu_startup_entry+0x60d/0x9a0 kernel/sched/idle.c:299
 [<ffffffff8113e119>] start_secondary+0x3c9/0x560 arch/x86/kernel/smpboot.c:245

The issue is triggered as this:

xfrm_add_policy
    -->verify_newpolicy_info  //check the index provided by user with XFRM_POLICY_MAX
			      //In my case, the index is 0x6E6BB6, so it pass the check.
    -->xfrm_policy_construct  //copy the user's policy and set xfrm_policy_timer
    -->xfrm_policy_insert
	--> __xfrm_policy_link //use the orgin dir, in my case is 2
	--> xfrm_gen_index   //generate policy index, there is 0x6E6BB6

then xfrm_policy_timer be fired

xfrm_policy_timer
   --> xfrm_policy_id2dir  //get dir from (policy index & 7), in my case is 6
   --> xfrm_policy_delete
      --> __xfrm_policy_unlink //access policy_count[dir], trigger out of range access

Add xfrm_policy_id2dir check in verify_newpolicy_info, make sure the computed dir is
valid, to fix the issue.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: e682adf021 ("xfrm: Try to honor policy index if it's supplied by user")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-25 18:22:21 +02:00
Kirill Smelkov
07a573c046 fuse: Add FOPEN_STREAM to use stream_open()
commit bbd84f3365 upstream.

Starting from commit 9c225f2655 ("vfs: atomic f_pos accesses as per
POSIX") files opened even via nonseekable_open gate read and write via lock
and do not allow them to be run simultaneously. This can create read vs
write deadlock if a filesystem is trying to implement a socket-like file
which is intended to be simultaneously used for both read and write from
filesystem client.  See commit 10dce8af34 ("fs: stream_open - opener for
stream-like files so that read and write can run simultaneously without
deadlock") for details and e.g. commit 581d21a2d0 ("xenbus: fix deadlock
on writes to /proc/xen/xenbus") for a similar deadlock example on
/proc/xen/xenbus.

To avoid such deadlock it was tempting to adjust fuse_finish_open to use
stream_open instead of nonseekable_open on just FOPEN_NONSEEKABLE flags,
but grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE,
and in particular GVFS which actually uses offset in its read and write
handlers

	https://codesearch.debian.net/search?q=-%3Enonseekable+%3D
	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080
	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346
	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481

so if we would do such a change it will break a real user.

Add another flag (FOPEN_STREAM) for filesystem servers to indicate that the
opened handler is having stream-like semantics; does not use file position
and thus the kernel is free to issue simultaneous read and write request on
opened file handle.

This patch together with stream_open() should be added to stable kernels
starting from v3.14+. This will allow to patch OSSPD and other FUSE
filesystems that provide stream-like files to return FOPEN_STREAM |
FOPEN_NONSEEKABLE in open handler and this way avoid the deadlock on all
kernel versions. This should work because fuse_finish_open ignores unknown
open flags returned from a filesystem and so passing FOPEN_STREAM to a
kernel that is not aware of this flag cannot hurt. In turn the kernel that
is not aware of FOPEN_STREAM will be < v3.14 where just FOPEN_NONSEEKABLE
is sufficient to implement streams without read vs write deadlock.

Cc: stable@vger.kernel.org # v3.14+
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-25 18:22:20 +02:00
Martin Wilck
560c6fd312 dm mpath: always free attached_handler_name in parse_path()
commit 940bc47178 upstream.

Commit b592211c33 ("dm mpath: fix attached_handler_name leak and
dangling hw_handler_name pointer") fixed a memory leak for the case
where setup_scsi_dh() returns failure. But setup_scsi_dh may return
success and not "use" attached_handler_name if the
retain_attached_hwhandler flag is not set on the map. As setup_scsi_sh
properly "steals" the pointer by nullifying it, freeing it
unconditionally in parse_path() is safe.

Fixes: b592211c33 ("dm mpath: fix attached_handler_name leak and dangling hw_handler_name pointer")
Cc: stable@vger.kernel.org
Reported-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-25 18:22:20 +02:00
Mikulas Patocka
96ecf4c59f dm integrity: correctly calculate the size of metadata area
commit 30bba430dd upstream.

When we use separate devices for data and metadata, dm-integrity would
incorrectly calculate the size of the metadata device as if it had
512-byte block size - and it would refuse activation with larger block
size and smaller metadata device.

Fix this so that it takes actual block size into account, which fixes
the following reported issue:
https://gitlab.com/cryptsetup/cryptsetup/issues/450

Fixes: 356d9d52e1 ("dm integrity: allow separate metadata device")
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-25 18:22:20 +02:00
Milan Broz
ecff1441aa dm crypt: move detailed message into debug level
commit 7a1cd7238f upstream.

The information about tag size should not be printed without debug info
set. Also print device major:minor in the error message to identify the
device instance.

Also use rate limiting and debug level for info about used crypto API
implementaton.  This is important because during online reencryption
the existing message saturates syslog (because we are moving hotzone
across the whole device).

Cc: stable@vger.kernel.org
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-25 18:22:19 +02:00
Mikulas Patocka
862a78341a dm delay: fix a crash when invalid device is specified
commit 81bc6d150a upstream.

When the target line contains an invalid device, delay_ctr() will call
delay_dtr() with NULL workqueue.  Attempting to destroy the NULL
workqueue causes a crash.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-25 18:22:19 +02:00