commit 0b9132ee742999aee13e6b22ef7723b6d4a0eaca Author: Greg Kroah-Hartman Date: Wed Apr 17 08:39:54 2019 +0200 Linux 5.0.8 commit eaa06ac5d1c1691dc4e5958f6097d49aba9dabe3 Author: Gerd Hoffmann Date: Fri Feb 8 15:04:09 2019 +0100 drm/virtio: do NOT reuse resource ids commit 16065fcdd19ddb9e093192914ac863884f308766 upstream. Bisected guest kernel changes crashing qemu. Landed at "6c1cd97bda drm/virtio: fix resource id handling". Looked again, and noticed we where not only leaking *some* ids, but *all* ids. The old code never ever called virtio_gpu_resource_id_put(). So, commit 6c1cd97bda effectively makes the linux kernel starting re-using IDs after releasing them, and apparently virglrenderer can't deal with that. Oops. This patch puts a temporary stopgap into place for the 5.0 release. Signed-off-by: Gerd Hoffmann Reviewed-by: Dave Airlie Signed-off-by: Dave Airlie Link: https://patchwork.freedesktop.org/patch/msgid/20190208140409.15280-1-kraxel@redhat.com Signed-off-by: Greg Kroah-Hartman commit 7af79a36d1881bdf54284c1a87318d772ebe6636 Author: Marc Orr Date: Mon Apr 1 23:56:00 2019 -0700 KVM: x86: nVMX: fix x2APIC VTPR read intercept commit c73f4c998e1fd4249b9edfa39e23f4fda2b9b041 upstream. Referring to the "VIRTUALIZING MSR-BASED APIC ACCESSES" chapter of the SDM, when "virtualize x2APIC mode" is 1 and "APIC-register virtualization" is 0, a RDMSR of 808H should return the VTPR from the virtual APIC page. However, for nested, KVM currently fails to disable the read intercept for this MSR. This means that a RDMSR exit takes precedence over "virtualize x2APIC mode", and KVM passes through L1's TPR to L2, instead of sourcing the value from L2's virtual APIC page. This patch fixes the issue by disabling the read intercept, in VMCS02, for the VTPR when "APIC-register virtualization" is 0. The issue described above and fix prescribed here, were verified with a related patch in kvm-unit-tests titled "Test VMX's virtualize x2APIC mode w/ nested". Signed-off-by: Marc Orr Reviewed-by: Jim Mattson Fixes: c992384bde84f ("KVM: vmx: speed up MSR bitmap merge") Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit b564364f22ac56f8be0ed7367fbebbe9df6d1569 Author: Marc Orr Date: Mon Apr 1 23:55:59 2019 -0700 KVM: x86: nVMX: close leak of L0's x2APIC MSRs (CVE-2019-3887) commit acff78477b9b4f26ecdf65733a4ed77fe837e9dc upstream. The nested_vmx_prepare_msr_bitmap() function doesn't directly guard the x2APIC MSR intercepts with the "virtualize x2APIC mode" MSR. As a result, we discovered the potential for a buggy or malicious L1 to get access to L0's x2APIC MSRs, via an L2, as follows. 1. L1 executes WRMSR(IA32_SPEC_CTRL, 1). This causes the spec_ctrl variable, in nested_vmx_prepare_msr_bitmap() to become true. 2. L1 disables "virtualize x2APIC mode" in VMCS12. 3. L1 enables "APIC-register virtualization" in VMCS12. Now, KVM will set VMCS02's x2APIC MSR intercepts from VMCS12, and then set "virtualize x2APIC mode" to 0 in VMCS02. Oops. This patch closes the leak by explicitly guarding VMCS02's x2APIC MSR intercepts with VMCS12's "virtualize x2APIC mode" control. The scenario outlined above and fix prescribed here, were verified with a related patch in kvm-unit-tests titled "Add leak scenario to virt_x2apic_mode_test". Note, it looks like this issue may have been introduced inadvertently during a merge---see 15303ba5d1cd. Signed-off-by: Marc Orr Reviewed-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 63bec9219c398a80a78608174225c16940fdf0d0 Author: Mikulas Patocka Date: Fri Apr 5 15:26:39 2019 -0400 dm integrity: fix deadlock with overlapping I/O commit 4ed319c6ac08e9a28fca7ac188181ac122f4de84 upstream. dm-integrity will deadlock if overlapping I/O is issued to it, the bug was introduced by commit 724376a04d1a ("dm integrity: implement fair range locks"). Users rarely use overlapping I/O so this bug went undetected until now. Fix this bug by correcting, likely cut-n-paste, typos in ranges_overlap() and also remove a flawed ranges_overlap() check in remove_range_unlocked(). This condition could leave unprocessed bios hanging on wait_list forever. Cc: stable@vger.kernel.org # v4.19+ Fixes: 724376a04d1a ("dm integrity: implement fair range locks") Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit de022a3453e67105ce587f3971ecb5b880f4b609 Author: Mike Snitzer Date: Wed Apr 3 12:23:11 2019 -0400 dm: disable DISCARD if the underlying storage no longer supports it commit bcb44433bba5eaff293888ef22ffa07f1f0347d6 upstream. Storage devices which report supporting discard commands like WRITE_SAME_16 with unmap, but reject discard commands sent to the storage device. This is a clear storage firmware bug but it doesn't change the fact that should a program cause discards to be sent to a multipath device layered on this buggy storage, all paths can end up failed at the same time from the discards, causing possible I/O loss. The first discard to a path will fail with Illegal Request, Invalid field in cdb, e.g.: kernel: sd 8:0:8:19: [sdfn] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE kernel: sd 8:0:8:19: [sdfn] tag#0 Sense Key : Illegal Request [current] kernel: sd 8:0:8:19: [sdfn] tag#0 Add. Sense: Invalid field in cdb kernel: sd 8:0:8:19: [sdfn] tag#0 CDB: Write same(16) 93 08 00 00 00 00 00 a0 08 00 00 00 80 00 00 00 kernel: blk_update_request: critical target error, dev sdfn, sector 10487808 The SCSI layer converts this to the BLK_STS_TARGET error number, the sd device disables its support for discard on this path, and because of the BLK_STS_TARGET error multipath fails the discard without failing any path or retrying down a different path. But subsequent discards can cause path failures. Any discards sent to the path which already failed a discard ends up failing with EIO from blk_cloned_rq_check_limits with an "over max size limit" error since the discard limit was set to 0 by the sd driver for the path. As the error is EIO, this now fails the path and multipath tries to send the discard down the next path. This cycle continues as discards are sent until all paths fail. Fix this by training DM core to disable DISCARD if the underlying storage already did so. Also, fix branching in dm_done() and clone_endio() to reflect the mutually exclussive nature of the IO operations in question. Cc: stable@vger.kernel.org Reported-by: David Jeffery Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit ca7671084384cba2e27de2aaa8c4ea4aa1daa86d Author: Ilya Dryomov Date: Tue Mar 26 20:20:58 2019 +0100 dm table: propagate BDI_CAP_STABLE_WRITES to fix sporadic checksum errors commit eb40c0acdc342b815d4d03ae6abb09e80c0f2988 upstream. Some devices don't use blk_integrity but still want stable pages because they do their own checksumming. Examples include rbd and iSCSI when data digests are negotiated. Stacking DM (and thus LVM) on top of these devices results in sporadic checksum errors. Set BDI_CAP_STABLE_WRITES if any underlying device has it set. Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 50b2e5c3b0a8a45c93354da2563e4241938effc6 Author: Mikulas Patocka Date: Thu Mar 21 16:46:12 2019 -0400 dm: revert 8f50e358153d ("dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE") commit 75ae193626de3238ca5fb895868ec91c94e63b1b upstream. The limit was already incorporated to dm-crypt with commit 4e870e948fba ("dm crypt: fix error with too large bios"), so we don't need to apply it globally to all targets. The quantity BIO_MAX_PAGES * PAGE_SIZE is wrong anyway because the variable ti->max_io_len it is supposed to be in the units of 512-byte sectors not in bytes. Reduction of the limit to 1048576 sectors could even cause data corruption in rare cases - suppose that we have a dm-striped device with stripe size 768MiB. The target will call dm_set_target_max_io_len with the value 1572864. The buggy code would reduce it to 1048576. Now, the dm-core will errorneously split the bios on 1048576-sector boundary insetad of 1572864-sector boundary and pass these stripe-crossing bios to the striped target. Cc: stable@vger.kernel.org # v4.16+ Fixes: 8f50e358153d ("dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE") Signed-off-by: Mikulas Patocka Acked-by: Ming Lei Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 556b7d910d5de6b1608250c14f9f6694ec3e2628 Author: Mikulas Patocka Date: Wed Mar 13 07:56:02 2019 -0400 dm integrity: change memcmp to strncmp in dm_integrity_ctr commit 0d74e6a3b6421d98eeafbed26f29156d469bc0b5 upstream. If the string opt_string is small, the function memcmp can access bytes that are beyond the terminating nul character. In theory, it could cause segfault, if opt_string were located just below some unmapped memory. Change from memcmp to strncmp so that we don't read bytes beyond the end of the string. Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 78dbc2482a7810c6db544c31ee1bc601b4a1cfc8 Author: Nicholas Piggin Date: Fri Mar 29 17:42:57 2019 +1000 powerpc/64s/radix: Fix radix segment exception handling commit 7100e8704b61247649c50551b965e71d168df30b upstream. Commit 48e7b76957 ("powerpc/64s/hash: Convert SLB miss handlers to C") broke the radix-mode segment exception handler. In radix mode, this is exception is not an SLB miss, rather it signals that the EA is outside the range translated by any page table. The commit lost the radix feature alternate code patch, which can cause faults to some EAs to kernel BUG at arch/powerpc/mm/slb.c:639! The original radix code would send faults to slb_miss_large_addr, which would end up faulting due to slb_addr_limit being 0. This patch sends radix directly to do_bad_slb_fault, which is a bit clearer. Fixes: 48e7b7695745 ("powerpc/64s/hash: Convert SLB miss handlers to C") Cc: stable@vger.kernel.org # v4.20+ Reported-by: Anton Blanchard Signed-off-by: Nicholas Piggin Reviewed-by: Aneesh Kumar K.V Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 49558542e0eb25cbd6467bb0aef12c474c145713 Author: Chuck Lever Date: Tue Apr 9 17:04:09 2019 -0400 xprtrdma: Fix helper that drains the transport commit e1ede312f17e96a9c5cda9aaa1cdcf442c1a5da8 upstream. We want to drain only the RQ first. Otherwise the transport can deadlock on ->close if there are outstanding Send completions. Fixes: 6d2d0ee27c7a ("xprtrdma: Replace rpcrdma_receive_wq ... ") Signed-off-by: Chuck Lever Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit 8af91139a0a8297557375f08320a89c7d2c87310 Author: Sergey Miroshnichenko Date: Tue Mar 12 15:05:48 2019 +0300 PCI: pciehp: Ignore Link State Changes after powering off a slot commit 3943af9d01e94330d0cfac6fccdbc829aad50c92 upstream. During a safe hot remove, the OS powers off the slot, which may cause a Data Link Layer State Changed event. The slot has already been set to OFF_STATE, so that event results in re-enabling the device, making it impossible to safely remove it. Clear out the Presence Detect Changed and Data Link Layer State Changed events when the disabled slot has settled down. It is still possible to re-enable the device if it remains in the slot after pressing the Attention Button by pressing it again. Fixes the problem that Micah reported below: an NVMe drive power button may not actually turn off the drive. Link: https://bugzilla.kernel.org/show_bug.cgi?id=203237 Reported-by: Micah Parrish Tested-by: Micah Parrish Signed-off-by: Sergey Miroshnichenko [bhelgaas: changelog, add bugzilla URL] Signed-off-by: Bjorn Helgaas Reviewed-by: Lukas Wunner Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Greg Kroah-Hartman commit 9b63917c6a4c5576e5fc84aaf0a3d8915035cf95 Author: Andre Przywara Date: Fri Apr 5 16:20:47 2019 +0100 PCI: Add function 1 DMA alias quirk for Marvell 9170 SATA controller commit 9cde402a59770a0669d895399c13407f63d7d209 upstream. There is a Marvell 88SE9170 PCIe SATA controller I found on a board here. Some quick testing with the ARM SMMU enabled reveals that it suffers from the same requester ID mixup problems as the other Marvell chips listed already. Add the PCI vendor/device ID to the list of chips which need the workaround. Signed-off-by: Andre Przywara Signed-off-by: Bjorn Helgaas CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 025768c171b8a428362a51a437ac43daf16e6735 Author: Lendacky, Thomas Date: Tue Apr 2 15:21:18 2019 +0000 x86/perf/amd: Remove need to check "running" bit in NMI handler commit 3966c3feca3fd10b2935caa0b4a08c7dd59469e5 upstream. Spurious interrupt support was added to perf in the following commit, almost a decade ago: 63e6be6d98e1 ("perf, x86: Catch spurious interrupts after disabling counters") The two previous patches (resolving the race condition when disabling a PMC and NMI latency mitigation) allow for the removal of this older spurious interrupt support. Currently in x86_pmu_stop(), the bit for the PMC in the active_mask bitmap is cleared before disabling the PMC, which sets up a race condition. This race condition was mitigated by introducing the running bitmap. That race condition can be eliminated by first disabling the PMC, waiting for PMC reset on overflow and then clearing the bit for the PMC in the active_mask bitmap. The NMI handler will not re-enable a disabled counter. If x86_pmu_stop() is called from the perf NMI handler, the NMI latency mitigation support will guard against any unhandled NMI messages. Signed-off-by: Tom Lendacky Signed-off-by: Peter Zijlstra (Intel) Cc: # 4.14.x- Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo Cc: Borislav Petkov Cc: Jiri Olsa Cc: Linus Torvalds Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Link: https://lkml.kernel.org/r/Message-ID: Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit ecb09f75a9b98ab26a91f5024ce45125c49daddb Author: Lendacky, Thomas Date: Tue Apr 2 15:21:16 2019 +0000 x86/perf/amd: Resolve NMI latency issues for active PMCs commit 6d3edaae16c6c7d238360f2841212c2b26774d5e upstream. On AMD processors, the detection of an overflowed PMC counter in the NMI handler relies on the current value of the PMC. So, for example, to check for overflow on a 48-bit counter, bit 47 is checked to see if it is 1 (not overflowed) or 0 (overflowed). When the perf NMI handler executes it does not know in advance which PMC counters have overflowed. As such, the NMI handler will process all active PMC counters that have overflowed. NMI latency in newer AMD processors can result in multiple overflowed PMC counters being processed in one NMI and then a subsequent NMI, that does not appear to be a back-to-back NMI, not finding any PMC counters that have overflowed. This may appear to be an unhandled NMI resulting in either a panic or a series of messages, depending on how the kernel was configured. To mitigate this issue, add an AMD handle_irq callback function, amd_pmu_handle_irq(), that will invoke the common x86_pmu_handle_irq() function and upon return perform some additional processing that will indicate if the NMI has been handled or would have been handled had an earlier NMI not handled the overflowed PMC. Using a per-CPU variable, a minimum value of the number of active PMCs or 2 will be set whenever a PMC is active. This is used to indicate the possible number of NMIs that can still occur. The value of 2 is used for when an NMI does not arrive at the LAPIC in time to be collapsed into an already pending NMI. Each time the function is called without having handled an overflowed counter, the per-CPU value is checked. If the value is non-zero, it is decremented and the NMI indicates that it handled the NMI. If the value is zero, then the NMI indicates that it did not handle the NMI. Signed-off-by: Tom Lendacky Signed-off-by: Peter Zijlstra (Intel) Cc: # 4.14.x- Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo Cc: Borislav Petkov Cc: Jiri Olsa Cc: Linus Torvalds Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Link: https://lkml.kernel.org/r/Message-ID: Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit c583b4d08f3a7fc4aa49804c7943df0e8c0000e0 Author: Lendacky, Thomas Date: Tue Apr 2 15:21:14 2019 +0000 x86/perf/amd: Resolve race condition when disabling PMC commit 914123fa39042e651d79eaf86bbf63a1b938dddf upstream. On AMD processors, the detection of an overflowed counter in the NMI handler relies on the current value of the counter. So, for example, to check for overflow on a 48 bit counter, bit 47 is checked to see if it is 1 (not overflowed) or 0 (overflowed). There is currently a race condition present when disabling and then updating the PMC. Increased NMI latency in newer AMD processors makes this race condition more pronounced. If the counter value has overflowed, it is possible to update the PMC value before the NMI handler can run. The updated PMC value is not an overflowed value, so when the perf NMI handler does run, it will not find an overflowed counter. This may appear as an unknown NMI resulting in either a panic or a series of messages, depending on how the kernel is configured. To eliminate this race condition, the PMC value must be checked after disabling the counter. Add an AMD function, amd_pmu_disable_all(), that will wait for the NMI handler to reset any active and overflowed counter after calling x86_pmu_disable_all(). Signed-off-by: Tom Lendacky Signed-off-by: Peter Zijlstra (Intel) Cc: # 4.14.x- Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo Cc: Borislav Petkov Cc: Jiri Olsa Cc: Linus Torvalds Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Link: https://lkml.kernel.org/r/Message-ID: Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 8b866ffe3d311fbb2241177a3e8fbfc020384e57 Author: Alexander Potapenko Date: Tue Apr 2 13:28:13 2019 +0200 x86/asm: Use stricter assembly constraints in bitops commit 5b77e95dd7790ff6c8fbf1cd8d0104ebed818a03 upstream. There's a number of problems with how arch/x86/include/asm/bitops.h is currently using assembly constraints for the memory region bitops are modifying: 1) Use memory clobber in bitops that touch arbitrary memory Certain bit operations that read/write bits take a base pointer and an arbitrarily large offset to address the bit relative to that base. Inline assembly constraints aren't expressive enough to tell the compiler that the assembly directive is going to touch a specific memory location of unknown size, therefore we have to use the "memory" clobber to indicate that the assembly is going to access memory locations other than those listed in the inputs/outputs. To indicate that BTR/BTS instructions don't necessarily touch the first sizeof(long) bytes of the argument, we also move the address to assembly inputs. This particular change leads to size increase of 124 kernel functions in a defconfig build. For some of them the diff is in NOP operations, other end up re-reading values from memory and may potentially slow down the execution. But without these clobbers the compiler is free to cache the contents of the bitmaps and use them as if they weren't changed by the inline assembly. 2) Use byte-sized arguments for operations touching single bytes. Passing a long value to ANDB/ORB/XORB instructions makes the compiler treat sizeof(long) bytes as being clobbered, which isn't the case. This may theoretically lead to worse code in the case of heavy optimization. Practical impact: I've built a defconfig kernel and looked through some of the functions generated by GCC 7.3.0 with and without this clobber, and didn't spot any miscompilations. However there is a (trivial) theoretical case where this code leads to miscompilation: https://lkml.org/lkml/2019/3/28/393 using just GCC 8.3.0 with -O2. It isn't hard to imagine someone writes such a function in the kernel someday. So the primary motivation is to fix an existing misuse of the asm directive, which happens to work in certain configurations now, but isn't guaranteed to work under different circumstances. [ --mingo: Added -stable tag because defconfig only builds a fraction of the kernel and the trivial testcase looks normal enough to be used in existing or in-development code. ] Signed-off-by: Alexander Potapenko Cc: Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: Dmitry Vyukov Cc: H. Peter Anvin Cc: James Y Knight Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20190402112813.193378-1-glider@google.com [ Edited the changelog, tidied up one of the defines. ] Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 3783a3b1e2189efb442cb7da828ca1111bd392fc Author: Rasmus Villemoes Date: Fri Jan 11 09:49:30 2019 +0100 x86/asm: Remove dead __GNUC__ conditionals commit 88ca66d8540ca26119b1428cddb96b37925bdf01 upstream. The minimum supported gcc version is >= 4.6, so these can be removed. Signed-off-by: Rasmus Villemoes Signed-off-by: Borislav Petkov Cc: "H. Peter Anvin" Cc: Dan Williams Cc: Geert Uytterhoeven Cc: Ingo Molnar Cc: Matthew Wilcox Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: x86-ml Link: https://lkml.kernel.org/r/20190111084931.24601-1-linux@rasmusvillemoes.dk Signed-off-by: Greg Kroah-Hartman commit 5866b5fc2b8a8fbe59f720fe5f2124e8df977e91 Author: Dmitry V. Levin Date: Fri Mar 29 20:12:30 2019 +0300 csky: Fix syscall_get_arguments() and syscall_set_arguments() commit ed3bb007021b9bddb90afae28a19f08ed8890add upstream. C-SKY syscall arguments are located in orig_a0,a1,a2,a3,regs[0],regs[1] fields of struct pt_regs. Due to an off-by-one bug and a bug in pointer arithmetic syscall_get_arguments() was reading orig_a0,regs[1..5] fields instead. Likewise, syscall_set_arguments() was writing orig_a0,regs[1..5] fields instead. Link: http://lkml.kernel.org/r/20190329171230.GB32456@altlinux.org Fixes: 4859bfca11c7d ("csky: System Call") Cc: Ingo Molnar Cc: Kees Cook Cc: Andy Lutomirski Cc: Will Drewry Cc: stable@vger.kernel.org # v4.20+ Tested-by: Guo Ren Acked-by: Guo Ren Signed-off-by: Dmitry V. Levin Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit b66f9a1ea65a15a39f502d8d3220d2b8aea825cb Author: Max Filippov Date: Thu Apr 4 11:08:40 2019 -0700 xtensa: fix return_address commit ada770b1e74a77fff2d5f539bf6c42c25f4784db upstream. return_address returns the address that is one level higher in the call stack than requested in its argument, because level 0 corresponds to its caller's return address. Use requested level as the number of stack frames to skip. This fixes the address reported by might_sleep and friends. Cc: stable@vger.kernel.org Signed-off-by: Max Filippov Signed-off-by: Greg Kroah-Hartman commit e09deff8f8f4dc7227cac9a3d759e9806f8a0773 Author: Mel Gorman Date: Tue Mar 19 12:36:10 2019 +0000 sched/fair: Do not re-read ->h_load_next during hierarchical load calculation commit 0e9f02450da07fc7b1346c8c32c771555173e397 upstream. A NULL pointer dereference bug was reported on a distribution kernel but the same issue should be present on mainline kernel. It occured on s390 but should not be arch-specific. A partial oops looks like: Unable to handle kernel pointer dereference in virtual kernel address space ... Call Trace: ... try_to_wake_up+0xfc/0x450 vhost_poll_wakeup+0x3a/0x50 [vhost] __wake_up_common+0xbc/0x178 __wake_up_common_lock+0x9e/0x160 __wake_up_sync_key+0x4e/0x60 sock_def_readable+0x5e/0x98 The bug hits any time between 1 hour to 3 days. The dereference occurs in update_cfs_rq_h_load when accumulating h_load. The problem is that cfq_rq->h_load_next is not protected by any locking and can be updated by parallel calls to task_h_load. Depending on the compiler, code may be generated that re-reads cfq_rq->h_load_next after the check for NULL and then oops when reading se->avg.load_avg. The dissassembly showed that it was possible to reread h_load_next after the check for NULL. While this does not appear to be an issue for later compilers, it's still an accident if the correct code is generated. Full locking in this path would have high overhead so this patch uses READ_ONCE to read h_load_next only once and check for NULL before dereferencing. It was confirmed that there were no further oops after 10 days of testing. As Peter pointed out, it is also necessary to use WRITE_ONCE() to avoid any potential problems with store tearing. Signed-off-by: Mel Gorman Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Valentin Schneider Cc: Linus Torvalds Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Fixes: 685207963be9 ("sched: Move h_load calculation to task_h_load()") Link: https://lkml.kernel.org/r/20190319123610.nsivgf3mjbjjesxb@techsingularity.net Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 05acf6f5b81c11ba69764417d40f32e66d3a4eb1 Author: Dan Carpenter Date: Thu Apr 4 18:12:17 2019 +0300 xen: Prevent buffer overflow in privcmd ioctl commit 42d8644bd77dd2d747e004e367cb0c895a606f39 upstream. The "call" variable comes from the user in privcmd_ioctl_hypercall(). It's an offset into the hypercall_page[] which has (PAGE_SIZE / 32) elements. We need to put an upper bound on it to prevent an out of bounds access. Cc: stable@vger.kernel.org Fixes: 1246ae0bb992 ("xen: add variable hypercall caller") Signed-off-by: Dan Carpenter Reviewed-by: Boris Ostrovsky Signed-off-by: Juergen Gross Signed-off-by: Greg Kroah-Hartman commit e0e77b23a73d07d5cb9ad1ba970c420e7883e9d0 Author: Moni Shoua Date: Tue Mar 19 11:24:36 2019 +0200 IB/mlx5: Reset access mask when looping inside page fault handler commit 1abe186ed8a6593069bc122da55fc684383fdc1c upstream. If page-fault handler spans multiple MRs then the access mask needs to be reset before each MR handling or otherwise write access will be granted to mapped pages instead of read-only. Cc: # 3.19 Fixes: 7bdf65d411c1 ("IB/mlx5: Handle page faults") Reported-by: Jerome Glisse Signed-off-by: Moni Shoua Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 6b3b09cd4daa81bab35375f54c4548dea930c238 Author: Ard Biesheuvel Date: Sun Apr 7 21:06:16 2019 +0200 arm64/ftrace: fix inadvertent BUG() in trampoline check commit 5a3ae7b314a2259b1188b22b392f5eba01e443ee upstream. The ftrace trampoline code (which deals with modules loaded out of BL range of the core kernel) uses plt_entries_equal() to check whether the per-module trampoline equals a zero buffer, to decide whether the trampoline has already been initialized. This triggers a BUG() in the opcode manipulation code, since we end up checking the ADRP offset of a 0x0 opcode, which is not an ADRP instruction. So instead, add a helper to check whether a PLT is initialized, and call that from the frace code. Cc: # v5.0 Fixes: bdb85cd1d206 ("arm64/module: switch to ADRP/ADD sequences for PLT entries") Acked-by: Mark Rutland Signed-off-by: Ard Biesheuvel Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 36078cae3790ac638013a43a93542a729082272c Author: Will Deacon Date: Mon Apr 8 17:56:34 2019 +0100 arm64: backtrace: Don't bother trying to unwind the userspace stack commit 1e6f5440a6814d28c32d347f338bfef68bc3e69d upstream. Calling dump_backtrace() with a pt_regs argument corresponding to userspace doesn't make any sense and our unwinder will simply print "Call trace:" before unwinding the stack looking for user frames. Rather than go through this song and dance, just return early if we're passed a user register state. Cc: Fixes: 1149aad10b1e ("arm64: Add dump_backtrace() in show_regs") Reported-by: Kefeng Wang Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit de2e5ed04711e45fca6b9b24b0ad893e009692f0 Author: Peter Geis Date: Wed Mar 13 18:45:36 2019 +0000 arm64: dts: rockchip: fix rk3328 rgmii high tx error rate commit 6fd8b9780ec1a49ac46e0aaf8775247205e66231 upstream. Several rk3328 based boards experience high rgmii tx error rates. This is due to several pins in the rk3328.dtsi rgmii pinmux that are missing a defined pull strength setting. This causes the pinmux driver to default to 2ma (bit mask 00). These pins are only defined in the rk3328.dtsi, and are not listed in the rk3328 specification. The TRM only lists them as "Reserved" (RK3328 TRM V1.1, 3.3.3 Detail Register Description, GRF_GPIO0B_IOMUX, GRF_GPIO0C_IOMUX, GRF_GPIO0D_IOMUX). However, removal of these pins from the rgmii pinmux definition causes the interface to fail to transmit. Also, the rgmii tx and rx pins defined in the dtsi are not consistent with the rk3328 specification, with tx pins currently set to 12ma and rx pins set to 2ma. Fix this by setting tx pins to 8ma and the rx pins to 4ma, consistent with the specification. Defining the drive strength for the undefined pins eliminated the high tx packet error rate observed under heavy data transfers. Aligning the drive strength to the TRM values eliminated the occasional packet retry errors under iperf3 testing. This allows much higher data rates with no recorded tx errors. Tested on the rk3328-roc-cc board. Fixes: 52e02d377a72 ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs") Cc: stable@vger.kernel.org Signed-off-by: Peter Geis Signed-off-by: Heiko Stuebner Signed-off-by: Greg Kroah-Hartman commit 7fcf2d915fa0893c20d02fac224e77ec024a7ab5 Author: Tomohiro Mayama Date: Sun Mar 10 01:10:12 2019 +0900 arm64: dts: rockchip: Fix vcc_host1_5v GPIO polarity on rk3328-rock64 commit a8772e5d826d0f61f8aa9c284b3ab49035d5273d upstream. This patch makes USB ports functioning again. Fixes: 955bebde057e ("arm64: dts: rockchip: add rk3328-rock64 board") Cc: stable@vger.kernel.org Suggested-by: Robin Murphy Signed-off-by: Tomohiro Mayama Tested-by: Katsuhiro Suzuki Signed-off-by: Heiko Stuebner Signed-off-by: Greg Kroah-Hartman commit 68a6a619ebd7134030d59451aa83ee49b3f6e92b Author: Will Deacon Date: Mon Apr 8 12:45:09 2019 +0100 arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value commit 045afc24124d80c6998d9c770844c67912083506 upstream. Rather embarrassingly, our futex() FUTEX_WAKE_OP implementation doesn't explicitly set the return value on the non-faulting path and instead leaves it holding the result of the underlying atomic operation. This means that any FUTEX_WAKE_OP atomic operation which computes a non-zero value will be reported as having failed. Regrettably, I wrote the buggy code back in 2011 and it was upstreamed as part of the initial arm64 support in 2012. The reasons we appear to get away with this are: 1. FUTEX_WAKE_OP is rarely used and therefore doesn't appear to get exercised by futex() test applications 2. If the result of the atomic operation is zero, the system call behaves correctly 3. Prior to version 2.25, the only operation used by GLIBC set the futex to zero, and therefore worked as expected. From 2.25 onwards, FUTEX_WAKE_OP is not used by GLIBC at all. Fix the implementation by ensuring that the return value is either 0 to indicate that the atomic operation completed successfully, or -EFAULT if we encountered a fault when accessing the user mapping. Cc: Fixes: 6170a97460db ("arm64: Atomic operations") Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit b0266ece310dae6c92da549685c600de9dbd2aa5 Author: David Engraf Date: Mon Mar 11 08:57:42 2019 +0100 ARM: dts: at91: Fix typo in ISC_D0 on PC9 commit e7dfb6d04e4715be1f3eb2c60d97b753fd2e4516 upstream. The function argument for the ISC_D0 on PC9 was incorrect. According to the documentation it should be 'C' aka 3. Signed-off-by: David Engraf Reviewed-by: Nicolas Ferre Signed-off-by: Ludovic Desroches Fixes: 7f16cb676c00 ("ARM: at91/dt: add sama5d2 pinmux") Cc: # v4.4+ Signed-off-by: Greg Kroah-Hartman commit a005242834ac42eb8fe150deaae2ab984ac208c3 Author: David Summers Date: Sat Mar 9 15:39:21 2019 +0000 ARM: dts: rockchip: Fix SD card detection on rk3288-tinker commit 8dbc4d5ddb59f49cb3e85bccf42a4720b27a6576 upstream. The Problem: On ASUS Tinker Board S, when booting from the eMMC, and there is card in the sd slot, there are constant errors. Also when warm reboot, uboot can not access the sd slot Cause: Identified by Robin Murphy @ ARM. The Card Detect on rk3288 devices is pulled up by vccio-sd; so when the regulator powers this off, card detect gives spurious errors. A second problem, is during power down, vccio-sd apprears to be powered down. This causes a problem when warm rebooting from the sd card. This was identified by Jonas Karlman. History: A common fault on these rk3288 board, which impliment the reference design. When this arose before: http://lists.infradead.org/pipermail/linux-arm-kernel/2014-August/281153.html And Ulf and Jaehoon clearly said this was a broken card detect design, which should be solved via polling Solution: Hence broken-cd is set as a property. This cures the errors. The powering down of vccio-sd during reboot is cured by adding regulator-boot-on. This solutions has been fairly widely reviewed and tested. Fixes: e58c5e739d6f ("ARM: dts: rockchip: move shared tinker-board nodes to a common dtsi") Cc: stable@vger.kernel.org [Heiko: slightly inaccurate fixes but tinker is a sbc (aka like a Pi) where we can hopefully expect people not to rely on overly old stable kernels] Signed-off-by: David Summers Reviewed-by: Jonas Karlman Tested-by: Jonas Karlman Reviewed-by: Robin Murphy Signed-off-by: Heiko Stuebner Signed-off-by: Greg Kroah-Hartman commit e74aa76752f403a993cc5749d4d613aecf61bf42 Author: Peter Ujfalusi Date: Fri Mar 15 12:59:09 2019 +0200 ARM: dts: am335x-evm: Correct the regulators for the audio codec commit 4f96dc0a3e79ec257a2b082dab3ee694ff88c317 upstream. Correctly map the regulators used by tlv320aic3106. Both 1.8V and 3.3V for the codec is derived from VBAT via fixed regulators. Cc: # v4.14+ Signed-off-by: Peter Ujfalusi Signed-off-by: Tony Lindgren Signed-off-by: Greg Kroah-Hartman commit 724d26349abfa9f816a2a989359d3a8d64f2d000 Author: Peter Ujfalusi Date: Fri Mar 15 12:59:17 2019 +0200 ARM: dts: am335x-evmsk: Correct the regulators for the audio codec commit 6691370646e844be98bb6558c024269791d20bd7 upstream. Correctly map the regulators used by tlv320aic3106. Both 1.8V and 3.3V for the codec is derived from VBAT via fixed regulators. Cc: # v4.14+ Signed-off-by: Peter Ujfalusi Signed-off-by: Tony Lindgren Signed-off-by: Greg Kroah-Hartman commit 4e34e23d570835fe61f69f3bd573432de0365fce Author: Jonas Karlman Date: Sun Feb 24 21:51:22 2019 +0000 ARM: dts: rockchip: fix rk3288 cpu opp node reference commit 6b2fde3dbfab6ebc45b0cd605e17ca5057ff9a3b upstream. The following error can be seen during boot: of: /cpus/cpu@501: Couldn't find opp node Change cpu nodes to use operating-points-v2 in order to fix this. Fixes: ce76de984649 ("ARM: dts: rockchip: convert rk3288 to operating-points-v2") Cc: stable@vger.kernel.org Signed-off-by: Jonas Karlman Signed-off-by: Heiko Stuebner Signed-off-by: Greg Kroah-Hartman commit f04200259be82b8d06e2df91daa176227d6af424 Author: Janusz Krzysztofik Date: Tue Mar 19 21:19:52 2019 +0100 ARM: OMAP1: ams-delta: Fix broken GPIO ID allocation commit 3e2cf62efec52fb49daed437cc486c3cb9a0afa2 upstream. In order to request dynamic allocationn of GPIO IDs, a negative number should be passed as a base GPIO ID via platform data. Unfortuntely, commit 771e53c4d1a1 ("ARM: OMAP1: ams-delta: Drop board specific global GPIO numbers") didn't follow that rule while switching to dynamically allocated GPIO IDs for Amstrad Delta latches, making their IDs overlapping with those already assigned to OMAP GPIO devices. Fix it. Fixes: 771e53c4d1a1 ("ARM: OMAP1: ams-delta: Drop board specific global GPIO numbers") Signed-off-by: Janusz Krzysztofik Cc: stable@vger.kernel.org Acked-by: Aaro Koskinen Signed-off-by: Tony Lindgren Signed-off-by: Greg Kroah-Hartman commit 4e6f0d588cd2c46e1f14146434d7f9b6381aa425 Author: Jani Nikula Date: Fri Apr 5 10:52:20 2019 +0300 drm/i915/dp: revert back to max link rate and lane count on eDP commit 21635d7311734d2d1b177f8a95e2f9386174b76d upstream. Commit 7769db588384 ("drm/i915/dp: optimize eDP 1.4+ link config fast and narrow") started to optize the eDP 1.4+ link config, both per spec and as preparation for display stream compression support. Sadly, we again face panels that flat out fail with parameters they claim to support. Revert, and go back to the drawing board. v2: Actually revert to max params instead of just wide-and-slow. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109959 Fixes: 7769db588384 ("drm/i915/dp: optimize eDP 1.4+ link config fast and narrow") Cc: Ville Syrjälä Cc: Manasi Navare Cc: Rodrigo Vivi Cc: Matt Atwood Cc: "Lee, Shawn C" Cc: Dave Airlie Cc: intel-gfx@lists.freedesktop.org Cc: # v5.0+ Reviewed-by: Rodrigo Vivi Reviewed-by: Manasi Navare Tested-by: Albert Astals Cid # v5.0 backport Tested-by: Emanuele Panigati # v5.0 backport Tested-by: Matteo Iervasi # v5.0 backport Signed-off-by: Jani Nikula Link: https://patchwork.freedesktop.org/patch/msgid/20190405075220.9815-1-jani.nikula@intel.com (cherry picked from commit f11cb1c19ad0563b3c1ea5eb16a6bac0e401f428) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit 88fa815395e3e082be1961fdf6c606ba1e23b342 Author: Cornelia Huck Date: Mon Apr 8 14:33:22 2019 +0200 virtio: Honour 'may_reduce_num' in vring_create_virtqueue commit cf94db21905333e610e479688add629397a4b384 upstream. vring_create_virtqueue() allows the caller to specify via the may_reduce_num parameter whether the vring code is allowed to allocate a smaller ring than specified. However, the split ring allocation code tries to allocate a smaller ring on allocation failure regardless of what the caller specified. This may cause trouble for e.g. virtio-pci in legacy mode, which does not support ring resizing. (The packed ring code does not resize in any case.) Let's fix this by bailing out immediately in the split ring code if the requested size cannot be allocated and may_reduce_num has not been specified. While at it, fix a typo in the usage instructions. Fixes: 2a2d1382fe9d ("virtio: Add improved queue allocation API") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Cornelia Huck Signed-off-by: Michael S. Tsirkin Reviewed-by: Halil Pasic Reviewed-by: Jens Freimann Signed-off-by: Greg Kroah-Hartman commit ec64558908d78a56422f72633151eb9a3b7953a9 Author: Kefeng Wang Date: Thu Apr 4 15:45:12 2019 +0800 genirq: Initialize request_mutex if CONFIG_SPARSE_IRQ=n commit e8458e7afa855317b14915d7b86ab3caceea7eb6 upstream. When CONFIG_SPARSE_IRQ is disable, the request_mutex in struct irq_desc is not initialized which causes malfunction. Fixes: 9114014cf4e6 ("genirq: Add mutex to irq desc to serialize request/free_irq()") Signed-off-by: Kefeng Wang Signed-off-by: Thomas Gleixner Reviewed-by: Mukesh Ojha Cc: Marc Zyngier Cc: Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190404074512.145533-1-wangkefeng.wang@huawei.com Signed-off-by: Greg Kroah-Hartman commit b8ad5278c4d3491935df978648f3002fe66c96bc Author: Stephen Boyd Date: Mon Mar 25 11:10:26 2019 -0700 genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent() commit 325aa19598e410672175ed50982f902d4e3f31c5 upstream. If a child irqchip calls irq_chip_set_wake_parent() but its parent irqchip has the IRQCHIP_SKIP_SET_WAKE flag set an error is returned. This is inconsistent behaviour vs. set_irq_wake_real() which returns 0 when the irqchip has the IRQCHIP_SKIP_SET_WAKE flag set. It doesn't attempt to walk the chain of parents and set irq wake on any chips that don't have the flag set either. If the intent is to call the .irq_set_wake() callback of the parent irqchip, then we expect irqchip implementations to omit the IRQCHIP_SKIP_SET_WAKE flag and implement an .irq_set_wake() function that calls irq_chip_set_wake_parent(). The problem has been observed on a Qualcomm sdm845 device where set wake fails on any GPIO interrupts after applying work in progress wakeup irq patches to the GPIO driver. The chain of chips looks like this: QCOM GPIO -> QCOM PDC (SKIP) -> ARM GIC (SKIP) The GPIO controllers parent is the QCOM PDC irqchip which in turn has ARM GIC as parent. The QCOM PDC irqchip has the IRQCHIP_SKIP_SET_WAKE flag set, and so does the grandparent ARM GIC. The GPIO driver doesn't know if the parent needs to set wake or not, so it unconditionally calls irq_chip_set_wake_parent() causing this function to return a failure because the parent irqchip (PDC) doesn't have the .irq_set_wake() callback set. Returning 0 instead makes everything work and irqs from the GPIO controller can be configured for wakeup. Make it consistent by returning 0 (success) from irq_chip_set_wake_parent() when a parent chip has IRQCHIP_SKIP_SET_WAKE set. [ tglx: Massaged changelog ] Fixes: 08b55e2a9208e ("genirq: Add irqchip_set_wake_parent") Signed-off-by: Stephen Boyd Signed-off-by: Thomas Gleixner Acked-by: Marc Zyngier Cc: linux-arm-kernel@lists.infradead.org Cc: linux-gpio@vger.kernel.org Cc: Lina Iyer Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190325181026.247796-1-swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman commit fffb3e8b59202a123391db6a5fd8aecaf79e6bd5 Author: Jason Yan Date: Fri Apr 12 10:09:16 2019 +0800 block: fix the return errno for direct IO commit a89afe58f1a74aac768a5eb77af95ef4ee15beaa upstream. If the last bio returned is not dio->bio, the status of the bio will not assigned to dio->bio if it is error. This will cause the whole IO status wrong. ksoftirqd/21-117 [021] ..s. 4017.966090: 8,0 C N 4883648 [0] -0 [018] ..s. 4017.970888: 8,0 C WS 4924800 + 1024 [0] -0 [018] ..s. 4017.970909: 8,0 D WS 4935424 + 1024 [] -0 [018] ..s. 4017.970924: 8,0 D WS 4936448 + 321 [] ksoftirqd/21-117 [021] ..s. 4017.995033: 8,0 C R 4883648 + 336 [65475] ksoftirqd/21-117 [021] d.s. 4018.001988: myprobe1: (blkdev_bio_end_io+0x0/0x168) bi_status=7 ksoftirqd/21-117 [021] d.s. 4018.001992: myprobe: (aio_complete_rw+0x0/0x148) x0=0xffff802f2595ad80 res=0x12a000 res2=0x0 We always have to assign bio->bi_status to dio->bio.bi_status because we will only check dio->bio.bi_status when we return the whole IO to the upper layer. Fixes: 542ff7bf18c6 ("block: new direct I/O implementation") Cc: stable@vger.kernel.org Cc: Christoph Hellwig Cc: Jens Axboe Reviewed-by: Ming Lei Signed-off-by: Jason Yan Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit b4af1b3cb545961740e6e4ee6df17117a9262cef Author: Jérôme Glisse Date: Wed Apr 10 16:27:51 2019 -0400 block: do not leak memory in bio_copy_user_iov() commit a3761c3c91209b58b6f33bf69dd8bb8ec0c9d925 upstream. When bio_add_pc_page() fails in bio_copy_user_iov() we should free the page we just allocated otherwise we are leaking it. Cc: linux-block@vger.kernel.org Cc: Linus Torvalds Cc: stable@vger.kernel.org Reviewed-by: Chaitanya Kulkarni Signed-off-by: Jérôme Glisse Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 43d15c041ad7422880dfd29d6d5dc0a86c58c1ae Author: Bart Van Assche Date: Thu Apr 4 10:08:43 2019 -0700 block: Revert v5.0 blk_mq_request_issue_directly() changes commit fd9c40f64c514bdc585a21e2e33fa5f83ca8811b upstream. blk_mq_try_issue_directly() can return BLK_STS*_RESOURCE for requests that have been queued. If that happens when blk_mq_try_issue_directly() is called by the dm-mpath driver then dm-mpath will try to resubmit a request that is already queued and a kernel crash follows. Since it is nontrivial to fix blk_mq_request_issue_directly(), revert the blk_mq_request_issue_directly() changes that went into kernel v5.0. This patch reverts the following commits: * d6a51a97c0b2 ("blk-mq: replace and kill blk_mq_request_issue_directly") # v5.0. * 5b7a6f128aad ("blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests") # v5.0. * 7f556a44e61d ("blk-mq: refactor the code of issue request directly") # v5.0. Cc: Christoph Hellwig Cc: Ming Lei Cc: Jianchao Wang Cc: Hannes Reinecke Cc: Johannes Thumshirn Cc: James Smart Cc: Dongli Zhang Cc: Laurence Oberman Cc: Reported-by: Laurence Oberman Tested-by: Laurence Oberman Fixes: 7f556a44e61d ("blk-mq: refactor the code of issue request directly") # v5.0. Signed-off-by: Bart Van Assche Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 6a184be78d3fd819361ba7a42d2e00d8ecdf9d3c Author: Dmitry V. Levin Date: Fri Mar 29 20:12:21 2019 +0300 riscv: Fix syscall_get_arguments() and syscall_set_arguments() commit 10a16997db3d99fc02c026cf2c6e6c670acafab0 upstream. RISC-V syscall arguments are located in orig_a0,a1..a5 fields of struct pt_regs. Due to an off-by-one bug and a bug in pointer arithmetic syscall_get_arguments() was reading s3..s7 fields instead of a1..a5. Likewise, syscall_set_arguments() was writing s3..s7 fields instead of a1..a5. Link: http://lkml.kernel.org/r/20190329171221.GA32456@altlinux.org Fixes: e2c0cdfba7f69 ("RISC-V: User-facing API") Cc: Ingo Molnar Cc: Kees Cook Cc: Andy Lutomirski Cc: Will Drewry Cc: Albert Ou Cc: linux-riscv@lists.infradead.org Cc: stable@vger.kernel.org # v4.15+ Acked-by: Palmer Dabbelt Signed-off-by: Dmitry V. Levin Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit ee02ae76d06960117228eeff43b14e9d55a148d4 Author: Anand Jain Date: Tue Apr 2 18:07:40 2019 +0800 btrfs: prop: fix vanished compression property after failed set commit 272e5326c7837697882ce3162029ba893059b616 upstream. The compression property resets to NULL, instead of the old value if we fail to set the new compression parameter. $ btrfs prop get /btrfs compression compression=lzo $ btrfs prop set /btrfs compression zli ERROR: failed to set compression for /btrfs: Invalid argument $ btrfs prop get /btrfs compression This is because the compression property ->validate() is successful for 'zli' as the strncmp() used the length passed from the userspace. Fix it by using the expected string length in strncmp(). Fixes: 63541927c8d1 ("Btrfs: add support for inode properties") Fixes: 5c1aab1dd544 ("btrfs: Add zstd support") CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Nikolay Borisov Signed-off-by: Anand Jain Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 41cd8693bfcc248ea9503e6322c6523afed0db00 Author: Anand Jain Date: Tue Apr 2 18:07:38 2019 +0800 btrfs: prop: fix zstd compression parameter validation commit 50398fde997f6be8faebdb5f38e9c9c467370f51 upstream. We let pass zstd compression parameter even if it is not fully valid. For example: $ btrfs prop set /btrfs compression zst $ btrfs prop get /btrfs compression compression=zst zlib and lzo are fine. Fix it by checking the correct prefix length. Fixes: 5c1aab1dd544 ("btrfs: Add zstd support") CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Nikolay Borisov Signed-off-by: Anand Jain Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit ddb27d3b30f7b01b2286194dfc179b75fc9fccb8 Author: Filipe Manana Date: Tue Mar 26 10:49:56 2019 +0000 Btrfs: do not allow trimming when a fs is mounted with the nologreplay option commit f35f06c35560a86e841631f0243b83a984dc11a9 upstream. Whan a filesystem is mounted with the nologreplay mount option, which requires it to be mounted in RO mode as well, we can not allow discard on free space inside block groups, because log trees refer to extents that are not pinned in a block group's free space cache (pinning the extents is precisely the first phase of replaying a log tree). So do not allow the fitrim ioctl to do anything when the filesystem is mounted with the nologreplay option, because later it can be mounted RW without that option, which causes log replay to happen and result in either a failure to replay the log trees (leading to a mount failure), a crash or some silent corruption. Reported-by: Darrick J. Wong Fixes: 96da09192cda ("btrfs: Introduce new mount option to disable tree log replay") CC: stable@vger.kernel.org # 4.9+ Reviewed-by: Nikolay Borisov Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 4badea79434df541edfe9e7beeadc4c664c5fdb7 Author: S.j. Wang Date: Wed Feb 27 06:31:12 2019 +0000 ASoC: fsl_esai: fix channel swap issue when stream starts commit 0ff4e8c61b794a4bf6c854ab071a1abaaa80f358 upstream. There is very low possibility ( < 0.1% ) that channel swap happened in beginning when multi output/input pin is enabled. The issue is that hardware can't send data to correct pin in the beginning with the normal enable flow. This is hardware issue, but there is no errata, the workaround flow is that: Each time playback/recording, firstly clear the xSMA/xSMB, then enable TE/RE, then enable xSMB and xSMA (xSMB must be enabled before xSMA). Which is to use the xSMA as the trigger start register, previously the xCR_TE or xCR_RE is the bit for starting. Fixes commit 43d24e76b698 ("ASoC: fsl_esai: Add ESAI CPU DAI driver") Cc: Reviewed-by: Fabio Estevam Acked-by: Nicolin Chen Signed-off-by: Shengjiu Wang Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 7c394c70a3945d31817826454a7a40a80ef2c0a0 Author: Guenter Roeck Date: Fri Mar 22 15:39:48 2019 -0700 ASoC: intel: Fix crash at suspend/resume after failed codec registration commit 8f71370f4b02730e8c27faf460af7a3586e24e1f upstream. If codec registration fails after the ASoC Intel SST driver has been probed, the kernel will Oops and crash at suspend/resume. general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 PID: 2811 Comm: cat Tainted: G W 4.19.30 #15 Hardware name: GOOGLE Clapper, BIOS Google_Clapper.5216.199.7 08/22/2014 RIP: 0010:snd_soc_suspend+0x5a/0xd21 Code: 03 80 3c 10 00 49 89 d7 74 0b 48 89 df e8 71 72 c4 fe 4c 89 fa 48 8b 03 48 89 45 d0 48 8d 98 a0 01 00 00 48 89 d8 48 c1 e8 03 <8a> 04 10 84 c0 0f 85 85 0c 00 00 80 3b 00 0f 84 6b 0c 00 00 48 8b RSP: 0018:ffff888035407750 EFLAGS: 00010202 RAX: 0000000000000034 RBX: 00000000000001a0 RCX: 0000000000000000 RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88805c417098 RBP: ffff8880354077b0 R08: dffffc0000000000 R09: ffffed100b975718 R10: 0000000000000001 R11: ffffffff949ea4a3 R12: 1ffff1100b975746 R13: dffffc0000000000 R14: ffff88805cba4588 R15: dffffc0000000000 FS: 0000794a78e91b80(0000) GS:ffff888068d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007bd5283ccf58 CR3: 000000004b7aa000 CR4: 00000000001006e0 Call Trace: ? dpm_complete+0x67b/0x67b ? i915_gem_suspend+0x14d/0x1ad sst_soc_prepare+0x91/0x1dd ? sst_be_hw_params+0x7e/0x7e dpm_prepare+0x39a/0x88b dpm_suspend_start+0x13/0x9d suspend_devices_and_enter+0x18f/0xbd7 ? arch_suspend_enable_irqs+0x11/0x11 ? printk+0xd9/0x12d ? lock_release+0x95f/0x95f ? log_buf_vmcoreinfo_setup+0x131/0x131 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? __pm_pr_dbg+0x186/0x190 ? pm_notifier_call_chain+0x39/0x39 ? suspend_test+0x9d/0x9d pm_suspend+0x2f4/0x728 ? trace_suspend_resume+0x3da/0x3da ? lock_release+0x95f/0x95f ? kernfs_fop_write+0x19f/0x32d state_store+0xd8/0x147 ? sysfs_kf_read+0x155/0x155 kernfs_fop_write+0x23e/0x32d __vfs_write+0x108/0x608 ? vfs_read+0x2e9/0x2e9 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? debug_smp_processor_id+0x10/0x10 ? selinux_file_permission+0x1c5/0x3c8 ? rcu_sync_lockdep_assert+0x6a/0xad ? __sb_start_write+0x129/0x2ac vfs_write+0x1aa/0x434 ksys_write+0xfe/0x1be ? __ia32_sys_read+0x82/0x82 do_syscall_64+0xcd/0x120 entry_SYSCALL_64_after_hwframe+0x49/0xbe In the observed situation, the problem is seen because the codec driver failed to probe due to a hardware problem. max98090 i2c-193C9890:00: Failed to read device revision: -1 max98090 i2c-193C9890:00: ASoC: failed to probe component -1 cht-bsw-max98090 cht-bsw-max98090: ASoC: failed to instantiate card -1 cht-bsw-max98090 cht-bsw-max98090: snd_soc_register_card failed -1 cht-bsw-max98090: probe of cht-bsw-max98090 failed with error -1 The problem is similar to the problem solved with commit 2fc995a87f2e ("ASoC: intel: Fix crash at suspend/resume without card registration"), but codec registration fails at a later point. At that time, the pointer checked with the above mentioned commit is already set, but it is not cleared if the device is subsequently removed. Adding a remove function to clear the pointer fixes the problem. Cc: stable@vger.kernel.org Cc: Jarkko Nikula Cc: Curtis Malainey Signed-off-by: Guenter Roeck Acked-by: Pierre-Louis Bossart Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 3859d8fae2d83df3c3d8b65ca90ab061a7f8ef6c Author: Greg Thelen Date: Fri Apr 5 18:39:18 2019 -0700 mm: writeback: use exact memcg dirty counts commit 0b3d6e6f2dd0a7b697b1aa8c167265908940624b upstream. Since commit a983b5ebee57 ("mm: memcontrol: fix excessive complexity in memory.stat reporting") memcg dirty and writeback counters are managed as: 1) per-memcg per-cpu values in range of [-32..32] 2) per-memcg atomic counter When a per-cpu counter cannot fit in [-32..32] it's flushed to the atomic. Stat readers only check the atomic. Thus readers such as balance_dirty_pages() may see a nontrivial error margin: 32 pages per cpu. Assuming 100 cpus: 4k x86 page_size: 13 MiB error per memcg 64k ppc page_size: 200 MiB error per memcg Considering that dirty+writeback are used together for some decisions the errors double. This inaccuracy can lead to undeserved oom kills. One nasty case is when all per-cpu counters hold positive values offsetting an atomic negative value (i.e. per_cpu[*]=32, atomic=n_cpu*-32). balance_dirty_pages() only consults the atomic and does not consider throttling the next n_cpu*32 dirty pages. If the file_lru is in the 13..200 MiB range then there's absolutely no dirty throttling, which burdens vmscan with only dirty+writeback pages thus resorting to oom kill. It could be argued that tiny containers are not supported, but it's more subtle. It's the amount the space available for file lru that matters. If a container has memory.max-200MiB of non reclaimable memory, then it will also suffer such oom kills on a 100 cpu machine. The following test reliably ooms without this patch. This patch avoids oom kills. $ cat test mount -t cgroup2 none /dev/cgroup cd /dev/cgroup echo +io +memory > cgroup.subtree_control mkdir test cd test echo 10M > memory.max (echo $BASHPID > cgroup.procs && exec /memcg-writeback-stress /foo) (echo $BASHPID > cgroup.procs && exec dd if=/dev/zero of=/foo bs=2M count=100) $ cat memcg-writeback-stress.c /* * Dirty pages from all but one cpu. * Clean pages from the non dirtying cpu. * This is to stress per cpu counter imbalance. * On a 100 cpu machine: * - per memcg per cpu dirty count is 32 pages for each of 99 cpus * - per memcg atomic is -99*32 pages * - thus the complete dirty limit: sum of all counters 0 * - balance_dirty_pages() only sees atomic count -99*32 pages, which * it max()s to 0. * - So a workload can dirty -99*32 pages before balance_dirty_pages() * cares. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include static char *buf; static int bufSize; static void set_affinity(int cpu) { cpu_set_t affinity; CPU_ZERO(&affinity); CPU_SET(cpu, &affinity); if (sched_setaffinity(0, sizeof(affinity), &affinity)) err(1, "sched_setaffinity"); } static void dirty_on(int output_fd, int cpu) { int i, wrote; set_affinity(cpu); for (i = 0; i < 32; i++) { for (wrote = 0; wrote < bufSize; ) { int ret = write(output_fd, buf+wrote, bufSize-wrote); if (ret == -1) err(1, "write"); wrote += ret; } } } int main(int argc, char **argv) { int cpu, flush_cpu = 1, output_fd; const char *output; if (argc != 2) errx(1, "usage: output_file"); output = argv[1]; bufSize = getpagesize(); buf = malloc(getpagesize()); if (buf == NULL) errx(1, "malloc failed"); output_fd = open(output, O_CREAT|O_RDWR); if (output_fd == -1) err(1, "open(%s)", output); for (cpu = 0; cpu < get_nprocs(); cpu++) { if (cpu != flush_cpu) dirty_on(output_fd, cpu); } set_affinity(flush_cpu); if (fsync(output_fd)) err(1, "fsync(%s)", output); if (close(output_fd)) err(1, "close(%s)", output); free(buf); } Make balance_dirty_pages() and wb_over_bg_thresh() work harder to collect exact per memcg counters. This avoids the aforementioned oom kills. This does not affect the overhead of memory.stat, which still reads the single atomic counter. Why not use percpu_counter? memcg already handles cpus going offline, so no need for that overhead from percpu_counter. And the percpu_counter spinlocks are more heavyweight than is required. It probably also makes sense to use exact dirty and writeback counters in memcg oom reports. But that is saved for later. Link: http://lkml.kernel.org/r/20190329174609.164344-1-gthelen@google.com Signed-off-by: Greg Thelen Reviewed-by: Roman Gushchin Acked-by: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Cc: Tejun Heo Cc: [4.16+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit c4727317b4e527b3720fb289d3f1eb139730083c Author: Arnd Bergmann Date: Fri Apr 5 18:38:53 2019 -0700 include/linux/bitrev.h: fix constant bitrev commit 6147e136ff5071609b54f18982dea87706288e21 upstream. clang points out with hundreds of warnings that the bitrev macros have a problem with constant input: drivers/hwmon/sht15.c:187:11: error: variable '__x' is uninitialized when used within its own initialization [-Werror,-Wuninitialized] u8 crc = bitrev8(data->val_status & 0x0F); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/bitrev.h:102:21: note: expanded from macro 'bitrev8' __constant_bitrev8(__x) : \ ~~~~~~~~~~~~~~~~~~~^~~~ include/linux/bitrev.h:67:11: note: expanded from macro '__constant_bitrev8' u8 __x = x; \ ~~~ ^ Both the bitrev and the __constant_bitrev macros use an internal variable named __x, which goes horribly wrong when passing one to the other. The obvious fix is to rename one of the variables, so this adds an extra '_'. It seems we got away with this because - there are only a few drivers using bitrev macros - usually there are no constant arguments to those - when they are constant, they tend to be either 0 or (unsigned)-1 (drivers/isdn/i4l/isdnhdlc.o, drivers/iio/amplifiers/ad8366.c) and give the correct result by pure chance. In fact, the only driver that I could find that gets different results with this is drivers/net/wan/slic_ds26522.c, which in turn is a driver for fairly rare hardware (adding the maintainer to Cc for testing). Link: http://lkml.kernel.org/r/20190322140503.123580-1-arnd@arndb.de Fixes: 556d2f055bf6 ("ARM: 8187/1: add CONFIG_HAVE_ARCH_BITREVERSE to support rbit instruction") Signed-off-by: Arnd Bergmann Reviewed-by: Nick Desaulniers Cc: Zhao Qiang Cc: Yalin Wang Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit a1428aee0c6976ce9322037168aa7a491110c1cf Author: David Rientjes Date: Tue Mar 19 15:19:56 2019 -0700 kvm: svm: fix potential get_num_contig_pages overflow commit ede885ecb2cdf8a8dd5367702e3d964ec846a2d5 upstream. get_num_contig_pages() could potentially overflow int so make its type consistent with its usage. Reported-by: Cfir Cohen Cc: stable@vger.kernel.org Signed-off-by: David Rientjes Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit d2c5c9ea9a09e9f4572c8bc9daa8aa08e39a5934 Author: Dave Airlie Date: Fri Apr 5 13:17:13 2019 +1000 drm/udl: add a release method and delay modeset teardown commit 9b39b013037fbfa8d4b999345d9e904d8a336fc2 upstream. If we unplug a udl device, the usb callback with deinit the mode_config struct, however userspace will still have an open file descriptor and a framebuffer on that device. When userspace closes the fd, we'll oops because it'll try and look stuff up in the object idr which we've destroyed. This punts destroying the mode objects until release time instead. Cc: stable@vger.kernel.org Reviewed-by: Daniel Vetter Signed-off-by: Dave Airlie Link: https://patchwork.freedesktop.org/patch/msgid/20190405031715.5959-2-airlied@gmail.com Signed-off-by: Greg Kroah-Hartman commit 7029188253fccc20e79e0218b7112032135de37b Author: Jernej Skrabec Date: Sun Mar 24 20:06:09 2019 +0100 drm/sun4i: DW HDMI: Lower max. supported rate for H6 commit cd9063757a227cf31ebf5391ccda2bf583b0806e upstream. Currently resolutions with pixel clock higher than 340 MHz don't work with H6 HDMI controller. They just produce a blank screen. Limit maximum pixel clock rate to 340 MHz until scrambling is supported. Cc: stable@vger.kernel.org # 5.0 Fixes: 40bb9d3147b2 ("drm/sun4i: Add support for H6 DW HDMI controller") Signed-off-by: Jernej Skrabec Signed-off-by: Maxime Ripard Link: https://patchwork.freedesktop.org/patch/msgid/20190324190609.32721-1-jernej.skrabec@siol.net Signed-off-by: Greg Kroah-Hartman commit 3e05b13e52e5e217a1d7767fe8122965fadf0605 Author: Yan Zhao Date: Wed Mar 27 00:54:51 2019 -0400 drm/i915/gvt: do not deliver a workload if its creation fails commit dade58ed5af6365ac50ff4259c2a0bf31219e285 upstream. in workload creation routine, if any failure occurs, do not queue this workload for delivery. if this failure is fatal, enter into failsafe mode. Fixes: 6d76303553ba ("drm/i915/gvt: Move common vGPU workload creation into scheduler.c") Cc: stable@vger.kernel.org #4.19+ Cc: zhenyuw@linux.intel.com Signed-off-by: Yan Zhao Signed-off-by: Zhenyu Wang Signed-off-by: Greg Kroah-Hartman commit 56487f7b83301c608ca77169abe74fc1a47ed94b Author: Andrei Vagin Date: Sun Apr 7 21:15:42 2019 -0700 alarmtimer: Return correct remaining time commit 07d7e12091f4ab869cc6a4bb276399057e73b0b3 upstream. To calculate a remaining time, it's required to subtract the current time from the expiration time. In alarm_timer_remaining() the arguments of ktime_sub are swapped. Fixes: d653d8457c76 ("alarmtimer: Implement remaining callback") Signed-off-by: Andrei Vagin Signed-off-by: Thomas Gleixner Reviewed-by: Mukesh Ojha Cc: Stephen Boyd Cc: John Stultz Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190408041542.26338-1-avagin@gmail.com Signed-off-by: Greg Kroah-Hartman commit b4dfbd47a494b634801f6186def12705926bce6b Author: Sven Schnelle Date: Thu Apr 4 18:16:04 2019 +0200 parisc: also set iaoq_b in instruction_pointer_set() commit f324fa58327791b2696628b31480e7e21c745706 upstream. When setting the instruction pointer on PA-RISC we also need to set the back of the instruction queue to the new offset, otherwise we will execute on instruction from the new location, and jumping back to the old location stored in iaoq_b. Signed-off-by: Sven Schnelle Signed-off-by: Helge Deller Fixes: 75ebedf1d263 ("parisc: Add HAVE_REGS_AND_STACK_ACCESS_API feature") Cc: stable@vger.kernel.org # 4.19+ Signed-off-by: Greg Kroah-Hartman commit 97ba69f226653e8818c424f9cbeaeae1bc58d332 Author: Sven Schnelle Date: Thu Apr 4 18:16:03 2019 +0200 parisc: regs_return_value() should return gpr28 commit 45efd871bf0a47648f119d1b41467f70484de5bc upstream. While working on kretprobes for PA-RISC I was wondering while the kprobes sanity test always fails on kretprobes. This is caused by returning gpr20 instead of gpr28. Signed-off-by: Sven Schnelle Signed-off-by: Helge Deller Cc: stable@vger.kernel.org # 4.14+ Signed-off-by: Greg Kroah-Hartman commit d347bbea06685c1ffb03bbec9efa3ffc3c0b9aae Author: Helge Deller Date: Tue Apr 2 12:13:27 2019 +0200 parisc: Detect QEMU earlier in boot process commit d006e95b5561f708d0385e9677ffe2c46f2ae345 upstream. While adding LASI support to QEMU, I noticed that the QEMU detection in the kernel happens much too late. For example, when a LASI chip is found by the kernel, it registers the LASI LED driver as well. But when we run on QEMU it makes sense to avoid spending unnecessary CPU cycles, so we need to access the running_on_QEMU flag earlier than before. This patch now makes the QEMU detection the fist task of the Linux kernel by moving it to where the kernel enters the C-coding. Fixes: 310d82784fb4 ("parisc: qemu idle sleep support") Signed-off-by: Helge Deller Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Greg Kroah-Hartman commit af2abcc62e64556cc76d0fef71a6b04c5f9d7f05 Author: Faiz Abbas Date: Thu Apr 11 14:29:37 2019 +0530 mmc: sdhci-omap: Don't finish_mrq() on a command error during tuning commit 5c41ea6d52003b5bc77c2a82fd5ca7d480237d89 upstream. commit 5b0d62108b46 ("mmc: sdhci-omap: Add platform specific reset callback") skips data resets during tuning operation. Because of this, a data error or data finish interrupt might still arrive after a command error has been handled and the mrq ended. This ends up with a "mmc0: Got data interrupt 0x00000002 even though no data operation was in progress" error message. Fix this by adding a platform specific callback for sdhci_irq. Mark the mrq as a failure but wait for a data interrupt instead of calling finish_mrq(). Fixes: 5b0d62108b46 ("mmc: sdhci-omap: Add platform specific reset callback") Signed-off-by: Faiz Abbas Acked-by: Adrian Hunter Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 13771e12fb7d82b725bd1e57220691cee75deaae Author: Daniel Drake Date: Tue Mar 26 15:04:14 2019 +0800 mmc: alcor: don't write data before command has completed commit 157c99c5a2956a9ab1ae12de0136a2d8a1b1a307 upstream. The alcor driver is setting up data transfer and submitting the associated MMC command at the same time. While this works most of the time, it occasionally causes problems upon write. In the working case, after setting up the data transfer and submitting the MMC command, an interrupt comes in a moment later with CMD_END and WRITE_BUF_RDY bits set. The data transfer then happens without problem. However, on occasion, the interrupt that arrives at that point only has WRITE_BUF_RDY set. The hardware notifies that it's ready to write data, but the associated MMC command is still running. Regardless, the driver was proceeding to write data immediately, and that would then cause another interrupt indicating data CRC error, and the write would fail. Additionally, the transfer setup function alcor_trigger_data_transfer() was being called 3 times for each write operation, which was confusing and may be contributing to this issue. Solve this by tweaking the driver behaviour to follow the sequence observed in the original ampe_stor vendor driver: 1. When starting request handling, write 0 to DATA_XFER_CTRL 2. Submit the command 3. Wait for CMD_END interrupt and then trigger data transfer 4. For the PIO case, trigger the next step of the data transfer only upon the following DATA_END interrupt, which occurs after the block has been written. I confirmed that the read path still works (DMA & PIO) and also now presents more consistency with the operations performed by ampe_stor. Signed-off-by: Daniel Drake Fixes: c5413ad815a6 ("mmc: add new Alcor Micro Cardreader SD/MMC driver") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 8a3bb1553d5b96d13bd611715fde925b2820222a Author: Peter Geis Date: Wed Mar 13 19:02:30 2019 +0000 arm64: dts: rockchip: fix rk3328 sdmmc0 write errors commit 09f91381fa5de1d44bc323d8bf345f5d57b3d9b5 upstream. Various rk3328 based boards experience occasional sdmmc0 write errors. This is due to the rk3328.dtsi tx drive levels being set to 4ma, vs 8ma per the rk3328 datasheet default settings. Fix this by setting the tx signal pins to 8ma. Inspiration from tonymac32's patch, https://github.com/ayufan-rock64/linux-kernel/commit/dc1212b347e0da17c5460bcc0a56b07d02bac3f8 Fixes issues on the rk3328-roc-cc and the rk3328-rock64 (as per the above commit message). Tested on the rk3328-roc-cc board. Fixes: 52e02d377a72 ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs") Cc: stable@vger.kernel.org Signed-off-by: Peter Geis Signed-off-by: Heiko Stuebner Signed-off-by: Greg Kroah-Hartman commit 9e77cd4a9922eef12277c09dfa00d12eef56658b Author: Aneesh Kumar K.V Date: Fri Apr 5 18:39:10 2019 -0700 mm/huge_memory.c: fix modifying of page protection by insert_pfn_pmd() commit c6f3c5ee40c10bb65725047a220570f718507001 upstream. With some architectures like ppc64, set_pmd_at() cannot cope with a situation where there is already some (different) valid entry present. Use pmdp_set_access_flags() instead to modify the pfn which is built to deal with modifying existing PMD entries. This is similar to commit cae85cb8add3 ("mm/memory.c: fix modifying of page protection by insert_pfn()") We also do similar update w.r.t insert_pfn_pud eventhough ppc64 don't support pud pfn entries now. Without this patch we also see the below message in kernel log "BUG: non-zero pgtables_bytes on freeing mm:" Link: http://lkml.kernel.org/r/20190402115125.18803-1-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V Reported-by: Chandan Rajendra Reviewed-by: Jan Kara Cc: Dan Williams Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 89944d7e95e39ed78e807a8a6754d41b9695cbeb Author: Hui Wang Date: Mon Apr 8 15:58:11 2019 +0800 ALSA: hda - Add two more machines to the power_save_blacklist commit cae30527901d9590db0e12ace994c1d58bea87fd upstream. Recently we set CONFIG_SND_HDA_POWER_SAVE_DEFAULT to 1 when configuring the kernel, then two machines were reported to have noise after installing the new kernel. Put them in the blacklist, the noise disappears. https://bugs.launchpad.net/bugs/1821663 Cc: Signed-off-by: Hui Wang Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 3c20e6c50e03bfc78b4405e814a7016d5a13fcdc Author: Oleksandr Andrushchenko Date: Thu Apr 4 15:38:38 2019 +0300 ALSA: xen-front: Do not use stream buffer size before it is set commit 8b030a57e35a0efc1a8aa18bb10555bc5066ac40 upstream. This fixes the regression introduced while moving to Xen shared buffer implementation. Fixes: 58f9d806d16a ("ALSA: xen-front: Use Xen common shared buffer implementation") Reviewed-by: Juergen Gross Signed-off-by: Oleksandr Andrushchenko Cc: # v5.0+ Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 97bf098399643d277907e99e1b26ac2e95695db1 Author: Richard Sailer Date: Tue Apr 2 15:52:04 2019 +0200 ALSA: hda/realtek - Add quirk for Tuxedo XC 1509 commit 80690a276f444a68a332136d98bfea1c338bc263 upstream. This adds a SND_PCI_QUIRK(...) line for the Tuxedo XC 1509. The Tuxedo XC 1509 and the System76 oryp5 are the same barebone notebooks manufactured by Clevo. To name the fixups both use after the actual underlying hardware, this patch also changes System76_orpy5 to clevo_pb51ed in 2 enum symbols and one function name, matching the other pci_quirk entries which are also named after the device ODM. Fixes: 7f665b1c3283 ("ALSA: hda/realtek - Headset microphone and internal speaker support for System76 oryp5") Signed-off-by: Richard Sailer Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit acaf3a11200816f8251203eec7ffbaca3c17bdda Author: Jian-Hong Pan Date: Mon Apr 1 11:25:05 2019 +0800 ALSA: hda/realtek: Enable headset MIC of Acer TravelMate B114-21 with ALC233 commit ea5c7eba216e832906e594799b8670f1954a588c upstream. The Acer TravelMate B114-21 laptop cannot detect and record sound from headset MIC. This patch adds the ALC233_FIXUP_ACER_HEADSET_MIC HDA verb quirk chained with ALC233_FIXUP_ASUS_MIC_NO_PRESENCE pin quirk to fix this issue. [ fixed the missing brace and reordered the entry -- tiwai ] Signed-off-by: Jian-Hong Pan Signed-off-by: Daniel Drake Reviewed-by: Kailang Yang Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit df828c33163f2e5f651be384ccc0baf2236a95b5 Author: Zubin Mithra Date: Thu Apr 4 14:33:55 2019 -0700 ALSA: seq: Fix OOB-reads from strlcpy commit 212ac181c158c09038c474ba68068be49caecebb upstream. When ioctl calls are made with non-null-terminated userspace strings, strlcpy causes an OOB-read from within strlen. Fix by changing to use strscpy instead. Signed-off-by: Zubin Mithra Reviewed-by: Guenter Roeck Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit cb11af93e6264b334cac554e48db4f5d030069c0 Author: Erik Schmauss Date: Mon Apr 8 13:42:26 2019 -0700 ACPICA: Namespace: remove address node from global list after method termination commit c5781ffbbd4f742a58263458145fe7f0ac01d9e0 upstream. ACPICA commit b233720031a480abd438f2e9c643080929d144c3 ASL operation_regions declare a range of addresses that it uses. In a perfect world, the range of addresses should be used exclusively by the AML interpreter. The OS can use this information to decide which drivers to load so that the AML interpreter and device drivers use different regions of memory. During table load, the address information is added to a global address range list. Each node in this list contains an address range as well as a namespace node of the operation_region. This list is deleted at ACPI shutdown. Unfortunately, ASL operation_regions can be declared inside of control methods. Although this is not recommended, modern firmware contains such code. New module level code changes unintentionally removed the functionality of adding and removing nodes to the global address range list. A few months ago, support for adding addresses has been re- implemented. However, the removal of the address range list was missed and resulted in some systems to crash due to the address list containing bogus namespace nodes from operation_regions declared in control methods. In order to fix the crash, this change removes dynamic operation_regions after control method termination. Link: https://github.com/acpica/acpica/commit/b2337200 Link: https://bugzilla.kernel.org/show_bug.cgi?id=202475 Fixes: 4abb951b73ff ("ACPICA: AML interpreter: add region addresses in global list during initialization") Reported-by: Michael J Gruber Signed-off-by: Erik Schmauss Signed-off-by: Bob Moore Cc: 4.20+ # 4.20+ Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit da6a87fb0ad43ae811519d2e0aa325c7f792b13a Author: Furquan Shaikh Date: Wed Mar 20 15:28:44 2019 -0700 ACPICA: Clear status of GPEs before enabling them commit c8b1917c8987a6fa3695d479b4d60fbbbc3e537b upstream. Commit 18996f2db918 ("ACPICA: Events: Stop unconditionally clearing ACPI IRQs during suspend/resume") was added to stop clearing event status bits unconditionally in the system-wide suspend and resume paths. This was done because of an issue with a laptop lid appaering to be closed even when it was used to wake up the system from suspend (see https://bugzilla.kernel.org/show_bug.cgi?id=196249), which happened because event status bits were cleared unconditionally on system resume. Though this change fixed the issue in the resume path, it introduced regressions in a few suspend paths. First regression was reported and fixed in the S5 entry path by commit fa85015c0d95 ("ACPICA: Clear status of all events when entering S5"). Next regression was reported and fixed for all legacy sleep paths by commit f317c7dc12b7 ("ACPICA: Clear status of all events when entering sleep states"). However, there still is a suspend-to-idle regression, since suspend-to-idle does not follow the legacy sleep paths. In the suspend-to-idle case, wakeup is enabled as part of device suspend. If the status bits of wakeup GPEs are set when they are enabled, it causes a premature system wakeup to occur. To address that problem, partially revert commit 18996f2db918 to restore GPE status bits clearing before the GPE is enabled in acpi_ev_enable_gpe(). Fixes: 18996f2db918 ("ACPICA: Events: Stop unconditionally clearing ACPI IRQs during suspend/resume") Signed-off-by: Furquan Shaikh Cc: 4.17+ # 4.17+ [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit bee8b4b4c79b6776c2f9a275ec2c7552f11be9f9 Author: Peter Hutterer Date: Wed Mar 20 08:48:23 2019 +1000 HID: logitech: Handle 0 scroll events for the m560 commit fd35759ce32b60d3eb52436894bab996dbf8cffa upstream. hidpp_scroll_counter_handle_scroll() doesn't expect a 0-value scroll event, it gets interpreted as a negative scroll direction event. This can cause scroll direction resets and thus broken scrolling. Fixes: 4435ff2f09a2fc ("HID: logitech: Enable high-resolution scrolling on Logitech mice") Cc: stable@vger.kernel.org # v5.0 Reported-and-tested-by: Aimo Metsälä Signed-off-by: Peter Hutterer Signed-off-by: Benjamin Tissoires Signed-off-by: Greg Kroah-Hartman commit 0601ac3b4925b227d129a7f9822d94b7fd94fbb9 Author: Steve French Date: Fri Mar 29 16:31:07 2019 -0500 SMB3: Allow persistent handle timeout to be configurable on mount commit ca567eb2b3f014d5be0f44c6f68b01a522f15ca4 upstream. Reconnecting after server or network failure can be improved (to maintain availability and protect data integrity) by allowing the client to choose the default persistent (or resilient) handle timeout in some use cases. Today we default to 0 which lets the server pick the default timeout (usually 120 seconds) but this can be problematic for some workloads. Add the new mount parameter to cifs.ko for SMB3 mounts "handletimeout" which enables the user to override the default handle timeout for persistent (mount option "persistenthandles") or resilient handles (mount option "resilienthandles"). Maximum allowed is 16 minutes (960000 ms). Units for the timeout are expressed in milliseconds. See section 2.2.14.2.12 and 2.2.31.3 of the MS-SMB2 protocol specification for more information. Signed-off-by: Steve French Reviewed-by: Pavel Shilovsky Reviewed-by: Ronnie Sahlberg CC: Stable Signed-off-by: Greg Kroah-Hartman commit 4d4ec04ed77ed684f5db6eeb3850a5a25b3996d8 Author: Eddie James Date: Tue Mar 19 16:01:58 2019 -0500 hwmon: (occ) Fix power sensor indexing commit 8e6af454117a51dbf6c8a47c00180a0c235052fe upstream. In the case of power sensor version 0xA0, the sensor indexing overlapped with the "caps" power sensors, resulting in probe failure and kernel warnings. Fix this by specifying the next index for each power sensor version. Fixes: 54076cb3b5ff ("hwmon (occ): Add sensor attributes and register ...") Cc: stable@vger.kernel.org Signed-off-by: Eddie James Tested-by: Joel Stanley Signed-off-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit 026f98a1d51df0b2f3d102ad118d7cf61ef72bd0 Author: Axel Lin Date: Mon Mar 11 17:57:30 2019 +0800 hwmon: (w83773g) Select REGMAP_I2C to fix build error commit a165dcc923ada2ffdee1d4f41f12f81b66d04c55 upstream. Select REGMAP_I2C to avoid below build error: ERROR: "__devm_regmap_init_i2c" [drivers/hwmon/w83773g.ko] undefined! Fixes: ee249f271524 ("hwmon: Add W83773G driver") Cc: stable@vger.kernel.org Signed-off-by: Axel Lin Signed-off-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit c231b6b0064de12f19e9d19384b33acf14d43b94 Author: Greg Kroah-Hartman Date: Mon Jan 21 17:26:42 2019 +0100 tty: ldisc: add sysctl to prevent autoloading of ldiscs commit 7c0cca7c847e6e019d67b7d793efbbe3b947d004 upstream. By default, the kernel will automatically load the module of any line dicipline that is asked for. As this sometimes isn't the safest thing to do, provide a sysctl to disable this feature. By default, we set this to 'y' as that is the historical way that Linux has worked, and we do not want to break working systems. But in the future, perhaps this can default to 'n' to prevent this functionality. Signed-off-by: Greg Kroah-Hartman Reviewed-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit e4ebae16792ad736add8a40645ecdf42afaba85e Author: Greg Kroah-Hartman Date: Fri Apr 5 15:39:26 2019 +0200 tty: mark Siemens R3964 line discipline as BROKEN commit c7084edc3f6d67750f50d4183134c4fb5712a5c8 upstream. The n_r3964 line discipline driver was written in a different time, when SMP machines were rare, and users were trusted to do the right thing. Since then, the world has moved on but not this code, it has stayed rooted in the past with its lovely hand-crafted list structures and loads of "interesting" race conditions all over the place. After attempting to clean up most of the issues, I just gave up and am now marking the driver as BROKEN so that hopefully someone who has this hardware will show up out of the woodwork (I know you are out there!) and will help with debugging a raft of changes that I had laying around for the code, but was too afraid to commit as odds are they would break things. Many thanks to Jann and Linus for pointing out the initial problems in this codebase, as well as many reviews of my attempts to fix the issues. It was a case of whack-a-mole, and as you can see, the mole won. Reported-by: Jann Horn Signed-off-by: Greg Kroah-Hartman Signed-off-by: Linus Torvalds commit e2a0237494ce1181d2b5002280a06b553ad54fdf Author: Neil Armstrong Date: Thu Apr 11 12:11:32 2019 +0200 Revert "clk: meson: clean-up clock registration" This reverts commit 9b0f430450cf230e736bc40f95bf34fbdb99cead. This patch was not initially a fix and is dependent on other changes which are not fixes eithers. With this change, multiple Amlogic based boards fails to boot, as reported by kernelci. Cc: stable@vger.kernel.org # 5.0.7 Signed-off-by: Neil Armstrong Signed-off-by: Sasha Levin commit 62a23bbaee095e935af50c82ff23da7a57eefa5c Author: Nick Desaulniers Date: Fri Apr 5 18:38:45 2019 -0700 lib/string.c: implement a basic bcmp [ Upstream commit 5f074f3e192f10c9fade898b9b3b8812e3d83342 ] A recent optimization in Clang (r355672) lowers comparisons of the return value of memcmp against zero to comparisons of the return value of bcmp against zero. This helps some platforms that implement bcmp more efficiently than memcmp. glibc simply aliases bcmp to memcmp, but an optimized implementation is in the works. This results in linkage failures for all targets with Clang due to the undefined symbol. For now, just implement bcmp as a tailcail to memcmp to unbreak the build. This routine can be further optimized in the future. Other ideas discussed: * A weak alias was discussed, but breaks for architectures that define their own implementations of memcmp since aliases to declarations are not permitted (only definitions). Arch-specific memcmp implementations typically declare memcmp in C headers, but implement them in assembly. * -ffreestanding also is used sporadically throughout the kernel. * -fno-builtin-bcmp doesn't work when doing LTO. Link: https://bugs.llvm.org/show_bug.cgi?id=41035 Link: https://code.woboq.org/userspace/glibc/string/memcmp.c.html#bcmp Link: https://github.com/llvm/llvm-project/commit/8e16d73346f8091461319a7dfc4ddd18eedcff13 Link: https://github.com/ClangBuiltLinux/linux/issues/416 Link: http://lkml.kernel.org/r/20190313211335.165605-1-ndesaulniers@google.com Signed-off-by: Nick Desaulniers Reported-by: Nathan Chancellor Reported-by: Adhemerval Zanella Suggested-by: Arnd Bergmann Suggested-by: James Y Knight Suggested-by: Masahiro Yamada Suggested-by: Nathan Chancellor Suggested-by: Rasmus Villemoes Acked-by: Steven Rostedt (VMware) Reviewed-by: Nathan Chancellor Tested-by: Nathan Chancellor Reviewed-by: Masahiro Yamada Reviewed-by: Andy Shevchenko Cc: David Laight Cc: Rasmus Villemoes Cc: Namhyung Kim Cc: Greg Kroah-Hartman Cc: Alexander Shishkin Cc: Dan Williams Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 622902df9ebad35413067e19bd3099f53b203d66 Author: Nick Desaulniers Date: Mon Feb 11 11:30:04 2019 -0800 kbuild: clang: choose GCC_TOOLCHAIN_DIR not on LD [ Upstream commit ad15006cc78459d059af56729c4d9bed7c7fd860 ] This causes an issue when trying to build with `make LD=ld.lld` if ld.lld and the rest of your cross tools aren't in the same directory (ex. /usr/local/bin) (as is the case for Android's build system), as the GCC_TOOLCHAIN_DIR then gets set based on `which $(LD)` which will point where LLVM tools are, not GCC/binutils tools are located. Instead, select the GCC_TOOLCHAIN_DIR based on another tool provided by binutils for which LLVM does not provide a substitute for, such as elfedit. Fixes: 785f11aa595b ("kbuild: Add better clang cross build support") Link: https://github.com/ClangBuiltLinux/linux/issues/341 Suggested-by: Nathan Chancellor Reviewed-by: Nathan Chancellor Tested-by: Nathan Chancellor Signed-off-by: Nick Desaulniers Signed-off-by: Masahiro Yamada Signed-off-by: Sasha Levin commit 396f116f6d0a62c0d8f6f8442f062c690e27fd29 Author: Huy Nguyen Date: Thu Mar 7 14:07:32 2019 -0600 net/mlx5e: Update xon formula [ Upstream commit e28408e98bced123038857b6e3c81fa12a2e3e68 ] Set xon = xoff - netdev's max_mtu. netdev's max_mtu will give enough time for the pause frame to arrive at the sender. Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration") Signed-off-by: Huy Nguyen Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 29b4db4176dadc69e969ef957a636390453be4b9 Author: Huy Nguyen Date: Thu Mar 7 14:49:50 2019 -0600 net/mlx5e: Update xoff formula [ Upstream commit 5ec983e924c7978aaec3cf8679ece9436508bb20 ] Set minimum speed in xoff threshold formula to 40Gbps Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration") Signed-off-by: Huy Nguyen Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 68ef6f3e1a70e2e09119b0941551e197f502dd84 Author: Aditya Pakki Date: Tue Mar 19 16:42:40 2019 -0500 net: mlx5: Add a missing check on idr_find, free buf [ Upstream commit 8e949363f017e2011464812a714fb29710fb95b4 ] idr_find() can return a NULL value to 'flow' which is used without a check. The patch adds a check to avoid potential NULL pointer dereference. In case of mlx5_fpga_sbu_conn_sendmsg() failure, free buf allocated using kzalloc. Fixes: ab412e1dd7db ("net/mlx5: Accel, add TLS rx offload routines") Signed-off-by: Aditya Pakki Reviewed-by: Yuval Shaia Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 4fe853723d57be4dae870ba83002c587bad34cf3 Author: Heiner Kallweit Date: Sat Mar 30 17:13:24 2019 +0100 r8169: disable default rx interrupt coalescing on RTL8168 [ Upstream commit 288ac524cf70a8e7ed851a61ed2a9744039dae8d ] It was reported that re-introducing ASPM, in combination with RX interrupt coalescing, results in significantly increased packet latency, see [0]. Disabling ASPM or RX interrupt coalescing fixes the issue. Therefore change the driver's default to disable RX interrupt coalescing. Users still have the option to enable RX coalescing via ethtool. [0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=925496 Fixes: a99790bf5c7f ("r8169: Reinstate ASPM Support") Reported-by: Mike Crowe Signed-off-by: Heiner Kallweit Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4b780e0fc98615f919189fb3d2711ad26db2a7b8 Author: Alexander Lobakin Date: Thu Mar 28 18:23:04 2019 +0300 net: core: netif_receive_skb_list: unlist skb before passing to pt->func [ Upstream commit 9a5a90d167b0e5fe3d47af16b68fd09ce64085cd ] __netif_receive_skb_list_ptype() leaves skb->next poisoned before passing it to pt_prev->func handler, what may produce (in certain cases, e.g. DSA setup) crashes like: [ 88.606777] CPU 0 Unable to handle kernel paging request at virtual address 0000000e, epc == 80687078, ra == 8052cc7c [ 88.618666] Oops[#1]: [ 88.621196] CPU: 0 PID: 0 Comm: swapper Not tainted 5.1.0-rc2-dlink-00206-g4192a172-dirty #1473 [ 88.630885] $ 0 : 00000000 10000400 00000002 864d7850 [ 88.636709] $ 4 : 87c0ddf0 864d7800 87c0ddf0 00000000 [ 88.642526] $ 8 : 00000000 49600000 00000001 00000001 [ 88.648342] $12 : 00000000 c288617b dadbee27 25d17c41 [ 88.654159] $16 : 87c0ddf0 85cff080 80790000 fffffffd [ 88.659975] $20 : 80797b20 ffffffff 00000001 864d7800 [ 88.665793] $24 : 00000000 8011e658 [ 88.671609] $28 : 80790000 87c0dbc0 87cabf00 8052cc7c [ 88.677427] Hi : 00000003 [ 88.680622] Lo : 7b5b4220 [ 88.683840] epc : 80687078 vlan_dev_hard_start_xmit+0x1c/0x1a0 [ 88.690532] ra : 8052cc7c dev_hard_start_xmit+0xac/0x188 [ 88.696734] Status: 10000404 IEp [ 88.700422] Cause : 50000008 (ExcCode 02) [ 88.704874] BadVA : 0000000e [ 88.708069] PrId : 0001a120 (MIPS interAptiv (multi)) [ 88.713005] Modules linked in: [ 88.716407] Process swapper (pid: 0, threadinfo=(ptrval), task=(ptrval), tls=00000000) [ 88.725219] Stack : 85f61c28 00000000 0000000e 80780000 87c0ddf0 85cff080 80790000 8052cc7c [ 88.734529] 87cabf00 00000000 00000001 85f5fb40 807b0000 864d7850 87cabf00 807d0000 [ 88.743839] 864d7800 8655f600 00000000 85cff080 87c1c000 0000006a 00000000 8052d96c [ 88.753149] 807a0000 8057adb8 87c0dcc8 87c0dc50 85cfff08 00000558 87cabf00 85f58c50 [ 88.762460] 00000002 85f58c00 864d7800 80543308 fffffff4 00000001 85f58c00 864d7800 [ 88.771770] ... [ 88.774483] Call Trace: [ 88.777199] [<80687078>] vlan_dev_hard_start_xmit+0x1c/0x1a0 [ 88.783504] [<8052cc7c>] dev_hard_start_xmit+0xac/0x188 [ 88.789326] [<8052d96c>] __dev_queue_xmit+0x6e8/0x7d4 [ 88.794955] [<805a8640>] ip_finish_output2+0x238/0x4d0 [ 88.800677] [<805ab6a0>] ip_output+0xc8/0x140 [ 88.805526] [<805a68f4>] ip_forward+0x364/0x560 [ 88.810567] [<805a4ff8>] ip_rcv+0x48/0xe4 [ 88.815030] [<80528d44>] __netif_receive_skb_one_core+0x44/0x58 [ 88.821635] [<8067f220>] dsa_switch_rcv+0x108/0x1ac [ 88.827067] [<80528f80>] __netif_receive_skb_list_core+0x228/0x26c [ 88.833951] [<8052ed84>] netif_receive_skb_list+0x1d4/0x394 [ 88.840160] [<80355a88>] lunar_rx_poll+0x38c/0x828 [ 88.845496] [<8052fa78>] net_rx_action+0x14c/0x3cc [ 88.850835] [<806ad300>] __do_softirq+0x178/0x338 [ 88.856077] [<8012a2d4>] irq_exit+0xbc/0x100 [ 88.860846] [<802f8b70>] plat_irq_dispatch+0xc0/0x144 [ 88.866477] [<80105974>] handle_int+0x14c/0x158 [ 88.871516] [<806acfb0>] r4k_wait+0x30/0x40 [ 88.876462] Code: afb10014 8c8200a0 00803025 <9443000c> 94a20468 00000000 10620042 00a08025 9605046a [ 88.887332] [ 88.888982] ---[ end trace eb863d007da11cf1 ]--- [ 88.894122] Kernel panic - not syncing: Fatal exception in interrupt [ 88.901202] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- Fix this by pulling skb off the sublist and zeroing skb->next pointer before calling ptype callback. Fixes: 88eb1944e18c ("net: core: propagate SKB lists through packet_type lookup") Reviewed-by: Edward Cree Signed-off-by: Alexander Lobakin Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit b5f69a5648b01f609441e6acfb43520d82caefaa Author: Miaohe Lin Date: Mon Apr 8 10:04:20 2019 +0800 net: vrf: Fix ping failed when vrf mtu is set to 0 [ Upstream commit 5055376a3b44c4021de8830c9157f086a97731df ] When the mtu of a vrf device is set to 0, it would cause ping failed. So I think we should limit vrf mtu in a reasonable range to solve this problem. I set dev->min_mtu to IPV6_MIN_MTU, so it will works for both ipv4 and ipv6. And if dev->max_mtu still be 0 can be confusing, so I set dev->max_mtu to ETH_MAX_MTU. Here is the reproduce step: 1.Config vrf interface and set mtu to 0: 3: enp4s0: mtu 1500 qdisc fq_codel master vrf1 state UP mode DEFAULT group default qlen 1000 link/ether 52:54:00:9e:dd:c1 brd ff:ff:ff:ff:ff:ff 2.Ping peer: 3: enp4s0: mtu 1500 qdisc fq_codel master vrf1 state UP group default qlen 1000 link/ether 52:54:00:9e:dd:c1 brd ff:ff:ff:ff:ff:ff inet 10.0.0.1/16 scope global enp4s0 valid_lft forever preferred_lft forever connect: Network is unreachable 3.Set mtu to default value, ping works: PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data. 64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=1.88 ms Fixes: ad49bc6361ca2 ("net: vrf: remove MTU limits for vrf device") Signed-off-by: Miaohe Lin Reviewed-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit c834470963904ba7c1c3159c6576452185f83b3c Author: Lorenzo Bianconi Date: Thu Apr 4 12:16:27 2019 +0200 net: thunderx: fix NULL pointer dereference in nicvf_open/nicvf_stop [ Upstream commit 2ec1ed2aa68782b342458681aa4d16b65c9014d6 ] When a bpf program is uploaded, the driver computes the number of xdp tx queues resulting in the allocation of additional qsets. Starting from commit '2ecbe4f4a027 ("net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs to private for each of them")' the driver runs link state polling for each VF resulting in the following NULL pointer dereference: [ 56.169256] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 [ 56.178032] Mem abort info: [ 56.180834] ESR = 0x96000005 [ 56.183877] Exception class = DABT (current EL), IL = 32 bits [ 56.189792] SET = 0, FnV = 0 [ 56.192834] EA = 0, S1PTW = 0 [ 56.195963] Data abort info: [ 56.198831] ISV = 0, ISS = 0x00000005 [ 56.202662] CM = 0, WnR = 0 [ 56.205619] user pgtable: 64k pages, 48-bit VAs, pgdp = 0000000021f0c7a0 [ 56.212315] [0000000000000020] pgd=0000000000000000, pud=0000000000000000 [ 56.219094] Internal error: Oops: 96000005 [#1] SMP [ 56.260459] CPU: 39 PID: 2034 Comm: ip Not tainted 5.1.0-rc3+ #3 [ 56.266452] Hardware name: GIGABYTE R120-T33/MT30-GS1, BIOS T49 02/02/2018 [ 56.273315] pstate: 80000005 (Nzcv daif -PAN -UAO) [ 56.278098] pc : __ll_sc___cmpxchg_case_acq_64+0x4/0x20 [ 56.283312] lr : mutex_lock+0x2c/0x50 [ 56.286962] sp : ffff0000219af1b0 [ 56.290264] x29: ffff0000219af1b0 x28: ffff800f64de49a0 [ 56.295565] x27: 0000000000000000 x26: 0000000000000015 [ 56.300865] x25: 0000000000000000 x24: 0000000000000000 [ 56.306165] x23: 0000000000000000 x22: ffff000011117000 [ 56.311465] x21: ffff800f64dfc080 x20: 0000000000000020 [ 56.316766] x19: 0000000000000020 x18: 0000000000000001 [ 56.322066] x17: 0000000000000000 x16: ffff800f2e077080 [ 56.327367] x15: 0000000000000004 x14: 0000000000000000 [ 56.332667] x13: ffff000010964438 x12: 0000000000000002 [ 56.337967] x11: 0000000000000000 x10: 0000000000000c70 [ 56.343268] x9 : ffff0000219af120 x8 : ffff800f2e077d50 [ 56.348568] x7 : 0000000000000027 x6 : 000000062a9d6a84 [ 56.353869] x5 : 0000000000000000 x4 : ffff800f2e077480 [ 56.359169] x3 : 0000000000000008 x2 : ffff800f2e077080 [ 56.364469] x1 : 0000000000000000 x0 : 0000000000000020 [ 56.369770] Process ip (pid: 2034, stack limit = 0x00000000c862da3a) [ 56.376110] Call trace: [ 56.378546] __ll_sc___cmpxchg_case_acq_64+0x4/0x20 [ 56.383414] drain_workqueue+0x34/0x198 [ 56.387247] nicvf_open+0x48/0x9e8 [nicvf] [ 56.391334] nicvf_open+0x898/0x9e8 [nicvf] [ 56.395507] nicvf_xdp+0x1bc/0x238 [nicvf] [ 56.399595] dev_xdp_install+0x68/0x90 [ 56.403333] dev_change_xdp_fd+0xc8/0x240 [ 56.407333] do_setlink+0x8e0/0xbe8 [ 56.410810] __rtnl_newlink+0x5b8/0x6d8 [ 56.414634] rtnl_newlink+0x54/0x80 [ 56.418112] rtnetlink_rcv_msg+0x22c/0x2f8 [ 56.422199] netlink_rcv_skb+0x60/0x120 [ 56.426023] rtnetlink_rcv+0x28/0x38 [ 56.429587] netlink_unicast+0x1c8/0x258 [ 56.433498] netlink_sendmsg+0x1b4/0x350 [ 56.437410] sock_sendmsg+0x4c/0x68 [ 56.440887] ___sys_sendmsg+0x240/0x280 [ 56.444711] __sys_sendmsg+0x68/0xb0 [ 56.448275] __arm64_sys_sendmsg+0x2c/0x38 [ 56.452361] el0_svc_handler+0x9c/0x128 [ 56.456186] el0_svc+0x8/0xc [ 56.459056] Code: 35ffff91 2a1003e0 d65f03c0 f9800011 (c85ffc10) [ 56.465166] ---[ end trace 4a57fdc27b0a572c ]--- [ 56.469772] Kernel panic - not syncing: Fatal exception Fix it by checking nicvf_rx_mode_wq pointer in nicvf_open and nicvf_stop Fixes: 2ecbe4f4a027 ("net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs to private for each of them") Fixes: 2c632ad8bc74 ("net: thunderx: move link state polling function to VF") Reported-by: Matteo Croce Signed-off-by: Lorenzo Bianconi Tested-by: Matteo Croce Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 502de75b3b3459ea918c7ef1710dd1b5d8203ce3 Author: Nikolay Aleksandrov Date: Wed Apr 3 23:27:24 2019 +0300 net: bridge: always clear mcast matching struct on reports and leaves [ Upstream commit 1515a63fc413f160d20574ab0894e7f1020c7be2 ] We need to be careful and always zero the whole br_ip struct when it is used for matching since the rhashtable change. This patch fixes all the places which didn't properly clear it which in turn might've caused mismatches. Thanks for the great bug report with reproducing steps and bisection. Steps to reproduce (from the bug report): ip link add br0 type bridge mcast_querier 1 ip link set br0 up ip link add v2 type veth peer name v3 ip link set v2 master br0 ip link set v2 up ip link set v3 up ip addr add 3.0.0.2/24 dev v3 ip netns add test ip link add v1 type veth peer name v1 netns test ip link set v1 master br0 ip link set v1 up ip -n test link set v1 up ip -n test addr add 3.0.0.1/24 dev v1 # Multicast receiver ip netns exec test socat UDP4-RECVFROM:5588,ip-add-membership=224.224.224.224:3.0.0.1,fork - # Multicast sender echo hello | nc -u -s 3.0.0.2 224.224.224.224 5588 Reported-by: liam.mcbirnie@boeing.com Fixes: 19e3a9c90c53 ("net: bridge: convert multicast to generic rhashtable") Signed-off-by: Nikolay Aleksandrov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f4473ccd883c89be4c31f3e8378dcb3ca2c868cd Author: Lorenzo Bianconi Date: Sat Apr 6 17:16:53 2019 +0200 net: ip6_gre: fix possible use-after-free in ip6erspan_rcv [ Upstream commit 2a3cabae4536edbcb21d344e7aa8be7a584d2afb ] erspan_v6 tunnels run __iptunnel_pull_header on received skbs to remove erspan header. This can determine a possible use-after-free accessing pkt_md pointer in ip6erspan_rcv since the packet will be 'uncloned' running pskb_expand_head if it is a cloned gso skb (e.g if the packet has been sent though a veth device). Fix it resetting pkt_md pointer after __iptunnel_pull_header Fixes: 1d7e2ed22f8d ("net: erspan: refactor existing erspan code") Signed-off-by: Lorenzo Bianconi Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit b49e1837b15eaf696e5f1f5c186e3f5ed0978580 Author: Lorenzo Bianconi Date: Sat Apr 6 17:16:52 2019 +0200 net: ip_gre: fix possible use-after-free in erspan_rcv [ Upstream commit 492b67e28ee5f2a2522fb72e3d3bcb990e461514 ] erspan tunnels run __iptunnel_pull_header on received skbs to remove gre and erspan headers. This can determine a possible use-after-free accessing pkt_md pointer in erspan_rcv since the packet will be 'uncloned' running pskb_expand_head if it is a cloned gso skb (e.g if the packet has been sent though a veth device). Fix it resetting pkt_md pointer after __iptunnel_pull_header Fixes: 1d7e2ed22f8d ("net: erspan: refactor existing erspan code") Signed-off-by: Lorenzo Bianconi Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit eefa6c2464c8503bf2004625ffb383e5c4c95497 Author: Michael Chan Date: Mon Apr 8 17:39:55 2019 -0400 bnxt_en: Reset device on RX buffer errors. [ Upstream commit 8e44e96c6c8e8fb80b84a2ca11798a8554f710f2 ] If the RX completion indicates RX buffers errors, the RX ring will be disabled by firmware and no packets will be received on that ring from that point on. Recover by resetting the device. Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit aecbbae850ede49183fa0b73c7445531a47669cf Author: Michael Chan Date: Mon Apr 8 17:39:54 2019 -0400 bnxt_en: Improve RX consumer index validity check. [ Upstream commit a1b0e4e684e9c300b9e759b46cb7a0147e61ddff ] There is logic to check that the RX/TPA consumer index is the expected index to work around a hardware problem. However, the potentially bad consumer index is first used to index into an array to reference an entry. This can potentially crash if the bad consumer index is beyond legal range. Improve the logic to use the consumer index for dereferencing after the validity check and log an error message. Fixes: fa7e28127a5a ("bnxt_en: Add workaround to detect bad opaque in rx completion (part 2)") Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit c43bbe6d49f48f318cd6a911d3c01d04bf2ed5bc Author: Jakub Kicinski Date: Wed Mar 27 11:38:39 2019 -0700 nfp: disable netpoll on representors [ Upstream commit c3e1f7fff69c78169c8ac40cc74ac4307f74e36d ] NFP reprs are software device on top of the PF's vNIC. The comment above __dev_queue_xmit() sayeth: When calling this method, interrupts MUST be enabled. This is because the BH enable code must have IRQs enabled so that it will not deadlock. For netconsole we can't guarantee IRQ state, let's just disable netpoll on representors to be on the safe side. When the initial implementation of NFP reprs was added by the commit 5de73ee46704 ("nfp: general representor implementation") .ndo_poll_controller was required for netpoll to be enabled. Fixes: ac3d9dd034e5 ("netpoll: make ndo_poll_controller() optional") Signed-off-by: Jakub Kicinski Reviewed-by: John Hurley Reviewed-by: Simon Horman Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit c974a681e99603fc68186a2d9a0df8b9c276fcf9 Author: Jakub Kicinski Date: Wed Mar 27 11:38:38 2019 -0700 nfp: validate the return code from dev_queue_xmit() [ Upstream commit c8ba5b91a04e3e2643e48501c114108802f21cda ] dev_queue_xmit() may return error codes as well as netdev_tx_t, and it always consumes the skb. Make sure we always return a correct netdev_tx_t value. Fixes: eadfa4c3be99 ("nfp: add stats and xmit helpers for representors") Signed-off-by: Jakub Kicinski Reviewed-by: John Hurley Reviewed-by: Simon Horman Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 6fc42565470ab355097ab7b5091da390468ba55b Author: Yuval Avnery Date: Mon Mar 11 06:18:24 2019 +0200 net/mlx5e: Add a lock on tir list [ Upstream commit 80a2a9026b24c6bd34b8d58256973e22270bedec ] Refresh tirs is looping over a global list of tirs while netdevs are adding and removing tirs from that list. That is why a lock is required. Fixes: 724b2aa15126 ("net/mlx5e: TIRs management refactoring") Signed-off-by: Yuval Avnery Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 44bd84f1b5a52df30428f8b3cf1f0d3bc6ae4a5a Author: Gavi Teitz Date: Mon Mar 11 11:56:34 2019 +0200 net/mlx5e: Fix error handling when refreshing TIRs [ Upstream commit bc87a0036826a37b43489b029af8143bd07c6cca ] Previously, a false positive would be caught if the TIRs list is empty, since the err value was initialized to -ENOMEM, and was only updated if a TIR is refreshed. This is resolved by initializing the err value to zero. Fixes: b676f653896a ("net/mlx5e: Refactor refresh TIRs") Signed-off-by: Gavi Teitz Reviewed-by: Roi Dayan Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 59c5f595a120808cf26f426af162fc9237b57f9b Author: Stephen Suryaputra Date: Mon Apr 1 09:17:32 2019 -0400 vrf: check accept_source_route on the original netdevice [ Upstream commit 8c83f2df9c6578ea4c5b940d8238ad8a41b87e9e ] Configuration check to accept source route IP options should be made on the incoming netdevice when the skb->dev is an l3mdev master. The route lookup for the source route next hop also needs the incoming netdev. v2->v3: - Simplify by passing the original netdevice down the stack (per David Ahern). Signed-off-by: Stephen Suryaputra Reviewed-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 71707cc55c39a2f22886c198ddaf788cb06d61f6 Author: Dust Li Date: Mon Apr 1 16:04:53 2019 +0800 tcp: fix a potential NULL pointer dereference in tcp_sk_exit [ Upstream commit b506bc975f60f06e13e74adb35e708a23dc4e87c ] When tcp_sk_init() failed in inet_ctl_sock_create(), 'net->ipv4.tcp_congestion_control' will be left uninitialized, but tcp_sk_exit() hasn't check for that. This patch add checking on 'net->ipv4.tcp_congestion_control' in tcp_sk_exit() to prevent NULL-ptr dereference. Fixes: 6670e1524477 ("tcp: Namespace-ify sysctl_tcp_default_congestion_control") Signed-off-by: Dust Li Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit bc7167651e3024361e74fe14d767f17d144f6c73 Author: Koen De Schepper Date: Thu Apr 4 12:24:02 2019 +0000 tcp: Ensure DCTCP reacts to losses [ Upstream commit aecfde23108b8e637d9f5c5e523b24fb97035dc3 ] RFC8257 §3.5 explicitly states that "A DCTCP sender MUST react to loss episodes in the same way as conventional TCP". Currently, Linux DCTCP performs no cwnd reduction when losses are encountered. Optionally, the dctcp_clamp_alpha_on_loss resets alpha to its maximal value if a RTO happens. This behavior is sub-optimal for at least two reasons: i) it ignores losses triggering fast retransmissions; and ii) it causes unnecessary large cwnd reduction in the future if the loss was isolated as it resets the historical term of DCTCP's alpha EWMA to its maximal value (i.e., denoting a total congestion). The second reason has an especially noticeable effect when using DCTCP in high BDP environments, where alpha normally stays at low values. This patch replace the clamping of alpha by setting ssthresh to half of cwnd for both fast retransmissions and RTOs, at most once per RTT. Consequently, the dctcp_clamp_alpha_on_loss module parameter has been removed. The table below shows experimental results where we measured the drop probability of a PIE AQM (not applying ECN marks) at a bottleneck in the presence of a single TCP flow with either the alpha-clamping option enabled or the cwnd halving proposed by this patch. Results using reno or cubic are given for comparison. | Link | RTT | Drop TCP CC | speed | base+AQM | probability ==================|=========|==========|============ CUBIC | 40Mbps | 7+20ms | 0.21% RENO | | | 0.19% DCTCP-CLAMP-ALPHA | | | 25.80% DCTCP-HALVE-CWND | | | 0.22% ------------------|---------|----------|------------ CUBIC | 100Mbps | 7+20ms | 0.03% RENO | | | 0.02% DCTCP-CLAMP-ALPHA | | | 23.30% DCTCP-HALVE-CWND | | | 0.04% ------------------|---------|----------|------------ CUBIC | 800Mbps | 1+1ms | 0.04% RENO | | | 0.05% DCTCP-CLAMP-ALPHA | | | 18.70% DCTCP-HALVE-CWND | | | 0.06% We see that, without halving its cwnd for all source of losses, DCTCP drives the AQM to large drop probabilities in order to keep the queue length under control (i.e., it repeatedly faces RTOs). Instead, if DCTCP reacts to all source of losses, it can then be controlled by the AQM using similar drop levels than cubic or reno. Signed-off-by: Koen De Schepper Signed-off-by: Olivier Tilmans Cc: Bob Briscoe Cc: Lawrence Brakmo Cc: Florian Westphal Cc: Daniel Borkmann Cc: Yuchung Cheng Cc: Neal Cardwell Cc: Eric Dumazet Cc: Andrew Shewmaker Cc: Glenn Judd Acked-by: Florian Westphal Acked-by: Neal Cardwell Acked-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit cd1b7376d8a395333bfbd0b649bbb99f1bef0ed8 Author: Xin Long Date: Sun Mar 31 16:58:15 2019 +0800 sctp: initialize _pad of sockaddr_in before copying to user memory [ Upstream commit 09279e615c81ce55e04835970601ae286e3facbe ] Syzbot report a kernel-infoleak: BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32 Call Trace: _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32 copy_to_user include/linux/uaccess.h:174 [inline] sctp_getsockopt_peer_addrs net/sctp/socket.c:5911 [inline] sctp_getsockopt+0x1668e/0x17f70 net/sctp/socket.c:7562 ... Uninit was stored to memory at: sctp_transport_init net/sctp/transport.c:61 [inline] sctp_transport_new+0x16d/0x9a0 net/sctp/transport.c:115 sctp_assoc_add_peer+0x532/0x1f70 net/sctp/associola.c:637 sctp_process_param net/sctp/sm_make_chunk.c:2548 [inline] sctp_process_init+0x1a1b/0x3ed0 net/sctp/sm_make_chunk.c:2361 ... Bytes 8-15 of 16 are uninitialized It was caused by that th _pad field (the 8-15 bytes) of a v4 addr (saved in struct sockaddr_in) wasn't initialized, but directly copied to user memory in sctp_getsockopt_peer_addrs(). So fix it by calling memset(addr->v4.sin_zero, 0, 8) to initialize _pad of sockaddr_in before copying it to user memory in sctp_v4_addr_to_user(), as sctp_v6_addr_to_user() does. Reported-by: syzbot+86b5c7c236a22616a72f@syzkaller.appspotmail.com Signed-off-by: Xin Long Tested-by: Alexander Potapenko Acked-by: Neil Horman Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 3c36cc5bdf89d2f5e49d7c867700adf782b98d8e Author: Heiner Kallweit Date: Fri Apr 5 20:46:46 2019 +0200 r8169: disable ASPM again [ Upstream commit b75bb8a5b755d0c7bf1ac071e4df2349a7644a1e ] There's a significant number of reports that re-enabling ASPM causes different issues, ranging from decreased performance to system not booting at all. This affects only a minority of users, but the number of affected users is big enough that we better switch off ASPM again. This will hurt notebook users who are not affected by the issues, they may see decreased battery runtime w/o ASPM. With the PCI core folks is being discussed to add generic sysfs attributes to control ASPM. Once this is in place brave enough users can re-enable ASPM on their system. Fixes: a99790bf5c7f ("r8169: Reinstate ASPM Support") Signed-off-by: Heiner Kallweit Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit cdeed1e0f47ea642d916706a9c464fe997533c1d Author: Bjørn Mork Date: Wed Mar 27 15:26:01 2019 +0100 qmi_wwan: add Olicard 600 [ Upstream commit 6289d0facd9ebce4cc83e5da39e15643ee998dc5 ] This is a Qualcomm based device with a QMI function on interface 4. It is mode switched from 2020:2030 using a standard eject message. T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 6 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2020 ProdID=2031 Rev= 2.32 S: Manufacturer=Mobile Connect S: Product=Mobile Connect S: SerialNumber=0123456789ABCDEF C:* #Ifs= 6 Cfg#= 1 Atr=80 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) E: Ad=89(I) Atr=03(Int.) MxPS= 8 Ivl=32ms E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=(none) E: Ad=8a(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=125us Signed-off-by: Bjørn Mork Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 3bcad39f08caf8ce0881628d118fdc71c5d64da8 Author: Andrea Righi Date: Thu Mar 28 07:36:00 2019 +0100 openvswitch: fix flow actions reallocation [ Upstream commit f28cd2af22a0c134e4aa1c64a70f70d815d473fb ] The flow action buffer can be resized if it's not big enough to contain all the requested flow actions. However, this resize doesn't take into account the new requested size, the buffer is only increased by a factor of 2x. This might be not enough to contain the new data, causing a buffer overflow, for example: [ 42.044472] ============================================================================= [ 42.045608] BUG kmalloc-96 (Not tainted): Redzone overwritten [ 42.046415] ----------------------------------------------------------------------------- [ 42.047715] Disabling lock debugging due to kernel taint [ 42.047716] INFO: 0x8bf2c4a5-0x720c0928. First byte 0x0 instead of 0xcc [ 42.048677] INFO: Slab 0xbc6d2040 objects=29 used=18 fp=0xdc07dec4 flags=0x2808101 [ 42.049743] INFO: Object 0xd53a3464 @offset=2528 fp=0xccdcdebb [ 42.050747] Redzone 76f1b237: cc cc cc cc cc cc cc cc ........ [ 42.051839] Object d53a3464: 6b 6b 6b 6b 6b 6b 6b 6b 0c 00 00 00 6c 00 00 00 kkkkkkkk....l... [ 42.053015] Object f49a30cc: 6c 00 0c 00 00 00 00 00 00 00 00 03 78 a3 15 f6 l...........x... [ 42.054203] Object acfe4220: 20 00 02 00 ff ff ff ff 00 00 00 00 00 00 00 00 ............... [ 42.055370] Object 21024e91: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [ 42.056541] Object 070e04c3: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [ 42.057797] Object 948a777a: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [ 42.059061] Redzone 8bf2c4a5: 00 00 00 00 .... [ 42.060189] Padding a681b46e: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ Fix by making sure the new buffer is properly resized to contain all the requested data. BugLink: https://bugs.launchpad.net/bugs/1813244 Signed-off-by: Andrea Righi Acked-by: Pravin B Shelar Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 48a1cd79cc7e780d711e9e0320ac4258ce1219ba Author: Nicolas Dichtel Date: Thu Mar 28 10:35:06 2019 +0100 net/sched: fix ->get helper of the matchall cls [ Upstream commit 0db6f8befc32c68bb13d7ffbb2e563c79e913e13 ] It returned always NULL, thus it was never possible to get the filter. Example: $ ip link add foo type dummy $ ip link add bar type dummy $ tc qdisc add dev foo clsact $ tc filter add dev foo protocol all pref 1 ingress handle 1234 \ matchall action mirred ingress mirror dev bar Before the patch: $ tc filter get dev foo protocol all pref 1 ingress handle 1234 matchall Error: Specified filter handle not found. We have an error talking to the kernel After: $ tc filter get dev foo protocol all pref 1 ingress handle 1234 matchall filter ingress protocol all pref 1 matchall chain 0 handle 0x4d2 not_in_hw action order 1: mirred (Ingress Mirror to device bar) pipe index 1 ref 1 bind 1 CC: Yotam Gigi CC: Jiri Pirko Fixes: fd62d9f5c575 ("net/sched: matchall: Fix configuration race") Signed-off-by: Nicolas Dichtel Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4edf174b821e85146d730b87fb71631b23f62240 Author: Davide Caratti Date: Thu Apr 4 12:31:35 2019 +0200 net/sched: act_sample: fix divide by zero in the traffic path [ Upstream commit fae2708174ae95d98d19f194e03d6e8f688ae195 ] the control path of 'sample' action does not validate the value of 'rate' provided by the user, but then it uses it as divisor in the traffic path. Validate it in tcf_sample_init(), and return -EINVAL with a proper extack message in case that value is zero, to fix a splat with the script below: # tc f a dev test0 egress matchall action sample rate 0 group 1 index 2 # tc -s a s action sample total acts 1 action order 0: sample rate 1/0 group 1 pipe index 2 ref 1 bind 1 installed 19 sec used 19 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 # ping 192.0.2.1 -I test0 -c1 -q divide error: 0000 [#1] SMP PTI CPU: 1 PID: 6192 Comm: ping Not tainted 5.1.0-rc2.diag2+ #591 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:tcf_sample_act+0x9e/0x1e0 [act_sample] Code: 6a f1 85 c0 74 0d 80 3d 83 1a 00 00 00 0f 84 9c 00 00 00 4d 85 e4 0f 84 85 00 00 00 e8 9b d7 9c f1 44 8b 8b e0 00 00 00 31 d2 <41> f7 f1 85 d2 75 70 f6 85 83 00 00 00 10 48 8b 45 10 8b 88 08 01 RSP: 0018:ffffae320190ba30 EFLAGS: 00010246 RAX: 00000000b0677d21 RBX: ffff8af1ed9ec000 RCX: 0000000059a9fe49 RDX: 0000000000000000 RSI: 000000000c7e33b7 RDI: ffff8af23daa0af0 RBP: ffff8af1ee11b200 R08: 0000000074fcaf7e R09: 0000000000000000 R10: 0000000000000050 R11: ffffffffb3088680 R12: ffff8af232307f80 R13: 0000000000000003 R14: ffff8af1ed9ec000 R15: 0000000000000000 FS: 00007fe9c6d2f740(0000) GS:ffff8af23da80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff6772f000 CR3: 00000000746a2004 CR4: 00000000001606e0 Call Trace: tcf_action_exec+0x7c/0x1c0 tcf_classify+0x57/0x160 __dev_queue_xmit+0x3dc/0xd10 ip_finish_output2+0x257/0x6d0 ip_output+0x75/0x280 ip_send_skb+0x15/0x40 raw_sendmsg+0xae3/0x1410 sock_sendmsg+0x36/0x40 __sys_sendto+0x10e/0x140 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x60/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe [...] Kernel panic - not syncing: Fatal exception in interrupt Add a TDC selftest to document that 'rate' is now being validated. Reported-by: Matteo Croce Fixes: 5c5670fae430 ("net/sched: Introduce sample tc action") Signed-off-by: Davide Caratti Acked-by: Yotam Gigi Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4e8d8e767f11aa420b6fef1f260d40010df77628 Author: Mao Wenan Date: Thu Mar 28 17:10:56 2019 +0800 net: rds: force to destroy connection if t_sock is NULL in rds_tcp_kill_sock(). [ Upstream commit cb66ddd156203daefb8d71158036b27b0e2caf63 ] When it is to cleanup net namespace, rds_tcp_exit_net() will call rds_tcp_kill_sock(), if t_sock is NULL, it will not call rds_conn_destroy(), rds_conn_path_destroy() and rds_tcp_conn_free() to free connection, and the worker cp_conn_w is not stopped, afterwards the net is freed in net_drop_ns(); While cp_conn_w rds_connect_worker() will call rds_tcp_conn_path_connect() and reference 'net' which has already been freed. In rds_tcp_conn_path_connect(), rds_tcp_set_callbacks() will set t_sock = sock before sock->ops->connect, but if connect() is failed, it will call rds_tcp_restore_callbacks() and set t_sock = NULL, if connect is always failed, rds_connect_worker() will try to reconnect all the time, so rds_tcp_kill_sock() will never to cancel worker cp_conn_w and free the connections. Therefore, the condition !tc->t_sock is not needed if it is going to do cleanup_net->rds_tcp_exit_net->rds_tcp_kill_sock, because tc->t_sock is always NULL, and there is on other path to cancel cp_conn_w and free connection. So this patch is to fix this. rds_tcp_kill_sock(): ... if (net != c_net || !tc->t_sock) ... Acked-by: Santosh Shilimkar ================================================================== BUG: KASAN: use-after-free in inet_create+0xbcc/0xd28 net/ipv4/af_inet.c:340 Read of size 4 at addr ffff8003496a4684 by task kworker/u8:4/3721 CPU: 3 PID: 3721 Comm: kworker/u8:4 Not tainted 5.1.0 #11 Hardware name: linux,dummy-virt (DT) Workqueue: krdsd rds_connect_worker Call trace: dump_backtrace+0x0/0x3c0 arch/arm64/kernel/time.c:53 show_stack+0x28/0x38 arch/arm64/kernel/traps.c:152 __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x120/0x188 lib/dump_stack.c:113 print_address_description+0x68/0x278 mm/kasan/report.c:253 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x21c/0x348 mm/kasan/report.c:409 __asan_report_load4_noabort+0x30/0x40 mm/kasan/report.c:429 inet_create+0xbcc/0xd28 net/ipv4/af_inet.c:340 __sock_create+0x4f8/0x770 net/socket.c:1276 sock_create_kern+0x50/0x68 net/socket.c:1322 rds_tcp_conn_path_connect+0x2b4/0x690 net/rds/tcp_connect.c:114 rds_connect_worker+0x108/0x1d0 net/rds/threads.c:175 process_one_work+0x6e8/0x1700 kernel/workqueue.c:2153 worker_thread+0x3b0/0xdd0 kernel/workqueue.c:2296 kthread+0x2f0/0x378 kernel/kthread.c:255 ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1117 Allocated by task 687: save_stack mm/kasan/kasan.c:448 [inline] set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xd4/0x180 mm/kasan/kasan.c:553 kasan_slab_alloc+0x14/0x20 mm/kasan/kasan.c:490 slab_post_alloc_hook mm/slab.h:444 [inline] slab_alloc_node mm/slub.c:2705 [inline] slab_alloc mm/slub.c:2713 [inline] kmem_cache_alloc+0x14c/0x388 mm/slub.c:2718 kmem_cache_zalloc include/linux/slab.h:697 [inline] net_alloc net/core/net_namespace.c:384 [inline] copy_net_ns+0xc4/0x2d0 net/core/net_namespace.c:424 create_new_namespaces+0x300/0x658 kernel/nsproxy.c:107 unshare_nsproxy_namespaces+0xa0/0x198 kernel/nsproxy.c:206 ksys_unshare+0x340/0x628 kernel/fork.c:2577 __do_sys_unshare kernel/fork.c:2645 [inline] __se_sys_unshare kernel/fork.c:2643 [inline] __arm64_sys_unshare+0x38/0x58 kernel/fork.c:2643 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall arch/arm64/kernel/syscall.c:47 [inline] el0_svc_common+0x168/0x390 arch/arm64/kernel/syscall.c:83 el0_svc_handler+0x60/0xd0 arch/arm64/kernel/syscall.c:129 el0_svc+0x8/0xc arch/arm64/kernel/entry.S:960 Freed by task 264: save_stack mm/kasan/kasan.c:448 [inline] set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x114/0x220 mm/kasan/kasan.c:521 kasan_slab_free+0x10/0x18 mm/kasan/kasan.c:528 slab_free_hook mm/slub.c:1370 [inline] slab_free_freelist_hook mm/slub.c:1397 [inline] slab_free mm/slub.c:2952 [inline] kmem_cache_free+0xb8/0x3a8 mm/slub.c:2968 net_free net/core/net_namespace.c:400 [inline] net_drop_ns.part.6+0x78/0x90 net/core/net_namespace.c:407 net_drop_ns net/core/net_namespace.c:406 [inline] cleanup_net+0x53c/0x6d8 net/core/net_namespace.c:569 process_one_work+0x6e8/0x1700 kernel/workqueue.c:2153 worker_thread+0x3b0/0xdd0 kernel/workqueue.c:2296 kthread+0x2f0/0x378 kernel/kthread.c:255 ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1117 The buggy address belongs to the object at ffff8003496a3f80 which belongs to the cache net_namespace of size 7872 The buggy address is located 1796 bytes inside of 7872-byte region [ffff8003496a3f80, ffff8003496a5e40) The buggy address belongs to the page: page:ffff7e000d25a800 count:1 mapcount:0 mapping:ffff80036ce4b000 index:0x0 compound_mapcount: 0 flags: 0xffffe0000008100(slab|head) raw: 0ffffe0000008100 dead000000000100 dead000000000200 ffff80036ce4b000 raw: 0000000000000000 0000000080040004 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8003496a4580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8003496a4600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff8003496a4680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8003496a4700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8003496a4780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Fixes: 467fa15356ac("RDS-TCP: Support multiple RDS-TCP listen endpoints, one per netns.") Reported-by: Hulk Robot Signed-off-by: Mao Wenan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ec7aeb6a0709855c6b76bc10e8a218e84d80fc68 Author: Eric Dumazet Date: Wed Mar 27 08:21:30 2019 -0700 netns: provide pure entropy for net_hash_mix() [ Upstream commit 355b98553789b646ed97ad801a619ff898471b92 ] net_hash_mix() currently uses kernel address of a struct net, and is used in many places that could be used to reveal this address to a patient attacker, thus defeating KASLR, for the typical case (initial net namespace, &init_net is not dynamically allocated) I believe the original implementation tried to avoid spending too many cycles in this function, but security comes first. Also provide entropy regardless of CONFIG_NET_NS. Fixes: 0b4419162aa6 ("netns: introduce the net_hash_mix "salt" for hashes") Signed-off-by: Eric Dumazet Reported-by: Amit Klein Reported-by: Benny Pinkas Cc: Pavel Emelyanov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 6ced07adaead7f520ef39f555bc9bdddf6329c56 Author: Artemy Kovalyov Date: Tue Mar 19 11:24:38 2019 +0200 net/mlx5: Decrease default mr cache size [ Upstream commit e8b26b2135dedc0284490bfeac06dfc4418d0105 ] Delete initialization of high order entries in mr cache to decrease initial memory footprint. When required, the administrator can populate the entries with memory keys via the /sys interface. This approach is very helpful to significantly reduce the per HW function memory footprint in virtualization environments such as SRIOV. Fixes: 9603b61de1ee ("mlx5: Move pci device handling from mlx5_ib to mlx5_core") Signed-off-by: Artemy Kovalyov Signed-off-by: Moni Shoua Signed-off-by: Leon Romanovsky Reported-by: Shalom Toledo Acked-by: Or Gerlitz Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 67b0fbfaf828757aba6cab27b3fe7280ee2ad171 Author: Steffen Klassert Date: Tue Apr 2 08:16:03 2019 +0200 net-gro: Fix GRO flush when receiving a GSO packet. [ Upstream commit 0ab03f353d3613ea49d1f924faf98559003670a8 ] Currently we may merge incorrectly a received GSO packet or a packet with frag_list into a packet sitting in the gro_hash list. skb_segment() may crash case because the assumptions on the skb layout are not met. The correct behaviour would be to flush the packet in the gro_hash list and send the received GSO packet directly afterwards. Commit d61d072e87c8e ("net-gro: avoid reorders") sets NAPI_GRO_CB(skb)->flush in this case, but this is not checked before merging. This patch makes sure to check this flag and to not merge in that case. Fixes: d61d072e87c8e ("net-gro: avoid reorders") Signed-off-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 88b9d6f6aff8acd2a613fe1c3245b84d2ff59fc1 Author: Li RongQing Date: Fri Mar 29 09:18:02 2019 +0800 net: ethtool: not call vzalloc for zero sized memory request [ Upstream commit 3d8830266ffc28c16032b859e38a0252e014b631 ] NULL or ZERO_SIZE_PTR will be returned for zero sized memory request, and derefencing them will lead to a segfault so it is unnecessory to call vzalloc for zero sized memory request and not call functions which maybe derefence the NULL allocated memory this also fixes a possible memory leak if phy_ethtool_get_stats returns error, memory should be freed before exit Signed-off-by: Li RongQing Reviewed-by: Wang Li Reviewed-by: Michal Kubecek Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 351ddbdf532c4f44a2b91cf8163f363fb632d2cf Author: Jiri Slaby Date: Fri Mar 29 12:19:46 2019 +0100 kcm: switch order of device registration to fix a crash [ Upstream commit 3c446e6f96997f2a95bf0037ef463802162d2323 ] When kcm is loaded while many processes try to create a KCM socket, a crash occurs: BUG: unable to handle kernel NULL pointer dereference at 000000000000000e IP: mutex_lock+0x27/0x40 kernel/locking/mutex.c:240 PGD 8000000016ef2067 P4D 8000000016ef2067 PUD 3d6e9067 PMD 0 Oops: 0002 [#1] SMP KASAN PTI CPU: 0 PID: 7005 Comm: syz-executor.5 Not tainted 4.12.14-396-default #1 SLE15-SP1 (unreleased) RIP: 0010:mutex_lock+0x27/0x40 kernel/locking/mutex.c:240 RSP: 0018:ffff88000d487a00 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 000000000000000e RCX: 1ffff100082b0719 ... CR2: 000000000000000e CR3: 000000004b1bc003 CR4: 0000000000060ef0 Call Trace: kcm_create+0x600/0xbf0 [kcm] __sock_create+0x324/0x750 net/socket.c:1272 ... This is due to race between sock_create and unfinished register_pernet_device. kcm_create tries to do "net_generic(net, kcm_net_id)". but kcm_net_id is not initialized yet. So switch the order of the two to close the race. This can be reproduced with mutiple processes doing socket(PF_KCM, ...) and one process doing module removal. Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reviewed-by: Michal Kubecek Signed-off-by: Jiri Slaby Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 472a290314f0a1e6342c3cf252ff2b675c6ae09a Author: Lorenzo Bianconi Date: Thu Apr 4 16:37:53 2019 +0200 ipv6: sit: reset ip header pointer in ipip6_rcv [ Upstream commit bb9bd814ebf04f579be466ba61fc922625508807 ] ipip6 tunnels run iptunnel_pull_header on received skbs. This can determine the following use-after-free accessing iph pointer since the packet will be 'uncloned' running pskb_expand_head if it is a cloned gso skb (e.g if the packet has been sent though a veth device) [ 706.369655] BUG: KASAN: use-after-free in ipip6_rcv+0x1678/0x16e0 [sit] [ 706.449056] Read of size 1 at addr ffffe01b6bd855f5 by task ksoftirqd/1/= [ 706.669494] Hardware name: HPE ProLiant m400 Server/ProLiant m400 Server, BIOS U02 08/19/2016 [ 706.771839] Call trace: [ 706.801159] dump_backtrace+0x0/0x2f8 [ 706.845079] show_stack+0x24/0x30 [ 706.884833] dump_stack+0xe0/0x11c [ 706.925629] print_address_description+0x68/0x260 [ 706.982070] kasan_report+0x178/0x340 [ 707.025995] __asan_report_load1_noabort+0x30/0x40 [ 707.083481] ipip6_rcv+0x1678/0x16e0 [sit] [ 707.132623] tunnel64_rcv+0xd4/0x200 [tunnel4] [ 707.185940] ip_local_deliver_finish+0x3b8/0x988 [ 707.241338] ip_local_deliver+0x144/0x470 [ 707.289436] ip_rcv_finish+0x43c/0x14b0 [ 707.335447] ip_rcv+0x628/0x1138 [ 707.374151] __netif_receive_skb_core+0x1670/0x2600 [ 707.432680] __netif_receive_skb+0x28/0x190 [ 707.482859] process_backlog+0x1d0/0x610 [ 707.529913] net_rx_action+0x37c/0xf68 [ 707.574882] __do_softirq+0x288/0x1018 [ 707.619852] run_ksoftirqd+0x70/0xa8 [ 707.662734] smpboot_thread_fn+0x3a4/0x9e8 [ 707.711875] kthread+0x2c8/0x350 [ 707.750583] ret_from_fork+0x10/0x18 [ 707.811302] Allocated by task 16982: [ 707.854182] kasan_kmalloc.part.1+0x40/0x108 [ 707.905405] kasan_kmalloc+0xb4/0xc8 [ 707.948291] kasan_slab_alloc+0x14/0x20 [ 707.994309] __kmalloc_node_track_caller+0x158/0x5e0 [ 708.053902] __kmalloc_reserve.isra.8+0x54/0xe0 [ 708.108280] __alloc_skb+0xd8/0x400 [ 708.150139] sk_stream_alloc_skb+0xa4/0x638 [ 708.200346] tcp_sendmsg_locked+0x818/0x2b90 [ 708.251581] tcp_sendmsg+0x40/0x60 [ 708.292376] inet_sendmsg+0xf0/0x520 [ 708.335259] sock_sendmsg+0xac/0xf8 [ 708.377096] sock_write_iter+0x1c0/0x2c0 [ 708.424154] new_sync_write+0x358/0x4a8 [ 708.470162] __vfs_write+0xc4/0xf8 [ 708.510950] vfs_write+0x12c/0x3d0 [ 708.551739] ksys_write+0xcc/0x178 [ 708.592533] __arm64_sys_write+0x70/0xa0 [ 708.639593] el0_svc_handler+0x13c/0x298 [ 708.686646] el0_svc+0x8/0xc [ 708.739019] Freed by task 17: [ 708.774597] __kasan_slab_free+0x114/0x228 [ 708.823736] kasan_slab_free+0x10/0x18 [ 708.868703] kfree+0x100/0x3d8 [ 708.905320] skb_free_head+0x7c/0x98 [ 708.948204] skb_release_data+0x320/0x490 [ 708.996301] pskb_expand_head+0x60c/0x970 [ 709.044399] __iptunnel_pull_header+0x3b8/0x5d0 [ 709.098770] ipip6_rcv+0x41c/0x16e0 [sit] [ 709.146873] tunnel64_rcv+0xd4/0x200 [tunnel4] [ 709.200195] ip_local_deliver_finish+0x3b8/0x988 [ 709.255596] ip_local_deliver+0x144/0x470 [ 709.303692] ip_rcv_finish+0x43c/0x14b0 [ 709.349705] ip_rcv+0x628/0x1138 [ 709.388413] __netif_receive_skb_core+0x1670/0x2600 [ 709.446943] __netif_receive_skb+0x28/0x190 [ 709.497120] process_backlog+0x1d0/0x610 [ 709.544169] net_rx_action+0x37c/0xf68 [ 709.589131] __do_softirq+0x288/0x1018 [ 709.651938] The buggy address belongs to the object at ffffe01b6bd85580 which belongs to the cache kmalloc-1024 of size 1024 [ 709.804356] The buggy address is located 117 bytes inside of 1024-byte region [ffffe01b6bd85580, ffffe01b6bd85980) [ 709.946340] The buggy address belongs to the page: [ 710.003824] page:ffff7ff806daf600 count:1 mapcount:0 mapping:ffffe01c4001f600 index:0x0 [ 710.099914] flags: 0xfffff8000000100(slab) [ 710.149059] raw: 0fffff8000000100 dead000000000100 dead000000000200 ffffe01c4001f600 [ 710.242011] raw: 0000000000000000 0000000000380038 00000001ffffffff 0000000000000000 [ 710.334966] page dumped because: kasan: bad access detected Fix it resetting iph pointer after iptunnel_pull_header Fixes: a09a4c8dd1ec ("tunnels: Remove encapsulation offloads on decap") Tested-by: Jianlin Shi Signed-off-by: Lorenzo Bianconi Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ccec3a09c429bdf3ab7e6df8850b0d505f606b59 Author: Junwei Hu Date: Tue Apr 2 19:38:04 2019 +0800 ipv6: Fix dangling pointer when ipv6 fragment [ Upstream commit ef0efcd3bd3fd0589732b67fb586ffd3c8705806 ] At the beginning of ip6_fragment func, the prevhdr pointer is obtained in the ip6_find_1stfragopt func. However, all the pointers pointing into skb header may change when calling skb_checksum_help func with skb->ip_summed = CHECKSUM_PARTIAL condition. The prevhdr pointe will be dangling if it is not reloaded after calling __skb_linearize func in skb_checksum_help func. Here, I add a variable, nexthdr_offset, to evaluate the offset, which does not changes even after calling __skb_linearize func. Fixes: 405c92f7a541 ("ipv6: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment") Signed-off-by: Junwei Hu Reported-by: Wenhao Zhang Reported-by: syzbot+e8ce541d095e486074fc@syzkaller.appspotmail.com Reviewed-by: Zhiqiang Liu Acked-by: Martin KaFai Lau Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit fd4ecb124730e5ed5f65d8102ab4b7703a989496 Author: Sheena Mira-ato Date: Mon Apr 1 13:04:42 2019 +1300 ip6_tunnel: Match to ARPHRD_TUNNEL6 for dev type [ Upstream commit b2e54b09a3d29c4db883b920274ca8dca4d9f04d ] The device type for ip6 tunnels is set to ARPHRD_TUNNEL6. However, the ip4ip6_err function is expecting the device type of the tunnel to be ARPHRD_TUNNEL. Since the device types do not match, the function exits and the ICMP error packet is not sent to the originating host. Note that the device type for IPv4 tunnels is set to ARPHRD_TUNNEL. Fix is to expect a tunnel device type of ARPHRD_TUNNEL6 instead. Now the tunnel device type matches and the ICMP error packet is sent to the originating host. Signed-off-by: Sheena Mira-ato Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4f90b9ca3cb22daafd105694b12aad7d5c919a2d Author: Thomas Falcon Date: Thu Apr 4 18:58:26 2019 -0500 ibmvnic: Fix completion structure initialization [ Upstream commit bbd669a868bba591ffd38b7bc75a7b361bb54b04 ] Fix device initialization completion handling for vNIC adapters. Initialize the completion structure on probe and reinitialize when needed. This also fixes a race condition during kdump where the driver can attempt to access the completion struct before it is initialized: Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc0000000081acbe0 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA pSeries Modules linked in: ibmvnic(+) ibmveth sunrpc overlay squashfs loop CPU: 19 PID: 301 Comm: systemd-udevd Not tainted 4.18.0-64.el8.ppc64le #1 NIP: c0000000081acbe0 LR: c0000000081ad964 CTR: c0000000081ad900 REGS: c000000027f3f990 TRAP: 0300 Not tainted (4.18.0-64.el8.ppc64le) MSR: 800000010280b033 CR: 28228288 XER: 00000006 CFAR: c000000008008934 DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 1 GPR00: c0000000081ad964 c000000027f3fc10 c0000000095b5800 c0000000221b4e58 GPR04: 0000000000000003 0000000000000001 000049a086918581 00000000000000d4 GPR08: 0000000000000007 0000000000000000 ffffffffffffffe8 d0000000014dde28 GPR12: c0000000081ad900 c000000009a00c00 0000000000000001 0000000000000100 GPR16: 0000000000000038 0000000000000007 c0000000095e2230 0000000000000006 GPR20: 0000000000400140 0000000000000001 c00000000910c880 0000000000000000 GPR24: 0000000000000000 0000000000000006 0000000000000000 0000000000000003 GPR28: 0000000000000001 0000000000000001 c0000000221b4e60 c0000000221b4e58 NIP [c0000000081acbe0] __wake_up_locked+0x50/0x100 LR [c0000000081ad964] complete+0x64/0xa0 Call Trace: [c000000027f3fc10] [c000000027f3fc60] 0xc000000027f3fc60 (unreliable) [c000000027f3fc60] [c0000000081ad964] complete+0x64/0xa0 [c000000027f3fca0] [d0000000014dad58] ibmvnic_handle_crq+0xce0/0x1160 [ibmvnic] [c000000027f3fd50] [d0000000014db270] ibmvnic_tasklet+0x98/0x130 [ibmvnic] [c000000027f3fda0] [c00000000813f334] tasklet_action_common.isra.3+0xc4/0x1a0 [c000000027f3fe00] [c000000008cd13f4] __do_softirq+0x164/0x400 [c000000027f3fef0] [c00000000813ed64] irq_exit+0x184/0x1c0 [c000000027f3ff20] [c0000000080188e8] __do_irq+0xb8/0x210 [c000000027f3ff90] [c00000000802d0a4] call_do_irq+0x14/0x24 [c000000026a5b010] [c000000008018adc] do_IRQ+0x9c/0x130 [c000000026a5b060] [c000000008008ce4] hardware_interrupt_common+0x114/0x120 Signed-off-by: Thomas Falcon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit b68b3427a2a20803996d0024c7f2073a885b11f7 Author: Haiyang Zhang Date: Thu Mar 28 19:40:36 2019 +0000 hv_netvsc: Fix unwanted wakeup after tx_disable [ Upstream commit 1b704c4a1ba95574832e730f23817b651db2aa59 ] After queue stopped, the wakeup mechanism may wake it up again when ring buffer usage is lower than a threshold. This may cause send path panic on NULL pointer when we stopped all tx queues in netvsc_detach and start removing the netvsc device. This patch fix it by adding a tx_disable flag to prevent unwanted queue wakeup. Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic") Reported-by: Mohammed Gamal Signed-off-by: Haiyang Zhang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 5160fb3353f5c4aeee5a1c072ff4283d73130257 Author: Taehee Yoo Date: Tue Mar 19 13:22:41 2019 +0900 netfilter: nf_tables: add missing ->release_ops() in error path of newrule() [ Upstream commit b25a31bf0ca091aa8bdb9ab329b0226257568bbe ] ->release_ops() callback releases resources and this is used in error path. If nf_tables_newrule() fails after ->select_ops(), it should release resources. but it can not call ->destroy() because that should be called after ->init(). At this point, ->release_ops() should be used for releasing resources. Test commands: modprobe -rv xt_tcpudp iptables-nft -I INPUT -m tcp <-- error command lsmod Result: Module Size Used by xt_tcpudp 20480 2 <-- it should be 0 Fixes: b8e204006340 ("netfilter: nft_compat: use .release_ops and remove list of extension") Signed-off-by: Taehee Yoo Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 19589530ec6d22ff340d9a13d13f81ee813e79b5 Author: Pablo Neira Ayuso Date: Mon Mar 11 13:04:16 2019 +0100 netfilter: nf_tables: use-after-free in dynamic operations [ Upstream commit 3f3a390dbd59d236f62cff8e8b20355ef7069e3d ] Smatch reports: net/netfilter/nf_tables_api.c:2167 nf_tables_expr_destroy() error: dereferencing freed memory 'expr->ops' net/netfilter/nf_tables_api.c 2162 static void nf_tables_expr_destroy(const struct nft_ctx *ctx, 2163 struct nft_expr *expr) 2164 { 2165 if (expr->ops->destroy) 2166 expr->ops->destroy(ctx, expr); ^^^^ --> 2167 module_put(expr->ops->type->owner); ^^^^^^^^^ 2168 } Smatch says there are three functions which free expr->ops. Fixes: b8e204006340 ("netfilter: nft_compat: use .release_ops and remove list of extension") Reported-by: Dan Carpenter Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 43154d5c868cc9b2bdae8a55cdebcff3da0d34e3 Author: Pablo Neira Ayuso Date: Wed Feb 13 13:18:36 2019 +0100 netfilter: nft_compat: use .release_ops and remove list of extension [ Upstream commit b8e204006340b7aaf32bd2b9806c692f6e0cb38a ] Add .release_ops, that is called in case of error at a later stage in the expression initialization path, ie. .select_ops() has been already set up operations and that needs to be undone. This allows us to unwind .select_ops from the error path, ie. release the dynamic operations for this extension. Moreover, allocate one single operation instead of recycling them, this comes at the cost of consuming a bit more memory per rule, but it simplifies the infrastructure. Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 93f7f61ed079fe074fa5b9aadffa1aa048da527f Author: Masahiro Yamada Date: Thu Feb 14 12:05:14 2019 +0900 kbuild: pkg: use -f $(srctree)/Makefile to recurse to top Makefile [ Upstream commit 175209cce23d6b0669ed5366add2517e26cd75cd ] '$(MAKE) KBUILD_SRC=' changes the working directory back and forth between objtree and srctree. It is better to recurse to the top-level Makefile directly. Signed-off-by: Masahiro Yamada Signed-off-by: Sasha Levin commit 118003351916d63cca179fd5aa41aae4d6e816d3 Author: Yan Zhao Date: Wed Mar 27 00:55:45 2019 -0400 drm/i915/gvt: do not let pin count of shadow mm go negative [ Upstream commit 663a50ceac75c2208d2ad95365bc8382fd42f44d ] shadow mm's pin count got increased in workload preparation phase, which is after workload scanning. it will get decreased in complete_current_workload() anyway after workload completion. Sometimes, if a workload meets a scanning error, its shadow mm pin count will not get increased but will get decreased in the end. This patch lets shadow mm's pin count not go below 0. Fixes: 2707e4446688 ("drm/i915/gvt: vGPU graphics memory virtualization") Cc: zhenyuw@linux.intel.com Cc: stable@vger.kernel.org #4.14+ Signed-off-by: Yan Zhao Signed-off-by: Zhenyu Wang Signed-off-by: Sasha Levin