commit 0e3d5747a3ef9986d7617cec396850bf9d039309 Author: Willy Tarreau Date: Tue Jun 27 11:49:32 2017 +0200 Linux 3.10.107 commit 0fcba8f954460c56217897bdffbf62060815329a Author: Helge Deller Date: Mon Jun 19 17:34:05 2017 +0200 Allow stack to grow up to address space limit commit bd726c90b6b8ce87602208701b208a208e6d5600 upstream. Fix expand_upwards() on architectures with an upward-growing stack (parisc, metag and partly IA-64) to allow the stack to reliably grow exactly up to the address space limit given by TASK_SIZE. Signed-off-by: Helge Deller Acked-by: Hugh Dickins Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 28ebf89579a055259280252f68f6c26d986e3ce3 Author: Hugh Dickins Date: Tue Jun 20 02:10:44 2017 -0700 mm: fix new crash in unmapped_area_topdown() commit f4cb767d76cf7ee72f97dd76f6cfa6c76a5edc89 upstream. Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of mmap testing. That's the VM_BUG_ON(gap_end < gap_start) at the end of unmapped_area_topdown(). Linus points out how MAP_FIXED (which does not have to respect our stack guard gap intentions) could result in gap_end below gap_start there. Fix that, and the similar case in its alternative, unmapped_area(). Cc: stable@vger.kernel.org Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas") Reported-by: Dave Jones Debugged-by: Linus Torvalds Signed-off-by: Hugh Dickins Acked-by: Michal Hocko Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 1ad9a25dd06fda9bdc27875e1cedb8277accb212 Author: Hugh Dickins Date: Mon Jun 19 04:03:24 2017 -0700 mm: larger stack guard gap, between vmas commit 1be7107fbe18eed3e319a6c3e83c78254b693acb upstream. Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov Original-patch-by: Michal Hocko Signed-off-by: Hugh Dickins [wt: backport to 4.11: adjust context] [wt: backport to 4.9: adjust context ; kernel doc was not in admin-guide] [wt: backport to 4.4: adjust context ; drop ppc hugetlb_radix changes] [wt: backport to 3.18: adjust context ; no FOLL_POPULATE ; s390 uses generic arch_get_unmapped_area()] [wt: backport to 3.16: adjust context] [wt: backport to 3.10: adjust context ; code logic in PARISC's arch_get_unmapped_area() wasn't found ; code inserted into expand_upwards() and expand_downwards() runs under anon_vma lock; changes for gup.c:faultin_page go to memory.c:__get_user_pages(); included Hugh Dickins' fixes] Signed-off-by: Willy Tarreau commit a5eec86eff34e9a1e503612ebd64ef9bc44acc48 Author: Hector Marco-Gisbert Date: Thu Mar 10 20:51:00 2016 +0100 x86/mm/32: Enable full randomization on i386 and X86_32 commit 8b8addf891de8a00e4d39fc32f93f7c5eb8feceb upstream. Currently on i386 and on X86_64 when emulating X86_32 in legacy mode, only the stack and the executable are randomized but not other mmapped files (libraries, vDSO, etc.). This patch enables randomization for the libraries, vDSO and mmap requests on i386 and in X86_32 in legacy mode. By default on i386 there are 8 bits for the randomization of the libraries, vDSO and mmaps which only uses 1MB of VA. This patch preserves the original randomness, using 1MB of VA out of 3GB or 4GB. We think that 1MB out of 3GB is not a big cost for having the ASLR. The first obvious security benefit is that all objects are randomized (not only the stack and the executable) in legacy mode which highly increases the ASLR effectiveness, otherwise the attackers may use these non-randomized areas. But also sensitive setuid/setgid applications are more secure because currently, attackers can disable the randomization of these applications by setting the ulimit stack to "unlimited". This is a very old and widely known trick to disable the ASLR in i386 which has been allowed for too long. Another trick used to disable the ASLR was to set the ADDR_NO_RANDOMIZE personality flag, but fortunately this doesn't work on setuid/setgid applications because there is security checks which clear Security-relevant flags. This patch always randomizes the mmap_legacy_base address, removing the possibility to disable the ASLR by setting the stack to "unlimited". Signed-off-by: Hector Marco-Gisbert Acked-by: Ismael Ripoll Ripoll Acked-by: Kees Cook Acked-by: Arjan van de Ven Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: akpm@linux-foundation.org Cc: kees Cook Link: http://lkml.kernel.org/r/1457639460-5242-1-git-send-email-hecmargi@upv.es Signed-off-by: Ingo Molnar Signed-off-by: Ben Hutchings Signed-off-by: Willy Tarreau commit 2d49a81a8a89b2a398e358fad5bb819f9d1f2528 Author: Kees Cook Date: Tue Apr 14 15:47:45 2015 -0700 x86: standardize mmap_rnd() usage commit 82168140bc4cec7ec9bad39705518541149ff8b7 upstream. In preparation for splitting out ET_DYN ASLR, this refactors the use of mmap_rnd() to be used similarly to arm, and extracts the checking of PF_RANDOMIZE. Signed-off-by: Kees Cook Reviewed-by: Ingo Molnar Cc: Oleg Nesterov Cc: Andy Lutomirski Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings Signed-off-by: Willy Tarreau commit 36805c5166899a5a013821d3e15b84409ae1ceb6 Author: Jamie Bainbridge Date: Wed Apr 26 10:43:27 2017 +1000 ipv6: check raw payload size correctly in ioctl commit 105f5528b9bbaa08b526d3405a5bcd2ff0c953c8 upstream. In situations where an skb is paged, the transport header pointer and tail pointer can be the same because the skb contents are in frags. This results in ioctl(SIOCINQ/FIONREAD) incorrectly returning a length of 0 when the length to receive is actually greater than zero. skb->len is already correctly set in ip6_input_finish() with pskb_pull(), so use skb->len as it always returns the correct result for both linear and paged data. Signed-off-by: Jamie Bainbridge Signed-off-by: Willy Tarreau commit d46354fc25ca29a05ed43194b56d2b1f6816f934 Author: Sergey Senozhatsky Date: Sat Feb 18 03:42:54 2017 -0800 printk: use rcuidle console tracepoint commit fc98c3c8c9dcafd67adcce69e6ce3191d5306c9c upstream. Use rcuidle console tracepoint because, apparently, it may be issued from an idle CPU: hw-breakpoint: Failed to enable monitor mode on CPU 0. hw-breakpoint: CPU 0 failed to disable vector catch =============================== [ ERR: suspicious RCU usage. ] 4.10.0-rc8-next-20170215+ #119 Not tainted ------------------------------- ./include/trace/events/printk.h:32 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from idle CPU! rcu_scheduler_active = 2, debug_locks = 0 RCU used illegally from extended quiescent state! 2 locks held by swapper/0/0: #0: (cpu_pm_notifier_lock){......}, at: [] cpu_pm_exit+0x10/0x54 #1: (console_lock){+.+.+.}, at: [] vprintk_emit+0x264/0x474 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc8-next-20170215+ #119 Hardware name: Generic OMAP4 (Flattened Device Tree) console_unlock vprintk_emit vprintk_default printk reset_ctrl_regs dbg_cpu_pm_notify notifier_call_chain cpu_pm_exit omap_enter_idle_coupled cpuidle_enter_state cpuidle_enter_state_coupled do_idle cpu_startup_entry start_kernel This RCU warning, however, is suppressed by lockdep_off() in printk(). lockdep_off() increments the ->lockdep_recursion counter and thus disables RCU_LOCKDEP_WARN() and debug_lockdep_rcu_enabled(), which want lockdep to be enabled "current->lockdep_recursion == 0". Link: http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.com Signed-off-by: Sergey Senozhatsky Reported-by: Tony Lindgren Tested-by: Tony Lindgren Acked-by: Paul E. McKenney Acked-by: Steven Rostedt (VMware) Cc: Petr Mladek Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Tony Lindgren Cc: Russell King Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [wt: changes are in kernel/printk.c in 3.10] Signed-off-by: Willy Tarreau commit 302c74b132987d69752aedeb33b232076d2006df Author: Willem de Bruijn Date: Fri Feb 3 18:20:48 2017 -0500 tun: read vnet_hdr_sz once commit e1edab87faf6ca30cd137e0795bc73aa9a9a22ec upstream. When IFF_VNET_HDR is enabled, a virtio_net header must precede data. Data length is verified to be greater than or equal to expected header length tun->vnet_hdr_sz before copying. Read this value once and cache locally, as it can be updated between the test and use (TOCTOU). [js] we have TUN_VNET_HDR in 3.12 Signed-off-by: Willem de Bruijn Reported-by: Dmitry Vyukov CC: Eric Dumazet Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Jiri Slaby [wt: s/READ_ONCE/ACCESS_ONCE] Signed-off-by: Willy Tarreau commit 58e4633a0841c48ce32f14cd797ec5482ecfa83b Author: Jim Mattson Date: Mon Dec 12 11:01:37 2016 -0800 kvm: nVMX: Allow L1 to intercept software exceptions (#BP and #OF) commit ef85b67385436ddc1998f45f1d6a210f935b3388 upstream. When L2 exits to L0 due to "exception or NMI", software exceptions (#BP and #OF) for which L1 has requested an intercept should be handled by L1 rather than L0. Previously, only hardware exceptions were forwarded to L1. Signed-off-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: Willy Tarreau commit b3d2d8af5b4791f60a1ced720a251c7f89badff7 Author: Josh Poimboeuf Date: Thu Apr 13 17:53:55 2017 -0500 ftrace/x86: Fix triple fault with graph tracing and suspend-to-ram commit 34a477e5297cbaa6ecc6e17c042a866e1cbe80d6 upstream. On x86-32, with CONFIG_FIRMWARE and multiple CPUs, if you enable function graph tracing and then suspend to RAM, it will triple fault and reboot when it resumes. The first fault happens when booting a secondary CPU: startup_32_smp() load_ucode_ap() prepare_ftrace_return() ftrace_graph_is_dead() (accesses 'kill_ftrace_graph') The early head_32.S code calls into load_ucode_ap(), which has an an ftrace hook, so it calls prepare_ftrace_return(), which calls ftrace_graph_is_dead(), which tries to access the global 'kill_ftrace_graph' variable with a virtual address, causing a fault because the CPU is still in real mode. The fix is to add a check in prepare_ftrace_return() to make sure it's running in protected mode before continuing. The check makes sure the stack pointer is a virtual kernel address. It's a bit of a hack, but it's not very intrusive and it works well enough. For reference, here are a few other (more difficult) ways this could have potentially been fixed: - Move startup_32_smp()'s call to load_ucode_ap() down to *after* paging is enabled. (No idea what that would break.) - Track down load_ucode_ap()'s entire callee tree and mark all the functions 'notrace'. (Probably not realistic.) - Pause graph tracing in ftrace_suspend_notifier_call() or bringup_cpu() or __cpu_up(), and ensure that the pause facility can be queried from real mode. Reported-by: Paul Menzel Signed-off-by: Josh Poimboeuf Tested-by: Paul Menzel Reviewed-by: Steven Rostedt (VMware) Cc: "Rafael J . Wysocki" Cc: linux-acpi@vger.kernel.org Cc: Borislav Petkov Cc: Len Brown Link: http://lkml.kernel.org/r/5c1272269a580660703ed2eccf44308e790c7a98.1492123841.git.jpoimboe@redhat.com Signed-off-by: Thomas Gleixner Signed-off-by: Willy Tarreau commit 2103905323896cd854cd881676d1fe5f0bf48124 Author: J. Bruce Fields Date: Fri Apr 21 16:10:18 2017 -0400 nfsd: check for oversized NFSv2/v3 arguments commit e6838a29ecb484c97e4efef9429643b9851fba6e upstream. A client can append random data to the end of an NFSv2 or NFSv3 RPC call without our complaining; we'll just stop parsing at the end of the expected data and ignore the rest. Encoded arguments and replies are stored together in an array of pages, and if a call is too large it could leave inadequate space for the reply. This is normally OK because NFS RPC's typically have either short arguments and long replies (like READ) or long arguments and short replies (like WRITE). But a client that sends an incorrectly long reply can violate those assumptions. This was observed to cause crashes. Also, several operations increment rq_next_page in the decode routine before checking the argument size, which can leave rq_next_page pointing well past the end of the page array, causing trouble later in svc_free_pages. So, following a suggestion from Neil Brown, add a central check to enforce our expectation that no NFSv2/v3 call has both a large call and a large reply. As followup we may also want to rewrite the encoding routines to check more carefully that they aren't running off the end of the page array. We may also consider rejecting calls that have any extra garbage appended. That would be safer, and within our rights by spec, but given the age of our server and the NFS protocol, and the fact that we've never enforced this before, we may need to balance that against the possibility of breaking some oddball client. Reported-by: Tuomas Haanpää Reported-by: Ari Kauppi Reviewed-by: NeilBrown Signed-off-by: J. Bruce Fields Signed-off-by: Willy Tarreau commit 37a23f22c72f6c90fd438db1f2abca3b342850a2 Author: Al Viro Date: Fri Apr 14 17:22:18 2017 -0400 p9_client_readdir() fix commit 71d6ad08379304128e4bdfaf0b4185d54375423e upstream. Don't assume that server is sane and won't return more data than asked for. Signed-off-by: Al Viro Signed-off-by: Willy Tarreau commit 1e46601d54d1649e4cea55790f8ad2c64e7f94d5 Author: Stefano Stabellini Date: Fri Apr 15 18:23:00 2016 -0700 xen/x86: don't lose event interrupts commit c06b6d70feb32d28f04ba37aa3df17973fd37b6b upstream. On slow platforms with unreliable TSC, such as QEMU emulated machines, it is possible for the kernel to request the next event in the past. In that case, in the current implementation of xen_vcpuop_clockevent, we simply return -ETIME. To be precise the Xen returns -ETIME and we pass it on. However the result of this is a missed event, which simply causes the kernel to hang. Instead it is better to always ask the hypervisor for a timer event, even if the timeout is in the past. That way there are no lost interrupts and the kernel survives. To do that, remove the VCPU_SSHOTTMR_future flag. Signed-off-by: Stefano Stabellini Acked-by: Juergen Gross Cc: Julia Lawall Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 403642a01b6dceaff7054480d6a52f3b5e07cb8e Author: santosh.shilimkar@oracle.com Date: Thu Apr 14 10:43:27 2016 -0700 RDS: Fix the atomicity for congestion map update commit e47db94e10447fc467777a40302f2b393e9af2fa upstream. Two different threads with different rds sockets may be in rds_recv_rcvbuf_delta() via receive path. If their ports both map to the same word in the congestion map, then using non-atomic ops to update it could cause the map to be incorrect. Lets use atomics to avoid such an issue. Full credit to Wengang for finding the issue, analysing it and also pointing out to offending code with spin lock based fix. Reviewed-by: Leon Romanovsky Signed-off-by: Wengang Wang Signed-off-by: Santosh Shilimkar Signed-off-by: David S. Miller Cc: Julia Lawall Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 32b0616ee1b1d3d14c27475257f77f7d719d0f1a Author: Corey Minyard Date: Mon Apr 11 09:10:19 2016 -0500 MIPS: Fix crash registers on non-crashing CPUs commit c80e1b62ffca52e2d1d865ee58bc79c4c0c55005 upstream. As part of handling a crash on an SMP system, an IPI is send to all other CPUs to save their current registers and stop. It was using task_pt_regs(current) to get the registers, but that will only be accurate if the CPU was interrupted running in userland. Instead allow the architecture to pass in the registers (all pass NULL now, but allow for the future) and then use get_irq_regs() which should be accurate as we are in an interrupt. Fall back to task_pt_regs(current) if nothing else is available. Signed-off-by: Corey Minyard Cc: David Daney Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13050/ Signed-off-by: Ralf Baechle Cc: Julia Lawall Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit ab93db27afea344e37a1a28f3e067fd4dd5c996e Author: Nikolay Aleksandrov Date: Fri Apr 21 20:42:16 2017 +0300 ip6mr: fix notification device destruction commit 723b929ca0f79c0796f160c2eeda4597ee98d2b8 upstream. Andrey Konovalov reported a BUG caused by the ip6mr code which is caused because we call unregister_netdevice_many for a device that is already being destroyed. In IPv4's ipmr that has been resolved by two commits long time ago by introducing the "notify" parameter to the delete function and avoiding the unregister when called from a notifier, so let's do the same for ip6mr. The trace from Andrey: ------------[ cut here ]------------ kernel BUG at net/core/dev.c:6813! invalid opcode: 0000 [#1] SMP KASAN Modules linked in: CPU: 1 PID: 1165 Comm: kworker/u4:3 Not tainted 4.11.0-rc7+ #251 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: netns cleanup_net task: ffff880069208000 task.stack: ffff8800692d8000 RIP: 0010:rollback_registered_many+0x348/0xeb0 net/core/dev.c:6813 RSP: 0018:ffff8800692de7f0 EFLAGS: 00010297 RAX: ffff880069208000 RBX: 0000000000000002 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88006af90569 RBP: ffff8800692de9f0 R08: ffff8800692dec60 R09: 0000000000000000 R10: 0000000000000006 R11: 0000000000000000 R12: ffff88006af90070 R13: ffff8800692debf0 R14: dffffc0000000000 R15: ffff88006af90000 FS: 0000000000000000(0000) GS:ffff88006cb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe7e897d870 CR3: 00000000657e7000 CR4: 00000000000006e0 Call Trace: unregister_netdevice_many.part.105+0x87/0x440 net/core/dev.c:7881 unregister_netdevice_many+0xc8/0x120 net/core/dev.c:7880 ip6mr_device_event+0x362/0x3f0 net/ipv6/ip6mr.c:1346 notifier_call_chain+0x145/0x2f0 kernel/notifier.c:93 __raw_notifier_call_chain kernel/notifier.c:394 raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401 call_netdevice_notifiers_info+0x51/0x90 net/core/dev.c:1647 call_netdevice_notifiers net/core/dev.c:1663 rollback_registered_many+0x919/0xeb0 net/core/dev.c:6841 unregister_netdevice_many.part.105+0x87/0x440 net/core/dev.c:7881 unregister_netdevice_many net/core/dev.c:7880 default_device_exit_batch+0x4fa/0x640 net/core/dev.c:8333 ops_exit_list.isra.4+0x100/0x150 net/core/net_namespace.c:144 cleanup_net+0x5a8/0xb40 net/core/net_namespace.c:463 process_one_work+0xc04/0x1c10 kernel/workqueue.c:2097 worker_thread+0x223/0x19c0 kernel/workqueue.c:2231 kthread+0x35e/0x430 kernel/kthread.c:231 ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430 Code: 3c 32 00 0f 85 70 0b 00 00 48 b8 00 02 00 00 00 00 ad de 49 89 47 78 e9 93 fe ff ff 49 8d 57 70 49 8d 5f 78 eb 9e e8 88 7a 14 fe <0f> 0b 48 8b 9d 28 fe ff ff e8 7a 7a 14 fe 48 b8 00 00 00 00 00 RIP: rollback_registered_many+0x348/0xeb0 RSP: ffff8800692de7f0 ---[ end trace e0b29c57e9b3292c ]--- Reported-by: Andrey Konovalov Signed-off-by: Nikolay Aleksandrov Tested-by: Andrey Konovalov Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 075de215620bbd06c0aa075c6b983143352530ed Author: Xin Long Date: Thu Apr 6 13:10:52 2017 +0800 sctp: listen on the sock only when it's state is listening or closed commit 34b2789f1d9bf8dcca9b5cb553d076ca2cd898ee upstream. Now sctp doesn't check sock's state before listening on it. It could even cause changing a sock with any state to become a listening sock when doing sctp_listen. This patch is to fix it by checking sock's state in sctp_listen, so that it will listen on the sock with right state. Reported-by: Andrey Konovalov Tested-by: Andrey Konovalov Signed-off-by: Xin Long Acked-by: Marcelo Ricardo Leitner Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 4d1b81c2669bc6ab4ff24d649c876469aba412a9 Author: Eric Dumazet Date: Thu Mar 23 12:39:21 2017 -0700 net: neigh: guard against NULL solicit() method commit 48481c8fa16410ffa45939b13b6c53c2ca609e5f upstream. Dmitry posted a nice reproducer of a bug triggering in neigh_probe() when dereferencing a NULL neigh->ops->solicit method. This can happen for arp_direct_ops/ndisc_direct_ops and similar, which can be used for NUD_NOARP neighbours (created when dev->header_ops is NULL). Admin can then force changing nud_state to some other state that would fire neigh timer. Signed-off-by: Eric Dumazet Reported-by: Dmitry Vyukov Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit c1b42042a542f2493c7a1f0658c8cace120d7cff Author: Arnd Bergmann Date: Tue Jan 26 13:08:10 2016 -0500 gfs2: avoid uninitialized variable warning commit 67893f12e5374bbcaaffbc6e570acbc2714ea884 upstream. We get a bogus warning about a potential uninitialized variable use in gfs2, because the compiler does not figure out that we never use the leaf number if get_leaf_nr() returns an error: fs/gfs2/dir.c: In function 'get_first_leaf': fs/gfs2/dir.c:802:9: warning: 'leaf_no' may be used uninitialized in this function [-Wmaybe-uninitialized] fs/gfs2/dir.c: In function 'dir_split_leaf': fs/gfs2/dir.c:1021:8: warning: 'leaf_no' may be used uninitialized in this function [-Wmaybe-uninitialized] Changing the 'if (!error)' to 'if (!IS_ERR_VALUE(error))' is sufficient to let gcc understand that this is exactly the same condition as in IS_ERR() so it can optimize the code path enough to understand it. Signed-off-by: Arnd Bergmann Signed-off-by: Bob Peterson Signed-off-by: Willy Tarreau commit f6d81f27fcdcbe570dda26bff0ba8ae582f6f3a4 Author: Arnd Bergmann Date: Thu Jan 28 22:58:28 2016 +0100 hostap: avoid uninitialized variable use in hfa384x_get_rid commit 48dc5fb3ba53b20418de8514700f63d88c5de3a3 upstream. The driver reads a value from hfa384x_from_bap(), which may fail, and then assigns the value to a local variable. gcc detects that in in the failure case, the 'rlen' variable now contains uninitialized data: In file included from ../drivers/net/wireless/intersil/hostap/hostap_pci.c:220:0: drivers/net/wireless/intersil/hostap/hostap_hw.c: In function 'hfa384x_get_rid': drivers/net/wireless/intersil/hostap/hostap_hw.c:842:5: warning: 'rec' may be used uninitialized in this function [-Wmaybe-uninitialized] if (le16_to_cpu(rec.len) == 0) { This restructures the function as suggested by Russell King, to make it more readable and get more reliable error handling, by handling each failure mode using a goto. Signed-off-by: Arnd Bergmann Signed-off-by: Kalle Valo Signed-off-by: Willy Tarreau commit 899866934d21292b65884fd68c71efcc8c8a1148 Author: Arnd Bergmann Date: Mon Jan 25 22:54:56 2016 +0100 tty: nozomi: avoid a harmless gcc warning commit a4f642a8a3c2838ad09fe8313d45db46600e1478 upstream. The nozomi wireless data driver has its own helper function to transfer data from a FIFO, doing an extra byte swap on big-endian architectures, presumably to bring the data back into byte-serial order after readw() or readl() perform their implicit byteswap. This helper function is used in the receive_data() function to first read the length into a 32-bit variable, which causes a compile-time warning: drivers/tty/nozomi.c: In function 'receive_data': drivers/tty/nozomi.c:857:9: warning: 'size' may be used uninitialized in this function [-Wmaybe-uninitialized] The problem is that gcc is unsure whether the data was actually read or not. We know that it is at this point, so we can replace it with a single readl() to shut up that warning. I am leaving the byteswap in there, to preserve the existing behavior, even though this seems fishy: Reading the length of the data into a cpu-endian variable should normally not use a second byteswap on big-endian systems, unless the hardware is aware of the CPU endianess. There appears to be a lot more confusion about endianess in this driver, so it probably has not worked on big-endian systems in a long time, if ever, and I have no way to test it. It's well possible that this driver has not been used by anyone in a while, the last patch that looks like it was tested on the hardware is from 2008. Signed-off-by: Arnd Bergmann Signed-off-by: Willy Tarreau commit a8069aeebf93c1f3a397364a44cddaa90ff472f3 Author: Andrey Konovalov Date: Wed Mar 29 16:11:22 2017 +0200 net/packet: fix overflow in check for tp_reserve commit bcc5364bdcfe131e6379363f089e7b4108d35b70 upstream. When calculating po->tp_hdrlen + po->tp_reserve the result can overflow. Fix by checking that tp_reserve <= INT_MAX on assign. Signed-off-by: Andrey Konovalov Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 492980ca268c3a213a71f5c0aef116b58c665e16 Author: Andrey Konovalov Date: Wed Mar 29 16:11:21 2017 +0200 net/packet: fix overflow in check for tp_frame_nr commit 8f8d28e4d6d815a391285e121c3a53a0b6cb9e7b upstream. When calculating rb->frames_per_block * req->tp_block_nr the result can overflow. Add a check that tp_block_size * tp_block_nr <= UINT_MAX. Since frames_per_block <= tp_block_size, the expression would never overflow. Signed-off-by: Andrey Konovalov Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 187d3b3274a52c80c38dffbe033cbb20a76c7648 Author: Michael Ellerman Date: Thu Apr 23 17:27:12 2015 +1000 powerpc: Reject binutils 2.24 when building little endian commit 60e065f70bdb0b0e916389024922ad40f3270c96 upstream. There is a bug in binutils 2.24 which causes miscompilation if we're building little endian and using weak symbols (which the kernel does). It is fixed in binutils commit 57fa7b8c7e59 "Correct elf_merge_st_other arguments for weak symbols", which is in binutils 2.25 and has been backported to the binutils 2.24 branch and has been picked up by most distros it seems. However if we're running stock 2.24 (no extra version) then the bug is present, so check for that and bail. Signed-off-by: Michael Ellerman Signed-off-by: Willy Tarreau commit 2c6896abfa6e0396b96e08c6edcd3a29f02f655b Author: Yazen Ghannam Date: Thu Mar 30 13:17:14 2017 +0200 x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs commit 29f72ce3e4d18066ec75c79c857bee0618a3504b upstream. MCA bank 3 is reserved on systems pre-Fam17h, so it didn't have a name. However, MCA bank 3 is defined on Fam17h systems and can be accessed using legacy MSRs. Without a name we get a stack trace on Fam17h systems when trying to register sysfs files for bank 3 on kernels that don't recognize Scalable MCA. Call MCA bank 3 "decode_unit" since this is what it represents on Fam17h. This will allow kernels without SMCA support to see this bank on Fam17h+ and prevent the stack trace. This will not affect older systems since this bank is reserved on them, i.e. it'll be ignored. Tested on AMD Fam15h and Fam17h systems. WARNING: CPU: 26 PID: 1 at lib/kobject.c:210 kobject_add_internal kobject: (ffff88085bb256c0): attempted to be registered with empty name! ... Call Trace: kobject_add_internal kobject_add kobject_create_and_add threshold_create_device threshold_init_device Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Link: http://lkml.kernel.org/r/1490102285-3659-1-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Thomas Gleixner Signed-off-by: Willy Tarreau commit e898332100f2b505a59229de796b99e67c979453 Author: Sebastian Siewior Date: Wed Feb 22 17:15:21 2017 +0100 ubi/upd: Always flush after prepared for an update commit 9cd9a21ce070be8a918ffd3381468315a7a76ba6 upstream. In commit 6afaf8a484cb ("UBI: flush wl before clearing update marker") I managed to trigger and fix a similar bug. Now here is another version of which I assumed it wouldn't matter back then but it turns out UBI has a check for it and will error out like this: |ubi0 warning: validate_vid_hdr: inconsistent used_ebs |ubi0 error: validate_vid_hdr: inconsistent VID header at PEB 592 All you need to trigger this is? "ubiupdatevol /dev/ubi0_0 file" + a powercut in the middle of the operation. ubi_start_update() sets the update-marker and puts all EBs on the erase list. After that userland can proceed to write new data while the old EB aren't erased completely. A powercut at this point is usually not that much of a tragedy. UBI won't give read access to the static volume because it has the update marker. It will most likely set the corrupted flag because it misses some EBs. So we are all good. Unless the size of the image that has been written differs from the old image in the magnitude of at least one EB. In that case UBI will find two different values for `used_ebs' and refuse to attach the image with the error message mentioned above. So in order not to get in the situation, the patch will ensure that we wait until everything is removed before it tries to write any data. The alternative would be to detect such a case and remove all EBs at the attached time after we processed the volume-table and see the update-marker set. The patch looks bigger and I doubt it is worth it since usually the write() will wait from time to time for a new EB since usually there not that many spare EB that can be used. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Richard Weinberger Signed-off-by: Willy Tarreau commit 0854b58f1f6c8d2540bffcbc1ed5b94f47b96a72 Author: Vitaly Kuznetsov Date: Thu Jun 9 17:08:56 2016 -0700 Drivers: hv: get rid of timeout in vmbus_open() commit 396e287fa2ff46e83ae016cdcb300c3faa3b02f6 upstream. vmbus_teardown_gpadl() can result in infinite wait when it is called on 5 second timeout in vmbus_open(). The issue is caused by the fact that gpadl teardown operation won't ever succeed for an opened channel and the timeout isn't always enough. As a guest, we can always trust the host to respond to our request (and there is nothing we can do if it doesn't). Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan Signed-off-by: Sumit Semwal Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 0c792ee191ef1297a37d9aadd3a4ffc6696d1429 Author: Vitaly Kuznetsov Date: Fri Jun 3 17:09:24 2016 -0700 Drivers: hv: don't leak memory in vmbus_establish_gpadl() commit 7cc80c98070ccc7940fc28811c92cca0a681015d upstream. In some cases create_gpadl_header() allocates submessages but we never free them. [sumits] Note for stable: Upstream commit 4d63763296ab7865a98bc29cc7d77145815ef89f: (Drivers: hv: get rid of redundant messagecount in create_gpadl_header()) changes the list usage to initialize list header in all cases; that patch isn't added to stable, so the current patch is modified a little bit from the upstream commit to check if the list is valid or not. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan Signed-off-by: Sumit Semwal Signed-off-by: Willy Tarreau commit 3fa16561f160d629f79321616a3e0dab5434e8d3 Author: Mantas M Date: Fri Dec 16 10:30:59 2016 +0200 net: ipv6: check route protocol when deleting routes commit c2ed1880fd61a998e3ce40254a99a2ad000f1a7d upstream. The protocol field is checked when deleting IPv4 routes, but ignored for IPv6, which causes problems with routing daemons accidentally deleting externally set routes (observed by multiple bird6 users). This can be verified using `ip -6 route del proto something`. Signed-off-by: Mantas Mikulėnas Signed-off-by: David S. Miller Cc: Ben Hutchings Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 6b7c152578ab00cdb7da05481dafb4a99b1a316d Author: Ben Hutchings Date: Sat Feb 4 16:57:04 2017 +0000 catc: Use heap buffer for memory size test commit 2d6a0e9de03ee658a9adc3bfb2f0ca55dff1e478 upstream. Allocating USB buffers on the stack is not portable, and no longer works on x86_64 (with VMAP_STACK enabled as per default). Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller Cc: Brad Spengler Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit f3487f846cf73c13adafe99ee3d078a18d189dae Author: Ben Hutchings Date: Sat Feb 4 16:56:56 2017 +0000 catc: Combine failure cleanup code in catc_probe() commit d41149145f98fe26dcd0bfd1d6cc095e6e041418 upstream. Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit fad5324503f33bf6547c7227de9de936e11f24a2 Author: Omar Sandoval Date: Wed Feb 1 00:02:27 2017 -0800 virtio-console: avoid DMA from stack commit c4baad50297d84bde1a7ad45e50c73adae4a2192 upstream. put_chars() stuffs the buffer it gets into an sg, but that buffer may be on the stack. This breaks with CONFIG_VMAP_STACK=y (for me, it manifested as printks getting turned into NUL bytes). Signed-off-by: Omar Sandoval Signed-off-by: Michael S. Tsirkin Reviewed-by: Amit Shah Cc: Ben Hutchings Cc: Brad Spengler Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 081fb3dc1515e39f57953e7f51aebe8c9de32fb5 Author: Kees Cook Date: Wed Apr 5 09:39:08 2017 -0700 mm: Tighten x86 /dev/mem with zeroing reads commit a4866aa812518ed1a37d8ea0c881dc946409de94 upstream. Under CONFIG_STRICT_DEVMEM, reading System RAM through /dev/mem is disallowed. However, on x86, the first 1MB was always allowed for BIOS and similar things, regardless of it actually being System RAM. It was possible for heap to end up getting allocated in low 1MB RAM, and then read by things like x86info or dd, which would trip hardened usercopy: usercopy: kernel memory exposure attempt detected from ffff880000090000 (dma-kmalloc-256) (4096 bytes) This changes the x86 exception for the low 1MB by reading back zeros for System RAM areas instead of blindly allowing them. More work is needed to extend this to mmap, but currently mmap doesn't go through usercopy, so hardened usercopy won't Oops the kernel. Reported-by: Tommi Rantala Tested-by: Tommi Rantala Signed-off-by: Kees Cook Cc: Brad Spengler Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit c9b40c2b9b2320c6d8f2fb5095b02ec50139f065 Author: Lee, Chun-Yi Date: Fri Apr 28 16:23:59 2017 +0800 platform/x86: acer-wmi: setup accelerometer when ACPI device was found commit f9ac89f5ad613b462339e845aeb8494646fd9be2 upstream. The 98d610c3739a patch was introduced since v4.11-rc1 that it causes that the accelerometer input device will not be created on workable machines because the HID string comparing logic is wrong. And, the patch doesn't prevent that the accelerometer input device be created on the machines that have no BST0001. That's because the acpi_get_devices() returns success even it didn't find any match device. This patch fixed the HID string comparing logic of BST0001 device. And, it also makes sure that the acpi_get_devices() returns acpi_handle for BST0001. Fixes: 98d610c3739a ("acer-wmi: setup accelerometer when machine has appropriate notify event") Reference: https://bugzilla.kernel.org/show_bug.cgi?id=193761 Reported-by: Samuel Sieb Signed-off-by: "Lee, Chun-Yi" Signed-off-by: Andy Shevchenko Signed-off-by: Willy Tarreau commit 9b2b8b099bbd744f19b031eb9f4e81e49a93ed04 Author: Chun-Yi Lee Date: Thu Nov 3 08:18:52 2016 +0800 platform/x86: acer-wmi: setup accelerometer when machine has appropriate notify event commit 98d610c3739ac354319a6590b915f4624d9151e6 upstream. The accelerometer event relies on the ACERWMID_EVENT_GUID notify. So, this patch changes the codes to setup accelerometer input device when detected ACERWMID_EVENT_GUID. It avoids that the accel input device created on every Acer machines. In addition, patch adds a clearly parsing logic of accelerometer hid to acer_wmi_get_handle_cb callback function. It is positive matching the "SENR" name with "BST0001" device to avoid non-supported hardware. Reported-by: Bjørn Mork Cc: Darren Hart Signed-off-by: Chun-Yi Lee [andy: slightly massage commit message] Signed-off-by: Andy Shevchenko Cc: Ben Hutchings Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit c0dd8fff25130e93d2a7c0bfe2bca2d4db60fe89 Author: Max Bires Date: Tue Jan 3 08:18:07 2017 -0800 char: lack of bool string made CONFIG_DEVPORT always on commit f2cfa58b136e4b06a9b9db7af5ef62fbb5992f62 upstream. Without a bool string present, using "# CONFIG_DEVPORT is not set" in defconfig files would not actually unset devport. This esnured that /dev/port was always on, but there are reasons a user may wish to disable it (smaller kernel, attack surface reduction) if it's not being used. Adding a message here in order to make this user visible. Signed-off-by: Max Bires Acked-by: Arnd Bergmann Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 3fa1aeb6d3bb3776ab57ae798be5c289b790b955 Author: Juergen Gross Date: Fri Apr 7 17:28:23 2017 +0200 xen, fbfront: fix connecting to backend commit 9121b15b5628b38b4695282dc18c553440e0f79b upstream. Connecting to the backend isn't working reliably in xen-fbfront: in case XenbusStateInitWait of the backend has been missed the backend transition to XenbusStateConnected will trigger the connected state only without doing the actions required when the backend has connected. Signed-off-by: Juergen Gross Reviewed-by: Boris Ostrovsky Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Willy Tarreau commit 66940417cbafc16e90ab4a44615c9ea34abe2f38 Author: Nicholas Bellinger Date: Sun Apr 2 13:36:44 2017 -0700 iscsi-target: Drop work-around for legacy GlobalSAN initiator commit 1c99de981f30b3e7868b8d20ce5479fa1c0fea46 upstream. Once upon a time back in 2009, a work-around was added to support the GlobalSAN iSCSI initiator v3.3 for MacOSX, which during login did not propose nor respond to MaxBurstLength, FirstBurstLength, DefaultTime2Wait and DefaultTime2Retain keys. The work-around in iscsi_check_proposer_for_optional_reply() allowed the missing keys to be proposed, but did not require waiting for a response before moving to full feature phase operation. This allowed GlobalSAN v3.3 to work out-of-the box, and for many years we didn't run into login interopt issues with any other initiators.. Until recently, when Martin tried a QLogic 57840S iSCSI Offload HBA on Windows 2016 which completed login, but subsequently failed with: Got unknown iSCSI OpCode: 0x43 The issue was QLogic MSFT side did not propose DefaultTime2Wait + DefaultTime2Retain, so LIO proposes them itself, and immediately transitions to full feature phase because of the GlobalSAN hack. However, the QLogic MSFT side still attempts to respond to DefaultTime2Retain + DefaultTime2Wait, even though LIO has set ISCSI_FLAG_LOGIN_NEXT_STAGE3 + ISCSI_FLAG_LOGIN_TRANSIT in last login response. So while the QLogic MSFT side should have been proposing these two keys to start, it was doing the correct thing per RFC-3720 attempting to respond to proposed keys before transitioning to full feature phase. All that said, recent versions of GlobalSAN iSCSI (v5.3.0.541) does correctly propose the four keys during login, making the original work-around moot. So in order to allow QLogic MSFT to run unmodified as-is, go ahead and drop this long standing work-around. Reported-by: Martin Svec Cc: Martin Svec Cc: Himanshu Madhani Cc: Arun Easi Signed-off-by: Nicholas Bellinger Signed-off-by: Willy Tarreau commit e71bfbe70c000fe63e66e03c6f55cc608df89a09 Author: Nicholas Bellinger Date: Thu Mar 23 17:19:24 2017 -0700 iscsi-target: Fix TMR reference leak during session shutdown commit efb2ea770bb3b0f40007530bc8b0c22f36e1c5eb upstream. This patch fixes a iscsi-target specific TMR reference leak during session shutdown, that could occur when a TMR was quiesced before the hand-off back to iscsi-target code via transport_cmd_check_stop_to_fabric(). The reference leak happens because iscsit_free_cmd() was incorrectly skipping the final target_put_sess_cmd() for TMRs when transport_generic_free_cmd() returned zero because the se_cmd->cmd_kref did not reach zero, due to the missing se_cmd assignment in original code. The result was iscsi_cmd and it's associated se_cmd memory would be freed once se_sess->sess_cmd_map where released, but the associated se_tmr_req was leaked and remained part of se_device->dev_tmr_list. This bug would manfiest itself as kernel paging request OOPsen in core_tmr_lun_reset(), when a left-over se_tmr_req attempted to dereference it's se_cmd pointer that had already been released during normal session shutdown. To address this bug, go ahead and treat ISCSI_OP_SCSI_CMD and ISCSI_OP_SCSI_TMFUNC the same when there is an extra se_cmd->cmd_kref to drop in iscsit_free_cmd(), and use op_scsi to signal __iscsit_free_cmd() when the former needs to clear any further iscsi related I/O state. Reported-by: Rob Millner Cc: Rob Millner Reported-by: Chu Yuan Lin Cc: Chu Yuan Lin Tested-by: Chu Yuan Lin Signed-off-by: Nicholas Bellinger Signed-off-by: Willy Tarreau commit 5e25d2de95789d010837ca2e51b606db4f507fcd Author: Thomas Gleixner Date: Mon Apr 10 17:14:28 2017 +0200 x86/vdso: Plug race between mapping and ELF header setup commit 6fdc6dd90272ce7e75d744f71535cfbd8d77da81 upstream. The vsyscall32 sysctl can racy against a concurrent fork when it switches from disabled to enabled: arch_setup_additional_pages() if (vdso32_enabled) --> No mapping sysctl.vsysscall32() --> vdso32_enabled = true create_elf_tables() ARCH_DLINFO_IA32 if (vdso32_enabled) { --> Add VDSO entry with NULL pointer Make ARCH_DLINFO_IA32 check whether the VDSO mapping has been set up for the newly forked process or not. Signed-off-by: Thomas Gleixner Acked-by: Andy Lutomirski Cc: Peter Zijlstra Cc: Mathias Krause Link: http://lkml.kernel.org/r/20170410151723.602367196@linutronix.de Signed-off-by: Thomas Gleixner Signed-off-by: Willy Tarreau commit 56bc42d0db996003318b06d9c0033506da007c23 Author: Andrey Konovalov Date: Wed Mar 29 16:11:20 2017 +0200 net/packet: fix overflow in check for priv area size commit 2b6867c2ce76c596676bec7d2d525af525fdc6e2 upstream. Subtracting tp_sizeof_priv from tp_block_size and casting to int to check whether one is less then the other doesn't always work (both of them are unsigned ints). Compare them as is instead. Also cast tp_sizeof_priv to u64 before using BLK_PLUS_PRIV, as it can overflow inside BLK_PLUS_PRIV otherwise. Signed-off-by: Andrey Konovalov Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit db930d85d2d3b3ca25b8f36527c194132cd045b6 Author: Rafał Miłecki Date: Sun Nov 20 16:09:30 2016 +0100 mtd: bcm47xxpart: fix parsing first block after aligned TRX commit bd5d21310133921021d78995ad6346f908483124 upstream. After parsing TRX we should skip to the first block placed behind it. Our code was working only with TRX with length not aligned to the blocksize. In other cases (length aligned) it was missing the block places right after TRX. This fixes calculation and simplifies the comment. Signed-off-by: Rafał Miłecki Signed-off-by: Brian Norris Signed-off-by: Amit Pundir Signed-off-by: Willy Tarreau commit ef67ca9962ce88f252dff0fee1f68212a87fa366 Author: Chris Salls Date: Fri Apr 7 23:48:11 2017 -0700 mm/mempolicy.c: fix error handling in set_mempolicy and mbind. commit cf01fb9985e8deb25ccf0ea54d916b8871ae0e62 upstream. In the case that compat_get_bitmap fails we do not want to copy the bitmap to the user as it will contain uninitialized stack data and leak sensitive data. Signed-off-by: Chris Salls Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 4261679ad36ca1703436d54914ab0505598bf706 Author: Paul Mackerras Date: Tue Apr 4 14:56:05 2017 +1000 powerpc: Don't try to fix up misaligned load-with-reservation instructions commit 48fe9e9488743eec9b7c1addd3c93f12f2123d54 upstream. In the past, there was only one load-with-reservation instruction, lwarx, and if a program attempted a lwarx on a misaligned address, it would take an alignment interrupt and the kernel handler would emulate it as though it was lwzx, which was not really correct, but benign since it is loading the right amount of data, and the lwarx should be paired with a stwcx. to the same address, which would also cause an alignment interrupt which would result in a SIGBUS being delivered to the process. We now have 5 different sizes of load-with-reservation instruction. Of those, lharx and ldarx cause an immediate SIGBUS by luck since their entries in aligninfo[] overlap instructions which were not fixed up, but lqarx overlaps with lhz and will be emulated as such. lbarx can never generate an alignment interrupt since it only operates on 1 byte. To straighten this out and fix the lqarx case, this adds code to detect the l[hwdq]arx instructions and return without fixing them up, resulting in a SIGBUS being delivered to the process. [js] include disassemble.h in 3.12 Signed-off-by: Paul Mackerras Signed-off-by: Michael Ellerman Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit aaf47bf25e589947f1db0758cbda6908ae108f49 Author: James Hogan Date: Fri Mar 31 11:14:02 2017 +0100 metag/usercopy: Zero rest of buffer from copy_from_user commit 563ddc1076109f2b3f88e6d355eab7b6fd4662cb upstream. Currently we try to zero the destination for a failed read from userland in fixup code in the usercopy.c macros. The rest of the destination buffer is then zeroed from __copy_user_zeroing(), which is used for both copy_from_user() and __copy_from_user(). Unfortunately we fail to zero in the fixup code as D1Ar1 is set to 0 before the fixup code entry labels, and __copy_from_user() shouldn't even be zeroing the rest of the buffer. Move the zeroing out into copy_from_user() and rename __copy_user_zeroing() to raw_copy_from_user() since it no longer does any zeroing. This also conveniently matches the name needed for RAW_COPY_USER support in a later patch. Fixes: 373cd784d0fc ("metag: Memory handling") Reported-by: Al Viro Signed-off-by: James Hogan Cc: linux-metag@vger.kernel.org Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 0f84dfc9b7b8114dbbe8e41f9bd1e4c8dbcdfacc Author: James Hogan Date: Fri Mar 31 10:37:44 2017 +0100 metag/usercopy: Drop unused macros commit ef62a2d81f73d9cddef14bc3d9097a57010d551c upstream. Metag's lib/usercopy.c has a bunch of copy_from_user macros for larger copies between 5 and 16 bytes which are completely unused. Before fixing zeroing lets drop these macros so there is less to fix. Signed-off-by: James Hogan Cc: Al Viro Cc: linux-metag@vger.kernel.org Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 903377e0196f22aa783a71d763c02ae164b29c60 Author: Jan-Marek Glogowski Date: Mon Feb 20 12:25:58 2017 +0100 Reset TreeId to zero on SMB2 TREE_CONNECT commit 806a28efe9b78ffae5e2757e1ee924b8e50c08ab upstream. Currently the cifs module breaks the CIFS specs on reconnect as described in http://msdn.microsoft.com/en-us/library/cc246529.aspx: "TreeId (4 bytes): Uniquely identifies the tree connect for the command. This MUST be 0 for the SMB2 TREE_CONNECT Request." Signed-off-by: Jan-Marek Glogowski Reviewed-by: Aurelien Aptel Tested-by: Aurelien Aptel Signed-off-by: Steve French Signed-off-by: Willy Tarreau commit 404f763e1b9f6de4db0edff73c18ef4683f67cd3 Author: Li Qiang Date: Mon Mar 27 20:10:53 2017 -0700 drm/vmwgfx: fix integer overflow in vmw_surface_define_ioctl() commit e7e11f99564222d82f0ce84bd521e57d78a6b678 upstream. In vmw_surface_define_ioctl(), the 'num_sizes' is the sum of the 'req->mip_levels' array. This array can be assigned any value from the user space. As both the 'num_sizes' and the array is uint32_t, it is easy to make 'num_sizes' overflow. The later 'mip_levels' is used as the loop count. This can lead an oob write. Add the check of 'req->mip_levels' to avoid this. Signed-off-by: Li Qiang Reviewed-by: Thomas Hellstrom Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 3a363044271d8007c711d8fad2ad86270c5dab50 Author: Thomas Hellstrom Date: Mon Mar 27 13:06:05 2017 +0200 drm/vmwgfx: Remove getparam error message commit 53e16798b0864464c5444a204e1bb93ae246c429 upstream. The mesa winsys sometimes uses unimplemented parameter requests to check for features. Remove the error message to avoid bloating the kernel log. Signed-off-by: Thomas Hellstrom Reviewed-by: Brian Paul Reviewed-by: Sinclair Yeh Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 354b332cb731caca4811707ae175d08a46cff28e Author: Murray McAllister Date: Mon Mar 27 11:15:12 2017 +0200 drm/vmwgfx: avoid calling vzalloc with a 0 size in vmw_get_cap_3d_ioctl() commit 63774069d9527a1aeaa4aa20e929ef5e8e9ecc38 upstream. In vmw_get_cap_3d_ioctl(), a user can supply 0 for a size that is used in vzalloc(). This eventually calls dump_stack() (in warn_alloc()), which can leak useful addresses to dmesg. Add check to avoid a size of 0. Signed-off-by: Murray McAllister Reviewed-by: Sinclair Yeh Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 30cd45a69bc69b191b0f6c14b317a6aeb4771439 Author: Murray McAllister Date: Mon Mar 27 11:12:53 2017 +0200 drm/vmwgfx: NULL pointer dereference in vmw_surface_define_ioctl() commit 36274ab8c596f1240c606bb514da329add2a1bcd upstream. Before memory allocations vmw_surface_define_ioctl() checks the upper-bounds of a user-supplied size, but does not check if the supplied size is 0. Add check to avoid NULL pointer dereferences. Signed-off-by: Murray McAllister Reviewed-by: Sinclair Yeh Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 02f1f3f6b8eea3f16c8ada4da8a89688bf4cd525 Author: Brendan McGrath Date: Sat Jan 7 08:01:38 2017 +1100 HID: i2c-hid: Add sleep between POWER ON and RESET commit a89af4abdf9b353cdd6f61afc0eaaac403304873 upstream. Support for the Asus Touchpad was recently added. It turns out this device can fail initialisation (and become unusable) when the RESET command is sent too soon after the POWER ON command. Unfortunately the i2c-hid specification does not specify the need for a delay between these two commands. But it was discovered the Windows driver has a 1ms delay. As a result, this patch modifies the i2c-hid module to add a sleep inbetween the POWER ON and RESET commands which lasts between 1ms and 5ms. See https://github.com/vlasenko/hid-asus-dkms/issues/24 for further details. Signed-off-by: Brendan McGrath Reviewed-by: Benjamin Tissoires Signed-off-by: Jiri Kosina Signed-off-by: Willy Tarreau commit 2ee4455f07c9e4fffe2ff126c003f31f55b26ff6 Author: Ardinartsev Nikita Date: Thu Jan 26 16:54:42 2017 +0300 HID: hid-lg: Fix immediate disconnection of Logitech Rumblepad 2 commit 877a021e08ccb6434718c0cc781fdf943c884cc0 upstream. With NOGET quirk Logitech F510 is now fully workable in dinput mode including rumble effects (according to fftest). Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=117091 [jkosina@suse.cz: fix patch format] Signed-off-by: Ardinartsev Nikita Acked-by: Benjamin Tissoires Signed-off-by: Jiri Kosina Signed-off-by: Willy Tarreau commit 7a76e42ecca6c06b1551120df6ea5c35f524f6ed Author: Jason A. Donenfeld Date: Thu Mar 23 12:24:43 2017 +0100 padata: avoid race in reordering commit de5540d088fe97ad583cc7d396586437b32149a5 upstream. Under extremely heavy uses of padata, crashes occur, and with list debugging turned on, this happens instead: [87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33 __list_add+0xae/0x130 [87487.301868] list_add corruption. prev->next should be next (ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00). [87487.339011] [] dump_stack+0x68/0xa3 [87487.342198] [] ? console_unlock+0x281/0x6d0 [87487.345364] [] __warn+0xff/0x140 [87487.348513] [] warn_slowpath_fmt+0x4a/0x50 [87487.351659] [] __list_add+0xae/0x130 [87487.354772] [] ? _raw_spin_lock+0x64/0x70 [87487.357915] [] padata_reorder+0x1e6/0x420 [87487.361084] [] padata_do_serial+0xa5/0x120 padata_reorder calls list_add_tail with the list to which its adding locked, which seems correct: spin_lock(&squeue->serial.lock); list_add_tail(&padata->list, &squeue->serial.list); spin_unlock(&squeue->serial.lock); This therefore leaves only place where such inconsistency could occur: if padata->list is added at the same time on two different threads. This pdata pointer comes from the function call to padata_get_next(pd), which has in it the following block: next_queue = per_cpu_ptr(pd->pqueue, cpu); padata = NULL; reorder = &next_queue->reorder; if (!list_empty(&reorder->list)) { padata = list_entry(reorder->list.next, struct padata_priv, list); spin_lock(&reorder->lock); list_del_init(&padata->list); atomic_dec(&pd->reorder_objects); spin_unlock(&reorder->lock); pd->processed++; goto out; } out: return padata; I strongly suspect that the problem here is that two threads can race on reorder list. Even though the deletion is locked, call to list_entry is not locked, which means it's feasible that two threads pick up the same padata object and subsequently call list_add_tail on them at the same time. The fix is thus be hoist that lock outside of that block. Signed-off-by: Jason A. Donenfeld Acked-by: Steffen Klassert Signed-off-by: Herbert Xu Signed-off-by: Willy Tarreau commit 535c7d619017d41a03a5b8b475a8348a9646af2b Author: Uwe Kleine-König Date: Sat Jul 2 17:28:10 2016 +0200 rtc: s35390a: improve irq handling commit 3bd32722c827d00eafe8e6d5b83e9f3148ea7c7e upstream. On some QNAP NAS devices the rtc can wake the machine. Several people noticed that once the machine was woken this way it fails to shut down. That's because the driver fails to acknowledge the interrupt and so it keeps active and restarts the machine immediatly after shutdown. See https://bugs.debian.org/794266 for a bug report. Doing this correctly requires to interpret the INT2 flag of the first read of the STATUS1 register because this bit is cleared by read. Note this is not maximally robust though because a pending irq isn't detected when the STATUS1 register was already read (and so INT2 is not set) but the irq was not disabled. But that is a hardware imposed problem that cannot easily be fixed by software. Signed-off-by: Uwe Kleine-König Signed-off-by: Alexandre Belloni Signed-off-by: Willy Tarreau commit df4be7e66b1e801140e65b275bcc67809610ad7a Author: Uwe Kleine-König Date: Sat Jul 2 17:28:09 2016 +0200 rtc: s35390a: implement reset routine as suggested by the reference commit 8e6583f1b5d1f5f129b873f1428b7e414263d847 upstream. There were two deviations from the reference manual: you have to wait half a second when POC is active and you might have to repeat initialization when POC or BLD are still set after the sequence. Note however that as POC and BLD are cleared by read the driver might not be able to detect that a reset is necessary. I don't have a good idea how to fix this. Additionally report the value read from STATUS1 to the caller. This prepares the next patch. Signed-off-by: Uwe Kleine-König Signed-off-by: Alexandre Belloni Signed-off-by: Willy Tarreau commit 231b0f80d19c96a2083fc735a116882e8db92df5 Author: Uwe Kleine-König Date: Mon Apr 3 23:32:38 2017 +0200 rtc: s35390a: make sure all members in the output are set commit ac4d4f65bbcba478309de36929016d2618421ba1 upstream. The rtc core calls the .read_alarm with all fields initialized to 0. As the s35390a driver doesn't touch some fields the returned date is interpreted as a date in January 1900. So make sure all fields are set to -1; some of them are then overwritten with the right data depending on the hardware state. In mainline this is done by commit d68778b80dd7 ("rtc: initialize output parameter for read alarm to "uninitialized"") in the core. This is considered to dangerous for stable as it might have side effects for other rtc drivers that might for example rely on alarm->time.tm_sec being initialized to 0. Signed-off-by: Uwe Kleine-König Signed-off-by: Willy Tarreau commit 1812b5cac00ba52bca38743bedca01ad3c08d338 Author: Arnd Bergmann Date: Wed Apr 19 19:47:04 2017 +0200 ACPI / power: Avoid maybe-uninitialized warning commit fe8c470ab87d90e4b5115902dd94eced7e3305c3 upstream. gcc -O2 cannot always prove that the loop in acpi_power_get_inferred_state() is enterered at least once, so it assumes that cur_state might not get initialized: drivers/acpi/power.c: In function 'acpi_power_get_inferred_state': drivers/acpi/power.c:222:9: error: 'cur_state' may be used uninitialized in this function [-Werror=maybe-uninitialized] This sets the variable to zero at the start of the loop, to ensure that there is well-defined behavior even for an empty list. This gets rid of the warning. The warning first showed up when the -Os flag got removed in a bug fix patch in linux-4.11-rc5. I would suggest merging this addon patch on top of that bug fix to avoid introducing a new warning in the stable kernels. Fixes: 61b79e16c68d (ACPI: Fix incompatibility with mcount-based function graph tracing) Signed-off-by: Arnd Bergmann Signed-off-by: Rafael J. Wysocki Signed-off-by: Willy Tarreau commit 42fdd36a05233cadd17a24e530c527203c2f02a8 Author: Josh Poimboeuf Date: Thu Mar 16 08:56:28 2017 -0500 ACPI: Fix incompatibility with mcount-based function graph tracing commit 61b79e16c68d703dde58c25d3935d67210b7d71b upstream. Paul Menzel reported a warning: WARNING: CPU: 0 PID: 774 at /build/linux-ROBWaj/linux-4.9.13/kernel/trace/trace_functions_graph.c:233 ftrace_return_to_handler+0x1aa/0x1e0 Bad frame pointer: expected f6919d98, received f6919db0 from func acpi_pm_device_sleep_wake return to c43b6f9d The warning means that function graph tracing is broken for the acpi_pm_device_sleep_wake() function. That's because the ACPI Makefile unconditionally sets the '-Os' gcc flag to optimize for size. That's an issue because mcount-based function graph tracing is incompatible with '-Os' on x86, thanks to the following gcc bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109 I have another patch pending which will ensure that mcount-based function graph tracing is never used with CONFIG_CC_OPTIMIZE_FOR_SIZE on x86. But this patch is needed in addition to that one because the ACPI Makefile overrides that config option for no apparent reason. It has had this flag since the beginning of git history, and there's no related comment, so I don't know why it's there. As far as I can tell, there's no reason for it to be there. The appropriate behavior is for it to honor CONFIG_CC_OPTIMIZE_FOR_{SIZE,PERFORMANCE} like the rest of the kernel. Reported-by: Paul Menzel Signed-off-by: Josh Poimboeuf Acked-by: Steven Rostedt (VMware) Signed-off-by: Rafael J. Wysocki Signed-off-by: Willy Tarreau commit edaad97b7b66222e51a2462f60d06fe3ebb55bca Author: Ilya Dryomov Date: Tue Mar 21 13:44:28 2017 +0100 libceph: force GFP_NOIO for socket allocations commit 633ee407b9d15a75ac9740ba9d3338815e1fcb95 upstream. sock_alloc_inode() allocates socket+inode and socket_wq with GFP_KERNEL, which is not allowed on the writeback path: Workqueue: ceph-msgr con_work [libceph] ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000 0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00 ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148 Call Trace: [] schedule+0x29/0x70 [] schedule_timeout+0x1bd/0x200 [] ? ttwu_do_wakeup+0x2c/0x120 [] ? ttwu_do_activate.constprop.135+0x66/0x70 [] wait_for_completion+0xbf/0x180 [] ? try_to_wake_up+0x390/0x390 [] flush_work+0x165/0x250 [] ? worker_detach_from_pool+0xd0/0xd0 [] xlog_cil_force_lsn+0x81/0x200 [xfs] [] ? __slab_free+0xee/0x234 [] _xfs_log_force_lsn+0x4d/0x2c0 [xfs] [] ? lookup_page_cgroup_used+0xe/0x30 [] ? xfs_reclaim_inode+0xa3/0x330 [xfs] [] xfs_log_force_lsn+0x3f/0xf0 [xfs] [] ? xfs_reclaim_inode+0xa3/0x330 [xfs] [] xfs_iunpin_wait+0xc6/0x1a0 [xfs] [] ? wake_atomic_t_function+0x40/0x40 [] xfs_reclaim_inode+0xa3/0x330 [xfs] [] xfs_reclaim_inodes_ag+0x257/0x3d0 [xfs] [] xfs_reclaim_inodes_nr+0x33/0x40 [xfs] [] xfs_fs_free_cached_objects+0x15/0x20 [xfs] [] super_cache_scan+0x178/0x180 [] shrink_slab_node+0x14e/0x340 [] ? mem_cgroup_iter+0x16b/0x450 [] shrink_slab+0x100/0x140 [] do_try_to_free_pages+0x335/0x490 [] try_to_free_pages+0xb9/0x1f0 [] ? __alloc_pages_direct_compact+0x69/0x1be [] __alloc_pages_nodemask+0x69a/0xb40 [] alloc_pages_current+0x9e/0x110 [] new_slab+0x2c5/0x390 [] __slab_alloc+0x33b/0x459 [] ? sock_alloc_inode+0x2d/0xd0 [] ? inet_sendmsg+0x71/0xc0 [] ? sock_alloc_inode+0x2d/0xd0 [] kmem_cache_alloc+0x1a2/0x1b0 [] sock_alloc_inode+0x2d/0xd0 [] alloc_inode+0x26/0xa0 [] new_inode_pseudo+0x1a/0x70 [] sock_alloc+0x1e/0x80 [] __sock_create+0x95/0x220 [] sock_create_kern+0x24/0x30 [] con_work+0xef9/0x2050 [libceph] [] ? rbd_img_request_submit+0x4c/0x60 [rbd] [] process_one_work+0x159/0x4f0 [] worker_thread+0x11b/0x530 [] ? create_worker+0x1d0/0x1d0 [] kthread+0xc9/0xe0 [] ? flush_kthread_worker+0x90/0x90 [] ret_from_fork+0x58/0x90 [] ? flush_kthread_worker+0x90/0x90 Use memalloc_noio_{save,restore}() to temporarily force GFP_NOIO here. Link: http://tracker.ceph.com/issues/19309 Reported-by: Sergey Jerusalimov Signed-off-by: Ilya Dryomov Reviewed-by: Jeff Layton Signed-off-by: Greg Kroah-Hartman Signed-off-by: Willy Tarreau commit 80fb16c096858f6bb6aac539644bc474cf276281 Author: Dave Martin Date: Mon Mar 27 15:10:57 2017 +0100 metag/ptrace: Reject partial NT_METAG_RPIPE writes commit 7195ee3120d878259e8d94a5d9f808116f34d5ea upstream. It's not clear what behaviour is sensible when doing partial write of NT_METAG_RPIPE, so just don't bother. This patch assumes that userspace will never rely on a partial SETREGSET in this case, since it's not clear what should happen anyway. Signed-off-by: Dave Martin Acked-by: James Hogan Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 137b9e272f4c4f93a7681c20b109663b45e22ada Author: Dave Martin Date: Mon Mar 27 15:10:56 2017 +0100 metag/ptrace: Provide default TXSTATUS for short NT_PRSTATUS commit 5fe81fe98123ce41265c65e95d34418d30d005d1 upstream. Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET to fill TXSTATUS, a well-defined default value is used, based on the task's current value. Suggested-by: James Hogan Signed-off-by: Dave Martin Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 1dc7c2d250c5d5900b9d3f40b2e6b2dc99bf0f63 Author: Dave Martin Date: Mon Mar 27 15:10:55 2017 +0100 metag/ptrace: Preserve previous registers for short regset write commit a78ce80d2c9178351b34d78fec805140c29c193e upstream. Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET to fill all the registers, the thread's old registers are preserved. Signed-off-by: Dave Martin Acked-by: James Hogan Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit e1e277ff7334f0d9765c28cf2ff02da5b4d434ff Author: Dave Martin Date: Mon Mar 27 15:10:59 2017 +0100 sparc/ptrace: Preserve previous registers for short regset write commit d3805c546b275c8cc7d40f759d029ae92c7175f2 upstream. Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET to fill all the registers, the thread's old registers are preserved. Signed-off-by: Dave Martin Acked-by: David S. Miller Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 7d3e34e259aaaea83cfb73d545bd60c8f579463c Author: Dave Martin Date: Mon Mar 27 15:10:53 2017 +0100 c6x/ptrace: Remove useless PTRACE_SETREGSET implementation commit fb411b837b587a32046dc4f369acb93a10b1def8 upstream. gpr_set won't work correctly and can never have been tested, and the correct behaviour is not clear due to the endianness-dependent task layout. So, just remove it. The core code will now return -EOPNOTSUPPORT when trying to set NT_PRSTATUS on this architecture until/unless a correct implementation is supplied. Signed-off-by: Dave Martin Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit e2e1170de69a3d518eeb3339068b1ef75ffe0be6 Author: Ladi Prosek Date: Thu Mar 23 08:04:18 2017 +0100 virtio_balloon: init 1st buffer in stats vq commit fc8653228c8588a120f6b5dad6983b7b61ff669e upstream. When init_vqs runs, virtio_balloon.stats is either uninitialized or contains stale values. The host updates its state with garbage data because it has no way of knowing that this is just a marker buffer used for signaling. This patch updates the stats before pushing the initial buffer. Alternative fixes: * Push an empty buffer in init_vqs. Not easily done with the current virtio implementation and violates the spec "Driver MUST supply the same subset of statistics in all buffers submitted to the statsq". * Push a buffer with invalid tags in init_vqs. Violates the same spec clause, plus "invalid tag" is not really defined. Note: the spec says: When using the legacy interface, the device SHOULD ignore all values in the first buffer in the statsq supplied by the driver after device initialization. Note: Historically, drivers supplied an uninitialized buffer in the first buffer. Unfortunately QEMU does not seem to implement the recommendation even for the legacy interface. Signed-off-by: Ladi Prosek Signed-off-by: Michael S. Tsirkin Signed-off-by: Willy Tarreau commit 0907dbed07d2aeb1c729a61fe114dcf6b8bb85fb Author: Jiri Slaby Date: Thu Dec 15 14:31:01 2016 +0100 crypto: algif_hash - avoid zero-sized array commit 6207119444595d287b1e9e83a2066c17209698f3 upstream. With this reproducer: struct sockaddr_alg alg = { .salg_family = 0x26, .salg_type = "hash", .salg_feat = 0xf, .salg_mask = 0x5, .salg_name = "digest_null", }; int sock, sock2; sock = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(sock, (struct sockaddr *)&alg, sizeof(alg)); sock2 = accept(sock, NULL, NULL); setsockopt(sock, SOL_ALG, ALG_SET_KEY, "\x9b\xca", 2); accept(sock2, NULL, NULL); ==== 8< ======== 8< ======== 8< ======== 8< ==== one can immediatelly see an UBSAN warning: UBSAN: Undefined behaviour in crypto/algif_hash.c:187:7 variable length array bound value 0 <= 0 CPU: 0 PID: 15949 Comm: syz-executor Tainted: G E 4.4.30-0-default #1 ... Call Trace: ... [] ? __ubsan_handle_vla_bound_not_positive+0x13d/0x188 [] ? __ubsan_handle_out_of_bounds+0x1bc/0x1bc [] ? hash_accept+0x5bd/0x7d0 [algif_hash] [] ? hash_accept_nokey+0x3f/0x51 [algif_hash] [] ? hash_accept_parent_nokey+0x4a0/0x4a0 [algif_hash] [] ? SyS_accept+0x2b/0x40 It is a correct warning, as hash state is propagated to accept as zero, but creating a zero-length variable array is not allowed in C. Fix this as proposed by Herbert -- do "?: 1" on that site. No sizeof or similar happens in the code there, so we just allocate one byte even though we do not use the array. Signed-off-by: Jiri Slaby Cc: Herbert Xu Cc: "David S. Miller" (maintainer:CRYPTO API) Reported-by: Sasha Levin Signed-off-by: Herbert Xu Cc: Arnd Bergmann Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 88085d855b3ed3e892e3ed072ae0784f12a3a91e Author: Takashi Iwai Date: Wed Jan 11 17:09:50 2017 +0100 fbcon: Fix vc attr at deinit commit 8aac7f34369726d1a158788ae8aff3002d5eb528 upstream. fbcon can deal with vc_hi_font_mask (the upper 256 chars) and adjust the vc attrs dynamically when vc_hi_font_mask is changed at fbcon_init(). When the vc_hi_font_mask is set, it remaps the attrs in the existing console buffer with one bit shift up (for 9 bits), while it remaps with one bit shift down (for 8 bits) when the value is cleared. It works fine as long as the font gets updated after fbcon was initialized. However, we hit a bizarre problem when the console is switched to another fb driver (typically from vesafb or efifb to drmfb). At switching to the new fb driver, we temporarily rebind the console to the dummy console, then rebind to the new driver. During the switching, we leave the modified attrs as is. Thus, the new fbcon takes over the old buffer as if it were to contain 8 bits chars (although the attrs are still shifted for 9 bits), and effectively this results in the yellow color texts instead of the original white color, as found in the bugzilla entry below. An easy fix for this is to re-adjust the attrs before leaving the fbcon at con_deinit callback. Since the code to adjust the attrs is already present in the current fbcon code, in this patch, we simply factor out the relevant code, and call it from fbcon_deinit(). Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1000619 Signed-off-by: Takashi Iwai Signed-off-by: Bartlomiej Zolnierkiewicz Cc: Arnd Bergmann Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 935896e82abfebe9786716f2fa45703752a099bf Author: Henrik Ingo Date: Sat Mar 25 21:48:16 2017 +0530 uvcvideo: uvc_scan_fallback() for webcams with broken chain commit e950267ab802c8558f1100eafd4087fd039ad634 upstream. Some devices have invalid baSourceID references, causing uvc_scan_chain() to fail, but if we just take the entities we can find and put them together in the most sensible chain we can think of, turns out they do work anyway. Note: This heuristic assumes there is a single chain. At the time of writing, devices known to have such a broken chain are - Acer Integrated Camera (5986:055a) - Realtek rtl157a7 (0bda:57a7) Signed-off-by: Henrik Ingo Signed-off-by: Laurent Pinchart Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sumit Semwal Signed-off-by: Willy Tarreau commit ecf229c3de27f918fede7409aaf4cbde471ecb27 Author: Adrian Hunter Date: Mon Mar 20 19:50:29 2017 +0200 mmc: sdhci: Do not disable interrupts while waiting for clock commit e2ebfb2142acefecc2496e71360f50d25726040b upstream. Disabling interrupts for even a millisecond can cause problems for some devices. That can happen when sdhci changes clock frequency because it waits for the clock to become stable under a spin lock. The spin lock is not necessary here. Anything that is racing with changes to the I/O state is already broken. The mmc core already provides synchronization via "claiming" the host. Although the spin lock probably should be removed from the code paths that lead to this point, such a patch would touch too much code to be suitable for stable trees. Consequently, for this patch, just drop the spin lock while waiting. Signed-off-by: Adrian Hunter Signed-off-by: Ulf Hansson Tested-by: Ludovic Desroches Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 7eb0374df6d09b065f7b8e5eeb6daea5001e56c9 Author: Oliver Neukum Date: Tue Mar 14 12:09:56 2017 +0100 ACM gadget: fix endianness in notifications commit cdd7928df0d2efaa3270d711963773a08a4cc8ab upstream. The gadget code exports the bitfield for serial status changes over the wire in its internal endianness. The fix is to convert to little endian before sending it over the wire. Signed-off-by: Oliver Neukum Tested-by: 家瑋 Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit f754ccc27550ca3e63ac7761a54874ad95044051 Author: Eric Dumazet Date: Wed Mar 22 08:10:21 2017 -0700 tcp: initialize icsk_ack.lrcvtime at session start time commit 15bb7745e94a665caf42bfaabf0ce062845b533b upstream. icsk_ack.lrcvtime has a 0 value at socket creation time. tcpi_last_data_recv can have bogus value if no payload is ever received. This patch initializes icsk_ack.lrcvtime for active sessions in tcp_finish_connect(), and for passive sessions in tcp_create_openreq_child() Signed-off-by: Eric Dumazet Acked-by: Neal Cardwell Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 7dbfa253c3832f4107ce019d026c7892b7d7ec1a Author: Eric Dumazet Date: Tue Mar 21 19:22:28 2017 -0700 ipv4: provide stronger user input validation in nl_fib_input() commit c64c0b3cac4c5b8cb093727d2c19743ea3965c0b upstream. Alexander reported a KMSAN splat caused by reads of uninitialized field (tb_id_in) from user provided struct fib_result_nl It turns out nl_fib_input() sanity tests on user input is a bit wrong : User can pretend nlh->nlmsg_len is big enough, but provide at sendmsg() time a too small buffer. Reported-by: Alexander Potapenko Signed-off-by: Eric Dumazet Signed-off-by: Willy Tarreau commit 6e37a689238c8bbb5e1de45588a00aa716967089 Author: Todd Fujinaka Date: Fri Mar 17 00:48:19 2017 +0000 igb: add i211 to i210 PHY workaround commit 5bc8c230e2a993b49244f9457499f17283da9ec7 upstream. i210 and i211 share the same PHY but have different PCI IDs. Don't forget i211 for any i210 workarounds. Signed-off-by: Todd Fujinaka Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Willy Tarreau commit cc7f31bf7726f63038c5e3caa014db9e383314df Author: Chris J Arges Date: Fri Mar 17 00:48:19 2017 +0000 igb: Workaround for igb i210 firmware issue commit 4e684f59d760a2c7c716bb60190783546e2d08a1 upstream. Sometimes firmware may not properly initialize I347AT4_PAGE_SELECT causing the probe of an igb i210 NIC to fail. This patch adds an addition zeroing of this register during igb_get_phy_id to workaround this issue. Thanks for Jochen Henneberg for the idea and original patch. Signed-off-by: Chris J Arges Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Willy Tarreau commit 741c8ac523d0f45031c07bd26a057af3d9b4e5a8 Author: Rafael J. Wysocki Date: Wed Mar 15 00:12:16 2017 +0100 cpufreq: Fix and clean up show_cpuinfo_cur_freq() commit 9b4f603e7a9f4282aec451063ffbbb8bb410dcd9 upstream. There is a missing newline in show_cpuinfo_cur_freq(), so add it, but while at it clean that function up somewhat too. Signed-off-by: Rafael J. Wysocki Acked-by: Viresh Kumar Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit ab6c86a0042d93b3ca15bbea29e22266a85a400f Author: Sebastian Ott Date: Fri Apr 15 09:41:35 2016 +0200 s390/pci: fix use after free in dma_init commit dba599091c191d209b1499511a524ad9657c0e5a upstream. After a failure during registration of the dma_table (because of the function being in error state) we free its memory but don't reset the associated pointer to zero. When we then receive a notification from firmware (about the function being in error state) we'll try to walk and free the dma_table again. Fix this by resetting the dma_table pointer. In addition to that make sure that we free the iommu_bitmap when appropriate. Signed-off-by: Sebastian Ott Reviewed-by: Gerald Schaefer Signed-off-by: Martin Schwidefsky Cc: Sumit Semwal Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 2b49b1938b5c74e90748d56999edf939e3251f24 Author: Vitaly Kuznetsov Date: Sat Apr 30 19:21:35 2016 -0700 Drivers: hv: balloon: don't crash when memory is added in non-sorted order commit 77c0c9735bc0ba5898e637a3a20d6bcb50e3f67d upstream. When we iterate through all HA regions in handle_pg_range() we have an assumption that all these regions are sorted in the list and the 'start_pfn >= has->end_pfn' check is enough to find the proper region. Unfortunately it's not the case with WS2016 where host can hot-add regions in a different order. We end up modifying the wrong HA region and crashing later on pages online. Modify the check to make sure we found the region we were searching for while iterating. Fix the same check in pfn_covered() as well. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan Cc: Sumit Semwal Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 4bbe571cd9552b96303e0228818f9d002065e85b Author: Alex Hung Date: Fri May 27 15:47:06 2016 +0800 ACPI / video: skip evaluating _DOD when it does not exist commit e34fbbac669de0b7fb7803929d0477f35f6e2833 upstream. Some system supports hybrid graphics and its discrete VGA does not have any connectors and therefore has no _DOD method. Signed-off-by: Alex Hung Reviewed-by: Aaron Lu Signed-off-by: Rafael J. Wysocki Cc: Sumit Semwal Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit a4da2b00fdab6cb7d60c4e3d7907933a62a3a893 Author: Wang, Rui Y Date: Sun Nov 29 22:45:34 2015 +0800 crypto: cryptd - Assign statesize properly commit 1a07834024dfca5c4bed5de8f8714306e0a11836 upstream. cryptd_create_hash() fails by returning -EINVAL. It is because after 8996eafdc ("crypto: ahash - ensure statesize is non-zero") all ahash drivers must have a non-zero statesize. This patch fixes the problem by properly assigning the statesize. Signed-off-by: Rui Wang Signed-off-by: Herbert Xu Cc: Sumit Semwal Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 4324a47c8e6c5bd6d4672a29b6c541d4f516cc1c Author: Wang, Rui Y Date: Sun Nov 29 22:45:33 2015 +0800 crypto: ghash-clmulni - Fix load failure commit 3a020a723c65eb8ffa7c237faca26521a024e582 upstream. ghash_clmulni_intel fails to load on Linux 4.3+ with the following message: "modprobe: ERROR: could not insert 'ghash_clmulni_intel': Invalid argument" After 8996eafdc ("crypto: ahash - ensure statesize is non-zero") all ahash drivers are required to implement import()/export(), and must have a non- zero statesize. This patch has been tested with the algif_hash interface. The calculated digest values, after several rounds of import()s and export()s, match those calculated by tcrypt. Signed-off-by: Rui Wang Signed-off-by: Herbert Xu Cc: Sumit Semwal Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 42be60ff1c448307ba0fb8ac05eeab591796b3e6 Author: Zhaohongjiang Date: Mon Oct 12 15:28:39 2015 +1100 cancel the setfilesize transation when io error happen commit 510c971aeaaebf0dce7a45d16dc3eb9eab1c8340 upstream. Commit 5cb13dcd0fac071b45c4bebe1801a08ff0d89cad upstream. When I ran xfstest/073 case, the remount process was blocked to wait transactions to be zero. I found there was a io error happened, and the setfilesize transaction was not released properly. We should add the changes to cancel the io error in this case. Reproduction steps: 1. dd if=/dev/zero of=xfs1.img bs=1M count=2048 2. mkfs.xfs xfs1.img 3. losetup -f ./xfs1.img /dev/loop0 4. mount -t xfs /dev/loop0 /home/test_dir/ 5. mkdir /home/test_dir/test 6. mkfs.xfs -dfile,name=image,size=2g 7. mount -t xfs -o loop image /home/test_dir/test 8. cp a file bigger than 2g to /home/test_dir/test 9. mount -t xfs -o remount,ro /home/test_dir/test [ dchinner: moved io error detection to xfs_setfilesize_ioend() after transaction context restoration. ] [ nborisov: Adjusted context for 3.12 ] Signed-off-by: Zhao Hongjiang Signed-off-by: Dave Chinner Reviewed-by: Christoph Hellwig Signed-off-by: Nikolay Borisov Signed-off-by: Willy Tarreau commit 46d284b7ac7435478a7bc934256c736e4615885f Author: Linus Torvalds Date: Thu Mar 2 12:17:22 2017 -0800 give up on gcc ilog2() constant optimizations commit 474c90156c8dcc2fa815e6716cc9394d7930cb9c upstream. gcc-7 has an "optimization" pass that completely screws up, and generates the code expansion for the (impossible) case of calling ilog2() with a zero constant, even when the code gcc compiles does not actually have a zero constant. And we try to generate a compile-time error for anybody doing ilog2() on a constant where that doesn't make sense (be it zero or negative). So now gcc7 will fail the build due to our sanity checking, because it created that constant-zero case that didn't actually exist in the source code. There's a whole long discussion on the kernel mailing about how to work around this gcc bug. The gcc people themselevs have discussed their "feature" in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72785 but it's all water under the bridge, because while it looked at one point like it would be solved by the time gcc7 was released, that was not to be. So now we have to deal with this compiler braindamage. And the only simple approach seems to be to just delete the code that tries to warn about bad uses of ilog2(). So now "ilog2()" will just return 0 not just for the value 1, but for any non-positive value too. It's not like I can recall anybody having ever actually tried to use this function on any invalid value, but maybe the sanity check just meant that such code never made it out in public. [js] no tools/include/linux/log2.h copy of that yet Reported-by: Laura Abbott Cc: John Stultz , Cc: Thomas Gleixner Cc: Ard Biesheuvel Signed-off-by: Linus Torvalds Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 4401c71d0caf324b075cd0aa5c6f77e99d5a6ac7 Author: Peter Zijlstra Date: Sat Mar 4 10:27:19 2017 +0100 futex: Add missing error handling to FUTEX_REQUEUE_PI commit 9bbb25afeb182502ca4f2c4f3f88af0681b34cae upstream. Thomas spotted that fixup_pi_state_owner() can return errors and we fail to unlock the rt_mutex in that case. Reported-by: Thomas Gleixner Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Darren Hart Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170304093558.867401760@infradead.org Signed-off-by: Thomas Gleixner Signed-off-by: Willy Tarreau commit 8f4a52d25a0d395f94299707bc3198a3dd3acc4c Author: Peter Zijlstra Date: Sat Mar 4 10:27:18 2017 +0100 futex: Fix potential use-after-free in FUTEX_REQUEUE_PI commit c236c8e95a3d395b0494e7108f0d41cf36ec107c upstream. While working on the futex code, I stumbled over this potential use-after-free scenario. Dmitry triggered it later with syzkaller. pi_mutex is a pointer into pi_state, which we drop the reference on in unqueue_me_pi(). So any access to that pointer after that is bad. Since other sites already do rt_mutex_unlock() with hb->lock held, see for example futex_lock_pi(), simply move the unlock before unqueue_me_pi(). Reported-by: Dmitry Vyukov Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Darren Hart Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170304093558.801744246@infradead.org Signed-off-by: Thomas Gleixner Signed-off-by: Willy Tarreau commit 3710b15ed5301d3af5e274f244f135206b794100 Author: Hannes Frederic Sowa Date: Mon Mar 13 00:01:30 2017 +0100 dccp: fix memory leak during tear-down of unsuccessful connection request commit 72ef9c4125c7b257e3a714d62d778ab46583d6a3 upstream. This patch fixes a memory leak, which happens if the connection request is not fulfilled between parsing the DCCP options and handling the SYN (because e.g. the backlog is full), because we forgot to free the list of ack vectors. Reported-by: Jianwen Ji Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit cb93d3f48930b2e435614570613e7c1d7810a6d1 Author: Florian Westphal Date: Mon Mar 13 16:24:28 2017 +0100 ipv6: avoid write to a possibly cloned skb commit 79e49503efe53a8c51d8b695bedc8a346c5e4a87 upstream. ip6_fragment, in case skb has a fraglist, checks if the skb is cloned. If it is, it will move to the 'slow path' and allocates new skbs for each fragment. However, right before entering the slowpath loop, it updates the nexthdr value of the last ipv6 extension header to NEXTHDR_FRAGMENT, to account for the fragment header that will be inserted in the new ipv6-fragment skbs. In case original skb is cloned this munges nexthdr value of another skb. Avoid this by doing the nexthdr update for each of the new fragment skbs separately. This was observed with tcpdump on a bridge device where netfilter ipv6 reassembly is active: tcpdump shows malformed fragment headers as the l4 header (icmpv6, tcp, etc). is decoded as a fragment header. Cc: Hannes Frederic Sowa Reported-by: Andreas Karis Signed-off-by: Florian Westphal Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit ce01649fc5a0ed37320149e2e6810ff7fe94e170 Author: Dmitry V. Levin Date: Tue Mar 7 23:50:50 2017 +0300 uapi: fix linux/packet_diag.h userspace compilation error commit 745cb7f8a5de0805cade3de3991b7a95317c7c73 upstream. Replace MAX_ADDR_LEN with its numeric value to fix the following linux/packet_diag.h userspace compilation error: /usr/include/linux/packet_diag.h:67:17: error: 'MAX_ADDR_LEN' undeclared here (not in a function) __u8 pdmc_addr[MAX_ADDR_LEN]; This is not the first case in the UAPI where the numeric value of MAX_ADDR_LEN is used instead of symbolic one, uapi/linux/if_link.h already does the same: $ grep MAX_ADDR_LEN include/uapi/linux/if_link.h __u8 mac[32]; /* MAX_ADDR_LEN */ There are no UAPI headers besides these two that use MAX_ADDR_LEN. Signed-off-by: Dmitry V. Levin Acked-by: Pavel Emelyanov Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 9b4c2e72f845a640a8dea56b8feebcf426df140b Author: Eric Dumazet Date: Fri Mar 3 14:08:21 2017 -0800 tcp: fix various issues for sockets morphing to listen state commit 02b2faaf0af1d85585f6d6980e286d53612acfc2 upstream. Dmitry Vyukov reported a divide by 0 triggered by syzkaller, exploiting tcp_disconnect() path that was never really considered and/or used before syzkaller ;) I was not able to reproduce the bug, but it seems issues here are the three possible actions that assumed they would never trigger on a listener. 1) tcp_write_timer_handler 2) tcp_delack_timer_handler 3) MTU reduction Only IPv6 MTU reduction was properly testing TCP_CLOSE and TCP_LISTEN states from tcp_v6_mtu_reduced() Signed-off-by: Eric Dumazet Reported-by: Dmitry Vyukov Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 0363343e50f8aad7462cd0f2017d9eb4ae402138 Author: Arnaldo Carvalho de Melo Date: Wed Mar 1 16:35:07 2017 -0300 dccp: Unlock sock before calling sk_free() commit d5afb6f9b6bb2c57bd0c05e76e12489dc0d037d9 upstream. The code where sk_clone() came from created a new socket and locked it, but then, on the error path didn't unlock it. This problem stayed there for a long while, till b0691c8ee7c2 ("net: Unlock sock before calling sk_free()") fixed it, but unfortunately the callers of sk_clone() (now sk_clone_locked()) were not audited and the one in dccp_create_openreq_child() remained. Now in the age of the syskaller fuzzer, this was finally uncovered, as reported by Dmitry: ---- 8< ---- I've got the following report while running syzkaller fuzzer on 86292b33d4b7 ("Merge branch 'akpm' (patches from Andrew)") [ BUG: held lock freed! ] 4.10.0+ #234 Not tainted ------------------------- syz-executor6/6898 is freeing memory ffff88006286cac0-ffff88006286d3b7, with a lock still held there! (slock-AF_INET6){+.-...}, at: [] spin_lock include/linux/spinlock.h:299 [inline] (slock-AF_INET6){+.-...}, at: [] sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504 5 locks held by syz-executor6/6898: #0: (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock include/net/sock.h:1460 [inline] #0: (sk_lock-AF_INET6){+.+.+.}, at: [] inet_stream_connect+0x44/0xa0 net/ipv4/af_inet.c:681 #1: (rcu_read_lock){......}, at: [] inet6_csk_xmit+0x12a/0x5d0 net/ipv6/inet6_connection_sock.c:126 #2: (rcu_read_lock){......}, at: [] __skb_unlink include/linux/skbuff.h:1767 [inline] #2: (rcu_read_lock){......}, at: [] __skb_dequeue include/linux/skbuff.h:1783 [inline] #2: (rcu_read_lock){......}, at: [] process_backlog+0x264/0x730 net/core/dev.c:4835 #3: (rcu_read_lock){......}, at: [] ip6_input_finish+0x0/0x1700 net/ipv6/ip6_input.c:59 #4: (slock-AF_INET6){+.-...}, at: [] spin_lock include/linux/spinlock.h:299 [inline] #4: (slock-AF_INET6){+.-...}, at: [] sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504 Fix it just like was done by b0691c8ee7c2 ("net: Unlock sock before calling sk_free()"). Reported-by: Dmitry Vyukov Cc: Cong Wang Cc: Eric Dumazet Cc: Gerrit Renker Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20170301153510.GE15145@kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit bc00602be132ecdedf68834e09fb14a76be73078 Author: Alexander Potapenko Date: Wed Mar 1 12:57:20 2017 +0100 net: don't call strlen() on the user buffer in packet_bind_spkt() commit 540e2894f7905538740aaf122bd8e0548e1c34a4 upstream. KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of uninitialized memory in packet_bind_spkt(): Acked-by: Eric Dumazet ================================================================== BUG: KMSAN: use of unitialized memory CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 0000000000000000 ffff88006b6dfc08 ffffffff82559ae8 ffff88006b6dfb48 ffffffff818a7c91 ffffffff85b9c870 0000000000000092 ffffffff85b9c550 0000000000000000 0000000000000092 00000000ec400911 0000000000000002 Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [] dump_stack+0x238/0x290 lib/dump_stack.c:51 [] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003 [] __msan_warning+0x5b/0xb0 mm/kmsan/kmsan_instr.c:424 [< inline >] strlen lib/string.c:484 [] strlcpy+0x9d/0x200 lib/string.c:144 [] packet_bind_spkt+0x144/0x230 net/packet/af_packet.c:3132 [] SYSC_bind+0x40d/0x5f0 net/socket.c:1370 [] SyS_bind+0x82/0xa0 net/socket.c:1356 [] entry_SYSCALL_64_fastpath+0x13/0x8f arch/x86/entry/entry_64.o:? chained origin: 00000000eba00911 [] save_stack_trace+0x27/0x50 arch/x86/kernel/stacktrace.c:67 [< inline >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322 [< inline >] kmsan_save_stack mm/kmsan/kmsan.c:334 [] kmsan_internal_chain_origin+0x118/0x1e0 mm/kmsan/kmsan.c:527 [] __msan_set_alloca_origin4+0xc3/0x130 mm/kmsan/kmsan_instr.c:380 [] SYSC_bind+0x129/0x5f0 net/socket.c:1356 [] SyS_bind+0x82/0xa0 net/socket.c:1356 [] entry_SYSCALL_64_fastpath+0x13/0x8f arch/x86/entry/entry_64.o:? origin description: ----address@SYSC_bind (origin=00000000eb400911) ================================================================== (the line numbers are relative to 4.8-rc6, but the bug persists upstream) , when I run the following program as root: ===================================== #include #include #include #include int main() { struct sockaddr addr; memset(&addr, 0xff, sizeof(addr)); addr.sa_family = AF_PACKET; int fd = socket(PF_PACKET, SOCK_PACKET, htons(ETH_P_ALL)); bind(fd, &addr, sizeof(addr)); return 0; } ===================================== This happens because addr.sa_data copied from the userspace is not zero-terminated, and copying it with strlcpy() in packet_bind_spkt() results in calling strlen() on the kernel copy of that non-terminated buffer. Signed-off-by: Alexander Potapenko Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 3475371514112e50da257ed505a5012abfc55094 Author: Paul Hüber Date: Sun Feb 26 17:58:19 2017 +0100 l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv commit 51fb60eb162ab84c5edf2ae9c63cf0b878e5547e upstream. l2tp_ip_backlog_recv may not return -1 if the packet gets dropped. The return value is passed up to ip_local_deliver_finish, which treats negative values as an IP protocol number for resubmission. Signed-off-by: Paul Hüber Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 57f208211a18c6f58ed3f69cd8b7edee77cac9a7 Author: Luis de Bethencourt Date: Mon Nov 30 14:32:17 2015 +0000 mvsas: fix misleading indentation commit 7789cd39274c51bf475411fe22a8ee7255082809 upstream. Fix a smatch warning: drivers/scsi/mvsas/mv_sas.c:740 mvs_task_prep() warn: curly braces intended? The code is correct, the indention is misleading. When the device is not ready we want to return SAS_PHY_DOWN. But current indentation makes it look like we only do so in the else branch of if (mvi_dev). Signed-off-by: Luis de Bethencourt Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen Signed-off-by: Willy Tarreau commit ab4de16c2dd1ecf1329dd40a9807d43240dbba87 Author: Arnd Bergmann Date: Mon Jan 16 14:20:54 2017 +0100 cpmac: remove hopeless #warning commit d43e6fb4ac4abfe4ef7c102833ed02330ad701e0 upstream. The #warning was present 10 years ago when the driver first got merged. As the platform is rather obsolete by now, it seems very unlikely that the warning will cause anyone to fix the code properly. kernelci.org reports the warning for every build in the meantime, so I think it's better to just turn it into a code comment to reduce noise. Signed-off-by: Arnd Bergmann Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit c4240148b4c6881ab6760af923910271eb17651b Author: Arnd Bergmann Date: Fri Feb 3 10:49:17 2017 +0100 mtd: pmcmsp: use kstrndup instead of kmalloc+strncpy commit 906b268477bc03daaa04f739844c120fe4dbc991 upstream. kernelci.org reports a warning for this driver, as it copies a local variable into a 'const char *' string: drivers/mtd/maps/pmcmsp-flash.c:149:30: warning: passing argument 1 of 'strncpy' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] Using kstrndup() simplifies the code and avoids the warning. Signed-off-by: Arnd Bergmann Acked-by: Marek Vasut Signed-off-by: Brian Norris Signed-off-by: Willy Tarreau commit 4f9f8485123d8bf492a66cddd1e8d2c87e9dfeb2 Author: Arnd Bergmann Date: Fri Feb 3 23:33:23 2017 +0100 crypto: improve gcc optimization flags for serpent and wp512 commit 7d6e9105026788c497f0ab32fa16c82f4ab5ff61 upstream. An ancient gcc bug (first reported in 2003) has apparently resurfaced on MIPS, where kernelci.org reports an overly large stack frame in the whirlpool hash algorithm: crypto/wp512.c:987:1: warning: the frame size of 1112 bytes is larger than 1024 bytes [-Wframe-larger-than=] With some testing in different configurations, I'm seeing large variations in stack frames size up to 1500 bytes for what should have around 300 bytes at most. I also checked the reference implementation, which is essentially the same code but also comes with some test and benchmarking infrastructure. It seems that recent compiler versions on at least arm, arm64 and powerpc have a partial fix for this problem, but enabling "-fsched-pressure", but even with that fix they suffer from the issue to a certain degree. Some testing on arm64 shows that the time needed to hash a given amount of data is roughly proportional to the stack frame size here, which makes sense given that the wp512 implementation is doing lots of loads for table lookups, and the problem with the overly large stack is a result of doing a lot more loads and stores for spilled registers (as seen from inspecting the object code). Disabling -fschedule-insns consistently fixes the problem for wp512, in my collection of cross-compilers, the results are consistently better or identical when comparing the stack sizes in this function, though some architectures (notable x86) have schedule-insns disabled by default. The four columns are: default: -O2 press: -O2 -fsched-pressure nopress: -O2 -fschedule-insns -fno-sched-pressure nosched: -O2 -no-schedule-insns (disables sched-pressure) default press nopress nosched alpha-linux-gcc-4.9.3 1136 848 1136 176 am33_2.0-linux-gcc-4.9.3 2100 2076 2100 2104 arm-linux-gnueabi-gcc-4.9.3 848 848 1048 352 cris-linux-gcc-4.9.3 272 272 272 272 frv-linux-gcc-4.9.3 1128 1000 1128 280 hppa64-linux-gcc-4.9.3 1128 336 1128 184 hppa-linux-gcc-4.9.3 644 308 644 276 i386-linux-gcc-4.9.3 352 352 352 352 m32r-linux-gcc-4.9.3 720 656 720 268 microblaze-linux-gcc-4.9.3 1108 604 1108 256 mips64-linux-gcc-4.9.3 1328 592 1328 208 mips-linux-gcc-4.9.3 1096 624 1096 240 powerpc64-linux-gcc-4.9.3 1088 432 1088 160 powerpc-linux-gcc-4.9.3 1080 584 1080 224 s390-linux-gcc-4.9.3 456 456 624 360 sh3-linux-gcc-4.9.3 292 292 292 292 sparc64-linux-gcc-4.9.3 992 240 992 208 sparc-linux-gcc-4.9.3 680 592 680 312 x86_64-linux-gcc-4.9.3 224 240 272 224 xtensa-linux-gcc-4.9.3 1152 704 1152 304 aarch64-linux-gcc-7.0.0 224 224 1104 208 arm-linux-gnueabi-gcc-7.0.1 824 824 1048 352 mips-linux-gcc-7.0.0 1120 648 1120 272 x86_64-linux-gcc-7.0.1 240 240 304 240 arm-linux-gnueabi-gcc-4.4.7 840 392 arm-linux-gnueabi-gcc-4.5.4 784 728 784 320 arm-linux-gnueabi-gcc-4.6.4 736 728 736 304 arm-linux-gnueabi-gcc-4.7.4 944 784 944 352 arm-linux-gnueabi-gcc-4.8.5 464 464 760 352 arm-linux-gnueabi-gcc-4.9.3 848 848 1048 352 arm-linux-gnueabi-gcc-5.3.1 824 824 1064 336 arm-linux-gnueabi-gcc-6.1.1 808 808 1056 344 arm-linux-gnueabi-gcc-7.0.1 824 824 1048 352 Trying the same test for serpent-generic, the picture is a bit different, and while -fno-schedule-insns is generally better here than the default, -fsched-pressure wins overall, so I picked that instead. default press nopress nosched alpha-linux-gcc-4.9.3 1392 864 1392 960 am33_2.0-linux-gcc-4.9.3 536 524 536 528 arm-linux-gnueabi-gcc-4.9.3 552 552 776 536 cris-linux-gcc-4.9.3 528 528 528 528 frv-linux-gcc-4.9.3 536 400 536 504 hppa64-linux-gcc-4.9.3 524 208 524 480 hppa-linux-gcc-4.9.3 768 472 768 508 i386-linux-gcc-4.9.3 564 564 564 564 m32r-linux-gcc-4.9.3 712 576 712 532 microblaze-linux-gcc-4.9.3 724 392 724 512 mips64-linux-gcc-4.9.3 720 384 720 496 mips-linux-gcc-4.9.3 728 384 728 496 powerpc64-linux-gcc-4.9.3 704 304 704 480 powerpc-linux-gcc-4.9.3 704 296 704 480 s390-linux-gcc-4.9.3 560 560 592 536 sh3-linux-gcc-4.9.3 540 540 540 540 sparc64-linux-gcc-4.9.3 544 352 544 496 sparc-linux-gcc-4.9.3 544 344 544 496 x86_64-linux-gcc-4.9.3 528 536 576 528 xtensa-linux-gcc-4.9.3 752 544 752 544 aarch64-linux-gcc-7.0.0 432 432 656 480 arm-linux-gnueabi-gcc-7.0.1 616 616 808 536 mips-linux-gcc-7.0.0 720 464 720 488 x86_64-linux-gcc-7.0.1 536 528 600 536 arm-linux-gnueabi-gcc-4.4.7 592 440 arm-linux-gnueabi-gcc-4.5.4 776 448 776 544 arm-linux-gnueabi-gcc-4.6.4 776 448 776 544 arm-linux-gnueabi-gcc-4.7.4 768 448 768 544 arm-linux-gnueabi-gcc-4.8.5 488 488 776 544 arm-linux-gnueabi-gcc-4.9.3 552 552 776 536 arm-linux-gnueabi-gcc-5.3.1 552 552 776 536 arm-linux-gnueabi-gcc-6.1.1 560 560 776 536 arm-linux-gnueabi-gcc-7.0.1 616 616 808 536 I did not do any runtime tests with serpent, so it is possible that stack frame size does not directly correlate with runtime performance here and it actually makes things worse, but it's more likely to help here, and the reduced stack frame size is probably enough reason to apply the patch, especially given that the crypto code is often used in deep call chains. Link: https://kernelci.org/build/id/58797d7559b5149efdf6c3a9/logs/ Link: http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11488 Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149 Cc: Ralf Baechle Signed-off-by: Arnd Bergmann Signed-off-by: Herbert Xu Signed-off-by: Willy Tarreau commit 2ebbe4f6313f16f619369785d3d849318531ae10 Author: Mathias Nyman Date: Fri Apr 8 16:25:10 2016 +0300 xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers commit 98d74f9ceaefc2b6c4a6440050163a83be0abede upstream. PCI hotpluggable xhci controllers such as some Alpine Ridge solutions will remove the xhci controller from the PCI bus when the last USB device is disconnected. Add a flag to indicate that the host is being removed to avoid queueing configure_endpoint commands for the dropped endpoints. For PCI hotplugged controllers this will prevent 5 second command timeouts For static xhci controllers the configure_endpoint command is not needed in the removal case as everything will be returned, freed, and the controller is reset. For now the flag is only set for PCI connected host controllers. Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman Signed-off-by: Willy Tarreau commit 8d05f356cc2b32d16371037aa24dc6e0b8d31316 Author: K. Y. Srinivasan Date: Wed Feb 8 18:30:56 2017 -0700 drivers: hv: Turn off write permission on the hypercall page commit 372b1e91343e657a7cc5e2e2bcecd5140ac28119 upstream. The hypercall page only needs to be executable but currently it is setup to be writable as well. Fix the issue. Signed-off-by: K. Y. Srinivasan Acked-by: Kees Cook Reported-by: Stephen Hemminger Tested-by: Stephen Hemminger Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 496355ec8c2f26fa65a5650ce4a0d89a3fc2bf46 Author: OGAWA Hirofumi Date: Thu Mar 9 16:17:37 2017 -0800 fat: fix using uninitialized fields of fat_inode/fsinfo_inode commit c0d0e351285161a515396b7b1ee53ec9ffd97e3c upstream. Recently fallocate patch was merged and it uses MSDOS_I(inode)->mmu_private at fat_evict_inode(). However, fat_inode/fsinfo_inode that was introduced in past didn't initialize MSDOS_I(inode) properly. With those combinations, it became the cause of accessing random entry in FAT area. Link: http://lkml.kernel.org/r/87pohrj4i8.fsf@mail.parknet.co.jp Signed-off-by: OGAWA Hirofumi Reported-by: Moreno Bartalucci Tested-by: Moreno Bartalucci Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 51f2df44a30f0d7c32d84d7dc570eda8829291b0 Author: Michel Dänzer Date: Wed Jan 25 17:21:31 2017 +0900 drm/ttm: Make sure BOs being swapped out are cacheable commit 239ac65fa5ffab71adf66e642750f940e7241d99 upstream. The current caching state may not be tt_cached, even though the placement contains TTM_PL_FLAG_CACHED, because placement can contain multiple caching flags. Trying to swap out such a BO would trip up the BUG_ON(ttm->caching_state != tt_cached); in ttm_tt_swapout. Signed-off-by: Michel Dänzer Reviewed-by: Thomas Hellstrom Reviewed-by: Christian König . Reviewed-by: Sinclair Yeh Signed-off-by: Christian König Signed-off-by: Willy Tarreau commit 6ac27411ce2cc9cb8e48df29d92992741414e1e4 Author: Y.C. Chen Date: Wed Feb 22 15:10:50 2017 +1100 drm/ast: Fix test for VGA enabled commit 905f21a49d388de3e99438235f3301cabf0c0ef4 upstream. The test to see if VGA was already enabled is doing an unnecessary second test from a register that may or may not have been initialized to a valid value. Remove it. Signed-off-by: Y.C. Chen Signed-off-by: Benjamin Herrenschmidt Acked-by: Joel Stanley Tested-by: Y.C. Chen Signed-off-by: Dave Airlie Signed-off-by: Willy Tarreau commit 6958f50ab442e17848b8949e8bc468d09b823c5a Author: Matt Chen Date: Sun Jan 22 02:16:58 2017 +0800 mac80211: flush delayed work when entering suspend commit a9e9200d8661c1a0be8c39f93deb383dc940de35 upstream. The issue was found when entering suspend and resume. It triggers a warning in: mac80211/key.c: ieee80211_enable_keys() ... WARN_ON_ONCE(sdata->crypto_tx_tailroom_needed_cnt || sdata->crypto_tx_tailroom_pending_dec); ... It points out sdata->crypto_tx_tailroom_pending_dec isn't cleaned up successfully in a delayed_work during suspend. Add a flush_delayed_work to fix it. Signed-off-by: Matt Chen Signed-off-by: Johannes Berg Signed-off-by: Willy Tarreau commit 0c51e5d46aeba36a10f0b7a71901bde4ff4bc76e Author: Max Filippov Date: Tue Jan 3 09:37:34 2017 -0800 xtensa: move parse_tag_fdt out of #ifdef CONFIG_BLK_DEV_INITRD commit 4ab18701c66552944188dbcd0ce0012729baab84 upstream. FDT tag parsing is not related to whether BLK_DEV_INITRD is configured or not, move it out of the corresponding #ifdef/#endif block. This fixes passing external FDT to the kernel configured w/o BLK_DEV_INITRD support. Signed-off-by: Max Filippov Signed-off-by: Willy Tarreau commit 71116d6a8d98c68f7561af3c641dd0764b69d73a Author: Martin Schwidefsky Date: Fri Feb 24 07:43:51 2017 +0100 s390: TASK_SIZE for kernel threads commit fb94a687d96c570d46332a4a890f1dcb7310e643 upstream. Return a sensible value if TASK_SIZE if called from a kernel thread. This gets us around an issue with copy_mount_options that does a magic size calculation "TASK_SIZE - (unsigned long)data" while in a kernel thread and data pointing to kernel space. Signed-off-by: Martin Schwidefsky Signed-off-by: Willy Tarreau commit f7bbcabc5e963cc49e3e1b4e82f80117487868e7 Author: Martin Schwidefsky Date: Fri Jul 26 15:04:03 2013 +0200 KVM: s390: fix task size check The gmap_map_segment function uses PGDIR_SIZE in the check for the maximum address in the tasks address space. This incorrectly limits the amount of memory usable for a kvm guest to 4TB. The correct limit is (1UL << 53). As the TASK_SIZE has different values (4TB vs 8PB) dependent on the existance of the fourth page table level, create a new define 'TASK_MAX_SIZE' for (1UL << 53). Signed-off-by: Martin Schwidefsky Signed-off-by: Christian Borntraeger Signed-off-by: Paolo Bonzini Signed-off-by: Willy Tarreau commit 9f3a56a47922f7652f6051a99747a8b445fd3600 Author: Thomas Huth Date: Wed May 18 21:01:20 2016 +0200 KVM: PPC: Book3S PR: Fix illegal opcode emulation commit 708e75a3ee750dce1072134e630d66c4e6eaf63c upstream. If kvmppc_handle_exit_pr() calls kvmppc_emulate_instruction() to emulate one instruction (in the BOOK3S_INTERRUPT_H_EMUL_ASSIST case), it calls kvmppc_core_queue_program() afterwards if kvmppc_emulate_instruction() returned EMULATE_FAIL, so the guest gets an program interrupt for the illegal opcode. However, the kvmppc_emulate_instruction() also tried to inject a program exception for this already, so the program interrupt gets injected twice and the return address in srr0 gets destroyed. All other callers of kvmppc_emulate_instruction() are also injecting a program interrupt, and since the callers have the right knowledge about the srr1 flags that should be used, it is the function kvmppc_emulate_instruction() that should _not_ inject program interrupts, so remove the kvmppc_core_queue_program() here. This fixes the issue discovered by Laurent Vivier with kvm-unit-tests where the logs are filled with these messages when the test tries to execute an illegal instruction: Couldn't emulate instruction 0x00000000 (op 0 xop 0) kvmppc_handle_exit_pr: emulation at 700 failed (00000000) Signed-off-by: Thomas Huth Reviewed-by: Alexander Graf Tested-by: Laurent Vivier Signed-off-by: Paul Mackerras Cc: Sumit Semwal Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit d1e71d5f34ccb5f172c26ee5bdfd19c414db387c Author: Chao Peng Date: Tue Feb 21 03:50:01 2017 -0500 KVM: VMX: use correct vmcs_read/write for guest segment selector/base commit 96794e4ed4d758272c486e1529e431efb7045265 upstream. Guest segment selector is 16 bit field and guest segment base is natural width field. Fix two incorrect invocations accordingly. Without this patch, build fails when aggressive inlining is used with ICC. [js] no vmx_dump_sel in 3.12 Signed-off-by: Chao Peng Signed-off-by: Paolo Bonzini Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 96a14fea86b2a4a05b31e90ccaae2b5fe87ceb11 Author: Ravi Bangoria Date: Tue Nov 22 14:55:59 2016 +0530 powerpc/xmon: Fix data-breakpoint commit c21a493a2b44650707d06741601894329486f2ad upstream. Currently xmon data-breakpoint feature is broken. Whenever there is a watchpoint match occurs, hw_breakpoint_handler will be called by do_break via notifier chains mechanism. If watchpoint is registered by xmon, hw_breakpoint_handler won't find any associated perf_event and returns immediately with NOTIFY_STOP. Similarly, do_break also returns without notifying to xmon. Solve this by returning NOTIFY_DONE when hw_breakpoint_handler does not find any perf_event associated with matched watchpoint, rather than NOTIFY_STOP, which tells the core code to continue calling the other breakpoint handlers including the xmon one. Signed-off-by: Ravi Bangoria Signed-off-by: Michael Ellerman Signed-off-by: Willy Tarreau commit 4289b9c0f9a6fca7eb96db454e0bad3abc0e6956 Author: Rafał Miłecki Date: Sat Jan 28 14:31:22 2017 +0100 bcma: use (get|put)_device when probing/removing device driver commit a971df0b9d04674e325346c17de9a895425ca5e1 upstream. This allows tracking device state and e.g. makes devm work as expected. Signed-off-by: Rafał Miłecki Signed-off-by: Kalle Valo Signed-off-by: Willy Tarreau commit 249eb349b072475604f6e4508890b75767b19848 Author: Weston Andros Adamson Date: Thu Feb 23 14:54:21 2017 -0500 NFSv4: fix getacl ERANGE for some ACL buffer sizes commit ed92d8c137b7794c2c2aa14479298b9885967607 upstream. We're not taking into account that the space needed for the (variable length) attr bitmap, with the result that we'd sometimes get a spurious ERANGE when the ACL data got close to the end of a page. Just add in an extra page to make sure. Signed-off-by: Weston Andros Adamson Signed-off-by: J. Bruce Fields Signed-off-by: Anna Schumaker Signed-off-by: Willy Tarreau commit 11f43b27cb874a0c791044d64a84096688ebc5f1 Author: Steve Wise Date: Tue Feb 21 11:21:57 2017 -0800 rdma_cm: fail iwarp accepts w/o connection params commit f2625f7db4dd0bbd16a9c7d2950e7621f9aa57ad upstream. cma_accept_iw() needs to return an error if conn_params is NULL. Since this is coming from user space, we can crash. Reported-by: Shaobo He Acked-by: Sean Hefty Signed-off-by: Steve Wise Signed-off-by: Doug Ledford Signed-off-by: Willy Tarreau commit f84c0647638e1a5b0950f79c8d43665deddf8ae8 Author: Felix Fietkau Date: Wed Jan 11 16:32:13 2017 +0200 ath5k: drop bogus warning on drv_set_key with unsupported cipher commit a70e1d6fd6b5e1a81fa6171600942bee34f5128f upstream. Simply return -EOPNOTSUPP instead. Signed-off-by: Felix Fietkau Signed-off-by: Kalle Valo Signed-off-by: Willy Tarreau commit 109b4210244d8e93c37735a2a0f5988bb0ccc6c6 Author: Mathias Svensson Date: Fri Jan 6 13:32:39 2017 -0800 samples/seccomp: fix 64-bit comparison macros commit 916cafdc95843fb9af5fd5f83ca499d75473d107 upstream. There were some bugs in the JNE64 and JLT64 comparision macros. This fixes them, improves comments, and cleans up the file while we are at it. Reported-by: Stephen Röttger Signed-off-by: Mathias Svensson Signed-off-by: Kees Cook Signed-off-by: James Morris Signed-off-by: Willy Tarreau commit 1ba4fc4b0772b130c3acc9cd377f3039b9b24f8f Author: Hannes Reinecke Date: Tue Apr 26 08:06:58 2016 +0200 sd: get disk reference in sd_check_events() commit eb72d0bb84eee5d0dc3044fd17b75e7101dabb57 upstream. sd_check_events() is called asynchronously, and might race with device removal. So always take a disk reference when processing the event to avoid the device being removed while the event is processed. Signed-off-by: Hannes Reinecke Reviewed-by: Ewan D. Milne Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen Cc: Jinpu Wang Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 20bfb0a079b45828a5937d1e3133234f06af9223 Author: Davidlohr Bueso Date: Mon Feb 27 14:28:24 2017 -0800 ipc/shm: Fix shmat mmap nil-page protection commit 95e91b831f87ac8e1f8ed50c14d709089b4e01b8 upstream. The issue is described here, with a nice testcase: https://bugzilla.kernel.org/show_bug.cgi?id=192931 The problem is that shmat() calls do_mmap_pgoff() with MAP_FIXED, and the address rounded down to 0. For the regular mmap case, the protection mentioned above is that the kernel gets to generate the address -- arch_get_unmapped_area() will always check for MAP_FIXED and return that address. So by the time we do security_mmap_addr(0) things get funky for shmat(). The testcase itself shows that while a regular user crashes, root will not have a problem attaching a nil-page. There are two possible fixes to this. The first, and which this patch does, is to simply allow root to crash as well -- this is also regular mmap behavior, ie when hacking up the testcase and adding mmap(... |MAP_FIXED). While this approach is the safer option, the second alternative is to ignore SHM_RND if the rounded address is 0, thus only having MAP_SHARED flags. This makes the behavior of shmat() identical to the mmap() case. The downside of this is obviously user visible, but does make sense in that it maintains semantics after the round-down wrt 0 address and mmap. Passes shm related ltp tests. Link: http://lkml.kernel.org/r/1486050195-18629-1-git-send-email-dave@stgolabs.net Signed-off-by: Davidlohr Bueso Reported-by: Gareth Evans Cc: Manfred Spraul Cc: Michael Kerrisk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 281c8f7385d876de909e6cd97bf8100e5246c95f Author: Vinayak Menon Date: Fri Feb 24 14:59:39 2017 -0800 mm: vmpressure: fix sending wrong events on underflow commit e1587a4945408faa58d0485002c110eb2454740c upstream. At the end of a window period, if the reclaimed pages is greater than scanned, an unsigned underflow can result in a huge pressure value and thus a critical event. Reclaimed pages is found to go higher than scanned because of the addition of reclaimed slab pages to reclaimed in shrink_node without a corresponding increment to scanned pages. Minchan Kim mentioned that this can also happen in the case of a THP page where the scanned is 1 and reclaimed could be 512. Link: http://lkml.kernel.org/r/1486641577-11685-1-git-send-email-vinmenon@codeaurora.org Signed-off-by: Vinayak Menon Acked-by: Minchan Kim Acked-by: Michal Hocko Cc: Johannes Weiner Cc: Mel Gorman Cc: Vlastimil Babka Cc: Rik van Riel Cc: Vladimir Davydov Cc: Anton Vorontsov Cc: Shiraz Hashim Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 95589a7820c1fbc58882c13c71c182ed4dd82ce0 Author: Ralf Baechle Date: Thu Jan 26 02:16:47 2017 +0100 MIPS: Fix special case in 64 bit IP checksumming. commit 66fd848cadaa6be974a8c780fbeb328f0af4d3bd upstream. For certain arguments such as saddr = 0xc0a8fd60, daddr = 0xc0a8fda1, len = 80, proto = 17, sum = 0x7eae049d there will be a carry when folding the intermediate 64 bit checksum to 32 bit but the code doesn't add the carry back to the one's complement sum, thus an incorrect result will be generated. Reported-by: Mark Zhang Signed-off-by: Ralf Baechle Reviewed-by: James Hogan Signed-off-by: James Hogan Signed-off-by: Willy Tarreau commit c0c294a36ea7b8f5ade83caf8a43692086318afd Author: Dan Carpenter Date: Tue Feb 18 15:20:51 2014 +0300 af_packet: remove a stray tab in packet_set_ring() commit d7cf0c34af067555737193b6c1aa7abaa677f29c upstream. At first glance it looks like there is a missing curly brace but actually the code works the same either way. I have adjusted the indenting but left the code the same. Signed-off-by: Dan Carpenter Acked-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 5751935fcfe0adcedbc88f801ae777f76ec6efdf Author: Michael Schenk Date: Thu Jan 26 11:25:04 2017 -0600 rtlwifi: rtl_usb: Fix for URB leaking when doing ifconfig up/down commit 575ddce0507789bf9830d089557d2199d2f91865 upstream. In the function rtl_usb_start we pre-allocate a certain number of urbs for RX path but they will not be freed when calling rtl_usb_stop. This results in leaking urbs when doing ifconfig up and down. Eventually, the system has no available urbs. Signed-off-by: Michael Schenk Signed-off-by: Larry Finger Signed-off-by: Kalle Valo Signed-off-by: Willy Tarreau commit 298feed49af39a701a50e94d21ea674ae6daa680 Author: Javier Martinez Canillas Date: Mon Jan 2 11:57:20 2017 -0300 tty: serial: msm: Fix module autoload commit abe81f3b8ed2996e1712d26d38ff6b73f582c616 upstream. If the driver is built as a module, autoload won't work because the module alias information is not filled. So user-space can't match the registered device with the corresponding module. Export the module alias information using the MODULE_DEVICE_TABLE() macro. Before this patch: $ modinfo drivers/tty/serial/msm_serial.ko | grep alias $ After this patch: $ modinfo drivers/tty/serial/msm_serial.ko | grep alias alias: of:N*T*Cqcom,msm-uartdmC* alias: of:N*T*Cqcom,msm-uartdm alias: of:N*T*Cqcom,msm-uartC* alias: of:N*T*Cqcom,msm-uart Signed-off-by: Javier Martinez Canillas Acked-by: Bjorn Andersson Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 5988be39d31cb1ab9cc21d86254f2082ff8a9d25 Author: David S. Miller Date: Fri Feb 17 16:19:39 2017 -0500 irda: Fix lockdep annotations in hashbin_delete(). commit 4c03b862b12f980456f9de92db6d508a4999b788 upstream. A nested lock depth was added to the hasbin_delete() code but it doesn't actually work some well and results in tons of lockdep splats. Fix the code instead to properly drop the lock around the operation and just keep peeking the head of the hashbin queue. Reported-by: Dmitry Vyukov Tested-by: Dmitry Vyukov Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 296a1ab22e2a153f428d8b3c7c47fc4076c05760 Author: Colin Ian King Date: Mon May 16 17:22:54 2016 +0100 rtc: interface: ignore expired timers when enqueuing new timers commit 2b2f5ff00f63847d95adad6289bd8b05f5983dd5 upstream. This patch fixes a RTC wakealarm issue, namely, the event fires during hibernate and is not cleared from the list, causing hwclock to block. The current enqueuing does not trigger an alarm if any expired timers already exist on the timerqueue. This can occur when a RTC wake alarm is used to wake a machine out of hibernate and the resumed state has old expired timers that have not been removed from the timer queue. This fix skips over any expired timers and triggers an alarm if there are no pending timers on the timerqueue. Note that the skipped expired timer will get reaped later on, so there is no need to clean it up immediately. The issue can be reproduced by putting a machine into hibernate and waking it with the RTC wakealarm. Running the example RTC test program from tools/testing/selftests/timers/rtctest.c after the hibernate will block indefinitely. With the fix, it no longer blocks after the hibernate resume. BugLink: http://bugs.launchpad.net/bugs/1333569 Signed-off-by: Colin Ian King Signed-off-by: Alexandre Belloni Cc: Sumit Semwal Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit bbd20686c604f34caf8e77f7cde01a17042e92e4 Author: Yang Yang Date: Fri Dec 30 16:17:55 2016 +0800 futex: Move futex_init() to core_initcall commit 25f71d1c3e98ef0e52371746220d66458eac75bc upstream. The UEVENT user mode helper is enabled before the initcalls are executed and is available when the root filesystem has been mounted. The user mode helper is triggered by device init calls and the executable might use the futex syscall. futex_init() is marked __initcall which maps to device_initcall, but there is no guarantee that futex_init() is invoked _before_ the first device init call which triggers the UEVENT user mode helper. If the user mode helper uses the futex syscall before futex_init() then the syscall crashes with a NULL pointer dereference because the futex subsystem has not been initialized yet. Move futex_init() to core_initcall so futexes are initialized before the root filesystem is mounted and the usermode helper becomes available. [ tglx: Rewrote changelog ] Signed-off-by: Yang Yang Cc: jiang.biao2@zte.com.cn Cc: jiang.zhengxiong@zte.com.cn Cc: zhong.weidong@zte.com.cn Cc: deng.huali@zte.com.cn Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/1483085875-6130-1-git-send-email-yang.yang29@zte.com.cn Signed-off-by: Thomas Gleixner Signed-off-by: Willy Tarreau commit 7160035d0356504cc70ae030ad60648b61faafd0 Author: Mauro Carvalho Chehab Date: Tue Feb 14 17:47:57 2017 -0200 siano: make it work again with CONFIG_VMAP_STACK commit f9c85ee67164b37f9296eab3b754e543e4e96a1c upstream. Reported as a Kaffeine bug: https://bugs.kde.org/show_bug.cgi?id=375811 The USB control messages require DMA to work. We cannot pass a stack-allocated buffer, as it is not warranted that the stack would be into a DMA enabled area. On Kernel 4.9, the default is to not accept DMA on stack anymore on x86 architecture. On other architectures, this has been a requirement since Kernel 2.2. So, after this patch, this driver should likely work fine on all archs. Tested with USB ID 2040:5510: Hauppauge Windham Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Willy Tarreau commit fe75d1a6b0d2658a154e460cb00d390900c64e2a Author: Miklos Szeredi Date: Thu Feb 16 17:49:02 2017 +0100 vfs: fix uninitialized flags in splice_to_pipe() commit 5a81e6a171cdbd1fa8bc1fdd80c23d3d71816fac upstream. Flags (PIPE_BUF_FLAG_PACKET, PIPE_BUF_FLAG_GIFT) could remain on the unused part of the pipe ring buffer. Previously splice_to_pipe() left the flags value alone, which could result in incorrect behavior. Uninitialized flags appears to have been there from the introduction of the splice syscall. Signed-off-by: Miklos Szeredi Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit c6cc07d404144f75945b6f96ede9510c1468f4b8 Author: Willem de Bruijn Date: Tue Feb 7 15:57:21 2017 -0500 packet: round up linear to header len commit 57031eb794906eea4e1c7b31dc1e2429c0af0c66 upstream. Link layer protocols may unconditionally pull headers, as Ethernet does in eth_type_trans. Ensure that the entire link layer header always lies in the skb linear segment. tpacket_snd has such a check. Extend this to packet_snd. Variable length link layer headers complicate the computation somewhat. Here skb->len may be smaller than dev->hard_header_len. Round up the linear length to be at least as long as the smallest of the two. [js] no virtio helpers in 3.12 Reported-by: Dmitry Vyukov Signed-off-by: Willem de Bruijn Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit b0899f80092e17d83979cd6c2825a5d4ee2cdcf7 Author: Willem de Bruijn Date: Fri Feb 3 18:20:49 2017 -0500 macvtap: read vnet_hdr_size once commit 837585a5375c38d40361cfe64e6fd11e1addb936 upstream. When IFF_VNET_HDR is enabled, a virtio_net header must precede data. Data length is verified to be greater than or equal to expected header length tun->vnet_hdr_sz before copying. Macvtap functions read the value once, but unless READ_ONCE is used, the compiler may ignore this and read multiple times. Enforce a single read and locally cached value to avoid updates between test and use. Signed-off-by: Willem de Bruijn Suggested-by: Eric Dumazet Acked-by: Eric Dumazet Signed-off-by: David S. Miller [wt: s/READ_ONCE/ACCESS_ONCE] Signed-off-by: Willy Tarreau commit 5c235ec479dcdec8d7ba8f6126b6a92df3f56c32 Author: Eric Dumazet Date: Wed Feb 1 08:33:53 2017 -0800 tcp: fix 0 divide in __tcp_select_window() commit 06425c308b92eaf60767bc71d359f4cbc7a561f8 upstream. syszkaller fuzzer was able to trigger a divide by zero, when TCP window scaling is not enabled. SO_RCVBUF can be used not only to increase sk_rcvbuf, also to decrease it below current receive buffers utilization. If mss is negative or 0, just return a zero TCP window. Signed-off-by: Eric Dumazet Reported-by: Dmitry Vyukov Acked-by: Neal Cardwell Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 4718919e008905508b5cfa5c3531f6b3cc05cdf8 Author: Rabin Vincent Date: Mon Apr 4 15:42:02 2016 +0200 sched/debug: Don't dump sched debug info in SysRq-W commit fb90a6e93c0684ab2629a42462400603aa829b9c upstream. sysrq_sched_debug_show() can dump a lot of information. Don't print out all that if we're just trying to get a list of blocked tasks (SysRq-W). The information is still accessible with SysRq-T. Signed-off-by: Rabin Vincent Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Steven Rostedt Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1459777322-30902-1-git-send-email-rabin.vincent@axis.com Signed-off-by: Ingo Molnar Cc: Nikolay Borisov Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit e2ec1495b015a5c13ce20c0a6137c49dd825d929 Author: Vineet Gupta Date: Tue Feb 7 09:44:58 2017 -0800 ARC: [arcompact] brown paper bag bug in unaligned access delay slot fixup commit a524c218bc94c705886a0e0fedeee45d1931da32 upstream. Reported-by: Jo-Philipp Wich Fixes: 9aed02feae57bf7 ("ARC: [arcompact] handle unaligned access delay slot") Cc: linux-kernel@vger.kernel.org Cc: linux-snps-arc@lists.infradead.org Signed-off-by: Vineet Gupta Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 6ee1806c1f515818d4ff31ffff5b3ea376e954bc Author: Michal Hocko Date: Fri Feb 3 13:13:29 2017 -0800 mm, fs: check for fatal signals in do_generic_file_read() commit 5abf186a30a89d5b9c18a6bf93a2c192c9fd52f6 upstream. do_generic_file_read() can be told to perform a large request from userspace. If the system is under OOM and the reading task is the OOM victim then it has an access to memory reserves and finishing the full request can lead to the full memory depletion which is dangerous. Make sure we rather go with a short read and allow the killed task to terminate. Link: http://lkml.kernel.org/r/20170201092706.9966-3-mhocko@kernel.org Signed-off-by: Michal Hocko Reviewed-by: Christoph Hellwig Cc: Tetsuo Handa Cc: Al Viro Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 18f626429a5a6a1b9a9742e944bdeb3536f030f5 Author: Toshi Kani Date: Fri Feb 3 13:13:20 2017 -0800 mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone() commit deb88a2a19e85842d79ba96b05031739ec327ff4 upstream. Patch series "fix a kernel oops when reading sysfs valid_zones", v2. A sysfs memory file is created for each 2GiB memory block on x86-64 when the system has 64GiB or more memory. [1] When the start address of a memory block is not backed by struct page, i.e. a memory range is not aligned by 2GiB, reading its 'valid_zones' attribute file leads to a kernel oops. This issue was observed on multiple x86-64 systems with more than 64GiB of memory. This patch-set fixes this issue. Patch 1 first fixes an issue in test_pages_in_a_zone(), which does not test the start section. Patch 2 then fixes the kernel oops by extending test_pages_in_a_zone() to return valid [start, end). Note for stable kernels: The memory block size change was made by commit bdee237c0343 ("x86: mm: Use 2GB memory block size on large-memory x86-64 systems"), which was accepted to 3.9. However, this patch-set depends on (and fixes) the change to test_pages_in_a_zone() made by commit 5f0f2887f4de ("mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()"), which was accepted to 4.4. So, I recommend that we backport it up to 4.4. [1] 'Commit bdee237c0343 ("x86: mm: Use 2GB memory block size on large-memory x86-64 systems")' This patch (of 2): test_pages_in_a_zone() does not check 'start_pfn' when it is aligned by section since 'sec_end_pfn' is set equal to 'pfn'. Since this function is called for testing the range of a sysfs memory file, 'start_pfn' is always aligned by section. Fix it by properly setting 'sec_end_pfn' to the next section pfn. Also make sure that this function returns 1 only when the range belongs to a zone. Link: http://lkml.kernel.org/r/20170127222149.30893-2-toshi.kani@hpe.com Signed-off-by: Toshi Kani Cc: Andrew Banman Cc: Reza Arbab Cc: Greg KH Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit bb3d39c70fe12ab44c4dff3785c3ff322adde851 Author: Arvind Yadav Date: Mon Dec 12 23:13:27 2016 +0530 ata: sata_mv:- Handle return value of devm_ioremap. commit 064c3db9c564cc5be514ac21fb4aa26cc33db746 upstream. Here, If devm_ioremap will fail. It will return NULL. Then hpriv->base = NULL - 0x20000; Kernel can run into a NULL-pointer dereference. This error check will avoid NULL pointer dereference. Signed-off-by: Arvind Yadav Signed-off-by: Tejun Heo Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 055d0a8d61f484a2a95b3a02c3381c50333a690b Author: Salvatore Benedetto Date: Fri Jan 13 11:54:08 2017 +0000 crypto: api - Clear CRYPTO_ALG_DEAD bit before registering an alg commit d6040764adcb5cb6de1489422411d701c158bb69 upstream. Make sure CRYPTO_ALG_DEAD bit is cleared before proceeding with the algorithm registration. This fixes qat-dh registration when driver is restarted Signed-off-by: Salvatore Benedetto Signed-off-by: Herbert Xu Signed-off-by: Willy Tarreau commit bb47c5c64ff12025cc56190e7adb19fc44b8604c Author: Ilia Mirkin Date: Thu Jan 19 22:56:30 2017 -0500 drm/nouveau/nv1a,nv1f/disp: fix memory clock rate retrieval commit 24bf7ae359b8cca165bb30742d2b1c03a1eb23af upstream. Based on the xf86-video-nv code, NFORCE (NV1A) and NFORCE2 (NV1F) have a different way of retrieving clocks. See the nv_hw.c:nForceUpdateArbitrationSettings function in the original code for how these clocks were accessed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54587 Signed-off-by: Ilia Mirkin Signed-off-by: Ben Skeggs Signed-off-by: Willy Tarreau commit ddee2228ae6cf9225b3ac78e687f40954470c141 Author: WANG Cong Date: Mon Jan 23 11:17:35 2017 -0800 af_unix: move unix_mknod() out of bindlock commit 0fb44559ffd67de8517098b81f675fa0210f13f0 upstream. Dmitry reported a deadlock scenario: unix_bind() path: u->bindlock ==> sb_writer do_splice() path: sb_writer ==> pipe->mutex ==> u->bindlock In the unix_bind() code path, unix_mknod() does not have to be done with u->bindlock held, since it is a pure fs operation, so we can just move unix_mknod() out. Reported-by: Dmitry Vyukov Tested-by: Dmitry Vyukov Cc: Rainer Weikusat Cc: Al Viro Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit cba3e28f6cf5fced79dabb01f6075342af3f8a6d Author: Kefeng Wang Date: Thu Jan 19 16:26:21 2017 +0800 ipv6: addrconf: Avoid addrconf_disable_change() using RCU read-side lock commit 03e4deff4987f79c34112c5ba4eb195d4f9382b0 upstream. Just like commit 4acd4945cd1e ("ipv6: addrconf: Avoid calling netdevice notifiers with RCU read-side lock"), it is unnecessary to make addrconf_disable_change() use RCU iteration over the netdev list, since it already holds the RTNL lock, or we may meet Illegal context switch in RCU read-side critical section. Signed-off-by: Kefeng Wang Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 379fa3e80f2a79fcf9f5c7facc43620257491e68 Author: Chuck Lever Date: Sun Jan 22 14:04:29 2017 -0500 nfs: Don't increment lock sequence ID after NFS4ERR_MOVED commit 059aa734824165507c65fd30a55ff000afd14983 upstream. Xuan Qi reports that the Linux NFSv4 client failed to lock a file that was migrated. The steps he observed on the wire: 1. The client sent a LOCK request to the source server 2. The source server replied NFS4ERR_MOVED 3. The client switched to the destination server 4. The client sent the same LOCK request to the destination server with a bumped lock sequence ID 5. The destination server rejected the LOCK request with NFS4ERR_BAD_SEQID RFC 3530 section 8.1.5 provides a list of NFS errors which do not bump a lock sequence ID. However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530 section 9.1.7, this list has been updated by the addition of NFS4ERR_MOVED. Reported-by: Xuan Qi Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust Signed-off-by: Willy Tarreau commit 485f99073ea17248d3938859e7d87e59406dbb8a Author: Helge Deller Date: Sat Jan 28 11:52:02 2017 +0100 parisc: Don't use BITS_PER_LONG in userspace-exported swab.h header commit 2ad5d52d42810bed95100a3d912679d8864421ec upstream. In swab.h the "#if BITS_PER_LONG > 32" breaks compiling userspace programs if BITS_PER_LONG is #defined by userspace with the sizeof() compiler builtin. Solve this problem by using __BITS_PER_LONG instead. Since we now #include asm/bitsperlong.h avoid further potential userspace pollution by moving the #define of SHIFT_PER_LONG to bitops.h which is not exported to userspace. This patch unbreaks compiling qemu on hppa/parisc. Signed-off-by: Helge Deller Signed-off-by: Willy Tarreau commit b5882ec2fe3c937c863aab441e96b3f47f72f035 Author: Vineet Gupta Date: Fri Jan 27 10:45:27 2017 -0800 ARC: [arcompact] handle unaligned access delay slot corner case commit 9aed02feae57bf7a40cb04ea0e3017cb7a998db4 upstream. After emulating an unaligned access in delay slot of a branch, we pretend as the delay slot never happened - so return back to actual branch target (or next PC if branch was not taken). Curently we did this by handling STATUS32.DE, we also need to clear the BTA.T bit, which is disregarded when returning from original misaligned exception, but could cause weirdness if it took the interrupt return path (in case interrupt was acive too) One ARC700 customer ran into this when enabling unaligned access fixup for kernel mode accesses as well Signed-off-by: Vineet Gupta Signed-off-by: Willy Tarreau commit ddf2415ce73a4282359a1f40907cc03d2bb55cc1 Author: Arnd Bergmann Date: Fri Jan 27 13:32:14 2017 +0100 ISDN: eicon: silence misleading array-bounds warning commit 950eabbd6ddedc1b08350b9169a6a51b130ebaaf upstream. With some gcc versions, we get a warning about the eicon driver, and that currently shows up as the only remaining warning in one of the build bots: In file included from ../drivers/isdn/hardware/eicon/message.c:30:0: eicon/message.c: In function 'mixer_notify_update': eicon/platform.h:333:18: warning: array subscript is above array bounds [-Warray-bounds] The code is easily changed to open-code the unusual PUT_WORD() line causing this to avoid the warning. Link: http://arm-soc.lixom.net/buildlogs/stable-rc/v4.4.45/ Signed-off-by: Arnd Bergmann Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit db2b7cd8f8489e8c4a55d7556b61282c7b7cc98d Author: Eric Dumazet Date: Wed Jan 25 18:20:55 2017 -0800 sysctl: fix proc_doulongvec_ms_jiffies_minmax() commit ff9f8a7cf935468a94d9927c68b00daae701667e upstream. We perform the conversion between kernel jiffies and ms only when exporting kernel value to user space. We need to do the opposite operation when value is written by user. Only matters when HZ != 1000 Signed-off-by: Eric Dumazet Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 3cc334c2acc9b4ff326da43fab18673803985edf Author: Dave Martin Date: Fri Jan 6 17:54:51 2017 +0000 tile/ptrace: Preserve previous registers for short regset write commit fd7c99142d77dc4a851879a66715abf12a3193fb upstream. Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET to fill all the registers, the thread's old registers are preserved. Signed-off-by: Dave Martin Signed-off-by: Chris Metcalf Signed-off-by: Willy Tarreau commit baaee1b72396821153c6ffaedf04c3422420e0bb Author: Mintz, Yuval Date: Sun Dec 4 15:30:17 2016 +0200 bnx2x: Correct ringparam estimate when DOWN commit 65870fa77fd7f83d7be4ed924d47ed9e3831f434 upstream. Until interface is up [and assuming ringparams weren't explicitly configured] when queried for the size of its rings bnx2x would claim they're the maximal size by default. That is incorrect as by default the maximal number of buffers would be equally divided between the various rx rings. This prevents the user from actually setting the number of elements on each rx ring to be of maximal size prior to transitioning the interface into up state. To fix this, make a rough estimation about the number of buffers. It wouldn't always be accurate, but it would be much better than current estimation and would allow users to increase number of buffers during early initialization of the interface. Reported-by: Seymour, Shane Signed-off-by: Yuval Mintz Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 1d662dc0cc20dc54fd2b43291be66e51675dae8d Author: Gabriel Krisman Bertazi Date: Mon Nov 28 19:34:42 2016 -0200 serial: 8250_pci: Detach low-level driver during PCI error recovery commit f209fa03fc9d131b3108c2e4936181eabab87416 upstream. During a PCI error recovery, like the ones provoked by EEH in the ppc64 platform, all IO to the device must be blocked while the recovery is completed. Current 8250_pci implementation only suspends the port instead of detaching it, which doesn't prevent incoming accesses like TIOCMGET and TIOCMSET calls from reaching the device. Those end up racing with the EEH recovery, crashing it. Similar races were also observed when opening the device and when shutting it down during recovery. This patch implements a more robust IO blockage for the 8250_pci recovery by unregistering the port at the beginning of the procedure and re-adding it afterwards. Since the port is detached from the uart layer, we can be sure that no request will make through to the device during recovery. This is similar to the solution used by the JSM serial driver. I thank Peter Hurley for valuable input on this one over one year ago. Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Greg Kroah-Hartman Signed-off-by: Willy Tarreau commit 64df8d3b3fdd06a877ccac395f60dd54dd515897 Author: Al Viro Date: Thu Sep 11 18:55:50 2014 -0400 move the call of __d_drop(anon) into __d_materialise_unique(dentry, anon) commit 6f18493e541c690169c3b1479d47d95f624161cf upstream. and lock the right list there Signed-off-by: Al Viro Acked-by: NeilBrown Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit dda50d926e8a3f215857ab7091cd6b11d588881a Author: Calvin Owens Date: Fri Oct 30 16:57:00 2015 -0700 sg: Fix double-free when drives detach during SG_IO commit f3951a3709ff50990bf3e188c27d346792103432 upstream. In sg_common_write(), we free the block request and return -ENODEV if the device is detached in the middle of the SG_IO ioctl(). Unfortunately, sg_finish_rem_req() also tries to free srp->rq, so we end up freeing rq->cmd in the already free rq object, and then free the object itself out from under the current user. This ends up corrupting random memory via the list_head on the rq object. The most common crash trace I saw is this: ------------[ cut here ]------------ kernel BUG at block/blk-core.c:1420! Call Trace: [] blk_put_request+0x5b/0x80 [] sg_finish_rem_req+0x6b/0x120 [sg] [] sg_common_write.isra.14+0x459/0x5a0 [sg] [] ? selinux_file_alloc_security+0x48/0x70 [] sg_new_write.isra.17+0x195/0x2d0 [sg] [] sg_ioctl+0x644/0xdb0 [sg] [] do_vfs_ioctl+0x90/0x520 [] ? file_has_perm+0x97/0xb0 [] SyS_ioctl+0x91/0xb0 [] tracesys+0xdd/0xe2 RIP [] __blk_put_request+0x154/0x1a0 The solution is straightforward: just set srp->rq to NULL in the failure branch so that sg_finish_rem_req() doesn't attempt to re-free it. Additionally, since sg_rq_end_io() will never be called on the object when this happens, we need to free memory backing ->cmd if it isn't embedded in the object itself. KASAN was extremely helpful in finding the root cause of this bug. Signed-off-by: Calvin Owens Acked-by: Douglas Gilbert Signed-off-by: Martin K. Petersen Acked-by: Johannes Thumshirn Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 0570afff43287ff3ffff86731ced73748f74149f Author: Benjamin Poirier Date: Mon Nov 7 17:57:56 2016 +0800 bna: Add synchronization for tx ring. commit d667f78514c656a6a8bf0b3d6134a7fe5cd4d317 upstream. We received two reports of BUG_ON in bnad_txcmpl_process() where hw_consumer_index appeared to be ahead of producer_index. Out of order write/read of these variables could explain these reports. bnad_start_xmit(), as a producer of tx descriptors, has a few memory barriers sprinkled around writes to producer_index and the device's doorbell but they're not paired with anything in bnad_txcmpl_process(), a consumer. Since we are synchronizing with a device, we must use mandatory barriers, not smp_*. Also, I didn't see the purpose of the last smp_mb() in bnad_start_xmit(). Signed-off-by: Benjamin Poirier Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 37ccd2be0e783b1fb8860ccd8b168109c22655c6 Author: Vlad Tsyrklevich Date: Wed Oct 12 18:51:24 2016 +0200 vfio/pci: Fix integer overflows, bitmask check commit 05692d7005a364add85c6e25a6c4447ce08f913a upstream. The VFIO_DEVICE_SET_IRQS ioctl did not sufficiently sanitize user-supplied integers, potentially allowing memory corruption. This patch adds appropriate integer overflow checks, checks the range bounds for VFIO_IRQ_SET_DATA_NONE, and also verifies that only single element in the VFIO_IRQ_SET_DATA_TYPE_MASK bitmask is set. VFIO_IRQ_SET_ACTION_TYPE_MASK is already correctly checked later in vfio_pci_set_irqs_ioctl(). Furthermore, a kzalloc is changed to a kcalloc because the use of a kzalloc with an integer multiplication allowed an integer overflow condition to be reached without this patch. kcalloc checks for overflow and should prevent a similar occurrence. Signed-off-by: Vlad Tsyrklevich Signed-off-by: Alex Williamson Signed-off-by: Willy Tarreau commit f466ca6552401524c229af56d1d120a96fa9184a Author: Heinrich Schuchardt Date: Fri Jun 10 23:34:26 2016 +0200 apparmor: do not expose kernel stack commit f4ee2def2d70692ccff0d55353df4ee594fd0017 upstream. Do not copy uninitalized fields th.td_hilen, th.td_data. Signed-off-by: Heinrich Schuchardt Signed-off-by: John Johansen Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit a078d77fb083a364341b376c8cfec25c00f50ee2 Author: John Johansen Date: Wed Jun 22 18:01:08 2016 -0700 apparmor: fix module parameters can be changed after policy is locked commit 58acf9d911c8831156634a44d0b022d683e1e50c upstream. the policy_lock parameter is a one way switch that prevents policy from being further modified. Unfortunately some of the module parameters can effectively modify policy by turning off enforcement. split policy_admin_capable into a view check and a full admin check, and update the admin check to test the policy_lock parameter. Signed-off-by: John Johansen Signed-off-by: Willy Tarreau commit 1259d17c102f9a5b3a063d32c567b6bbeb0fa285 Author: John Johansen Date: Wed Jun 15 10:00:55 2016 +0300 apparmor: fix oops in profile_unpack() when policy_db is not present commit 5f20fdfed16bc599a325a145bf0123a8e1c9beea upstream. BugLink: http://bugs.launchpad.net/bugs/1592547 If unpack_dfa() returns NULL due to the dfa not being present, profile_unpack() is not checking if the dfa is not present (NULL). Signed-off-by: John Johansen Signed-off-by: Willy Tarreau commit 4b0f1ec2fa2d7dc4bd80cf8e8e10a27c5af3dc52 Author: John Johansen Date: Wed Jun 15 09:57:55 2016 +0300 apparmor: don't check for vmalloc_addr if kvzalloc() failed commit 3197f5adf539a3ee6331f433a51483f8c842f890 upstream. Signed-off-by: John Johansen Signed-off-by: Willy Tarreau commit efbb2d5b9dad87aea586b21ba861aba605d35bf7 Author: John Johansen Date: Thu Jun 2 02:37:02 2016 -0700 apparmor: add missing id bounds check on dfa verification commit 15756178c6a65b261a080e21af4766f59cafc112 upstream. Signed-off-by: John Johansen Signed-off-by: Willy Tarreau commit 57ad1701a19192ff71267818f980f20e0293e9cd Author: John Johansen Date: Thu Mar 17 12:02:54 2016 -0700 apparmor: check that xindex is in trans_table bounds commit 23ca7b640b4a55f8747301b6bd984dd05545f6a7 upstream. Signed-off-by: John Johansen Acked-by: Seth Arnold Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 8b201a9c604722c1f934ede797a020d3ae42159e Author: John Johansen Date: Fri Jul 25 04:02:10 2014 -0700 apparmor: internal paths should be treated as disconnected commit bd35db8b8ca6e27fc17a9057ef78e1ddfc0de351 upstream. Internal mounts are not mounted anywhere and as such should be treated as disconnected paths. Signed-off-by: John Johansen Acked-by: Seth Arnold Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 31d3307054aececcc1197eb119ebc35a5f4b60fc Author: John Johansen Date: Fri Jul 25 04:02:08 2014 -0700 apparmor: fix disconnected bind mnts reconnection commit f2e561d190da7ff5ee265fa460e2d7f753dddfda upstream. Bind mounts can fail to be properly reconnected when PATH_CONNECT is specified. Ensure that when PATH_CONNECT is specified the path has a root. BugLink: http://bugs.launchpad.net/bugs/1319984 Signed-off-by: John Johansen Acked-by: Seth Arnold Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 8d14bc992000791b414bd536fe3516303c5c039e Author: John Johansen Date: Fri Jul 25 04:02:03 2014 -0700 apparmor: exec should not be returning ENOENT when it denies commit 9049a7922124d843a2cd26a02b1d00a17596ec0c upstream. The current behavior is confusing as it causes exec failures to report the executable is missing instead of identifying that apparmor caused the failure. Signed-off-by: John Johansen Acked-by: Seth Arnold Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit e41dd348186f9fbbf799a944697c64b01a03fb04 Author: John Johansen Date: Sun Jun 8 11:20:54 2014 -0700 apparmor: fix uninitialized lsm_audit member commit b6b1b81b3afba922505b57f4c812bba022f7c4a9 upstream. BugLink: http://bugs.launchpad.net/bugs/1268727 The task field in the lsm_audit struct needs to be initialized if a change_hat fails, otherwise the following oops will occur BUG: unable to handle kernel paging request at 0000002fbead7d08 IP: [] _raw_spin_lock+0xe/0x50 PGD 1e3f35067 PUD 0 Oops: 0002 [#1] SMP Modules linked in: pppox crc_ccitt p8023 p8022 psnap llc ax25 btrfs raid6_pq xor xfs libcrc32c dm_multipath scsi_dh kvm_amd dcdbas kvm microcode amd64_edac_mod joydev edac_core psmouse edac_mce_amd serio_raw k10temp sp5100_tco i2c_piix4 ipmi_si ipmi_msghandler acpi_power_meter mac_hid lp parport hid_generic usbhid hid pata_acpi mpt2sas ahci raid_class pata_atiixp bnx2 libahci scsi_transport_sas [last unloaded: tipc] CPU: 2 PID: 699 Comm: changehat_twice Tainted: GF O 3.13.0-7-generic #25-Ubuntu Hardware name: Dell Inc. PowerEdge R415/08WNM9, BIOS 1.8.6 12/06/2011 task: ffff8802135c6000 ti: ffff880212986000 task.ti: ffff880212986000 RIP: 0010:[] [] _raw_spin_lock+0xe/0x50 RSP: 0018:ffff880212987b68 EFLAGS: 00010006 RAX: 0000000000020000 RBX: 0000002fbead7500 RCX: 0000000000000000 RDX: 0000000000000292 RSI: ffff880212987ba8 RDI: 0000002fbead7d08 RBP: ffff880212987b68 R08: 0000000000000246 R09: ffff880216e572a0 R10: ffffffff815fd677 R11: ffffea0008469580 R12: ffffffff8130966f R13: ffff880212987ba8 R14: 0000002fbead7d08 R15: ffff8800d8c6b830 FS: 00002b5e6c84e7c0(0000) GS:ffff880216e40000(0000) knlGS:0000000055731700 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000002fbead7d08 CR3: 000000021270f000 CR4: 00000000000006e0 Stack: ffff880212987b98 ffffffff81075f17 ffffffff8130966f 0000000000000009 0000000000000000 0000000000000000 ffff880212987bd0 ffffffff81075f7c 0000000000000292 ffff880212987c08 ffff8800d8c6b800 0000000000000026 Call Trace: [] __lock_task_sighand+0x47/0x80 [] ? apparmor_cred_prepare+0x2f/0x50 [] do_send_sig_info+0x2c/0x80 [] send_sig_info+0x1e/0x30 [] aa_audit+0x13d/0x190 [] aa_audit_file+0xbc/0x130 [] ? apparmor_cred_prepare+0x2f/0x50 [] aa_change_hat+0x202/0x530 [] aa_setprocattr_changehat+0x116/0x1d0 [] apparmor_setprocattr+0x25d/0x300 [] security_setprocattr+0x16/0x20 [] proc_pid_attr_write+0x107/0x130 [] vfs_write+0xb4/0x1f0 [] SyS_write+0x49/0xa0 [] tracesys+0xe1/0xe6 Signed-off-by: John Johansen Acked-by: Seth Arnold Acked-by: Jeff Mahoney Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 0e872a0d773a10de75253a54e512346c132098f9 Author: Sachin Prabhu Date: Thu Jan 26 14:28:49 2017 +0100 Fix regression which breaks DFS mounting commit d171356ff11ab1825e456dfb979755e01b3c54a1 upstream. Patch a6b5058 results in -EREMOTE returned by is_path_accessible() in cifs_mount() to be ignored which breaks DFS mounting. Signed-off-by: Sachin Prabhu Reviewed-by: Aurelien Aptel Signed-off-by: Steve French Signed-off-by: Willy Tarreau commit 685957ca255636425c6f6fe3e97aef0f227b8980 Author: Sachin Prabhu Date: Thu Jan 26 14:28:02 2017 +0100 Move check for prefix path to within cifs_get_root() commit 348c1bfa84dfc47da1f1234b7f2bf09fa798edea upstream. Signed-off-by: Sachin Prabhu Tested-by: Aurelien Aptel Signed-off-by: Steve French Acked-by: Aurelien Aptel Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 5c33dcb4788ac9cd1de9e340543badae18e0d1b3 Author: Sachin Prabhu Date: Thu Jan 26 14:27:27 2017 +0100 Compare prepaths when comparing superblocks commit c1d8b24d18192764fe82067ec6aa8d4c3bf094e0 upstream. The patch Fs/cifs: make share unaccessible at root level mountable makes use of prepaths when any component of the underlying path is inaccessible. When mounting 2 separate shares having different prepaths but are other wise similar in other respects, we end up sharing superblocks when we shouldn't be doing so. Signed-off-by: Sachin Prabhu Tested-by: Aurelien Aptel Signed-off-by: Steve French Acked-by: Aurelien Aptel Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 528d066d3ddd524e84ca037911cee45031b4e517 Author: Sachin Prabhu Date: Fri Jul 29 22:38:19 2016 +0100 Fix memory leaks in cifs_do_mount() commit 4214ebf4654798309364d0c678b799e402f38288 upstream. Fix memory leaks introduced by the patch Fs/cifs: make share unaccessible at root level mountable Also move allocation of cifs_sb->prepath to cifs_setup_cifs_sb(). Signed-off-by: Sachin Prabhu Tested-by: Aurelien Aptel Signed-off-by: Steve French Acked-by: Aurelien Aptel Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 40fb18bddcd4780ae8c106a81312bab9ac0bd28c Author: Benjamin Poirier Date: Mon Oct 3 10:47:50 2016 +0800 vmxnet3: Wake queue from reset work commit 277964e19e1416ca31301e113edb2580c81a8b66 upstream. vmxnet3_reset_work() expects tx queues to be stopped (via vmxnet3_quiesce_dev -> netif_tx_disable). However, this races with the netif_wake_queue() call in netif_tx_timeout() such that the driver's start_xmit routine may be called unexpectedly, triggering one of the BUG_ON in vmxnet3_map_pkt with a stack trace like this: RIP: 0010:[] vmxnet3_map_pkt+0x3ac/0x4c0 [vmxnet3] [] vmxnet3_tq_xmit+0x210/0x4e0 [vmxnet3] [] dev_hard_start_xmit+0x2e4/0x4c0 [] sch_direct_xmit+0x17e/0x1e0 [] __qdisc_run+0xd7/0x130 [] net_tx_action+0x10a/0x200 [] __do_softirq+0x11f/0x260 [] call_softirq+0x1c/0x30 [] do_softirq+0x65/0xa0 [] local_bh_enable_ip+0x99/0xa0 [] destroy_conntrack+0x96/0x110 [nf_conntrack] [] nf_conntrack_destroy+0x12/0x20 [] skb_release_head_state+0xb5/0xf0 [] skb_release_all+0x9/0x20 [] __kfree_skb+0x9/0x90 [] vmxnet3_quiesce_dev+0x209/0x340 [vmxnet3] [] vmxnet3_reset_work+0x6a/0xa0 [vmxnet3] [] process_one_work+0x16c/0x350 [] worker_thread+0x17a/0x410 [] kthread+0x96/0xa0 [] kernel_thread_helper+0x4/0x10 Signed-off-by: Benjamin Poirier Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit da39d112afd21b10d71dc09c6a4dd4dfd1f96dfe Author: Trond Myklebust Date: Thu Oct 23 19:33:14 2014 +0300 NFSv4: Ensure nfs_atomic_open set the dentry verifier on ENOENT commit 809fd143de8805970eec02c27c0bc2622a6ecbda upstream. If the OPEN rpc call to the server fails with an ENOENT call, nfs_atomic_open will create a negative dentry for that file, however it currently fails to call nfs_set_verifier(), thus causing the dentry to be immediately revalidated on the next call to nfs_lookup_revalidate() instead of following the usual lookup caching rules. Signed-off-by: Trond Myklebust Signed-off-by: Willy Tarreau commit 1c17617887d36f4db701de1dc2398a7ed649dd52 Author: Fabien Parent Date: Tue Jan 17 13:57:42 2017 +0100 ARM: dts: da850-evm: fix read access to SPI flash commit 43849785e1079f6606a31cb7fda92d1200849728 upstream. Read access to the SPI flash are broken on da850-evm, i.e. the data read is not what is actually programmed on the flash. According to the datasheet for the M25P64 part present on the da850-evm, if the SPI frequency is higher than 20MHz then the READ command is not usable anymore and only the FAST_READ command can be used to read data. This commit specifies in the DTS that we should use FAST_READ command instead of the READ command. Tested-by: Kevin Hilman Signed-off-by: Fabien Parent [nsekhar@ti.com: subject line adjustment] Signed-off-by: Sekhar Nori Signed-off-by: Jiri Slaby Signed-off-by: Olof Johansson Signed-off-by: Willy Tarreau commit d6797dd0b0e7f2d8a828617b8c6f9c66d72b9386 Author: Mark Rutland Date: Fri Jan 6 13:12:47 2017 +0100 ARM: 8634/1: hw_breakpoint: blacklist Scorpion CPUs commit ddc37832a1349f474c4532de381498020ed71d31 upstream. On APQ8060, the kernel crashes in arch_hw_breakpoint_init, taking an undefined instruction trap within write_wb_reg. This is because Scorpion CPUs erroneously appear to set DBGPRSR.SPD when WFI is issued, even if the core is not powered down. When DBGPRSR.SPD is set, breakpoint and watchpoint registers are treated as undefined. It's possible to trigger similar crashes later on from userspace, by requesting the kernel to install a breakpoint or watchpoint, as we can go idle at any point between the reset of the debug registers and their later use. This has always been the case. Given that this has always been broken, no-one has complained until now, and there is no clear workaround, disable hardware breakpoints and watchpoints on Scorpion to avoid these issues. Signed-off-by: Mark Rutland Reported-by: Linus Walleij Reviewed-by: Stephen Boyd Acked-by: Will Deacon Cc: Russell King Signed-off-by: Russell King Signed-off-by: Willy Tarreau commit 2044077298b44481e2961c5c1b8764ed96e8d279 Author: Quinn Tran Date: Fri Dec 23 18:06:10 2016 -0800 qla2xxx: Fix crash due to null pointer access commit fc1ffd6cb38a1c1af625b9833c41928039e733f5 upstream. During code inspection, while investigating following stack trace seen on one of the test setup, we found out there was possibility of memory leak becuase driver was not unwinding the stack properly. This issue has not been reproduced in a test environment or on a customer setup. Here's stack trace that was seen. [1469877.797315] Call Trace: [1469877.799940] [] qla2x00_mem_alloc+0xb09/0x10c0 [qla2xxx] [1469877.806980] [] qla2x00_probe_one+0x86a/0x1b50 [qla2xxx] [1469877.814013] [] ? __pm_runtime_resume+0x51/0xa0 [1469877.820265] [] ? _raw_spin_lock_irqsave+0x25/0x90 [1469877.826776] [] ? _raw_spin_unlock_irqrestore+0x6d/0x80 [1469877.833720] [] ? preempt_count_sub+0xb1/0x100 [1469877.839885] [] ? _raw_spin_unlock_irqrestore+0x4c/0x80 [1469877.846830] [] local_pci_probe+0x4c/0xb0 [1469877.852562] [] ? preempt_count_sub+0xb1/0x100 [1469877.858727] [] pci_call_probe+0x89/0xb0 Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani Reviewed-by: Christoph Hellwig [ bvanassche: Fixed spelling in patch description ] Signed-off-by: Bart Van Assche Signed-off-by: Willy Tarreau commit 369ca806e103aeeacb154cde201a49f68049826e Author: Bjorn Helgaas Date: Wed Dec 28 14:55:16 2016 -0600 x86/PCI: Ignore _CRS on Supermicro X8DTH-i/6/iF/6F commit 89e9f7bcd8744ea25fcf0ac671b8d72c10d7d790 upstream. Martin reported that the Supermicro X8DTH-i/6/iF/6F advertises incorrect host bridge windows via _CRS: pci_root PNP0A08:00: host bridge window [io 0xf000-0xffff] pci_root PNP0A08:01: host bridge window [io 0xf000-0xffff] Both bridges advertise the 0xf000-0xffff window, which cannot be correct. Work around this by ignoring _CRS on this system. The downside is that we may not assign resources correctly to hot-added PCI devices (if they are possible on this system). Link: https://bugzilla.kernel.org/show_bug.cgi?id=42606 Reported-by: Martin Burnicki Signed-off-by: Bjorn Helgaas Signed-off-by: Willy Tarreau commit 9cbef9a41d483fd988089ae1bf6b800a709d6635 Author: Niklas Söderlund Date: Sat Nov 12 17:04:24 2016 +0100 pinctrl: sh-pfc: Do not unconditionally support PIN_CONFIG_BIAS_DISABLE commit 5d7400c4acbf7fe633a976a89ee845f7333de3e4 upstream. Always stating PIN_CONFIG_BIAS_DISABLE is supported gives untrue output when examining /sys/kernel/debug/pinctrl/e6060000.pfc/pinconf-pins if the operation get_bias() is implemented but the pin is not handled by the get_bias() implementation. In that case the output will state that "input bias disabled" indicating that this pin has bias control support. Make support for PIN_CONFIG_BIAS_DISABLE depend on that the pin either supports SH_PFC_PIN_CFG_PULL_UP or SH_PFC_PIN_CFG_PULL_DOWN. This also solves the issue where SoC specific implementations print error messages if their particular implementation of {set,get}_bias() is called with a pin it does not know about. Signed-off-by: Niklas Söderlund Acked-by: Laurent Pinchart Signed-off-by: Geert Uytterhoeven Signed-off-by: Willy Tarreau commit a5464db93860e420aede7804a9d65a38cc087e51 Author: Akinobu Mita Date: Fri Jan 6 02:14:16 2017 +0900 sysrq: attach sysrq handler correctly for 32-bit kernel commit 802c03881f29844af0252b6e22be5d2f65f93fd0 upstream. The sysrq input handler should be attached to the input device which has a left alt key. On 32-bit kernels, some input devices which has a left alt key cannot attach sysrq handler. Because the keybit bitmap in struct input_device_id for sysrq is not correctly initialized. KEY_LEFTALT is 56 which is greater than BITS_PER_LONG on 32-bit kernels. I found this problem when using a matrix keypad device which defines a KEY_LEFTALT (56) but doesn't have a KEY_O (24 == 56%32). Cc: Jiri Slaby Signed-off-by: Akinobu Mita Acked-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman Signed-off-by: Willy Tarreau commit 23913c7fb34cf707471d63311926ec6e68421ec7 Author: Augusto Mecking Caringi Date: Tue Jan 10 10:45:00 2017 +0000 vme: Fix wrong pointer utilization in ca91cx42_slave_get commit c8a6a09c1c617402cc9254b2bc8da359a0347d75 upstream. In ca91cx42_slave_get function, the value pointed by vme_base pointer is set through: *vme_base = ioread32(bridge->base + CA91CX42_VSI_BS[i]); So it must be dereferenced to be used in calculation of pci_base: *pci_base = (dma_addr_t)*vme_base + pci_offset; This bug was caught thanks to the following gcc warning: drivers/vme/bridges/vme_ca91cx42.c: In function ‘ca91cx42_slave_get’: drivers/vme/bridges/vme_ca91cx42.c:467:14: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] *pci_base = (dma_addr_t)vme_base + pci_offset; Signed-off-by: Augusto Mecking Caringi Acked-By: Martyn Welch Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit f8ddf9677f60e00b0137379ebf266b2809ba002e Author: Vlad Tsyrklevich Date: Mon Jan 9 22:53:36 2017 +0700 i2c: fix kernel memory disclosure in dev interface commit 30f939feaeee23e21391cfc7b484f012eb189c3c upstream. i2c_smbus_xfer() does not always fill an entire block, allowing kernel stack memory disclosure through the temp variable. Clear it before it's read to. Signed-off-by: Vlad Tsyrklevich Signed-off-by: Wolfram Sang Signed-off-by: Willy Tarreau commit b7f592498930c207185a81e93dc013dff301ffd4 Author: Dmitry Torokhov Date: Thu Apr 13 15:36:31 2017 -0700 Input: i8042 - add Clevo P650RS to the i8042 reset list commit 7c5bb4ac2b76d2a09256aec8a7d584bf3e2b0466 upstream. Clevo P650RS and other similar devices require i8042 to be reset in order to detect Synaptics touchpad. Reported-by: Paweł Bylica Tested-by: Ed Bordin Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=190301 Signed-off-by: Dmitry Torokhov Signed-off-by: Willy Tarreau commit 091db5264d8cd317a83a41ec6b4456a8cb687618 Author: Akinobu Mita Date: Sun Jan 15 14:44:05 2017 -0800 Input: mpr121 - set missing event capability commit 9723ddc8fe0d76ce41fe0dc16afb241ec7d0a29d upstream. This driver reports misc scan input events on the sensor's status register changes. But the event capability for them was not set in the device initialization, so these events were ignored. This change adds the missing event capability. Signed-off-by: Akinobu Mita Signed-off-by: Dmitry Torokhov Cc: Oliver Neukum Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 39106815f8ce7a47a3f6f4de499d891cc3f2a5cc Author: Akinobu Mita Date: Sun Jan 15 14:44:30 2017 -0800 Input: mpr121 - handle multiple bits change of status register commit 08fea55e37f58371bffc5336a59e55d1f155955a upstream. This driver reports input events on their interrupts which are triggered by the sensor's status register changes. But only single bit change is reported in the interrupt handler. So if there are multiple bits are changed at almost the same time, other press or release events are ignored. This fixes it by detecting all changed bits in the status register. Signed-off-by: Akinobu Mita Signed-off-by: Dmitry Torokhov Cc: Oliver Neukum Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 4a878a06235c1445fbd68c2219837ca7697c346b Author: Maxime Ripard Date: Tue Jan 17 13:24:22 2017 -0800 Input: tca8418 - use the interrupt trigger from the device tree commit 259b77ef853cc375a5c9198cf81f9b79fc19413c upstream. The TCA8418 might be used using different interrupt triggers on various boards. This is not working so far because the current code forces a falling edge trigger. The device tree already provides a trigger type, so let's use whatever it sets up, and since we can be loaded without DT, keep the old behaviour for the non-DT case. Signed-off-by: Maxime Ripard Signed-off-by: Dmitry Torokhov Cc: Oliver Neukum Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 840c242b63c8bbc156f6c954e826f71da07f24f7 Author: Raphael Assenat Date: Thu Dec 29 10:23:09 2016 -0800 Input: joydev - do not report stale values on first open commit 45536d373a21d441bd488f618b6e3e9bfae839f3 upstream. Postpone axis initialization to the first open instead of doing it in joydev_connect. This is to make sure the generated startup events are representative of the current joystick state rather than what it was when joydev_connect() was called, potentially much earlier. Once the first user is connected to joydev node we'll be updating joydev->abs[] values and subsequent clients will be getting correct initial states as well. This solves issues with joystick driven menus that start scrolling up each time they are started, until the user moves the joystick to generate events. In emulator menu setups where the menu program is restarted every time the game exits, the repeated need to move the joystick to stop the unintended scrolling gets old rather quickly... Signed-off-by: Raphael Assenat Signed-off-by: Dmitry Torokhov Cc: Oliver Neukum Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 3f25fb3f2e7e6cd5677bcf2f7363a9e7037b9997 Author: Johan Hovold Date: Thu Mar 16 11:41:55 2017 -0700 Input: kbtab - validate number of endpoints before using them commit cb1b494663e037253337623bf1ef2df727883cb7 upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer should a malicious device lack endpoints. Signed-off-by: Johan Hovold Signed-off-by: Dmitry Torokhov Signed-off-by: Willy Tarreau commit f0819a498be8d63256dc176dc4de7b6286de7d47 Author: Johan Hovold Date: Thu Mar 16 11:34:02 2017 -0700 Input: iforce - validate number of endpoints before using them commit 59cf8bed44a79ec42303151dd014fdb6434254bb upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer or accessing memory that lie beyond the end of the endpoint array should a malicious device lack the expected endpoints. Signed-off-by: Johan Hovold Signed-off-by: Dmitry Torokhov Signed-off-by: Willy Tarreau commit 4fccf698ed2aef12f4e150b6ee090b92fc3e7c64 Author: Kai-Heng Feng Date: Tue Mar 7 09:31:29 2017 -0800 Input: i8042 - add noloop quirk for Dell Embedded Box PC 3000 commit 45838660e34d90db8d4f7cbc8fd66e8aff79f4fe upstream. The aux port does not get detected without noloop quirk, so external PS/2 mouse cannot work as result. The PS/2 mouse can work with this quirk. BugLink: https://bugs.launchpad.net/bugs/1591053 Signed-off-by: Kai-Heng Feng Reviewed-by: Marcos Paulo de Souza Signed-off-by: Dmitry Torokhov Signed-off-by: Willy Tarreau commit e2c835ec044d59f9511f205cc358454fd6822d49 Author: Pavel Rojtberg Date: Tue Dec 27 11:44:51 2016 -0800 Input: xpad - use correct product id for x360w controllers commit b6fc513da50c5dbc457a8ad6b58b046a6a68fd9d upstream. currently the controllers get the same product id as the wireless receiver. However the controllers actually have their own product id. The patch makes the driver expose the same product id as the windows driver. This improves compatibility when running applications with WINE. see https://github.com/paroj/xpad/issues/54 Signed-off-by: Pavel Rojtberg Signed-off-by: Dmitry Torokhov Signed-off-by: Willy Tarreau commit a48d5b1570ac6f5fd02028e5a71431d1f7e778db Author: Greg Kroah-Hartman Date: Fri Jan 6 15:33:36 2017 +0100 HID: hid-cypress: validate length of report commit 1ebb71143758f45dc0fa76e2f48429e13b16d110 upstream. Make sure we have enough of a report structure to validate before looking at it. Reported-by: Benoit Camredon Tested-by: Benoit Camredon Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 461527f71c537ce9d972a116f08f2bbd0c5ac105 Author: Michal Tesar Date: Mon Jan 2 14:38:36 2017 +0100 igmp: Make igmp group member RFC 3376 compliant commit 7ababb782690e03b78657e27bd051e20163af2d6 upstream. 5.2. Action on Reception of a Query When a system receives a Query, it does not respond immediately. Instead, it delays its response by a random amount of time, bounded by the Max Resp Time value derived from the Max Resp Code in the received Query message. A system may receive a variety of Queries on different interfaces and of different kinds (e.g., General Queries, Group-Specific Queries, and Group-and-Source-Specific Queries), each of which may require its own delayed response. Before scheduling a response to a Query, the system must first consider previously scheduled pending responses and in many cases schedule a combined response. Therefore, the system must be able to maintain the following state: o A timer per interface for scheduling responses to General Queries. o A per-group and interface timer for scheduling responses to Group- Specific and Group-and-Source-Specific Queries. o A per-group and interface list of sources to be reported in the response to a Group-and-Source-Specific Query. When a new Query with the Router-Alert option arrives on an interface, provided the system has state to report, a delay for a response is randomly selected in the range (0, [Max Resp Time]) where Max Resp Time is derived from Max Resp Code in the received Query message. The following rules are then used to determine if a Report needs to be scheduled and the type of Report to schedule. The rules are considered in order and only the first matching rule is applied. 1. If there is a pending response to a previous General Query scheduled sooner than the selected delay, no additional response needs to be scheduled. 2. If the received Query is a General Query, the interface timer is used to schedule a response to the General Query after the selected delay. Any previously pending response to a General Query is canceled. --8<-- Currently the timer is rearmed with new random expiration time for every incoming query regardless of possibly already pending report. Which is not aligned with the above RFE. It also might happen that higher rate of incoming queries can postpone the report after the expiration time of the first query causing group membership loss. Now the per interface general query timer is rearmed only when there is no pending report already scheduled on that interface or the newly selected expiration time is before the already pending scheduled report. Signed-off-by: Michal Tesar Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit e88f37a15a772296ae3850ebe24b385ccd951dd5 Author: Reiter Wolfgang Date: Tue Jan 3 01:39:10 2017 +0100 drop_monitor: consider inserted data in genlmsg_end commit 3b48ab2248e61408910e792fe84d6ec466084c1a upstream. Final nlmsg_len field update must reflect inserted net_dm_drop_point data. This patch depends on previous patch: "drop_monitor: add missing call to genlmsg_end" Signed-off-by: Reiter Wolfgang Acked-by: Neil Horman Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 0cfc0624b1a8359ed127856bfc7e1118cf8eb5c9 Author: Reiter Wolfgang Date: Sat Dec 31 21:11:57 2016 +0100 drop_monitor: add missing call to genlmsg_end commit 4200462d88f47f3759bdf4705f87e207b0f5b2e4 upstream. Update nlmsg_len field with genlmsg_end to enable userspace processing using nlmsg_next helper. Also adds error handling. Signed-off-by: Reiter Wolfgang Acked-by: Neil Horman Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 437529843ff158e2f132b04c014abd2d0707bba1 Author: stephen hemminger Date: Tue Dec 6 13:43:54 2016 -0800 netvsc: reduce maximum GSO size commit a50af86dd49ee1851d1ccf06dd0019c05b95e297 upstream. Hyper-V (and Azure) support using NVGRE which requires some extra space for encapsulation headers. Because of this the largest allowed TSO packet is reduced. For older releases, hard code a fixed reduced value. For next release, there is a better solution which uses result of host offload negotiation. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 468316bf28200b76195c11f7f3a9693978484a99 Author: Thomas Gleixner Date: Thu Dec 15 12:10:37 2016 +0100 tick/broadcast: Prevent NULL pointer dereference commit c1a9eeb938b5433947e5ea22f89baff3182e7075 upstream. When a disfunctional timer, e.g. dummy timer, is installed, the tick core tries to setup the broadcast timer. If no broadcast device is installed, the kernel crashes with a NULL pointer dereference in tick_broadcast_setup_oneshot() because the function has no sanity check. Reported-by: Mason Signed-off-by: Thomas Gleixner Cc: Mark Rutland Cc: Anna-Maria Gleixner Cc: Richard Cochran Cc: Sebastian Andrzej Siewior Cc: Daniel Lezcano Cc: Peter Zijlstra , Cc: Sebastian Frias Cc: Thibaud Cornic Cc: Robin Murphy Link: http://lkml.kernel.org/r/1147ef90-7877-e4d2-bb2b-5c4fa8d3144b@free.fr Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 8f587770ccc4525fb026d1fe43aaac4d9ec70efb Author: Paul Burton Date: Fri Sep 2 15:22:48 2016 +0100 net: ti: cpmac: Fix compiler warning due to type confusion commit 2f5281ba2a8feaf6f0aee93356f350855bb530fc upstream. cpmac_start_xmit() used the max() macro on skb->len (an unsigned int) and ETH_ZLEN (a signed int literal). This led to the following compiler warning: In file included from include/linux/list.h:8:0, from include/linux/module.h:9, from drivers/net/ethernet/ti/cpmac.c:19: drivers/net/ethernet/ti/cpmac.c: In function 'cpmac_start_xmit': include/linux/kernel.h:748:17: warning: comparison of distinct pointer types lacks a cast (void) (&_max1 == &_max2); \ ^ drivers/net/ethernet/ti/cpmac.c:560:8: note: in expansion of macro 'max' len = max(skb->len, ETH_ZLEN); ^ On top of this, it assigned the result of the max() macro to a signed integer whilst all further uses of it result in it being cast to varying widths of unsigned integer. Fix this up by using max_t to ensure the comparison is performed as unsigned integers, and for consistency change the type of the len variable to unsigned int. Signed-off-by: Paul Burton Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit de2e2803e68126872b822a731ef6bcbeaa6553d1 Author: Arnd Bergmann Date: Tue Mar 22 14:27:11 2016 -0700 cred/userns: define current_user_ns() as a function commit 0335695dfa4df01edff5bb102b9a82a0668ee51e upstream. The current_user_ns() macro currently returns &init_user_ns when user namespaces are disabled, and that causes several warnings when building with gcc-6.0 in code that compares the result of the macro to &init_user_ns itself: fs/xfs/xfs_ioctl.c: In function 'xfs_ioctl_setattr_check_projid': fs/xfs/xfs_ioctl.c:1249:22: error: self-comparison always evaluates to true [-Werror=tautological-compare] if (current_user_ns() == &init_user_ns) This is a legitimate warning in principle, but here it isn't really helpful, so I'm reprasing the definition in a way that shuts up the warning. Apparently gcc only warns when comparing identical literals, but it can figure out that the result of an inline function can be identical to a constant expression in order to optimize a condition yet not warn about the fact that the condition is known at compile time. This is exactly what we want here, and it looks reasonable because we generally prefer inline functions over macros anyway. Signed-off-by: Arnd Bergmann Acked-by: Serge Hallyn Cc: David Howells Cc: Yaowei Bai Cc: James Morris Cc: "Paul E. McKenney" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit fa51c867f4d587ff6d9b75ee9e9cb9e107165e7a Author: Steven Rostedt Date: Mon May 16 23:00:35 2016 -0400 ftrace/x86: Set ftrace_stub to weak to prevent gcc from using short jumps to it commit 8329e818f14926a6040df86b2668568bde342ebf upstream. Matt Fleming reported seeing crashes when enabling and disabling function profiling which uses function graph tracer. Later Namhyung Kim hit a similar issue and he found that the issue was due to the jmp to ftrace_stub in ftrace_graph_call was only two bytes, and when it was changed to jump to the tracing code, it overwrote the ftrace_stub that was after it. Masami Hiramatsu bisected this down to a binutils change: 8dcea93252a9ea7dff57e85220a719e2a5e8ab41 is the first bad commit commit 8dcea93252a9ea7dff57e85220a719e2a5e8ab41 Author: H.J. Lu Date: Fri May 15 03:17:31 2015 -0700 Add -mshared option to x86 ELF assembler This patch adds -mshared option to x86 ELF assembler. By default, assembler will optimize out non-PLT relocations against defined non-weak global branch targets with default visibility. The -mshared option tells the assembler to generate code which may go into a shared library where all non-weak global branch targets with default visibility can be preempted. The resulting code is slightly bigger. This option only affects the handling of branch instructions. Declaring ftrace_stub as a weak call prevents gas from using two byte jumps to it, which would be converted to a jump to the function graph code. Link: http://lkml.kernel.org/r/20160516230035.1dbae571@gandalf.local.home Reported-by: Matt Fleming Reported-by: Namhyung Kim Tested-by: Matt Fleming Reviewed-by: Masami Hiramatsu Signed-off-by: Steven Rostedt Signed-off-by: Willy Tarreau commit 823a2a0330f25ccea8ef64ee4a378e37ca51361c Author: Al Viro Date: Fri Dec 16 13:42:06 2016 -0500 sg_write()/bsg_write() is not fit to be called under KERNEL_DS commit 128394eff343fc6d2f32172f03e24829539c5835 upstream. Both damn things interpret userland pointers embedded into the payload; worse, they are actually traversing those. Leaving aside the bad API design, this is very much _not_ safe to call with KERNEL_DS. Bail out early if that happens. Signed-off-by: Al Viro Signed-off-by: Willy Tarreau commit b1230ef86d56322ebc21b1dbd7fa91250ca5d266 Author: Geoff Levand Date: Tue Nov 29 10:47:32 2016 -0800 powerpc/ps3: Fix system hang with GCC 5 builds commit 6dff5b67054e17c91bd630bcdda17cfca5aa4215 upstream. GCC 5 generates different code for this bootwrapper null check that causes the PS3 to hang very early in its bootup. This check is of limited value, so just get rid of it. Signed-off-by: Geoff Levand Signed-off-by: Michael Ellerman Signed-off-by: Willy Tarreau commit 768ce7b007bcff58f2100123dc1ca1722f71d340 Author: Al Viro Date: Mon Sep 5 21:42:32 2016 -0400 nfs_write_end(): fix handling of short copies commit c0cf3ef5e0f47e385920450b245d22bead93e7ad upstream. What matters when deciding if we should make a page uptodate is not how much we _wanted_ to copy, but how much we actually have copied. As it is, on architectures that do not zero tail on short copy we can leave uninitialized data in page marked uptodate. Signed-off-by: Al Viro Signed-off-by: Willy Tarreau commit 93c83e37a5790c86626ef00ade1350770a904c9d Author: Ilya Dryomov Date: Fri Dec 2 16:35:09 2016 +0100 libceph: verify authorize reply on connect commit 5c056fdc5b474329037f2aa18401bd73033e0ce0 upstream. After sending an authorizer (ceph_x_authorize_a + ceph_x_authorize_b), the client gets back a ceph_x_authorize_reply, which it is supposed to verify to ensure the authenticity and protect against replay attacks. The code for doing this is there (ceph_x_verify_authorizer_reply(), ceph_auth_verify_authorizer_reply() + plumbing), but it is never invoked by the the messenger. AFAICT this goes back to 2009, when ceph authentication protocols support was added to the kernel client in 4e7a5dcd1bba ("ceph: negotiate authentication protocol; implement AUTH_NONE protocol"). The second param of ceph_connection_operations::verify_authorizer_reply is unused all the way down. Pass 0 to facilitate backporting, and kill it in the next commit. Signed-off-by: Ilya Dryomov Reviewed-by: Sage Weil Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 8cfc539f624a1c145ae237bfc187c66a89dc9316 Author: Gerald Schaefer Date: Mon Nov 21 12:13:58 2016 +0100 s390/vmlogrdr: fix IUCV buffer allocation commit 5457e03de918f7a3e294eb9d26a608ab8a579976 upstream. The buffer for iucv_message_receive() needs to be below 2 GB. In __iucv_message_receive(), the buffer address is casted to an u32, which would result in either memory corruption or an addressing exception when using addresses >= 2 GB. Fix this by using GFP_DMA for the buffer allocation. Signed-off-by: Gerald Schaefer Signed-off-by: Martin Schwidefsky Signed-off-by: Willy Tarreau commit 937f07614dcd8da04621e1e67589555a47a828d2 Author: Martin K. Petersen Date: Tue Apr 4 10:42:30 2017 -0400 scsi: sd: Fix capacity calculation with 32-bit sector_t commit 7c856152cb92f8eee2df29ef325a1b1f43161aff upstream. We previously made sure that the reported disk capacity was less than 0xffffffff blocks when the kernel was not compiled with large sector_t support (CONFIG_LBDAF). However, this check assumed that the capacity was reported in units of 512 bytes. Add a sanity check function to ensure that we only enable disks if the entire reported capacity can be expressed in terms of sector_t. Reported-by: Steve Magnani Cc: Bart Van Assche Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen Signed-off-by: Willy Tarreau commit d427ab53d49580344f2d515c0ffd261fc3b483df Author: Martin K. Petersen Date: Fri Mar 17 08:47:14 2017 -0400 scsi: sr: Sanity check returned mode data commit a00a7862513089f17209b732f230922f1942e0b9 upstream. Kefeng Wang discovered that old versions of the QEMU CD driver would return mangled mode data causing us to walk off the end of the buffer in an attempt to parse it. Sanity check the returned mode sense data. Reported-by: Kefeng Wang Tested-by: Kefeng Wang Signed-off-by: Martin K. Petersen Signed-off-by: Willy Tarreau commit dfda822cbb76b105d0094b122e4c025642e3b218 Author: Anton Blanchard Date: Mon Feb 13 08:49:20 2017 +1100 scsi: lpfc: Add shutdown method for kexec commit 85e8a23936ab3442de0c42da97d53b29f004ece1 upstream. We see lpfc devices regularly fail during kexec. Fix this by adding a shutdown method which mirrors the remove method. Signed-off-by: Anton Blanchard Reviewed-by: Mauricio Faria de Oliveira Tested-by: Mauricio Faria de Oliveira Signed-off-by: Martin K. Petersen Signed-off-by: Willy Tarreau commit 22e0bbbc6f30e61aa735f59f738caadf3c9784af Author: Nicholas Bellinger Date: Thu Nov 3 23:06:53 2016 -0700 target/pscsi: Fix TYPE_TAPE + TYPE_MEDIMUM_CHANGER export commit a04e54f2c35823ca32d56afcd5cea5b783e2f51a upstream. The following fixes a divide by zero OOPs with TYPE_TAPE due to pscsi_tape_read_blocksize() failing causing a zero sd->sector_size being propigated up via dev_attrib.hw_block_size. It also fixes another long-standing bug where TYPE_TAPE and TYPE_MEDIMUM_CHANGER where using pscsi_create_type_other(), which does not call scsi_device_get() to take the device reference. Instead, rename pscsi_create_type_rom() to pscsi_create_type_nondisk() and use it for all cases. Finally, also drop a dump_stack() in pscsi_get_blocks() for non TYPE_DISK, which in modern target-core can get invoked via target_sense_desc_format() during CHECK_CONDITION. [js] cast max_sectors to unsigned to avoid warnings Reported-by: Malcolm Haak Signed-off-by: Nicholas Bellinger Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit d49e93f927b14e1ac95aaa5847ee1175b177a07c Author: Long Li Date: Wed Dec 14 18:46:03 2016 -0800 scsi: storvsc: properly set residual data length on errors commit 40630f462824ee24bc00d692865c86c3828094e0 upstream. On I/O errors, the Windows driver doesn't set data_transfer_length on error conditions other than SRB_STATUS_DATA_OVERRUN. In these cases we need to set data_transfer_length to 0, indicating there is no data transferred. On SRB_STATUS_DATA_OVERRUN, data_transfer_length is set by the Windows driver to the actual data transferred. Reported-by: Shiva Krishna Signed-off-by: Long Li Reviewed-by: K. Y. Srinivasan Signed-off-by: K. Y. Srinivasan Signed-off-by: Martin K. Petersen Signed-off-by: Willy Tarreau commit d9b6a466adbf115d0428eff73ef78e7d01646396 Author: Long Li Date: Wed Dec 14 18:46:02 2016 -0800 scsi: storvsc: properly handle SRB_ERROR when sense message is present commit bba5dc332ec2d3a685cb4dae668c793f6a3713a3 upstream. When sense message is present on error, we should pass along to the upper layer to decide how to deal with the error. This patch fixes connectivity issues with Fiber Channel devices. Signed-off-by: Long Li Reviewed-by: K. Y. Srinivasan Signed-off-by: K. Y. Srinivasan Signed-off-by: Martin K. Petersen Signed-off-by: Willy Tarreau commit c243c6149a5c64efe60fb032f6ffdba8c408211f Author: Johannes Thumshirn Date: Tue Jan 31 10:16:00 2017 +0100 scsi: don't BUG_ON() empty DMA transfers commit fd3fc0b4d7305fa7246622dcc0dec69c42443f45 upstream. Don't crash the machine just because of an empty transfer. Use WARN_ON() combined with returning an error. Found by Dmitry Vyukov and syzkaller. [ Changed to "WARN_ON_ONCE()". Al has a patch that should fix the root cause, but a BUG_ON() is not acceptable in any case, and a WARN_ON() might still be a cause of excessive log spamming. NOTE! If this warning ever triggers, we may end up leaking resources, since this doesn't bother to try to clean the command up. So this WARN_ON_ONCE() triggering does imply real problems. But BUG_ON() is much worse. People really need to stop using BUG_ON() for "this shouldn't ever happen". It makes pretty much any bug worse. - Linus ] Signed-off-by: Johannes Thumshirn Reported-by: Dmitry Vyukov Cc: James Bottomley Cc: Al Viro Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit cf8ec2cf8d05900c8322942da7635e1386b502a3 Author: Christoph Hellwig Date: Sat Jun 28 11:51:01 2014 +0200 scsi: move the nr_phys_segments assert into scsi_init_io commit 635d98b1d0cfc2ba3426a701725d31a6102c059a upstream. scsi_init_io should only be called for requests that transfer data, so move the assert that a request has segments from the callers into scsi_init_io. Signed-off-by: Christoph Hellwig Reviewed-by: Martin K. Petersen Reviewed-by: Hannes Reinecke Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit b3ffb4cb387985c8572f49478a822abf8125f53c Author: Wei Fang Date: Tue Dec 13 09:25:21 2016 +0800 scsi: avoid a permanent stop of the scsi device's request queue commit d2a145252c52792bc59e4767b486b26c430af4bb upstream. A race between scanning and fc_remote_port_delete() may result in a permanent stop if the device gets blocked before scsi_sysfs_add_sdev() and unblocked after. The reason is that blocking a device sets both the SDEV_BLOCKED state and the QUEUE_FLAG_STOPPED. However, scsi_sysfs_add_sdev() unconditionally sets SDEV_RUNNING which causes the device to be ignored by scsi_target_unblock() and thus never have its QUEUE_FLAG_STOPPED cleared leading to a device which is apparently running but has a stopped queue. We actually have two places where SDEV_RUNNING is set: once in scsi_add_lun() which respects the blocked flag and once in scsi_sysfs_add_sdev() which doesn't. Since the second set is entirely spurious, simply remove it to fix the problem. Reported-by: Zengxi Chen Signed-off-by: Wei Fang Reviewed-by: Ewan D. Milne Signed-off-by: Martin K. Petersen Signed-off-by: Willy Tarreau commit 0175d92323beef4bfdb24addd53144a0f4f49373 Author: Russell Currey Date: Thu Dec 15 16:12:41 2016 +1100 drivers/gpu/drm/ast: Fix infinite loop if read fails commit 298360af3dab45659810fdc51aba0c9f4097e4f6 upstream. ast_get_dram_info() configures a window in order to access BMC memory. A BMC register can be configured to disallow this, and if so, causes an infinite loop in the ast driver which renders the system unusable. Fix this by erroring out if an error is detected. On powerpc systems with EEH, this leads to the device being fenced and the system continuing to operate. Signed-off-by: Russell Currey Reviewed-by: Joel Stanley Signed-off-by: Daniel Vetter Link: http://patchwork.freedesktop.org/patch/msgid/20161215051241.20815-1-ruscur@russell.cc Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 048eeebe865bd568d570e328ecc98fbd72489f4c Author: Larry Finger Date: Sat Nov 5 14:08:57 2016 -0500 ssb: Fix error routine when fallback SPROM fails commit 8052d7245b6089992343c80b38b14dbbd8354651 upstream. When there is a CRC error in the SPROM read from the device, the code attempts to handle a fallback SPROM. When this also fails, the driver returns zero rather than an error code. Signed-off-by: Larry Finger Signed-off-by: Kalle Valo Signed-off-by: Willy Tarreau commit 4e448d489e0d63267294541cef8f81d7f1b88095 Author: Darrick J. Wong Date: Wed Jan 25 20:24:57 2017 -0800 xfs: clear _XBF_PAGES from buffers when readahead page commit 2aa6ba7b5ad3189cc27f14540aa2f57f0ed8df4b upstream. If we try to allocate memory pages to back an xfs_buf that we're trying to read, it's possible that we'll be so short on memory that the page allocation fails. For a blocking read we'll just wait, but for readahead we simply dump all the pages we've collected so far. Unfortunately, after dumping the pages we neglect to clear the _XBF_PAGES state, which means that the subsequent call to xfs_buf_free thinks that b_pages still points to pages we own. It then double-frees the b_pages pages. This results in screaming about negative page refcounts from the memory manager, which xfs oughtn't be triggering. To reproduce this case, mount a filesystem where the size of the inodes far outweighs the availalble memory (a ~500M inode filesystem on a VM with 300MB memory did the trick here) and run bulkstat in parallel with other memory eating processes to put a huge load on the system. The "check summary" phase of xfs_scrub also works for this purpose. Signed-off-by: Darrick J. Wong Reviewed-by: Eric Sandeen Cc: Ivan Kozik Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit dfd97b52341c677e9767a17e3a7dbf0a3fd68601 Author: Eric Sandeen Date: Mon Dec 5 12:31:06 2016 +1100 xfs: set AGI buffer type in xlog_recover_clear_agi_bucket commit 6b10b23ca94451fae153a5cc8d62fd721bec2019 upstream. xlog_recover_clear_agi_bucket didn't set the type to XFS_BLFT_AGI_BUF, so we got a warning during log replay (or an ASSERT on a debug build). XFS (md0): Unknown buffer type 0! XFS (md0): _xfs_buf_ioapply: no ops on block 0xaea8802/0x1 Fix this, as was done in f19b872b for 2 other locations with the same problem. Signed-off-by: Eric Sandeen Reviewed-by: Brian Foster Reviewed-by: Christoph Hellwig Signed-off-by: Dave Chinner Signed-off-by: Willy Tarreau commit afeef476bbff8f8b8ff5cb129b72fbf544a51941 Author: Julien Grall Date: Wed Dec 7 12:24:40 2016 +0000 arm/xen: Use alloc_percpu rather than __alloc_percpu commit 24d5373dda7c00a438d26016bce140299fae675e upstream. The function xen_guest_init is using __alloc_percpu with an alignment which are not power of two. However, the percpu allocator never supported alignments which are not power of two and has always behaved incorectly in thise case. Commit 3ca45a4 "percpu: ensure requested alignment is power of two" introduced a check which trigger a warning [1] when booting linux-next on Xen. But in reality this bug was always present. This can be fixed by replacing the call to __alloc_percpu with alloc_percpu. The latter will use an alignment which are a power of two. [1] [ 0.023921] illegal size (48) or align (48) for percpu allocation [ 0.024167] ------------[ cut here ]------------ [ 0.024344] WARNING: CPU: 0 PID: 1 at linux/mm/percpu.c:892 pcpu_alloc+0x88/0x6c0 [ 0.024584] Modules linked in: [ 0.024708] [ 0.024804] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc7-next-20161128 #473 [ 0.025012] Hardware name: Foundation-v8A (DT) [ 0.025162] task: ffff80003d870000 task.stack: ffff80003d844000 [ 0.025351] PC is at pcpu_alloc+0x88/0x6c0 [ 0.025490] LR is at pcpu_alloc+0x88/0x6c0 [ 0.025624] pc : [] lr : [] pstate: 60000045 [ 0.025830] sp : ffff80003d847cd0 [ 0.025946] x29: ffff80003d847cd0 x28: 0000000000000000 [ 0.026147] x27: 0000000000000000 x26: 0000000000000000 [ 0.026348] x25: 0000000000000000 x24: 0000000000000000 [ 0.026549] x23: 0000000000000000 x22: 00000000024000c0 [ 0.026752] x21: ffff000008e97000 x20: 0000000000000000 [ 0.026953] x19: 0000000000000030 x18: 0000000000000010 [ 0.027155] x17: 0000000000000a3f x16: 00000000deadbeef [ 0.027357] x15: 0000000000000006 x14: ffff000088f79c3f [ 0.027573] x13: ffff000008f79c4d x12: 0000000000000041 [ 0.027782] x11: 0000000000000006 x10: 0000000000000042 [ 0.027995] x9 : ffff80003d847a40 x8 : 6f697461636f6c6c [ 0.028208] x7 : 6120757063726570 x6 : ffff000008f79c84 [ 0.028419] x5 : 0000000000000005 x4 : 0000000000000000 [ 0.028628] x3 : 0000000000000000 x2 : 000000000000017f [ 0.028840] x1 : ffff80003d870000 x0 : 0000000000000035 [ 0.029056] [ 0.029152] ---[ end trace 0000000000000000 ]--- [ 0.029297] Call trace: [ 0.029403] Exception stack(0xffff80003d847b00 to 0xffff80003d847c30) [ 0.029621] 7b00: 0000000000000030 0001000000000000 ffff80003d847cd0 ffff00000818e678 [ 0.029901] 7b20: 0000000000000002 0000000000000004 ffff000008f7c060 0000000000000035 [ 0.030153] 7b40: ffff000008f79000 ffff000008c4cd88 ffff80003d847bf0 ffff000008101778 [ 0.030402] 7b60: 0000000000000030 0000000000000000 ffff000008e97000 00000000024000c0 [ 0.030647] 7b80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.030895] 7ba0: 0000000000000035 ffff80003d870000 000000000000017f 0000000000000000 [ 0.031144] 7bc0: 0000000000000000 0000000000000005 ffff000008f79c84 6120757063726570 [ 0.031394] 7be0: 6f697461636f6c6c ffff80003d847a40 0000000000000042 0000000000000006 [ 0.031643] 7c00: 0000000000000041 ffff000008f79c4d ffff000088f79c3f 0000000000000006 [ 0.031877] 7c20: 00000000deadbeef 0000000000000a3f [ 0.032051] [] pcpu_alloc+0x88/0x6c0 [ 0.032229] [] __alloc_percpu+0x18/0x20 [ 0.032409] [] xen_guest_init+0x174/0x2f4 [ 0.032591] [] do_one_initcall+0x38/0x130 [ 0.032783] [] kernel_init_freeable+0xe0/0x248 [ 0.032995] [] kernel_init+0x10/0x100 [ 0.033172] [] ret_from_fork+0x10/0x50 Reported-by: Wei Chen Link: https://lkml.org/lkml/2016/11/28/669 Signed-off-by: Julien Grall Signed-off-by: Stefano Stabellini Reviewed-by: Stefano Stabellini Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 50da0b03025fe051b25501652b96664ecdfeda43 Author: Alan Stern Date: Fri Oct 21 16:49:07 2016 -0400 USB: UHCI: report non-PME wakeup signalling for Intel hardware commit ccdb6be9ec6580ef69f68949ebe26e0fb58a6fb0 upstream. The UHCI controllers in Intel chipsets rely on a platform-specific non-PME mechanism for wakeup signalling. They can generate wakeup signals even though they don't support PME. We need to let the USB core know this so that it will enable runtime suspend for UHCI controllers. Signed-off-by: Alan Stern Signed-off-by: Bjorn Helgaas Acked-by: Greg Kroah-Hartman Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit d4dfd03c3f10a885acd6c62344697f805a5b94ab Author: Felipe Balbi Date: Wed Sep 28 10:38:11 2016 +0300 usb: gadget: composite: correctly initialize ep->maxpacket commit e8f29bb719b47a234f33b0af62974d7a9521a52c upstream. usb_endpoint_maxp() returns wMaxPacketSize in its raw form. Without taking into consideration that it also contains other bits reserved for isochronous endpoints. This patch fixes one occasion where this is a problem by making sure that we initialize ep->maxpacket only with lower 10 bits of the value returned by usb_endpoint_maxp(). Note that seperate patches will be necessary to audit all call sites of usb_endpoint_maxp() and make sure that usb_endpoint_maxp() only returns lower 10 bits of wMaxPacketSize. Signed-off-by: Felipe Balbi Signed-off-by: Willy Tarreau commit 1ea3d6c8d5111691a678da52ea1c1809c3cedc1d Author: Guenter Roeck Date: Thu Dec 1 13:49:59 2016 -0800 usb: hub: Wait for connection to be reestablished after port reset commit 22547c4cc4fe20698a6a85a55b8788859134b8e4 upstream. On a system with a defective USB device connected to an USB hub, an endless sequence of port connect events was observed. The sequence of events as observed is as follows: - Port reports connected event (port status=USB_PORT_STAT_CONNECTION). - Event handler debounces port and resets it by calling hub_port_reset(). - hub_port_reset() calls hub_port_wait_reset() to wait for the reset to complete. - The reset completes, but USB_PORT_STAT_CONNECTION is not immediately set in the port status register. - hub_port_wait_reset() returns -ENOTCONN. - Port initialization sequence is aborted. - A few milliseconds later, the port again reports a connected event, and the sequence repeats. This continues either forever or, randomly, stops if the connection is already re-established when the port status is read. It results in a high rate of udev events. This in turn destabilizes userspace since the above sequence holds the device mutex pretty much continuously and prevents userspace from actually reading the device status. To prevent the problem from happening, let's wait for the connection to be re-established after a port reset. If the device was actually disconnected, the code will still return an error, but it will do so only after the long reset timeout. Cc: Douglas Anderson Signed-off-by: Guenter Roeck Acked-by: Alan Stern Signed-off-by: Sumit Semwal Signed-off-by: Willy Tarreau commit 91c86d706c740499ef68dbc0724347c12dc3ac76 Author: Janusz Dziedzic Date: Mon Mar 13 14:11:32 2017 +0200 usb: dwc3: gadget: delay unmap of bounced requests commit de288e36fe33f7e06fa272bc8e2f85aa386d99aa upstream. In the case of bounced ep0 requests, we must delay DMA operation until after ->complete() otherwise we might overwrite contents of req->buf. This caused problems with RNDIS gadget. Signed-off-by: Janusz Dziedzic Signed-off-by: Felipe Balbi Signed-off-by: Willy Tarreau commit a3fdfc6427e7e838223bc4077da66cc00f02451e Author: Guenter Roeck Date: Thu Mar 9 15:39:37 2017 +0200 usb: host: xhci-plat: Fix timeout on removal of hot pluggable xhci controllers commit dcc7620cad5ad1326a78f4031a7bf4f0e5b42984 upstream. Upstream commit 98d74f9ceaef ("xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers") fixes a problem with hot pluggable PCI xhci controllers which can result in excessive timeouts, to the point where the system reports a deadlock. The same problem is seen with hot pluggable xhci controllers using the xhci-plat driver, such as the driver used for Type-C ports on rk3399. Similar to hot-pluggable PCI controllers, the driver for this chip removes the xhci controller from the system when the Type-C cable is disconnected. The solution for PCI devices works just as well for non-PCI devices and avoids the problem. Signed-off-by: Guenter Roeck Signed-off-by: Mathias Nyman Signed-off-by: Willy Tarreau commit 8cada854ab704ca486754043a42a7b7777506c2a Author: Felipe Balbi Date: Tue Jan 31 13:24:54 2017 +0200 usb: dwc3: gadget: make Set Endpoint Configuration macros safe commit 7369090a9fb57c3fc705ce355d2e4523a5a24716 upstream. Some gadget drivers are bad, bad boys. We notice that ADB was passing bad Burst Size which caused top bits of param0 to be overwritten which confused DWC3 when running this command. In order to avoid future issues, we're going to make sure values passed by macros are always safe for the controller. Note that ADB still needs a fix to *not* pass bad values. Reported-by: Mohamed Abbas Sugested-by: Adam Andruszak Signed-off-by: Felipe Balbi Signed-off-by: Willy Tarreau commit 817a471a34a6a31a2fed5b69bc920507f41f9969 Author: Johan Hovold Date: Mon May 26 19:23:43 2014 +0200 USB: cdc-acm: fix failed open not being detected commit 8727bf689a77a79816065e23a7a58a474ad544f9 upstream. Fix errors during open not being returned to userspace. Specifically, failed control-line manipulations or control or read urb submissions would not be detected. Fixes: 7fb57a019f94 ("USB: cdc-acm: Fix potential deadlock (lockdep warning)") Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Signed-off-by: Willy Tarreau commit 256d49baf399fae3921083cfb6a8e59ad393fee8 Author: Johan Hovold Date: Mon May 26 19:23:42 2014 +0200 USB: cdc-acm: fix open and suspend race commit 703df3297fb1950b0aa53e656108eb936d3f21d9 upstream. We must not do the usb_autopm_put_interface() before submitting the read urbs or we might end up doing I/O to a suspended device. Fixes: 088c64f81284 ("USB: cdc-acm: re-write read processing") Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Signed-off-by: Willy Tarreau commit f8dc9ab9dcf2133919763e00ba646fca891f248b Author: Alexey Khoroshilov Date: Sat Apr 12 02:10:45 2014 +0400 USB: cdc-acm: fix double usb_autopm_put_interface() in acm_port_activate() commit 070c0b17f6a1ba39dff9be112218127e7e8fd456 upstream. If acm_submit_read_urbs() fails in acm_port_activate(), error handling code calls usb_autopm_put_interface() while it is already called before acm_submit_read_urbs(). The patch reorganizes error handling code to avoid double decrement of USB interface's PM-usage counter. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov Acked-by: Oliver Neukum Signed-off-by: Greg Kroah-Hartman Signed-off-by: Willy Tarreau commit 7b1a80c8b132b2bebe9193d2a9325d1c20fe4063 Author: Felipe Balbi Date: Wed Sep 28 12:33:31 2016 +0300 usb: gadget: composite: always set ep->mult to a sensible value commit eaa496ffaaf19591fe471a36cef366146eeb9153 upstream. ep->mult is supposed to be set to Isochronous and Interrupt Endapoint's multiplier value. This value is computed from different places depending on the link speed. If we're dealing with HighSpeed, then it's part of bits [12:11] of wMaxPacketSize. This case wasn't taken into consideration before. While at that, also make sure the ep->mult defaults to one so drivers can use it unconditionally and assume they'll never multiply ep->maxpacket to zero. Signed-off-by: Felipe Balbi Signed-off-by: Willy Tarreau commit a1bf279fac23144553a60a4d30ae60fea718e07f Author: Johan Hovold Date: Tue Jan 3 16:39:46 2017 +0100 USB: serial: io_ti: bind to interface after fw download commit e35d6d7c4e6532a89732cf4bace0e910ee684c88 upstream. Bind to the interface, but do not register any ports, after having downloaded the firmware. The device will still disconnect and re-enumerate, but this way we avoid an error messages from being logged as part of the process: io_ti: probe of 1-1.3:1.0 failed with error -5 Signed-off-by: Johan Hovold Signed-off-by: Willy Tarreau commit e233c671b6a91ecf0149d8fec0c1baa6b8d628c6 Author: Mathias Nyman Date: Tue Jan 3 18:28:43 2017 +0200 xhci: free xhci virtual devices with leaf nodes first commit ee8665e28e8d90ce69d4abe5a469c14a8707ae0e upstream. the tt_info provided by a HS hub might be in use to by a child device Make sure we free the devices in the correct order. This is needed in special cases such as when xhci controller is reset when resuming from hibernate, and all virt_devices are freed. Also free the virt_devices starting from max slot_id as children more commonly have higher slot_id than parent. Reported-by: Guenter Roeck Tested-by: Guenter Roeck Signed-off-by: Mathias Nyman Signed-off-by: Willy Tarreau commit bde22e32dfe73c1463a43dceb66b1a72714e0e2d Author: Alan Stern Date: Fri Dec 9 15:24:24 2016 -0500 USB: gadgetfs: fix checks of wTotalLength in config descriptors commit 1c069b057dcf64fada952eaa868d35f02bb0cfc2 upstream. Andrey Konovalov's fuzz testing of gadgetfs showed that we should improve the driver's checks for valid configuration descriptors passed in by the user. In particular, the driver needs to verify that the wTotalLength value in the descriptor is not too short (smaller than USB_DT_CONFIG_SIZE). And the check for whether wTotalLength is too large has to be changed, because the driver assumes there is always enough room remaining in the buffer to hold a device descriptor (at least USB_DT_DEVICE_SIZE bytes). This patch adds the additional check and fixes the existing check. It may do a little more than strictly necessary, but one extra check won't hurt. Signed-off-by: Alan Stern CC: Andrey Konovalov Signed-off-by: Felipe Balbi Signed-off-by: Willy Tarreau commit 6ccec4b1ed141625123261ac2e7b9ce849698aac Author: Alan Stern Date: Fri Dec 9 15:18:43 2016 -0500 USB: gadgetfs: fix use-after-free bug commit add333a81a16abbd4f106266a2553677a165725f upstream. Andrey Konovalov reports that fuzz testing with syzkaller causes a KASAN use-after-free bug report in gadgetfs: BUG: KASAN: use-after-free in gadgetfs_setup+0x208a/0x20e0 at addr ffff88003dfe5bf2 Read of size 2 by task syz-executor0/22994 CPU: 3 PID: 22994 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #16 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 ffff88006df06a18 ffffffff81f96aba ffffffffe0528500 1ffff1000dbe0cd6 ffffed000dbe0cce ffff88006df068f0 0000000041b58ab3 ffffffff8598b4c8 ffffffff81f96828 1ffff1000dbe0ccd ffff88006df06708 ffff88006df06748 Call Trace: [ 201.343209] [< inline >] __dump_stack lib/dump_stack.c:15 [ 201.343209] [] dump_stack+0x292/0x398 lib/dump_stack.c:51 [] kasan_object_err+0x1c/0x70 mm/kasan/report.c:159 [< inline >] print_address_description mm/kasan/report.c:197 [] kasan_report_error+0x1f0/0x4e0 mm/kasan/report.c:286 [< inline >] kasan_report mm/kasan/report.c:306 [] __asan_report_load_n_noabort+0x3a/0x40 mm/kasan/report.c:337 [< inline >] config_buf drivers/usb/gadget/legacy/inode.c:1298 [] gadgetfs_setup+0x208a/0x20e0 drivers/usb/gadget/legacy/inode.c:1368 [] dummy_timer+0x11f0/0x36d0 drivers/usb/gadget/udc/dummy_hcd.c:1858 [] call_timer_fn+0x241/0x800 kernel/time/timer.c:1308 [< inline >] expire_timers kernel/time/timer.c:1348 [] __run_timers+0xa06/0xec0 kernel/time/timer.c:1641 [] run_timer_softirq+0x21/0x80 kernel/time/timer.c:1654 [] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284 The cause of the bug is subtle. The dev_config() routine gets called twice by the fuzzer. The first time, the user data contains both a full-speed configuration descriptor and a high-speed config descriptor, causing dev->hs_config to be set. But it also contains an invalid device descriptor, so the buffer containing the descriptors is deallocated and dev_config() returns an error. The second time dev_config() is called, the user data contains only a full-speed config descriptor. But dev->hs_config still has the stale pointer remaining from the first call, causing the routine to think that there is a valid high-speed config. Later on, when the driver dereferences the stale pointer to copy that descriptor, we get a use-after-free access. The fix is simple: Clear dev->hs_config if the passed-in data does not contain a high-speed config descriptor. Signed-off-by: Alan Stern Reported-by: Andrey Konovalov Tested-by: Andrey Konovalov Signed-off-by: Felipe Balbi Signed-off-by: Willy Tarreau commit 66e87b46965c4208dbfa3404f507ea8db56e33ca Author: Alan Stern Date: Fri Dec 9 15:17:46 2016 -0500 USB: gadgetfs: fix unbounded memory allocation bug commit faab50984fe6636e616c7cc3d30308ba391d36fd upstream. Andrey Konovalov reports that fuzz testing with syzkaller causes a KASAN warning in gadgetfs: BUG: KASAN: slab-out-of-bounds in dev_config+0x86f/0x1190 at addr ffff88003c47e160 Write of size 65537 by task syz-executor0/6356 CPU: 3 PID: 6356 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 ffff88003c107ad8 ffffffff81f96aba ffffffff3dc11ef0 1ffff10007820eee ffffed0007820ee6 ffff88003dc11f00 0000000041b58ab3 ffffffff8598b4c8 ffffffff81f96828 ffffffff813fb4a0 ffff88003b6eadc0 ffff88003c107738 Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [] dump_stack+0x292/0x398 lib/dump_stack.c:51 [] kasan_object_err+0x1c/0x70 mm/kasan/report.c:159 [< inline >] print_address_description mm/kasan/report.c:197 [] kasan_report_error+0x1f0/0x4e0 mm/kasan/report.c:286 [] kasan_report+0x35/0x40 mm/kasan/report.c:306 [< inline >] check_memory_region_inline mm/kasan/kasan.c:308 [] check_memory_region+0x139/0x190 mm/kasan/kasan.c:315 [] kasan_check_write+0x14/0x20 mm/kasan/kasan.c:326 [< inline >] copy_from_user arch/x86/include/asm/uaccess.h:689 [< inline >] ep0_write drivers/usb/gadget/legacy/inode.c:1135 [] dev_config+0x86f/0x1190 drivers/usb/gadget/legacy/inode.c:1759 [] __vfs_write+0x5d5/0x760 fs/read_write.c:510 [] vfs_write+0x170/0x4e0 fs/read_write.c:560 [< inline >] SYSC_write fs/read_write.c:607 [] SyS_write+0xfb/0x230 fs/read_write.c:599 [] entry_SYSCALL_64_fastpath+0x1f/0xc2 Indeed, there is a comment saying that the value of len is restricted to a 16-bit integer, but the code doesn't actually do this. This patch fixes the warning. It replaces the comment with a computation that forces the amount of data copied from the user in ep0_write() to be no larger than the wLength size for the control transfer, which is a 16-bit quantity. Signed-off-by: Alan Stern Reported-by: Andrey Konovalov Tested-by: Andrey Konovalov Signed-off-by: Felipe Balbi Signed-off-by: Willy Tarreau commit 133caaf57f3b8b57db1aa3fca3d343f3672fdf1a Author: Greg Kroah-Hartman Date: Tue Dec 6 08:36:29 2016 +0100 usb: gadgetfs: restrict upper bound on device configuration size commit 0994b0a257557e18ee8f0b7c5f0f73fe2b54eec1 upstream. Andrey Konovalov reported that we were not properly checking the upper limit before of a device configuration size before calling memdup_user(), which could cause some problems. So set the upper limit to PAGE_SIZE * 4, which should be good enough for all devices. Reported-by: Andrey Konovalov Signed-off-by: Felipe Balbi Signed-off-by: Willy Tarreau commit 7563f274e645922cb7475fd7e15eb86d06026ef8 Author: Con Kolivas Date: Fri Dec 9 15:15:57 2016 +1100 ALSA: usb-audio: Add QuickCam Communicate Deluxe/S7500 to volume_control_quirks commit 82ffb6fc637150b279f49e174166d2aa3853eaf4 upstream. The Logitech QuickCam Communicate Deluxe/S7500 microphone fails with the following warning. [ 6.778995] usb 2-1.2.2.2: Warning! Unlikely big volume range (=3072), cval->res is probably wrong. [ 6.778996] usb 2-1.2.2.2: [5] FU [Mic Capture Volume] ch = 1, val = 4608/7680/1 Adding it to the list of devices in volume_control_quirks makes it work properly, fixing related typo. Signed-off-by: Con Kolivas Signed-off-by: Takashi Iwai Signed-off-by: Willy Tarreau commit 42860dabff821117bdb1716a938f4e6b0d424965 Author: Takashi Iwai Date: Sun Apr 9 10:41:27 2017 +0200 ALSA: seq: Don't break snd_use_lock_sync() loop by timeout commit 4e7655fd4f47c23e5249ea260dc802f909a64611 upstream. The snd_use_lock_sync() (thus its implementation snd_use_lock_sync_helper()) has the 5 seconds timeout to break out of the sync loop. It was introduced from the beginning, just to be "safer", in terms of avoiding the stupid bugs. However, as Ben Hutchings suggested, this timeout rather introduces a potential leak or use-after-free that was apparently fixed by the commit 2d7d54002e39 ("ALSA: seq: Fix race during FIFO resize"): for example, snd_seq_fifo_event_in() -> snd_seq_event_dup() -> copy_from_user() could block for a long time, and snd_use_lock_sync() goes timeout and still leaves the cell at releasing the pool. For fixing such a problem, we remove the break by the timeout while still keeping the warning. Suggested-by: Ben Hutchings Signed-off-by: Takashi Iwai Signed-off-by: Willy Tarreau commit 28e0ebdd57cd3287ac821068364b74c70af3861c Author: Takashi Iwai Date: Fri Mar 24 17:07:57 2017 +0100 ALSA: seq: Fix race during FIFO resize commit 2d7d54002e396c180db0c800c1046f0a3c471597 upstream. When a new event is queued while processing to resize the FIFO in snd_seq_fifo_clear(), it may lead to a use-after-free, as the old pool that is being queued gets removed. For avoiding this race, we need to close the pool to be deleted and sync its usage before actually deleting it. The issue was spotted by syzkaller. Reported-by: Dmitry Vyukov Signed-off-by: Takashi Iwai Signed-off-by: Willy Tarreau commit 2dbb155d5902591f3c87196645b4fd722e540809 Author: Takashi Iwai Date: Tue Mar 21 13:56:04 2017 +0100 ALSA: seq: Fix racy cell insertions during snd_seq_pool_done() commit c520ff3d03f0b5db7146d9beed6373ad5d2a5e0e upstream. When snd_seq_pool_done() is called, it marks the closing flag to refuse the further cell insertions. But snd_seq_pool_done() itself doesn't clear the cells but just waits until all cells are cleared by the caller side. That is, it's racy, and this leads to the endless stall as syzkaller spotted. This patch addresses the racy by splitting the setup of pool->closing flag out of snd_seq_pool_done(), and calling it properly before snd_seq_pool_done(). BugLink: http://lkml.kernel.org/r/CACT4Y+aqqy8bZA1fFieifNxR2fAfFQQABcBHj801+u5ePV0URw@mail.gmail.com Reported-and-tested-by: Dmitry Vyukov Signed-off-by: Takashi Iwai Signed-off-by: Willy Tarreau commit 2dbfb5cbe934be0354e8c726096199ba0b64c109 Author: Takashi Iwai Date: Tue Feb 28 22:15:51 2017 +0100 ALSA: seq: Fix link corruption by event error handling commit f3ac9f737603da80c2da3e84b89e74429836bb6d upstream. The sequencer FIFO management has a bug that may lead to a corruption (shortage) of the cell linked list. When a sequencer client faces an error at the event delivery, it tries to put back the dequeued cell. When the first queue was put back, this forgot the tail pointer tracking, and the link will be screwed up. Although there is no memory corruption, the sequencer client may stall forever at exit while flushing the pending FIFO cells in snd_seq_pool_done(), as spotted by syzkaller. This patch addresses the missing tail pointer tracking at snd_seq_fifo_cell_putback(). Also the patch makes sure to clear the cell->enxt pointer at snd_seq_fifo_event_in() for avoiding a similar mess-up of the FIFO linked list. Reported-by: Dmitry Vyukov Signed-off-by: Takashi Iwai Signed-off-by: Willy Tarreau commit 7a3085a36da3ee4b6b53c9f86d5186aec1caba64 Author: Takashi Iwai Date: Tue Feb 28 14:49:07 2017 +0100 ALSA: timer: Reject user params with too small ticks commit 71321eb3f2d0df4e6c327e0b936eec4458a12054 upstream. When a user sets a too small ticks with a fine-grained timer like hrtimer, the kernel tries to fire up the timer irq too frequently. This may lead to the condensed locks, eventually the kernel spinlock lockup with warnings. For avoiding such a situation, we define a lower limit of the resolution, namely 1ms. When the user passes a too small tick value that results in less than that, the kernel returns -EINVAL now. Reported-by: Dmitry Vyukov Signed-off-by: Takashi Iwai Signed-off-by: Willy Tarreau commit 28567fb4bc02fc0fdb7fd328cddfcd623f5dc65c Author: Takashi Iwai Date: Mon Feb 6 15:09:48 2017 +0100 ALSA: seq: Don't handle loop timeout at snd_seq_pool_done() commit 37a7ea4a9b81f6a864c10a7cb0b96458df5310a3 upstream. snd_seq_pool_done() syncs with closing of all opened threads, but it aborts the wait loop with a timeout, and proceeds to the release resource even if not all threads have been closed. The timeout was 5 seconds, and if you run a crazy stuff, it can exceed easily, and may result in the access of the invalid memory address -- this is what syzkaller detected in a bug report. As a fix, let the code graduate from naiveness, simply remove the loop timeout. BugLink: http://lkml.kernel.org/r/CACT4Y+YdhDV2H5LLzDTJDVF-qiYHUHhtRaW4rbb4gUhTCQB81w@mail.gmail.com Reported-by: Dmitry Vyukov Signed-off-by: Takashi Iwai Signed-off-by: Willy Tarreau commit 6dd5cf43dfa0353f9a1bd7b3e71f4c2703c7042f Author: Takashi Iwai Date: Wed Feb 8 12:35:39 2017 +0100 ALSA: seq: Fix race at creating a queue commit 4842e98f26dd80be3623c4714a244ba52ea096a8 upstream. When a sequencer queue is created in snd_seq_queue_alloc(),it adds the new queue element to the public list before referencing it. Thus the queue might be deleted before the call of snd_seq_queue_use(), and it results in the use-after-free error, as spotted by syzkaller. The fix is to reference the queue object at the right time. Reported-by: Dmitry Vyukov Signed-off-by: Takashi Iwai Signed-off-by: Willy Tarreau commit bbcdcb830eb0d37bcd2bc43c9f2eedcd4a134fce Author: Takashi Iwai Date: Tue Dec 6 16:20:36 2016 +0100 ALSA: hda - Fix up GPIO for ASUS ROG Ranger commit 85bcf96caba8b4a7c0805555638629ba3c67ea0c upstream. ASUS ROG Ranger VIII with ALC1150 codec requires the extra GPIO pin to up for the front panel. Just use the existing fixup for setting up the GPIO pins. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189411 Signed-off-by: Takashi Iwai Signed-off-by: Willy Tarreau commit d2ab6e524ffb26caf0680aaf2963082d26d2b77d Author: Marc Kleine-Budde Date: Thu Mar 2 12:03:40 2017 +0100 can: usb_8dev: Fix memory leak of priv->cmd_msg_buffer commit 7c42631376306fb3f34d51fda546b50a9b6dd6ec upstream. The priv->cmd_msg_buffer is allocated in the probe function, but never kfree()ed. This patch converts the kzalloc() to resource-managed kzalloc. Signed-off-by: Marc Kleine-Budde Signed-off-by: Willy Tarreau commit d5361ee850ef3e69a92883e46ccbb5ed6ccb9323 Author: Oliver Hartkopp Date: Wed Jan 18 21:30:51 2017 +0100 can: bcm: fix hrtimer/tasklet termination in bcm op removal commit a06393ed03167771246c4c43192d9c264bc48412 upstream. When removing a bcm tx operation either a hrtimer or a tasklet might run. As the hrtimer triggers its associated tasklet and vice versa we need to take care to mutually terminate both handlers. Reported-by: Michael Josenhans Signed-off-by: Oliver Hartkopp Tested-by: Michael Josenhans Signed-off-by: Marc Kleine-Budde Signed-off-by: Willy Tarreau commit 2adddc0a8fc6bccaa475c2cb40c10f11adb80f35 Author: Yegor Yefremov Date: Wed Jan 18 11:35:57 2017 +0100 can: ti_hecc: add missing prepare and unprepare of the clock commit befa60113ce7ea270cb51eada28443ca2756f480 upstream. In order to make the driver work with the common clock framework, this patch converts the clk_enable()/clk_disable() to clk_prepare_enable()/clk_disable_unprepare(). Also add error checking for clk_prepare_enable(). Signed-off-by: Yegor Yefremov Signed-off-by: Marc Kleine-Budde Signed-off-by: Willy Tarreau commit 165cc0330b7cdb071ae8d652905cbc538bdc78ad Author: Einar Jón Date: Fri Aug 12 13:50:41 2016 +0200 can: c_can_pci: fix null-pointer-deref in c_can_start() - set device pointer commit c97c52be78b8463ac5407f1cf1f22f8f6cf93a37 upstream. The priv->device pointer for c_can_pci is never set, but it is used without a NULL check in c_can_start(). Setting it in c_can_pci_probe() like c_can_plat_probe() prevents c_can_pci.ko from crashing, with and without CONFIG_PM. This might also cause the pm_runtime_*() functions in c_can.c to actually be executed for c_can_pci devices - they are the only other place where priv->device is used, but they all contain a null check. Signed-off-by: Einar Jón Signed-off-by: Marc Kleine-Budde Signed-off-by: Willy Tarreau commit 3e7b58a3c45e296eefe6ae0b822c7c7ee95dfc78 Author: 추지호 Date: Thu Dec 8 12:01:13 2016 +0000 can: peak: fix bad memory access and free sequence commit b67d0dd7d0dc9e456825447bbeb935d8ef43ea7c upstream. Fix for bad memory access while disconnecting. netdev is freed before private data free, and dev is accessed after freeing netdev. This makes a slub problem, and it raise kernel oops with slub debugger config. Signed-off-by: Jiho Chu Signed-off-by: Marc Kleine-Budde Signed-off-by: Willy Tarreau commit cf7969002086e294d8b1b5377b00d1a8de98a3ab Author: Marc Kleine-Budde Date: Mon Dec 5 11:44:23 2016 +0100 can: raw: raw_setsockopt: limit number of can_filter that can be set commit 332b05ca7a438f857c61a3c21a88489a21532364 upstream. This patch adds a check to limit the number of can_filters that can be set via setsockopt on CAN_RAW sockets. Otherwise allocations > MAX_ORDER are not prevented resulting in a warning. Reference: https://lkml.org/lkml/2016/12/2/230 Reported-by: Andrey Konovalov Tested-by: Andrey Konovalov Signed-off-by: Marc Kleine-Budde Signed-off-by: Willy Tarreau commit 1918581a164b0e44ff403e7d7db073ef93fab1d2 Author: Tariq Saeed Date: Fri Sep 4 15:44:31 2015 -0700 ocfs2: fix BUG_ON() in ocfs2_ci_checkpointed() commit 3d46a44a0c01b15d385ccaae24b56f619613c256 upstream. PID: 614 TASK: ffff882a739da580 CPU: 3 COMMAND: "ocfs2dc" #0 [ffff882ecc3759b0] machine_kexec at ffffffff8103b35d #1 [ffff882ecc375a20] crash_kexec at ffffffff810b95b5 #2 [ffff882ecc375af0] oops_end at ffffffff815091d8 #3 [ffff882ecc375b20] die at ffffffff8101868b #4 [ffff882ecc375b50] do_trap at ffffffff81508bb0 #5 [ffff882ecc375ba0] do_invalid_op at ffffffff810165e5 #6 [ffff882ecc375c40] invalid_op at ffffffff815116fb [exception RIP: ocfs2_ci_checkpointed+208] RIP: ffffffffa0a7e940 RSP: ffff882ecc375cf0 RFLAGS: 00010002 RAX: 0000000000000001 RBX: 000000000000654b RCX: ffff8812dc83f1f8 RDX: 00000000000017d9 RSI: ffff8812dc83f1f8 RDI: ffffffffa0b2c318 RBP: ffff882ecc375d20 R8: ffff882ef6ecfa60 R9: ffff88301f272200 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffff R13: ffff8812dc83f4f0 R14: 0000000000000000 R15: ffff8812dc83f1f8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffff882ecc375d28] ocfs2_check_meta_downconvert at ffffffffa0a7edbd [ocfs2] #8 [ffff882ecc375d38] ocfs2_unblock_lock at ffffffffa0a84af8 [ocfs2] #9 [ffff882ecc375dc8] ocfs2_process_blocked_lock at ffffffffa0a85285 [ocfs2] assert is tripped because the tran is not checkpointed and the lock level is PR. Some time ago, chmod command had been executed. As result, the following call chain left the inode cluster lock in PR state, latter on causing the assert. system_call_fastpath -> my_chmod -> sys_chmod -> sys_fchmodat -> notify_change -> ocfs2_setattr -> posix_acl_chmod -> ocfs2_iop_set_acl -> ocfs2_set_acl -> ocfs2_acl_set_mode Here is how. 1119 int ocfs2_setattr(struct dentry *dentry, struct iattr *attr) 1120 { 1247 ocfs2_inode_unlock(inode, 1); <<< WRONG thing to do. .. 1258 if (!status && attr->ia_valid & ATTR_MODE) { 1259 status = posix_acl_chmod(inode, inode->i_mode); 519 posix_acl_chmod(struct inode *inode, umode_t mode) 520 { .. 539 ret = inode->i_op->set_acl(inode, acl, ACL_TYPE_ACCESS); 287 int ocfs2_iop_set_acl(struct inode *inode, struct posix_acl *acl, ... 288 { 289 return ocfs2_set_acl(NULL, inode, NULL, type, acl, NULL, NULL); 224 int ocfs2_set_acl(handle_t *handle, 225 struct inode *inode, ... 231 { .. 252 ret = ocfs2_acl_set_mode(inode, di_bh, 253 handle, mode); 168 static int ocfs2_acl_set_mode(struct inode *inode, struct buffer_head ... 170 { 183 if (handle == NULL) { >>> BUG: inode lock not held in ex at this point <<< 184 handle = ocfs2_start_trans(OCFS2_SB(inode->i_sb), 185 OCFS2_INODE_UPDATE_CREDITS); ocfs2_setattr.#1247 we unlock and at #1259 call posix_acl_chmod. When we reach ocfs2_acl_set_mode.#181 and do trans, the inode cluster lock is not held in EX mode (it should be). How this could have happended? We are the lock master, were holding lock EX and have released it in ocfs2_setattr.#1247. Note that there are no holders of this lock at this point. Another node needs the lock in PR, and we downconvert from EX to PR. So the inode lock is PR when do the trans in ocfs2_acl_set_mode.#184. The trans stays in core (not flushed to disc). Now another node want the lock in EX, downconvert thread gets kicked (the one that tripped assert abovt), finds an unflushed trans but the lock is not EX (it is PR). If the lock was at EX, it would have flushed the trans ocfs2_ci_checkpointed -> ocfs2_start_checkpoint before downconverting (to NULL) for the request. ocfs2_setattr must not drop inode lock ex in this code path. If it does, takes it again before the trans, say in ocfs2_set_acl, another cluster node can get in between, execute another setattr, overwriting the one in progress on this node, resulting in a mode acl size combo that is a mix of the two. Orabug: 20189959 Signed-off-by: Tariq Saeed Reviewed-by: Mark Fasheh Cc: Joel Becker Cc: Joseph Qi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 4b93c1da7b6bef1ad8f7ae9a3858bf46c9d2d417 Author: Eric Ren Date: Tue Jan 10 16:57:33 2017 -0800 ocfs2: fix crash caused by stale lvb with fsdlm plugin commit e7ee2c089e94067d68475990bdeed211c8852917 upstream. The crash happens rather often when we reset some cluster nodes while nodes contend fiercely to do truncate and append. The crash backtrace is below: dlm: C21CBDA5E0774F4BA5A9D4F317717495: dlm_recover_grant 1 locks on 971 resources dlm: C21CBDA5E0774F4BA5A9D4F317717495: dlm_recover 9 generation 5 done: 4 ms ocfs2: Begin replay journal (node 318952601, slot 2) on device (253,18) ocfs2: End replay journal (node 318952601, slot 2) on device (253,18) ocfs2: Beginning quota recovery on device (253,18) for slot 2 ocfs2: Finishing quota recovery on device (253,18) for slot 2 (truncate,30154,1):ocfs2_truncate_file:470 ERROR: bug expression: le64_to_cpu(fe->i_size) != i_size_read(inode) (truncate,30154,1):ocfs2_truncate_file:470 ERROR: Inode 290321, inode i_size = 732 != di i_size = 937, i_flags = 0x1 ------------[ cut here ]------------ kernel BUG at /usr/src/linux/fs/ocfs2/file.c:470! invalid opcode: 0000 [#1] SMP Modules linked in: ocfs2_stack_user(OEN) ocfs2(OEN) ocfs2_nodemanager ocfs2_stackglue(OEN) quota_tree dlm(OEN) configfs fuse sd_mod iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi af_packet iscsi_ibft iscsi_boot_sysfs softdog xfs libcrc32c ppdev parport_pc pcspkr parport joydev virtio_balloon virtio_net i2c_piix4 acpi_cpufreq button processor ext4 crc16 jbd2 mbcache ata_generic cirrus virtio_blk ata_piix drm_kms_helper ahci syscopyarea libahci sysfillrect sysimgblt fb_sys_fops ttm floppy libata drm virtio_pci virtio_ring uhci_hcd virtio ehci_hcd usbcore serio_raw usb_common sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4 Supported: No, Unsupported modules are loaded CPU: 1 PID: 30154 Comm: truncate Tainted: G OE N 4.4.21-69-default #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20151112_172657-sheep25 04/01/2014 task: ffff88004ff6d240 ti: ffff880074e68000 task.ti: ffff880074e68000 RIP: 0010:[] [] ocfs2_truncate_file+0x640/0x6c0 [ocfs2] RSP: 0018:ffff880074e6bd50 EFLAGS: 00010282 RAX: 0000000000000074 RBX: 000000000000029e RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246 RBP: ffff880074e6bda8 R08: 000000003675dc7a R09: ffffffff82013414 R10: 0000000000034c50 R11: 0000000000000000 R12: ffff88003aab3448 R13: 00000000000002dc R14: 0000000000046e11 R15: 0000000000000020 FS: 00007f839f965700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f839f97e000 CR3: 0000000036723000 CR4: 00000000000006e0 Call Trace: ocfs2_setattr+0x698/0xa90 [ocfs2] notify_change+0x1ae/0x380 do_truncate+0x5e/0x90 do_sys_ftruncate.constprop.11+0x108/0x160 entry_SYSCALL_64_fastpath+0x12/0x6d Code: 24 28 ba d6 01 00 00 48 c7 c6 30 43 62 a0 8b 41 2c 89 44 24 08 48 8b 41 20 48 c7 c1 78 a3 62 a0 48 89 04 24 31 c0 e8 a0 97 f9 ff <0f> 0b 3d 00 fe ff ff 0f 84 ab fd ff ff 83 f8 fc 0f 84 a2 fd ff RIP [] ocfs2_truncate_file+0x640/0x6c0 [ocfs2] It's because ocfs2_inode_lock() get us stale LVB in which the i_size is not equal to the disk i_size. We mistakenly trust the LVB because the underlaying fsdlm dlm_lock() doesn't set lkb_sbflags with DLM_SBF_VALNOTVALID properly for us. But, why? The current code tries to downconvert lock without DLM_LKF_VALBLK flag to tell o2cb don't update RSB's LVB if it's a PR->NULL conversion, even if the lock resource type needs LVB. This is not the right way for fsdlm. The fsdlm plugin behaves different on DLM_LKF_VALBLK, it depends on DLM_LKF_VALBLK to decide if we care about the LVB in the LKB. If DLM_LKF_VALBLK is not set, fsdlm will skip recovering RSB's LVB from this lkb and set the right DLM_SBF_VALNOTVALID appropriately when node failure happens. The following diagram briefly illustrates how this crash happens: RSB1 is inode metadata lock resource with LOCK_TYPE_USES_LVB; The 1st round: Node1 Node2 RSB1: PR RSB1(master): NULL->EX ocfs2_downconvert_lock(PR->NULL, set_lvb==0) ocfs2_dlm_lock(no DLM_LKF_VALBLK) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - dlm_lock(no DLM_LKF_VALBLK) convert_lock(overwrite lkb->lkb_exflags with no DLM_LKF_VALBLK) RSB1: NULL RSB1: EX reset Node2 dlm_recover_rsbs() recover_lvb() /* The LVB is not trustable if the node with EX fails and * no lock >= PR is left. We should set RSB_VALNOTVALID for RSB1. */ if(!(kb_exflags & DLM_LKF_VALBLK)) /* This means we miss the chance to return; * to invalid the LVB here. */ The 2nd round: Node 1 Node2 RSB1(become master from recovery) ocfs2_setattr() ocfs2_inode_lock(NULL->EX) /* dlm_lock() return the stale lvb without setting DLM_SBF_VALNOTVALID */ ocfs2_meta_lvb_is_trustable() return 1 /* so we don't refresh inode from disk */ ocfs2_truncate_file() mlog_bug_on_msg(disk isize != i_size_read(inode)) /* crash! */ The fix is quite straightforward. We keep to set DLM_LKF_VALBLK flag for dlm_lock() if the lock resource type needs LVB and the fsdlm plugin is uesed. Link: http://lkml.kernel.org/r/1481275846-6604-1-git-send-email-zren@suse.com Signed-off-by: Eric Ren Reviewed-by: Joseph Qi Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 4549e4af6df0499ee53cc873f3de93751902d125 Author: Sachin Prabhu Date: Sun Apr 16 20:37:24 2017 +0100 cifs: Do not send echoes before Negotiate is complete commit 62a6cfddcc0a5313e7da3e8311ba16226fe0ac10 upstream. commit 4fcd1813e640 ("Fix reconnect to not defer smb3 session reconnect long after socket reconnect") added support for Negotiate requests to be initiated by echo calls. To avoid delays in calling echo after a reconnect, I added the patch introduced by the commit b8c600120fc8 ("Call echo service immediately after socket reconnect"). This has however caused a regression with cifs shares which do not have support for echo calls to trigger Negotiate requests. On connections which need to call Negotiation, the echo calls trigger an error which triggers a reconnect which in turn triggers another echo call. This results in a loop which is only broken when an operation is performed on the cifs share. For an idle share, it can DOS a server. The patch uses the smb_operation can_echo() for cifs so that it is called only if connection has been already been setup. kernel bz: 194531 Signed-off-by: Sachin Prabhu Tested-by: Jonathan Liu Acked-by: Pavel Shilovsky Signed-off-by: Steve French Signed-off-by: Willy Tarreau commit a9baa44162bf847ea97ea179e9b37d25a75a8c99 Author: Aurelien Aptel Date: Thu Jan 26 14:25:49 2017 +0100 fs/cifs: make share unaccessible at root level mountable commit a6b5058fafdf508904bbf16c29b24042cef3c496 upstream. if, when mounting //HOST/share/sub/dir/foo we can query /sub/dir/foo but not any of the path components above: - store the /sub/dir/foo prefix in the cifs super_block info - in the superblock, set root dentry to the subpath dentry (instead of the share root) - set a flag in the superblock to remember it - use prefixpath when building path from a dentry fixes bso#8950 Signed-off-by: Aurelien Aptel Reviewed-by: Pavel Shilovsky Signed-off-by: Steve French Signed-off-by: Willy Tarreau commit 60f2c2fa044c5e127bfa01b2c0bae00d7711acc1 Author: Germano Percossi Date: Fri Apr 7 12:29:37 2017 +0100 CIFS: remove bad_network_name flag commit a0918f1ce6a43ac980b42b300ec443c154970979 upstream. STATUS_BAD_NETWORK_NAME can be received during node failover, causing the flag to be set and making the reconnect thread always unsuccessful, thereafter. Once the only place where it is set is removed, the remaining bits are rendered moot. Removing it does not prevent "mount" from failing when a non existent share is passed. What happens when the share really ceases to exist while the share is mounted is undefined now as much as it was before. Signed-off-by: Germano Percossi Reviewed-by: Pavel Shilovsky Signed-off-by: Steve French Signed-off-by: Willy Tarreau commit f9e74c2ad12ad9b0f4135046c3d835de835843ef Author: Pavel Shilovsky Date: Tue Nov 29 16:14:43 2016 -0800 CIFS: Fix a possible memory corruption in push locks commit e3d240e9d505fc67f8f8735836df97a794bbd946 upstream. If maxBuf is not 0 but less than a size of SMB2 lock structure we can end up with a memory corruption. Signed-off-by: Pavel Shilovsky Signed-off-by: Willy Tarreau commit 85beff453d806fa682f4d54430fc0fa96f6d4cd9 Author: Pavel Shilovsky Date: Tue Nov 29 11:30:58 2016 -0800 CIFS: Fix missing nls unload in smb2_reconnect() commit 4772c79599564bd08ee6682715a7d3516f67433f upstream. Acked-by: Sachin Prabhu Signed-off-by: Pavel Shilovsky Signed-off-by: Willy Tarreau commit e008a962311a875a828cbae54b43285858aaa6c8 Author: Pavel Shilovsky Date: Fri Nov 4 11:50:31 2016 -0700 CIFS: Fix a possible memory corruption during reconnect commit 53e0e11efe9289535b060a51d4cf37c25e0d0f2b upstream. We can not unlock/lock cifs_tcp_ses_lock while walking through ses and tcon lists because it can corrupt list iterator pointers and a tcon structure can be released if we don't hold an extra reference. Fix it by moving a reconnect process to a separate delayed work and acquiring a reference to every tcon that needs to be reconnected. Also do not send an echo request on newly established connections. Signed-off-by: Pavel Shilovsky Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit d45256ff3ed6edf2d61caf25b37643eed9001908 Author: colyli@suse.de Date: Sat Jan 28 21:11:49 2017 +0800 md linear: fix a race between linear_add() and linear_congested() commit 03a9e24ef2aaa5f1f9837356aed79c860521407a upstream. Recently I receive a bug report that on Linux v3.0 based kerenl, hot add disk to a md linear device causes kernel crash at linear_congested(). From the crash image analysis, I find in linear_congested(), mddev->raid_disks contains value N, but conf->disks[] only has N-1 pointers available. Then a NULL pointer deference crashes the kernel. There is a race between linear_add() and linear_congested(), RCU stuffs used in these two functions cannot avoid the race. Since Linuv v4.0 RCU code is replaced by introducing mddev_suspend(). After checking the upstream code, it seems linear_congested() is not called in generic_make_request() code patch, so mddev_suspend() cannot provent it from being called. The possible race still exists. Here I explain how the race still exists in current code. For a machine has many CPUs, on one CPU, linear_add() is called to add a hard disk to a md linear device; at the same time on other CPU, linear_congested() is called to detect whether this md linear device is congested before issuing an I/O request onto it. Now I use a possible code execution time sequence to demo how the possible race happens, seq linear_add() linear_congested() 0 conf=mddev->private 1 oldconf=mddev->private 2 mddev->raid_disks++ 3 for (i=0; iraid_disks;i++) 4 bdev_get_queue(conf->disks[i].rdev->bdev) 5 mddev->private=newconf In linear_add() mddev->raid_disks is increased in time seq 2, and on another CPU in linear_congested() the for-loop iterates conf->disks[i] by the increased mddev->raid_disks in time seq 3,4. But conf with one more element (which is a pointer to struct dev_info type) to conf->disks[] is not updated yet, accessing its structure member in time seq 4 will cause a NULL pointer deference fault. To fix this race, there are 2 parts of modification in the patch, 1) Add 'int raid_disks' in struct linear_conf, as a copy of mddev->raid_disks. It is initialized in linear_conf(), always being consistent with pointers number of 'struct dev_info disks[]'. When iterating conf->disks[] in linear_congested(), use conf->raid_disks to replace mddev->raid_disks in the for-loop, then NULL pointer deference will not happen again. 2) RCU stuffs are back again, and use kfree_rcu() in linear_add() to free oldconf memory. Because oldconf may be referenced as mddev->private in linear_congested(), kfree_rcu() makes sure that its memory will not be released until no one uses it any more. Also some code comments are added in this patch, to make this modification to be easier understandable. This patch can be applied for kernels since v4.0 after commit: 3be260cc18f8 ("md/linear: remove rcu protections in favour of suspend/resume"). But this bug is reported on Linux v3.0 based kernel, for people who maintain kernels before Linux v4.0, they need to do some back back port to this patch. Changelog: - V3: add 'int raid_disks' in struct linear_conf, and use kfree_rcu() to replace rcu_call() in linear_add(). - v2: add RCU stuffs by suggestion from Shaohua and Neil. - v1: initial effort. Signed-off-by: Coly Li Cc: Shaohua Li Cc: Neil Brown Signed-off-by: Shaohua Li Signed-off-by: Willy Tarreau commit 7a56cc091820dc84e1260a74ec316d4a4b12d372 Author: Wei Fang Date: Mon Mar 21 19:18:32 2016 +0800 md:raid1: fix a dead loop when read from a WriteMostly disk commit 816b0acf3deb6d6be5d0519b286fdd4bafade905 upstream. If first_bad == this_sector when we get the WriteMostly disk in read_balance(), valid disk will be returned with zero max_sectors. It'll lead to a dead loop in make_request(), and OOM will happen because of endless allocation of struct bio. Since we can't get data from this disk in this case, so continue for another disk. Signed-off-by: Wei Fang Signed-off-by: Shaohua Li Cc: Julia Lawall Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 523d7696aa00e1a7417a425b0c8f53b22c4d65cd Author: Konstantin Khlebnikov Date: Sun Nov 27 19:32:32 2016 +0300 md/raid5: limit request size according to implementation limits commit e8d7c33232e5fdfa761c3416539bc5b4acd12db5 upstream. Current implementation employ 16bit counter of active stripes in lower bits of bio->bi_phys_segments. If request is big enough to overflow this counter bio will be completed and freed too early. Fortunately this not happens in default configuration because several other limits prevent that: stripe_cache_size * nr_disks effectively limits count of active stripes. And small max_sectors_kb at lower disks prevent that during normal read/write operations. Overflow easily happens in discard if it's enabled by module parameter "devices_handle_discard_safely" and stripe_cache_size is set big enough. This patch limits requests size with 256Mb - 8Kb to prevent overflows. Signed-off-by: Konstantin Khlebnikov Cc: Shaohua Li Cc: Neil Brown Signed-off-by: Shaohua Li Signed-off-by: Willy Tarreau commit 8255960354cc202bf6bc714ccd7879a202d468d5 Author: Benjamin Marzinski Date: Wed Nov 30 17:56:14 2016 -0600 dm space map metadata: fix 'struct sm_metadata' leak on failed create commit 314c25c56c1ee5026cf99c570bdfe01847927acb upstream. In dm_sm_metadata_create() we temporarily change the dm_space_map operations from 'ops' (whose .destroy function deallocates the sm_metadata) to 'bootstrap_ops' (whose .destroy function doesn't). If dm_sm_metadata_create() fails in sm_ll_new_metadata() or sm_ll_extend(), it exits back to dm_tm_create_internal(), which calls dm_sm_destroy() with the intention of freeing the sm_metadata, but it doesn't (because the dm_space_map operations is still set to 'bootstrap_ops'). Fix this by setting the dm_space_map operations back to 'ops' if dm_sm_metadata_create() fails when it is set to 'bootstrap_ops'. [js] no nr_blocks test in 3.12 yet Signed-off-by: Benjamin Marzinski Acked-by: Joe Thornber Signed-off-by: Mike Snitzer Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 3044e1958de35175526af69f0bba20d415229744 Author: Ondrej Kozina Date: Wed Nov 2 15:02:08 2016 +0100 dm crypt: mark key as invalid until properly loaded commit 265e9098bac02bc5e36cda21fdbad34cb5b2f48d upstream. In crypt_set_key(), if a failure occurs while replacing the old key (e.g. tfm->setkey() fails) the key must not have DM_CRYPT_KEY_VALID flag set. Otherwise, the crypto layer would have an invalid key that still has DM_CRYPT_KEY_VALID flag set. Signed-off-by: Ondrej Kozina Reviewed-by: Mikulas Patocka Signed-off-by: Mike Snitzer Signed-off-by: Willy Tarreau commit 1dd3d3e635c3b9538d693a616b150ddeb3662963 Author: Dan Williams Date: Tue Dec 29 14:02:29 2015 -0800 block: fix del_gendisk() vs blkdev_ioctl crash commit ac34f15e0c6d2fd58480052b6985f6991fb53bcc upstream. When tearing down a block device early in its lifetime, userspace may still be performing discovery actions like blkdev_ioctl() to re-read partitions. The nvdimm_revalidate_disk() implementation depends on disk->driverfs_dev to be valid at entry. However, it is set to NULL in del_gendisk() and fatally this is happening *before* the disk device is deleted from userspace view. There's no reason for del_gendisk() to clear ->driverfs_dev. That device is the parent of the disk. It is guaranteed to not be freed until the disk, as a child, drops its ->parent reference. We could also fix this issue locally in nvdimm_revalidate_disk() by using disk_to_dev(disk)->parent, but lets fix it globally since ->driverfs_dev follows the lifetime of the parent. Longer term we should probably just add a @parent parameter to add_disk(), and stop carrying this pointer in the gendisk. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] nvdimm_revalidate_disk+0x18/0x90 [libnvdimm] CPU: 2 PID: 538 Comm: systemd-udevd Tainted: G O 4.4.0-rc5 #2257 [..] Call Trace: [] rescan_partitions+0x87/0x2c0 [] ? __lock_is_held+0x49/0x70 [] __blkdev_reread_part+0x72/0xb0 [] blkdev_reread_part+0x25/0x40 [] blkdev_ioctl+0x4fd/0x9c0 [] ? current_kernel_time64+0x69/0xd0 [] block_ioctl+0x3d/0x50 [] do_vfs_ioctl+0x308/0x560 [] ? __audit_syscall_entry+0xb1/0x100 [] ? do_audit_syscall_entry+0x66/0x70 [] SyS_ioctl+0x79/0x90 [] entry_SYSCALL_64_fastpath+0x12/0x76 Cc: Jan Kara Cc: Jens Axboe Reported-by: Robert Hu Signed-off-by: Dan Williams Signed-off-by: Willy Tarreau commit 5cb0174119e4743c9f4b8f17115a8a2386c0f4a2 Author: Mauricio Faria de Oliveira Date: Sat Mar 25 21:48:14 2017 +0530 block: allow WRITE_SAME commands with the SG_IO ioctl commit 25cdb64510644f3e854d502d69c73f21c6df88a9 upstream. The WRITE_SAME commands are not present in the blk_default_cmd_filter write_ok list, and thus are failed with -EPERM when the SG_IO ioctl() is executed without CAP_SYS_RAWIO capability (e.g., unprivileged users). [ sg_io() -> blk_fill_sghdr_rq() > blk_verify_command() -> -EPERM ] The problem can be reproduced with the sg_write_same command # sg_write_same --num 1 --xferlen 512 /dev/sda # # capsh --drop=cap_sys_rawio -- -c \ 'sg_write_same --num 1 --xferlen 512 /dev/sda' Write same: pass through os error: Operation not permitted # For comparison, the WRITE_VERIFY command does not observe this problem, since it is in that list: # capsh --drop=cap_sys_rawio -- -c \ 'sg_write_verify --num 1 --ilen 512 --lba 0 /dev/sda' # So, this patch adds the WRITE_SAME commands to the list, in order for the SG_IO ioctl to finish successfully: # capsh --drop=cap_sys_rawio -- -c \ 'sg_write_same --num 1 --xferlen 512 /dev/sda' # That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices (qemu "-device scsi-block" [1], libvirt "" [2]), which employs the SG_IO ioctl() and runs as an unprivileged user (libvirt-qemu). In that scenario, when a filesystem (e.g., ext4) performs its zero-out calls, which are translated to write-same calls in the guest kernel, and then into SG_IO ioctls to the host kernel, SCSI I/O errors may be observed in the guest: [...] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [...] sd 0:0:0:0: [sda] tag#0 Sense Key : Aborted Command [current] [...] sd 0:0:0:0: [sda] tag#0 Add. Sense: I/O process terminated [...] sd 0:0:0:0: [sda] tag#0 CDB: Write Same(10) 41 00 01 04 e0 78 00 00 08 00 [...] blk_update_request: I/O error, dev sda, sector 17096824 Links: [1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52 [2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device') Signed-off-by: Mauricio Faria de Oliveira Signed-off-by: Brahadambal Srinivasan Reported-by: Manjunatha H R Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sumit Semwal Signed-off-by: Willy Tarreau commit 0f3a4aaafa82dfbc98606e56453fabbd6c27a3ce Author: Omar Sandoval Date: Fri Jul 1 00:39:35 2016 -0700 block: fix use-after-free in sys_ioprio_get() commit 8ba8682107ee2ca3347354e018865d8e1967c5f4 upstream. get_task_ioprio() accesses the task->io_context without holding the task lock and thus can race with exit_io_context(), leading to a use-after-free. The reproducer below hits this within a few seconds on my 4-core QEMU VM: int main(int argc, char **argv) { pid_t pid, child; long nproc, i; /* ioprio_set(IOPRIO_WHO_PROCESS, 0, IOPRIO_PRIO_VALUE(IOPRIO_CLASS_IDLE, 0)); */ syscall(SYS_ioprio_set, 1, 0, 0x6000); nproc = sysconf(_SC_NPROCESSORS_ONLN); for (i = 0; i < nproc; i++) { pid = fork(); assert(pid != -1); if (pid == 0) { for (;;) { pid = fork(); assert(pid != -1); if (pid == 0) { _exit(0); } else { child = wait(NULL); assert(child == pid); } } } pid = fork(); assert(pid != -1); if (pid == 0) { for (;;) { /* ioprio_get(IOPRIO_WHO_PGRP, 0); */ syscall(SYS_ioprio_get, 2, 0); } } } for (;;) { /* ioprio_get(IOPRIO_WHO_PGRP, 0); */ syscall(SYS_ioprio_get, 2, 0); } return 0; } This gets us KASAN dumps like this: [ 35.526914] ================================================================== [ 35.530009] BUG: KASAN: out-of-bounds in get_task_ioprio+0x7b/0x90 at addr ffff880066f34e6c [ 35.530009] Read of size 2 by task ioprio-gpf/363 [ 35.530009] ============================================================================= [ 35.530009] BUG blkdev_ioc (Not tainted): kasan: bad access detected [ 35.530009] ----------------------------------------------------------------------------- [ 35.530009] Disabling lock debugging due to kernel taint [ 35.530009] INFO: Allocated in create_task_io_context+0x2b/0x370 age=0 cpu=0 pid=360 [ 35.530009] ___slab_alloc+0x55d/0x5a0 [ 35.530009] __slab_alloc.isra.20+0x2b/0x40 [ 35.530009] kmem_cache_alloc_node+0x84/0x200 [ 35.530009] create_task_io_context+0x2b/0x370 [ 35.530009] get_task_io_context+0x92/0xb0 [ 35.530009] copy_process.part.8+0x5029/0x5660 [ 35.530009] _do_fork+0x155/0x7e0 [ 35.530009] SyS_clone+0x19/0x20 [ 35.530009] do_syscall_64+0x195/0x3a0 [ 35.530009] return_from_SYSCALL_64+0x0/0x6a [ 35.530009] INFO: Freed in put_io_context+0xe7/0x120 age=0 cpu=0 pid=1060 [ 35.530009] __slab_free+0x27b/0x3d0 [ 35.530009] kmem_cache_free+0x1fb/0x220 [ 35.530009] put_io_context+0xe7/0x120 [ 35.530009] put_io_context_active+0x238/0x380 [ 35.530009] exit_io_context+0x66/0x80 [ 35.530009] do_exit+0x158e/0x2b90 [ 35.530009] do_group_exit+0xe5/0x2b0 [ 35.530009] SyS_exit_group+0x1d/0x20 [ 35.530009] entry_SYSCALL_64_fastpath+0x1a/0xa4 [ 35.530009] INFO: Slab 0xffffea00019bcd00 objects=20 used=4 fp=0xffff880066f34ff0 flags=0x1fffe0000004080 [ 35.530009] INFO: Object 0xffff880066f34e58 @offset=3672 fp=0x0000000000000001 [ 35.530009] ================================================================== Fix it by grabbing the task lock while we poke at the io_context. Reported-by: Dmitry Vyukov Signed-off-by: Omar Sandoval Signed-off-by: Jens Axboe Acked-by: Johannes Thumshirn Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit cd0d925440c191543502b5fa5cc6c6243e601aca Author: Daeho Jeong Date: Thu Dec 1 11:49:12 2016 -0500 ext4: fix inode checksum calculation problem if i_extra_size is small commit 05ac5aa18abd7db341e54df4ae2b4c98ea0e43b7 upstream. We've fixed the race condition problem in calculating ext4 checksum value in commit b47820edd163 ("ext4: avoid modifying checksum fields directly during checksum veficationon"). However, by this change, when calculating the checksum value of inode whose i_extra_size is less than 4, we couldn't calculate the checksum value in a proper way. This problem was found and reported by Nix, Thank you. Reported-by: Nix Signed-off-by: Daeho Jeong Signed-off-by: Youngjin Gil Signed-off-by: Darrick J. Wong Signed-off-by: Theodore Ts'o Signed-off-by: Willy Tarreau commit 48a5889bfdf956a7426df6c07f6eda8602e6a918 Author: Theodore Ts'o Date: Sun Feb 5 01:26:48 2017 -0500 ext4: return EROFS if device is r/o and journal replay is needed commit 4753d8a24d4588657bc0a4cd66d4e282dff15c8c upstream. If the file system requires journal recovery, and the device is read-ony, return EROFS to the mount system call. This allows xfstests generic/050 to pass. Signed-off-by: Theodore Ts'o Signed-off-by: Willy Tarreau commit 399562b67718b4679e0f228131811b1c1725cf10 Author: Theodore Ts'o Date: Sat Feb 4 23:38:06 2017 -0500 ext4: preserve the needs_recovery flag when the journal is aborted commit 97abd7d4b5d9c48ec15c425485f054e1c15e591b upstream. If the journal is aborted, the needs_recovery feature flag should not be removed. Otherwise, it's the journal might not get replayed and this could lead to more data getting lost. Signed-off-by: Theodore Ts'o Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit 98f58e05231f835dfb09359e3b5e3a886fe8f189 Author: Jan Kara Date: Fri Jan 27 14:34:30 2017 -0500 ext4: trim allocation requests to group size commit cd648b8a8fd5071d232242d5ee7ee3c0815776af upstream. If filesystem groups are artifically small (using parameter -g to mkfs.ext4), ext4_mb_normalize_request() can result in a request that is larger than a block group. Trim the request size to not confuse allocation code. Reported-by: "Kirill A. Shutemov" Signed-off-by: Jan Kara Signed-off-by: Theodore Ts'o Signed-off-by: Willy Tarreau commit 77bd57e66e071d866323132e1250350d7c90159f Author: Theodore Ts'o Date: Wed Feb 15 01:26:39 2017 -0500 ext4: fix fencepost in s_first_meta_bg validation commit 2ba3e6e8afc9b6188b471f27cf2b5e3cf34e7af2 upstream. It is OK for s_first_meta_bg to be equal to the number of block group descriptor blocks. (It rarely happens, but it shouldn't cause any problems.) https://bugzilla.kernel.org/show_bug.cgi?id=194567 Fixes: 3a4b77cd47bb837b8557595ec7425f281f2ca1fe Signed-off-by: Theodore Ts'o Signed-off-by: Willy Tarreau commit 45f1a95e8e169b8fc67d77b603ed2615301626f3 Author: Theodore Ts'o Date: Sat Feb 4 23:14:19 2017 -0500 jbd2: don't leak modified metadata buffers on an aborted journal commit e112666b4959b25a8552d63bc564e1059be703e8 upstream. If the journal has been aborted, we shouldn't mark the underlying buffer head as dirty, since that will cause the metadata block to get modified. And if the journal has been aborted, we shouldn't allow this since it will almost certainly lead to a corrupted file system. Signed-off-by: Theodore Ts'o Signed-off-by: Willy Tarreau commit 188b2ebb367591a1841825294f21f30186743186 Author: Eryu Guan Date: Thu Dec 1 15:08:37 2016 -0500 ext4: validate s_first_meta_bg at mount time commit 3a4b77cd47bb837b8557595ec7425f281f2ca1fe upstream. Ralf Spenneberg reported that he hit a kernel crash when mounting a modified ext4 image. And it turns out that kernel crashed when calculating fs overhead (ext4_calculate_overhead()), this is because the image has very large s_first_meta_bg (debug code shows it's 842150400), and ext4 overruns the memory in count_overhead() when setting bitmap buffer, which is PAGE_SIZE. ext4_calculate_overhead(): buf = get_zeroed_page(GFP_NOFS); <=== PAGE_SIZE buffer blks = count_overhead(sb, i, buf); count_overhead(): for (j = ext4_bg_num_gdb(sb, grp); j > 0; j--) { <=== j = 842150400 ext4_set_bit(EXT4_B2C(sbi, s++), buf); <=== buffer overrun count++; } This can be reproduced easily for me by this script: #!/bin/bash rm -f fs.img mkdir -p /mnt/ext4 fallocate -l 16M fs.img mke2fs -t ext4 -O bigalloc,meta_bg,^resize_inode -F fs.img debugfs -w -R "ssv first_meta_bg 842150400" fs.img mount -o loop fs.img /mnt/ext4 Fix it by validating s_first_meta_bg first at mount time, and refusing to mount if its value exceeds the largest possible meta_bg number. [js] use EXT4_HAS_INCOMPAT_FEATURE instead of new ext4_has_feature_meta_bg Reported-by: Ralf Spenneberg Signed-off-by: Eryu Guan Signed-off-by: Theodore Ts'o Reviewed-by: Andreas Dilger Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit d61f4e22b4d2946ffb60e2ea61716681f247076c Author: Theodore Ts'o Date: Fri Nov 18 13:37:47 2016 -0500 ext4: add sanity checking to count_overhead() commit c48ae41bafe31e9a66d8be2ced4e42a6b57fa814 upstream. The commit "ext4: sanity check the block and cluster size at mount time" should prevent any problems, but in case the superblock is modified while the file system is mounted, add an extra safety check to make sure we won't overrun the allocated buffer. Signed-off-by: Theodore Ts'o Signed-off-by: Willy Tarreau commit bd652ad14e9b9839f05f95117a00906af97b833f Author: Theodore Ts'o Date: Fri Nov 18 13:24:26 2016 -0500 ext4: fix in-superblock mount options processing commit 5aee0f8a3f42c94c5012f1673420aee96315925a upstream. Fix a large number of problems with how we handle mount options in the superblock. For one, if the string in the superblock is long enough that it is not null terminated, we could run off the end of the string and try to interpret superblocks fields as characters. It's unlikely this will cause a security problem, but it could result in an invalid parse. Also, parse_options is destructive to the string, so in some cases if there is a comma-separated string, it would be modified in the superblock. (Fortunately it only happens on file systems with a 1k block size.) Signed-off-by: Theodore Ts'o Signed-off-by: Willy Tarreau commit 408d8245b80b5426d670de4b4ec60f24b05eb0ef Author: Theodore Ts'o Date: Fri Nov 18 13:28:30 2016 -0500 ext4: use more strict checks for inodes_per_block on mount commit cd6bb35bf7f6d7d922509bf50265383a0ceabe96 upstream. Centralize the checks for inodes_per_block and be more strict to make sure the inodes_per_block_group can't end up being zero. Signed-off-by: Theodore Ts'o Reviewed-by: Andreas Dilger Signed-off-by: Jiri Slaby Signed-off-by: Willy Tarreau commit de714a8a75b6985452fe6e5d3f6393eb1da2eb3d Author: Liu Bo Date: Wed Aug 3 12:33:01 2016 -0700 Btrfs: fix memory leak in reading btree blocks commit 2571e739677f1e4c0c63f5ed49adcc0857923625 upstream. So we can read a btree block via readahead or intentional read, and we can end up with a memory leak when something happens as follows, 1) readahead starts to read block A but does not wait for read completion, 2) btree_readpage_end_io_hook finds that block A is corrupted, and it needs to clear all block A's pages' uptodate bit. 3) meanwhile an intentional read kicks in and checks block A's pages' uptodate to decide which page needs to be read. 4) when some pages have the uptodate bit during 3)'s check so 3) doesn't count them for eb->io_pages, but they are later cleared by 2) so we has to readpage on the page, we get the wrong eb->io_pages which results in a memory leak of this block. This fixes the problem by firstly getting all pages's locking and then checking pages' uptodate bit. t1(readahead) t2(readahead endio) t3(the following read) read_extent_buffer_pages end_bio_extent_readpage for pg in eb: for page 0,1,2 in eb: if pg is uptodate: btree_readpage_end_io_hook(pg) num_reads++ if uptodate: eb->io_pages = num_reads SetPageUptodate(pg) _______________ for pg in eb: for page 3 in eb: read_extent_buffer_pages if pg is NOT uptodate: btree_readpage_end_io_hook(pg) for pg in eb: __extent_read_full_page(pg) sanity check reports something wrong if pg is uptodate: clear_extent_buffer_uptodate(eb) num_reads++ for pg in eb: eb->io_pages = num_reads ClearPageUptodate(page) _______________ for pg in eb: if pg is NOT uptodate: __extent_read_full_page(pg) So t3's eb->io_pages is not consistent with the number of pages it's reading, and during endio(), atomic_dec_and_test(&eb->io_pages) will get a negative number so that we're not able to free the eb. Signed-off-by: Liu Bo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Willy Tarreau commit 0ee88216358c4a1821022763cb84d2b0550bf1e2 Author: Jeff Mahoney Date: Fri Dec 2 22:21:55 2016 -0500 Revert "Btrfs: don't delay inode ref updates during log, replay" commit 081fafddc3ff1e86e36024b0177c08e340b19a12 upstream. This reverts commit 644d10716875b24388680925d6c7502420987bfe, upstream commit 6f8960541b1eb6054a642da48daae2320fddba93. The original patch for mainline, 6f8960541b1 (Btrfs: don't delay inode ref updates during log replay) lists 1d52c78afbb (Btrfs: try not to ENOSPC on log replay) as the only pre-3.18 dependency, but it also depends on 67de11769bd (Btrfs: introduce the delayed inode ref deletion for the single link inode), which was introduced in 3.14 and isn't in 3.12.y. The -stable commit added the check to btrfs_delayed_update_inode, which may look similar to btrfs_delayed_delete_inode_ref, but it's only superficial. The tops of both functions handle typical delayed node boilerplate. The upshot is that the patch is harmless since the caller already checks to see if we're doing log recovery, so we're not breaking anything. It should be reverted because it makes it appear as if this issue was fixed for users who did backport 67de11769bd, when it is not. Signed-off-by: Jeff Mahoney Signed-off-by: Willy Tarreau