summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2018-04-03thread: move thread bits accessors to separated fileYury Norov
They may be accessed from low-level code, so isolating is a measure to avoid circular dependencies in header files. The exact reason for circular dependency is WARN_ON() macro added in patch edd63a27 "set_restore_sigmask() is never called without SIGPENDING (and never should be)" Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
2018-04-03asm-generic: Drop getrlimit and setrlimit syscalls from default listYury Norov
The newer prlimit64 syscall provides all the functionality provided by the getrlimit and setrlimit syscalls and adds the pid of target process, so future architectures won't need to include getrlimit and setrlimit. Therefore drop getrlimit and setrlimit syscalls from the generic syscall list unless __ARCH_WANT_SET_GET_RLIMIT is defined by the architecture's unistd.h prior to including asm-generic/unistd.h, and adjust all architectures using the generic syscall list to define it so that no in-tree architectures are affected. Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Hogan <james.hogan@imgtec.com> Cc: linux-arch@vger.kernel.org Cc: linux-snps-arc@lists.infradead.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: linux-c6x-dev@linux-c6x.org Cc: Richard Kuo <rkuo@codeaurora.org> Cc: linux-hexagon@vger.kernel.org Cc: linux-metag@vger.kernel.org Cc: Jonas Bonn <jonas@southpole.se> Cc: linux@lists.openrisc.net Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ley Foon Tan <lftan@altera.com> Cc: nios2-dev@lists.rocketboards.org Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Mark Salter <msalter@redhat.com> [c6x] Acked-by: James Hogan <james.hogan@imgtec.com> [metag] Acked-by: Ley Foon Tan <lftan@altera.com> [nios2] Acked-by: Stafford Horne <shorne@gmail.com> [openrisc] Acked-by: Will Deacon <will.deacon@arm.com> [arm64] Acked-by: Vineet Gupta <vgupta@synopsys.com> #arch/arc bits
2018-04-0332-bit userspace ABI: introduce ARCH_32BIT_OFF_T config optionYury Norov
All new 32-bit architectures should have 64-bit userspace off_t type, but existing architectures has 32-bit ones. To handle it, new config option is added to arch/Kconfig that defaults ARCH_32BIT_OFF_T to be disabled for non-64 bit architectures. All existing 32-bit architectures enable it explicitly here. New option affects force_o_largefile() behaviour. Namely, if userspace off_t is 64-bits long, we have no reason to reject user to open big files. Note that even if architectures has only 64-bit off_t in the kernel (arc, c6x, h8300, hexagon, metag, nios2, openrisc, tile32 and unicore32), a libc may use 32-bit off_t, and therefore want to limit the file size to 4GB unless specified differently in the open flags. Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> Acked-by: Arnd Bergmann <arnd@arndb.de>
2018-04-03compat ABI: use non-compat openat and open_by_handle_at variantsYury Norov
The only difference is that non-compat version forces O_LARGEFILE, and it should be the default behaviour for all architectures, as we are going to drop the support of 32-bit userspace off_t. The exception is tile32 that continues with compat version of syscalls. Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Chris Metcalf <cmetcalf@ezchip.com> [for tile]
2018-03-31net: use skb_to_full_sk() in skb_update_prio()Eric Dumazet
[ Upstream commit 4dcb31d4649df36297296b819437709f5407059c ] Andrei Vagin reported a KASAN: slab-out-of-bounds error in skb_update_prio() Since SYNACK might be attached to a request socket, we need to get back to the listener socket. Since this listener is manipulated without locks, add const qualifiers to sock_cgroup_prioidx() so that the const can also be used in skb_update_prio() Also add the const qualifier to sock_cgroup_classid() for consistency. Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-31sch_netem: fix skb leak in netem_enqueue()Alexey Kodanev
[ Upstream commit 35d889d10b649fda66121891ec05eca88150059d ] When we exceed current packets limit and we have more than one segment in the list returned by skb_gso_segment(), netem drops only the first one, skipping the rest, hence kmemleak reports: unreferenced object 0xffff880b5d23b600 (size 1024): comm "softirq", pid 0, jiffies 4384527763 (age 2770.629s) hex dump (first 32 bytes): 00 80 23 5d 0b 88 ff ff 00 00 00 00 00 00 00 00 ..#]............ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000d8a19b9d>] __alloc_skb+0xc9/0x520 [<000000001709b32f>] skb_segment+0x8c8/0x3710 [<00000000c7b9bb88>] tcp_gso_segment+0x331/0x1830 [<00000000c921cba1>] inet_gso_segment+0x476/0x1370 [<000000008b762dd4>] skb_mac_gso_segment+0x1f9/0x510 [<000000002182660a>] __skb_gso_segment+0x1dd/0x620 [<00000000412651b9>] netem_enqueue+0x1536/0x2590 [sch_netem] [<0000000005d3b2a9>] __dev_queue_xmit+0x1167/0x2120 [<00000000fc5f7327>] ip_finish_output2+0x998/0xf00 [<00000000d309e9d3>] ip_output+0x1aa/0x2c0 [<000000007ecbd3a4>] tcp_transmit_skb+0x18db/0x3670 [<0000000042d2a45f>] tcp_write_xmit+0x4d4/0x58c0 [<0000000056a44199>] tcp_tasklet_func+0x3d9/0x540 [<0000000013d06d02>] tasklet_action+0x1ca/0x250 [<00000000fcde0b8b>] __do_softirq+0x1b4/0x5a3 [<00000000e7ed027c>] irq_exit+0x1e2/0x210 Fix it by adding the rest of the segments, if any, to skb 'to_free' list. Add new __qdisc_drop_all() and qdisc_drop_all() functions because they can be useful in the future if we need to drop segmented GSO packets in other places. Fixes: 6071bd1aa13e ("netem: Segment GSO packets on enqueue") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-31rhashtable: Fix rhlist duplicates insertionPaul Blakey
[ Upstream commit d3dcf8eb615537526bd42ff27a081d46d337816e ] When inserting duplicate objects (those with the same key), current rhlist implementation messes up the chain pointers by updating the bucket pointer instead of prev next pointer to the newly inserted node. This causes missing elements on removal and travesal. Fix that by properly updating pprev pointer to point to the correct rhash_head next pointer. Issue: 1241076 Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7 Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface') Signed-off-by: Paul Blakey <paulb@mellanox.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-31net: phy: Tell caller result of phy_change()Brad Mouring
[ Upstream commit a2c054a896b8ac794ddcfc7c92e2dc7ec4ed4ed5 ] In 664fcf123a30e (net: phy: Threaded interrupts allow some simplification) the phy_interrupt system was changed to use a traditional threaded interrupt scheme instead of a workqueue approach. With this change, the phy status check moved into phy_change, which did not report back to the caller whether or not the interrupt was handled. This means that, in the case of a shared phy interrupt, only the first phydev's interrupt registers are checked (since phy_interrupt() would always return IRQ_HANDLED). This leads to interrupt storms when it is a secondary device that's actually the interrupt source. Signed-off-by: Brad Mouring <brad.mouring@ni.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-28mtd: nand: fsl_ifc: Read ECCSTAT0 and ECCSTAT1 registers for IFC 2.0Jagdish Gediya
commit 6b00c35138b404be98b85f4a703be594cbed501c upstream. Due to missing information in Hardware manual, current implementation doesn't read ECCSTAT0 and ECCSTAT1 registers for IFC 2.0. Add support to read ECCSTAT0 and ECCSTAT1 registers during ecccheck for IFC 2.0. Fixes: 656441478ed5 ("mtd: nand: ifc: Fix location of eccstat registers for IFC V1.0") Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Jagdish Gediya <jagdish.gediya@nxp.com> Reviewed-by: Prabhakar Kushwaha <prabhakar.kushwaha@nxp.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-28Revert "mm: page_alloc: skip over regions of invalid pfns where possible"Daniel Vacek
commit f59f1caf72ba00d519c793c3deb32cd3be32edc2 upstream. This reverts commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible"). The commit is meant to be a boot init speed up skipping the loop in memmap_init_zone() for invalid pfns. But given some specific memory mapping on x86_64 (or more generally theoretically anywhere but on arm with CONFIG_HAVE_ARCH_PFN_VALID) the implementation also skips valid pfns which is plain wrong and causes 'kernel BUG at mm/page_alloc.c:1389!' crash> log | grep -e BUG -e RIP -e Call.Trace -e move_freepages_block -e rmqueue -e freelist -A1 kernel BUG at mm/page_alloc.c:1389! invalid opcode: 0000 [#1] SMP -- RIP: 0010: move_freepages+0x15e/0x160 -- Call Trace: move_freepages_block+0x73/0x80 __rmqueue+0x263/0x460 get_page_from_freelist+0x7e1/0x9e0 __alloc_pages_nodemask+0x176/0x420 -- crash> page_init_bug -v | grep RAM <struct resource 0xffff88067fffd2f8> 1000 - 9bfff System RAM (620.00 KiB) <struct resource 0xffff88067fffd3a0> 100000 - 430bffff System RAM ( 1.05 GiB = 1071.75 MiB = 1097472.00 KiB) <struct resource 0xffff88067fffd410> 4b0c8000 - 4bf9cfff System RAM ( 14.83 MiB = 15188.00 KiB) <struct resource 0xffff88067fffd480> 4bfac000 - 646b1fff System RAM (391.02 MiB = 400408.00 KiB) <struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB) <struct resource 0xffff88067fffd640> 100000000 - 67fffffff System RAM ( 22.00 GiB) crash> page_init_bug | head -6 <struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB) <struct page 0xffffea0001ede200> 1fffff00000000 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575 <struct page 0xffffea0001ede200> 505736 505344 <struct page 0xffffea0001ed8000> 505855 <struct page 0xffffea0001edffc0> <struct page 0xffffea0001ed8000> 0 0 <struct pglist_data 0xffff88047ffd9000> 0 <struct zone 0xffff88047ffd9000> DMA 1 4095 <struct page 0xffffea0001edffc0> 1fffff00000400 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575 BUG, zones differ! crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b787000 7b788000 PAGE PHYSICAL MAPPING INDEX CNT FLAGS ffffea0001e00000 78000000 0 0 0 0 ffffea0001ed7fc0 7b5ff000 0 0 0 0 ffffea0001ed8000 7b600000 0 0 0 0 <<<< ffffea0001ede1c0 7b787000 0 0 0 0 ffffea0001ede200 7b788000 0 0 1 1fffff00000000 Link: http://lkml.kernel.org/r/20180316143855.29838-1-neelx@redhat.com Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible") Signed-off-by: Daniel Vacek <neelx@redhat.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-28mm/vmalloc: add interfaces to free unmapped page tableToshi Kani
commit b6bdb7517c3d3f41f20e5c2948d6bc3f8897394e upstream. On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may create pud/pmd mappings. A kernel panic was observed on arm64 systems with Cortex-A75 in the following steps as described by Hanjun Guo. 1. ioremap a 4K size, valid page table will build, 2. iounmap it, pte0 will set to 0; 3. ioremap the same address with 2M size, pgd/pmd is unchanged, then set the a new value for pmd; 4. pte0 is leaked; 5. CPU may meet exception because the old pmd is still in TLB, which will lead to kernel panic. This panic is not reproducible on x86. INVLPG, called from iounmap, purges all levels of entries associated with purged address on x86. x86 still has memory leak. The patch changes the ioremap path to free unmapped page table(s) since doing so in the unmap path has the following issues: - The iounmap() path is shared with vunmap(). Since vmap() only supports pte mappings, making vunmap() to free a pte page is an overhead for regular vmap users as they do not need a pte page freed up. - Checking if all entries in a pte page are cleared in the unmap path is racy, and serializing this check is expensive. - The unmap path calls free_vmap_area_noflush() to do lazy TLB purges. Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB purge. Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which clear a given pud/pmd entry and free up a page for the lower level entries. This patch implements their stub functions on x86 and arm64, which work as workaround. [akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub] Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com Fixes: e61ce6ade404e ("mm: change ioremap to set up huge I/O mappings") Reported-by: Lei Li <lious.lilei@hisilicon.com> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Wang Xuefeng <wxf.wang@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Chintan Pandya <cpandya@codeaurora.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-28mmc: core: Fix tracepoint print of blk_addr and blkszAdrian Hunter
commit c658dc58c7eaa8569ceb0edd1ddbdfda84fe8aa5 upstream. Swap the positions of blk_addr and blksz in the tracepoint print arguments so that they match the print format. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: d2f82254e4e8 ("mmc: core: Add members to mmc_request and mmc_data for CQE's") Cc: <stable@vger.kernel.org> # 4.14+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-28ALSA: usb-audio: Fix parsing descriptor of UAC2 processing unitKirill Marinushkin
commit a6618f4aedb2b60932d766bd82ae7ce866e842aa upstream. Currently, the offsets in the UAC2 processing unit descriptor are calculated incorrectly. It causes an issue when connecting the device which provides such a feature: ~~~~ [84126.724420] usb 1-1.3.1: invalid Processing Unit descriptor (id 18) ~~~~ After this patch is applied, the UAC2 processing unit inits w/o this error. Fixes: 23caaf19b11e ("ALSA: usb-mixer: Add support for Audio Class v2.0") Signed-off-by: Kirill Marinushkin <k.marinushkin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24IB/mlx5: Fix integer overflows in mlx5_ib_create_srqBoris Pismenny
commit c2b37f76485f073f020e60b5954b6dc4e55f693c upstream. This patch validates user provided input to prevent integer overflow due to integer manipulation in the mlx5_ib_create_srq function. Cc: syzkaller <syzkaller@googlegroups.com> Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24bpf/cgroup: fix a verification error for a CGROUP_DEVICE type progYonghong Song
[ Upstream commit 06ef0ccb5a36e1feba9b413ff59a04ecc4407c1c ] The tools/testing/selftests/bpf test program test_dev_cgroup fails with the following error when compiled with llvm 6.0. (I did not try with earlier versions.) libbpf: load bpf program failed: Permission denied libbpf: -- BEGIN DUMP LOG --- libbpf: 0: (61) r2 = *(u32 *)(r1 +4) 1: (b7) r0 = 0 2: (55) if r2 != 0x1 goto pc+8 R0=inv0 R1=ctx(id=0,off=0,imm=0) R2=inv1 R10=fp0 3: (69) r2 = *(u16 *)(r1 +0) invalid bpf_context access off=0 size=2 ... The culprit is the following statement in dev_cgroup.c: short type = ctx->access_type & 0xFFFF; This code is typical as the ctx->access_type is assigned as below in kernel/bpf/cgroup.c: struct bpf_cgroup_dev_ctx ctx = { .access_type = (access << 16) | dev_type, .major = major, .minor = minor, }; The compiler converts it to u16 access while the verifier cgroup_dev_is_valid_access rejects any non u32 access. This patch permits the field access_type to be accessible with type u16 and u8 as well. Signed-off-by: Yonghong Song <yhs@fb.com> Tested-by: Roman Gushchin <guro@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-21KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintidMarc Zyngier
commit 16ca6a607d84bef0129698d8d808f501afd08d43 upstream. The vgic code is trying to be clever when injecting GICv2 SGIs, and will happily populate LRs with the same interrupt number if they come from multiple vcpus (after all, they are distinct interrupt sources). Unfortunately, this is against the letter of the architecture, and the GICv2 architecture spec says "Each valid interrupt stored in the List registers must have a unique VirtualID for that virtual CPU interface.". GICv3 has similar (although slightly ambiguous) restrictions. This results in guests locking up when using GICv2-on-GICv3, for example. The obvious fix is to stop trying so hard, and inject a single vcpu per SGI per guest entry. After all, pending SGIs with multiple source vcpus are pretty rare, and are mostly seen in scenario where the physical CPUs are severely overcomitted. But as we now only inject a single instance of a multi-source SGI per vcpu entry, we may delay those interrupts for longer than strictly necessary, and run the risk of injecting lower priority interrupts in the meantime. In order to address this, we adopt a three stage strategy: - If we encounter a multi-source SGI in the AP list while computing its depth, we force the list to be sorted - When populating the LRs, we prevent the injection of any interrupt of lower priority than that of the first multi-source SGI we've injected. - Finally, the injection of a multi-source SGI triggers the request of a maintenance interrupt when there will be no pending interrupt in the LRs (HCR_NPIE). At the point where the last pending interrupt in the LRs switches from Pending to Active, the maintenance interrupt will be delivered, allowing us to add the remaining SGIs using the same process. Cc: stable@vger.kernel.org Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework") Acked-by: Christoffer Dall <cdall@kernel.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-21KVM: arm/arm64: Reset mapped IRQs on VM resetChristoffer Dall
commit 413aa807ae39fed7e387c175d2d0ae9fcf6c0c9d upstream. We currently don't allow resetting mapped IRQs from userspace, because their state is controlled by the hardware. But we do need to reset the state when the VM is reset, so we provide a function for the 'owner' of the mapped interrupt to reset the interrupt state. Currently only the timer uses mapped interrupts, so we call this function from the timer reset logic. Cc: stable@vger.kernel.org Fixes: 4c60e360d6df ("KVM: arm/arm64: Provide a get_input_level for the arch timer") Signed-off-by: Christoffer Dall <cdall@kernel.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-21fs: Teach path_connected to handle nfs filesystems with multiple roots.Eric W. Biederman
commit 95dd77580ccd66a0da96e6d4696945b8cea39431 upstream. On nfsv2 and nfsv3 the nfs server can export subsets of the same filesystem and report the same filesystem identifier, so that the nfs client can know they are the same filesystem. The subsets can be from disjoint directory trees. The nfsv2 and nfsv3 filesystems provides no way to find the common root of all directory trees exported form the server with the same filesystem identifier. The practical result is that in struct super s_root for nfs s_root is not necessarily the root of the filesystem. The nfs mount code sets s_root to the root of the first subset of the nfs filesystem that the kernel mounts. This effects the dcache invalidation code in generic_shutdown_super currently called shrunk_dcache_for_umount and that code for years has gone through an additional list of dentries that might be dentry trees that need to be freed to accomodate nfs. When I wrote path_connected I did not realize nfs was so special, and it's hueristic for avoiding calling is_subdir can fail. The practical case where this fails is when there is a move of a directory from the subtree exposed by one nfs mount to the subtree exposed by another nfs mount. This move can happen either locally or remotely. With the remote case requiring that the move directory be cached before the move and that after the move someone walks the path to where the move directory now exists and in so doing causes the already cached directory to be moved in the dcache through the magic of d_splice_alias. If someone whose working directory is in the move directory or a subdirectory and now starts calling .. from the initial mount of nfs (where s_root == mnt_root), then path_connected as a heuristic will not bother with the is_subdir check. As s_root really is not the root of the nfs filesystem this heuristic is wrong, and the path may actually not be connected and path_connected can fail. The is_subdir function might be cheap enough that we can call it unconditionally. Verifying that will take some benchmarking and the result may not be the same on all kernels this fix needs to be backported to. So I am avoiding that for now. Filesystems with snapshots such as nilfs and btrfs do something similar. But as the directory tree of the snapshots are disjoint from one another and from the main directory tree rename won't move things between them and this problem will not occur. Cc: stable@vger.kernel.org Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Fixes: 397d425dc26d ("vfs: Test for and handle paths that are unreachable from their mnt_root") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19dma-buf/fence: Fix lock inversion within dma-fence-arrayChris Wilson
[ Upstream commit 03e4e0a9e02cf703da331ff6cfd57d0be9bf5692 ] Ages ago Rob Clark noted, "Currently with fence-array, we have a potential deadlock situation. If we fence_add_callback() on an array-fence, the array-fence's lock is acquired first, and in it's ->enable_signaling() callback, it will install cbs on it's array-member fences, so the array-member's lock is acquired second. But in the signal path, the array-member's lock is acquired first, and the array-fence's lock acquired second." Rob proposed either extensive changes to dma-fence to unnest the fence-array signaling, or to defer the signaling onto a workqueue. This is a more refined version of the later, that should keep the latency of the fence signaling to a minimum by using an irq-work, which is executed asap. Reported-by: Rob Clark <robdclark@gmail.com> Suggested-by: Rob Clark <robdclark@gmail.com> References: 1476635975-21981-1-git-send-email-robdclark@gmail.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rob Clark <robdclark@gmail.com> Cc: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20171114162719.30958-1-chris@chris-wilson.co.uk Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19usb: quirks: add control message delay for 1b1c:1b20Danilo Krummrich
commit cb88a0588717ba6c756cb5972d75766b273a6817 upstream. Corsair Strafe RGB keyboard does not respond to usb control messages sometimes and hence generates timeouts. Commit de3af5bf259d ("usb: quirks: add delay init quirk for Corsair Strafe RGB keyboard") tried to fix those timeouts by adding USB_QUIRK_DELAY_INIT. Unfortunately, even with this quirk timeouts of usb_control_msg() can still be seen, but with a lower frequency (approx. 1 out of 15): [ 29.103520] usb 1-8: string descriptor 0 read error: -110 [ 34.363097] usb 1-8: can't set config #1, error -110 Adding further delays to different locations where usb control messages are issued just moves the timeouts to other locations, e.g.: [ 35.400533] usbhid 1-8:1.0: can't add hid device: -110 [ 35.401014] usbhid: probe of 1-8:1.0 failed with error -110 The only way to reliably avoid those issues is having a pause after each usb control message. In approx. 200 boot cycles no more timeouts were seen. Addionaly, keep USB_QUIRK_DELAY_INIT as it turned out to be necessary to have the delay in hub_port_connect() after hub_port_init(). The overall boot time seems not to be influenced by these additional delays, even on fast machines and lightweight distributions. Fixes: de3af5bf259d ("usb: quirks: add delay init quirk for Corsair Strafe RGB keyboard") Cc: stable@vger.kernel.org Signed-off-by: Danilo Krummrich <danilokrummrich@dk-develop.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15tpm: Keep CLKRUN enabled throughout the duration of transmit_cmd()Azhar Shaikh
commit b3e958ce4c585bf666de249dc794971ebc62d2d3 upstream. Commit 5e572cab92f0bb5 ("tpm: Enable CLKRUN protocol for Braswell systems") disabled CLKRUN protocol during TPM transactions and re-enabled once the transaction is completed. But there were still some corner cases observed where, reading of TPM header failed for savestate command while going to suspend, which resulted in suspend failure. To fix this issue keep the CLKRUN protocol disabled for the entire duration of a single TPM command and not disabling and re-enabling again for every TPM transaction. For the other TPM accesses outside TPM command flow, add a higher level of disabling and re-enabling the CLKRUN protocol, instead of doing for every TPM transaction. Fixes: 5e572cab92f0bb5 ("tpm: Enable CLKRUN protocol for Braswell systems") Signed-off-by: Azhar Shaikh <azhar.shaikh@intel.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15x86/retpoline: Support retpoline builds with ClangDavid Woodhouse
commit 87358710c1fb4f1bf96bbe2349975ff9953fc9b2 upstream. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: arjan.van.de.ven@intel.com Cc: bp@alien8.de Cc: dave.hansen@intel.com Cc: jmattson@google.com Cc: karahmed@amazon.de Cc: kvm@vger.kernel.org Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Link: http://lkml.kernel.org/r/1519037457-7643-5-git-send-email-dwmw@amazon.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15nospec: Include <asm/barrier.h> dependencyDan Williams
commit eb6174f6d1be16b19cfa43dac296bfed003ce1a6 upstream. The nospec.h header expects the per-architecture header file <asm/barrier.h> to optionally define array_index_mask_nospec(). Include that dependency to prevent inadvertent fallback to the default array_index_mask_nospec() implementation. The default implementation may not provide a full mitigation on architectures that perform data value speculation. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/151881605404.17395.1341935530792574707.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15nospec: Kill array_index_nospec_mask_check()Dan Williams
commit 1d91c1d2c80cb70e2e553845e278b87a960c04da upstream. There are multiple problems with the dynamic sanity checking in array_index_nospec_mask_check(): * It causes unnecessary overhead in the 32-bit case since integer sized @index values will no longer cause the check to be compiled away like in the 64-bit case. * In the 32-bit case it may trigger with user controllable input when the expectation is that should only trigger during development of new kernel enabling. * The macro reuses the input parameter in multiple locations which is broken if someone passes an expression like 'index++' to array_index_nospec(). Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/151881604278.17395.6605847763178076520.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15drm/nouveau: prefer XBGR2101010 for addfb ioctlIlia Mirkin
commit c20bb155c2c5acb775f68be5d84fe679687c3c1e upstream. Nouveau only exposes support for XBGR2101010. Prior to the atomic conversion, drm would pass in the wrong format in the framebuffer, but it was always ignored -- both userspace (xf86-video-nouveau) and the kernel driver agreed on the layout, so the fact that the format was wrong didn't matter. With the atomic conversion, nouveau all of a sudden started caring about the exact format, and so the previously-working code in xf86-video-nouveau no longer functioned since the (internally-assigned) format from the addfb ioctl was wrong. This change adds infrastructure to allow a drm driver to specify that it prefers the XBGR format variant for the addfb ioctl, and makes nouveau's nv50 display driver set it. (Prior gens had no support for 30bpp at all.) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: stable@vger.kernel.org # v4.10+ Acked-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180203191123.31507-1-imirkin@alum.mit.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15drm: Allow determining if current task is output poll workerLukas Wunner
commit 25c058ccaf2ebbc3e250ec1e199e161f91fe27d4 upstream. Introduce a helper to determine if the current task is an output poll worker. This allows us to fix a long-standing deadlock in several DRM drivers wherein the ->runtime_suspend callback waits for the output poll worker to finish and the worker in turn calls a ->detect callback which waits for runtime suspend to finish. The ->detect callback is invoked from multiple call sites and waiting for runtime suspend to finish is the correct thing to do except if it's executing in the context of the worker. v2: Expand kerneldoc to specifically mention deadlock between output poll worker and autosuspend worker as use case. (Lyude) Cc: Dave Airlie <airlied@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://patchwork.freedesktop.org/patch/msgid/3549ce32e7f1467102e70d3e9cbf70c46bfe108e.1518593424.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15workqueue: Allow retrieval of current task's work structLukas Wunner
commit 27d4ee03078aba88c5e07dcc4917e8d01d046f38 upstream. Introduce a helper to retrieve the current task's work struct if it is a workqueue worker. This allows us to fix a long-standing deadlock in several DRM drivers wherein the ->runtime_suspend callback waits for a specific worker to finish and that worker in turn calls a function which waits for runtime suspend to finish. That function is invoked from multiple call sites and waiting for runtime suspend to finish is the correct thing to do except if it's executing in the context of the worker. Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://patchwork.freedesktop.org/patch/msgid/2d8f603074131eb87e588d2b803a71765bd3a2fd.1518338788.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15scsi: core: Avoid that ATA error handling can trigger a kernel hang or oopsBart Van Assche
commit 3be8828fc507cdafe7040a3dcf361a2bcd8e305b upstream. Avoid that the recently introduced call_rcu() call in the SCSI core triggers a double call_rcu() call. Reported-by: Natanael Copa <ncopa@alpinelinux.org> Reported-by: Damien Le Moal <damien.lemoal@wdc.com> References: https://bugzilla.kernel.org/show_bug.cgi?id=198861 Fixes: 3bd6f43f5cb3 ("scsi: core: Ensure that the SCSI error handler gets woken up") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Cc: Natanael Copa <ncopa@alpinelinux.org> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: Alexandre Oliva <oliva@gnu.org> Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-08nospec: Allow index argument to have const-qualified typeRasmus Villemoes
commit b98c6a160a057d5686a8c54c79cc6c8c94a7d0c8 upstream. The last expression in a statement expression need not be a bare variable, quoting gcc docs The last thing in the compound statement should be an expression followed by a semicolon; the value of this subexpression serves as the value of the entire construct. and we already use that in e.g. the min/max macros which end with a ternary expression. This way, we can allow index to have const-qualified type, which will in some cases avoid the need for introducing a local copy of index of non-const qualified type. That, in turn, can prevent readers not familiar with the internals of array_index_nospec from wondering about the seemingly redundant extra variable, and I think that's worthwhile considering how confusing the whole _nospec business is. The expression _i&_mask has type unsigned long (since that is the type of _mask, and the BUILD_BUG_ONs guarantee that _i will get promoted to that), so in order not to change the type of the whole expression, add a cast back to typeof(_i). Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arch@vger.kernel.org Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/151881604837.17395.10812767547837568328.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-08net: phy: Restore phy_resume() locking assumptionAndrew Lunn
[ Upstream commit 9c2c2e62df3fa30fb13fbeb7512a4eede729383b ] commit f5e64032a799 ("net: phy: fix resume handling") changes the locking semantics for phy_resume() such that the caller now needs to hold the phy mutex. Not all call sites were adopted to this new semantic, resulting in warnings from the added WARN_ON(!mutex_is_locked(&phydev->lock)). Rather than change the semantics, add a __phy_resume() and restore the old behavior of phy_resume(). Reported-by: Heiner Kallweit <hkallweit1@gmail.com> Fixes: f5e64032a799 ("net: phy: fix resume handling") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-08udplite: fix partial checksum initializationAlexey Kodanev
[ Upstream commit 15f35d49c93f4fa9875235e7bf3e3783d2dd7a1b ] Since UDP-Lite is always using checksum, the following path is triggered when calculating pseudo header for it: udp4_csum_init() or udp6_csum_init() skb_checksum_init_zero_check() __skb_checksum_validate_complete() The problem can appear if skb->len is less than CHECKSUM_BREAK. In this particular case __skb_checksum_validate_complete() also invokes __skb_checksum_complete(skb). If UDP-Lite is using partial checksum that covers only part of a packet, the function will return bad checksum and the packet will be dropped. It can be fixed if we skip skb_checksum_init_zero_check() and only set the required pseudo header checksum for UDP-Lite with partial checksum before udp4_csum_init()/udp6_csum_init() functions return. Fixes: ed70fcfcee95 ("net: Call skb_checksum_init in IPv4") Fixes: e4f45b7f40bd ("net: Call skb_checksum_init in IPv6") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-08dax: fix vma_is_fsdax() helperDan Williams
commit 230f5a8969d8345fc9bbe3683f068246cf1be4b8 upstream. Gerd reports that ->i_mode may contain other bits besides S_IFCHR. Use S_ISCHR() instead. Otherwise, get_user_pages_longterm() may fail on device-dax instances when those are meant to be explicitly allowed. Fixes: 2bb6d2837083 ("mm: introduce get_user_pages_longterm") Cc: <stable@vger.kernel.org> Reported-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Jane Chu <jane.chu@oracle.com> Reported-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-28drm/atomic: Fix memleak on ERESTARTSYS during non-blocking commitsLeo (Sunpeng) Li
commit 54f809cfbd6b4a43959039f5d33596ed3297ce16 upstream. During a non-blocking commit, it is possible to return before the commit_tail work is queued (-ERESTARTSYS, for example). Since a reference on the crtc commit object is obtained for the pending vblank event when preparing the commit, the above situation will leave us with an extra reference. Therefore, if the commit_tail worker has not consumed the event at the end of a commit, release it's reference. Changes since v1: - Also check for state->event->base.completion being set, to handle the case where stall_checks() fails in setup_crtc_commit(). Changes since v2: - Add a flag to drm_crtc_commit, to prevent dereferencing a freed event. i915 may unreference the state in a worker. Fixes: 24835e442f28 ("drm: reference count event->completion") Cc: <stable@vger.kernel.org> # v4.11+ Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> #v1 Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180117115108.29608-1-maarten.lankhorst@linux.intel.com Reviewed-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-28uapi/if_ether.h: move __UAPI_DEF_ETHHDR libc defineHauke Mehrtens
commit da360299b6734135a5f66d7db458dcc7801c826a upstream. This fixes a compile problem of some user space applications by not including linux/libc-compat.h in uapi/if_ether.h. linux/libc-compat.h checks which "features" the header files, included from the libc, provide to make the Linux kernel uapi header files only provide no conflicting structures and enums. If a user application mixes kernel headers and libc headers it could happen that linux/libc-compat.h gets included too early where not all other libc headers are included yet. Then the linux/libc-compat.h would not prevent all the redefinitions and we run into compile problems. This patch removes the include of linux/libc-compat.h from uapi/if_ether.h to fix the recently introduced case, but not all as this is more or less impossible. It is no problem to do the check directly in the if_ether.h file and not in libc-compat.h as this does not need any fancy glibc header detection as glibc never provided struct ethhdr and should define __UAPI_DEF_ETHHDR by them self when they will provide this. The following test program did not compile correctly any more: #include <linux/if_ether.h> #include <netinet/in.h> #include <linux/in.h> int main(void) { return 0; } Fixes: 6926e041a892 ("uapi/if_ether.h: prevent redefinition of struct ethhdr") Reported-by: Guillaume Nault <g.nault@alphalink.fr> Cc: <stable@vger.kernel.org> # 4.15 Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-28Kbuild: always define endianess in kconfig.hArnd Bergmann
commit 101110f6271ce956a049250c907bc960030577f8 upstream. Build testing with LTO found a couple of files that get compiled differently depending on whether asm/byteorder.h gets included early enough or not. In particular, include/asm-generic/qrwlock_types.h is affected by this, but there are probably others as well. The symptom is a series of LTO link time warnings, including these: net/netlabel/netlabel_unlabeled.h:223: error: type of 'netlbl_unlhsh_add' does not match original declaration [-Werror=lto-type-mismatch] int netlbl_unlhsh_add(struct net *net, net/netlabel/netlabel_unlabeled.c:377: note: 'netlbl_unlhsh_add' was previously declared here include/net/ipv6.h:360: error: type of 'ipv6_renew_options_kern' does not match original declaration [-Werror=lto-type-mismatch] ipv6_renew_options_kern(struct sock *sk, net/ipv6/exthdrs.c:1162: note: 'ipv6_renew_options_kern' was previously declared here net/core/dev.c:761: note: 'dev_get_by_name_rcu' was previously declared here struct net_device *dev_get_by_name_rcu(struct net *net, const char *name) net/core/dev.c:761: note: code may be misoptimized unless -fno-strict-aliasing is used drivers/gpu/drm/i915/i915_drv.h:3377: error: type of 'i915_gem_object_set_to_wc_domain' does not match original declaration [-Werror=lto-type-mismatch] i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write); drivers/gpu/drm/i915/i915_gem.c:3639: note: 'i915_gem_object_set_to_wc_domain' was previously declared here include/linux/debugfs.h:92:9: error: type of 'debugfs_attr_read' does not match original declaration [-Werror=lto-type-mismatch] ssize_t debugfs_attr_read(struct file *file, char __user *buf, fs/debugfs/file.c:318: note: 'debugfs_attr_read' was previously declared here include/linux/rwlock_api_smp.h:30: error: type of '_raw_read_unlock' does not match original declaration [-Werror=lto-type-mismatch] void __lockfunc _raw_read_unlock(rwlock_t *lock) __releases(lock); kernel/locking/spinlock.c:246:26: note: '_raw_read_unlock' was previously declared here include/linux/fs.h:3308:5: error: type of 'simple_attr_open' does not match original declaration [-Werror=lto-type-mismatch] int simple_attr_open(struct inode *inode, struct file *file, fs/libfs.c:795: note: 'simple_attr_open' was previously declared here All of the above are caused by include/asm-generic/qrwlock_types.h failing to include asm/byteorder.h after commit e0d02285f16e ("locking/qrwlock: Use 'struct qrwlock' instead of 'struct __qrwlock'") in linux-4.15. Similar bugs may or may not exist in older kernels as well, but there is no easy way to test those with link-time optimizations, and kernels before 4.14 are harder to fix because they don't have Babu's patch series We had similar issues with CONFIG_ symbols in the past and ended up always including the configuration headers though linux/kconfig.h. This works around the issue through that same file, defining either __BIG_ENDIAN or __LITTLE_ENDIAN depending on CONFIG_CPU_BIG_ENDIAN, which is now always set on all architectures since commit 4c97a0c8fee3 ("arch: define CPU_BIG_ENDIAN for all fixed big endian archs"). Link: http://lkml.kernel.org/r/20180202154104.1522809-2-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Babu Moger <babu.moger@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-28kconfig.h: Include compiler types to avoid missed struct attributesKees Cook
commit 28128c61e08eaeced9cc8ec0e6b5d677b5b94690 upstream. The header files for some structures could get included in such a way that struct attributes (specifically __randomize_layout from path.h) would be parsed as variable names instead of attributes. This could lead to some instances of a structure being unrandomized, causing nasty GPFs, etc. This patch makes sure the compiler_types.h header is included in kconfig.h so that we've always got types and struct attributes defined, since kconfig.h is included from the compiler command line. Reported-by: Patrick McLean <chutzpah@gentoo.org> Root-caused-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Fixes: 3859a271a003 ("randstruct: Mark various structs for randomization") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25ptr_ring: try vmalloc() when kmalloc() failsJason Wang
commit 0bf7800f1799b5b1fd7d4f024e9ece53ac489011 upstream. This patch switch to use kvmalloc_array() for using a vmalloc() fallback to help in case kmalloc() fails. Reported-by: syzbot+e4d4f9ddd4295539735d@syzkaller.appspotmail.com Fixes: 2e0ab8ca83c12 ("ptr_ring: array based FIFO for pointers") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZEJason Wang
commit 6e6e41c3112276288ccaf80c70916779b84bb276 upstream. To avoid slab to warn about exceeded size, fail early if queue occupies more than KMALLOC_MAX_SIZE. Reported-by: syzbot+e4d4f9ddd4295539735d@syzkaller.appspotmail.com Fixes: 2e0ab8ca83c12 ("ptr_ring: array based FIFO for pointers") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pagesTony Luck
commit fd0e786d9d09024f67bd71ec094b110237dc3840 upstream. In the following commit: ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages") ... we added code to memory_failure() to unmap the page from the kernel 1:1 virtual address space to avoid speculative access to the page logging additional errors. But memory_failure() may not always succeed in taking the page offline, especially if the page belongs to the kernel. This can happen if there are too many corrected errors on a page and either mcelog(8) or drivers/ras/cec.c asks to take a page offline. Since we remove the 1:1 mapping early in memory_failure(), we can end up with the page unmapped, but still in use. On the next access the kernel crashes :-( There are also various debug paths that call memory_failure() to simulate occurrence of an error. Since there is no actual error in memory, we don't need to map out the page for those cases. Revert most of the previous attempt and keep the solution local to arch/x86/kernel/cpu/mcheck/mce.c. Unmap the page only when: 1) there is a real error 2) memory_failure() succeeds. All of this only applies to 64-bit systems. 32-bit kernel doesn't map all of memory into kernel space. It isn't worth adding the code to unmap the piece that is mapped because nobody would run a 32-bit kernel on a machine that has recoverable machine checks. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert (Persistent Memory) <elliott@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Cc: stable@vger.kernel.org #v4.14 Fixes: ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages") Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22jbd2: fix sphinx kernel-doc build warningsTobin C. Harding
commit f69120ce6c024aa634a8fc25787205e42f0ccbe6 upstream. Sphinx emits various (26) warnings when building make target 'htmldocs'. Currently struct definitions contain duplicate documentation, some as kernel-docs and some as standard c89 comments. We can reduce duplication while cleaning up the kernel docs. Move all kernel-docs to right above each struct member. Use the set of all existing comments (kernel-doc and c89). Add documentation for missing struct members and function arguments. Signed-off-by: Tobin C. Harding <me@tobin.cc> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22mlx5: fix mlx5_get_vector_affinity to start from completion vector 0Sagi Grimberg
commit 2572cf57d75a7f91835d9a38771e9e76d575d122 upstream. The consumers of this routine expects the affinity map of of vector index relative to the first completion vector. The upper layers are not aware of internal/private completion vectors that mlx5 allocates for its own usage. Hence, return the affinity map of vector index relative to the first completion vector. Fixes: 05e0cc84e00c ("net/mlx5: Fix get vector affinity helper function") Reported-by: Logan Gunthorpe <logang@deltatee.com> Tested-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Cc: <stable@vger.kernel.org> # v4.15 Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22x86/mm: Rename flush_tlb_single() and flush_tlb_one() to ↵Andy Lutomirski
__flush_tlb_one_[user|kernel]() commit 1299ef1d8870d2d9f09a5aadf2f8b2c887c2d033 upstream. flush_tlb_single() and flush_tlb_one() sound almost identical, but they really mean "flush one user translation" and "flush one kernel translation". Rename them to flush_tlb_one_user() and flush_tlb_one_kernel() to make the semantics more obvious. [ I was looking at some PTI-related code, and the flush-one-address code is unnecessarily hard to understand because the names of the helpers are uninformative. This came up during PTI review, but no one got around to doing it. ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Hugh Dickins <hughd@google.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linux-MM <linux-mm@kvack.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/3303b02e3c3d049dc5235d5651e0ae6d29a34354.1517414378.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22nospec: Move array_index_nospec() parameter checking into separate macroWill Deacon
commit 8fa80c503b484ddc1abbd10c7cb2ab81f3824a50 upstream. For architectures providing their own implementation of array_index_mask_nospec() in asm/barrier.h, attempting to use WARN_ONCE() to complain about out-of-range parameters using WARN_ON() results in a mess of mutually-dependent include files. Rather than unpick the dependencies, simply have the core code in nospec.h perform the checking for us. Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1517840166-15399-1-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22PM: cpuidle: Fix cpuidle_poll_state_init() prototypeRafael J. Wysocki
commit d7212cfb05ba802bea4dd6c90d61cfe6366ea224 upstream. Commit f85942207516 (x86: PM: Make APM idle driver initialize polling state) made apm_init() call cpuidle_poll_state_init(), but that only is defined for CONFIG_CPU_IDLE set, so make the empty stub of it available for CONFIG_CPU_IDLE unset too to fix the resulting build issue. Fixes: f85942207516 (x86: PM: Make APM idle driver initialize polling state) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22compiler-gcc.h: __nostackprotector needs gcc-4.4 and upGeert Uytterhoeven
commit d9afaaa4ff7af8b87d4a205e48cb8a6f666d7f01 upstream. Gcc versions before 4.4 do not recognize the __optimize__ compiler attribute: warning: ‘__optimize__’ attribute directive ignored Fixes: 7375ae3a0b79ea07 ("compiler-gcc.h: Introduce __nostackprotector function attribute") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22compiler-gcc.h: Introduce __optimize function attributeGeert Uytterhoeven
commit df5d45aa08f848b79caf395211b222790534ccc7 upstream. Create a new function attribute __optimize, which allows to specify an optimization level on a per-function basis. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22x86/gpu: add CFL to early quirksLucas De Marchi
commit 33aa69ed8aacd92dea12671e52eb3ca6ac2d7a49 upstream. CFL was missing from intel_early_ids[]. The PCI ID needs to be there to allow the memory region to be stolen, otherwise we could have RAM being arbitrarily overwritten if for example we keep using the UEFI framebuffer, depending on how BIOS has set up the e820 map. Fixes: b056f8f3d6b9 ("drm/i915/cfl: Add Coffee Lake PCI IDs for S Skus.") Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: David Airlie <airlied@linux.ie> Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Cc: <stable@vger.kernel.org> # v4.13+ 0890540e21cf drm/i915: add GT number to intel_device_info Cc: <stable@vger.kernel.org> # v4.13+ 41693fd52373 drm/i915/kbl: Change a KBL pci id to GT2 from GT1.5 Cc: <stable@vger.kernel.org> # v4.13+ Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171213200425.2954-1-lucas.demarchi@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22IB/core: Fix ib_wc structure size to remain in 64 bytes boundaryBodong Wang
commit cd2a6e7d384b043d5d029e39663061cebc949385 upstream. The change of slid from u16 to u32 results in sizeof(struct ib_wc) cross 64B boundary, which causes more cache misses. This patch rearranges the fields and remain the size to 64B. Pahole output before this change: struct ib_wc { union { u64 wr_id; /* 8 */ struct ib_cqe * wr_cqe; /* 8 */ }; /* 0 8 */ enum ib_wc_status status; /* 8 4 */ enum ib_wc_opcode opcode; /* 12 4 */ u32 vendor_err; /* 16 4 */ u32 byte_len; /* 20 4 */ struct ib_qp * qp; /* 24 8 */ union { __be32 imm_data; /* 4 */ u32 invalidate_rkey; /* 4 */ } ex; /* 32 4 */ u32 src_qp; /* 36 4 */ int wc_flags; /* 40 4 */ u16 pkey_index; /* 44 2 */ /* XXX 2 bytes hole, try to pack */ u32 slid; /* 48 4 */ u8 sl; /* 52 1 */ u8 dlid_path_bits; /* 53 1 */ u8 port_num; /* 54 1 */ u8 smac[6]; /* 55 6 */ /* XXX 1 byte hole, try to pack */ u16 vlan_id; /* 62 2 */ /* --- cacheline 1 boundary (64 bytes) --- */ u8 network_hdr_type; /* 64 1 */ /* size: 72, cachelines: 2, members: 17 */ /* sum members: 62, holes: 2, sum holes: 3 */ /* padding: 7 */ /* last cacheline: 8 bytes */ }; Pahole output after this change: struct ib_wc { union { u64 wr_id; /* 8 */ struct ib_cqe * wr_cqe; /* 8 */ }; /* 0 8 */ enum ib_wc_status status; /* 8 4 */ enum ib_wc_opcode opcode; /* 12 4 */ u32 vendor_err; /* 16 4 */ u32 byte_len; /* 20 4 */ struct ib_qp * qp; /* 24 8 */ union { __be32 imm_data; /* 4 */ u32 invalidate_rkey; /* 4 */ } ex; /* 32 4 */ u32 src_qp; /* 36 4 */ u32 slid; /* 40 4 */ int wc_flags; /* 44 4 */ u16 pkey_index; /* 48 2 */ u8 sl; /* 50 1 */ u8 dlid_path_bits; /* 51 1 */ u8 port_num; /* 52 1 */ u8 smac[6]; /* 53 6 */ /* XXX 1 byte hole, try to pack */ u16 vlan_id; /* 60 2 */ u8 network_hdr_type; /* 62 1 */ /* size: 64, cachelines: 1, members: 17 */ /* sum members: 62, holes: 1, sum holes: 1 */ /* padding: 1 */ }; Fixes: 7db20ecd1d97 ("IB/core: Change wc.slid from 16 to 32 bits") Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16scsi: core: Ensure that the SCSI error handler gets woken upBart Van Assche
commit 3bd6f43f5cb3714f70c591514f344389df593501 upstream. If scsi_eh_scmd_add() is called concurrently with scsi_host_queue_ready() while shost->host_blocked > 0 then it can happen that neither function wakes up the SCSI error handler. Fix this by making every function that decreases the host_busy counter wake up the error handler if necessary and by protecting the host_failed checks with the SCSI host lock. Reported-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> References: https://marc.info/?l=linux-kernel&m=150461610630736 Fixes: commit 746650160866 ("scsi: convert host_busy to atomic_t") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Tested-by: Stuart Hayes <stuart.w.hayes@gmail.com> Cc: Konstantin Khorenko <khorenko@virtuozzo.com> Cc: Stuart Hayes <stuart.w.hayes@gmail.com> Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16crypto: hash - prevent using keyed hashes without setting keyEric Biggers
commit 9fa68f620041be04720d0cbfb1bd3ddfc6310b24 upstream. Currently, almost none of the keyed hash algorithms check whether a key has been set before proceeding. Some algorithms are okay with this and will effectively just use a key of all 0's or some other bogus default. However, others will severely break, as demonstrated using "hmac(sha3-512-generic)", the unkeyed use of which causes a kernel crash via a (potentially exploitable) stack buffer overflow. A while ago, this problem was solved for AF_ALG by pairing each hash transform with a 'has_key' bool. However, there are still other places in the kernel where userspace can specify an arbitrary hash algorithm by name, and the kernel uses it as unkeyed hash without checking whether it is really unkeyed. Examples of this include: - KEYCTL_DH_COMPUTE, via the KDF extension - dm-verity - dm-crypt, via the ESSIV support - dm-integrity, via the "internal hash" mode with no key given - drbd (Distributed Replicated Block Device) This bug is especially bad for KEYCTL_DH_COMPUTE as that requires no privileges to call. Fix the bug for all users by adding a flag CRYPTO_TFM_NEED_KEY to the ->crt_flags of each hash transform that indicates whether the transform still needs to be keyed or not. Then, make the hash init, import, and digest functions return -ENOKEY if the key is still needed. The new flag also replaces the 'has_key' bool which algif_hash was previously using, thereby simplifying the algif_hash implementation. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>