summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2016-11-18ACPI/PCI: pci_link: penalize SCI correctlySinan Kaya
commit f1caa61df2a3dc4c58316295c5dc5edba4c68d85 upstream. Ondrej reported that IRQs stopped working in v4.7 on several platforms. A typical scenario, from Ondrej's VT82C694X/694X, is: ACPI: Using PIC for interrupt routing ACPI: PCI Interrupt Link [LNKA] (IRQs 1 3 4 5 6 7 10 *11 12 14 15) ACPI: No IRQ available for PCI Interrupt Link [LNKA] 8139too 0000:00:0f.0: PCI INT A: no GSI We're using PIC routing, so acpi_irq_balance == 0, and LNKA is already active at IRQ 11. In that case, acpi_pci_link_allocate() only tries to use the active IRQ (IRQ 11) which also happens to be the SCI. We should penalize the SCI by PIRQ_PENALTY_PCI_USING, but irq_get_trigger_type(11) returns something other than IRQ_TYPE_LEVEL_LOW, so we penalize it by PIRQ_PENALTY_ISA_ALWAYS instead, which makes acpi_pci_link_allocate() assume the IRQ isn't available and give up. Add acpi_penalize_sci_irq() so platforms can tell us the SCI IRQ, trigger, and polarity directly and we don't have to depend on irq_get_trigger_type(). Fixes: 103544d86976 (ACPI,PCI,IRQ: reduce resource requirements) Link: http://lkml.kernel.org/r/201609251512.05657.linux@rainbow-software.org Reported-by: Ondrej Zary <linux@rainbow-software.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Tested-by: Jonathan Liu <net147@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-18s390/dumpstack: restore reliable indicator for call tracesHeiko Carstens
commit d0208639dbc6fe97a25054df44faa2d19aca9380 upstream. Before merging all different stack tracers the call traces printed had an indicator if an entry can be considered reliable or not. Unreliable entries were put in braces, reliable not. Currently all lines contain these extra braces. This patch restores the old behaviour by adding an extra "reliable" parameter to the callback functions. Only show_trace makes currently use of it. Before: [ 0.804751] Call Trace: [ 0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0) [ 0.804756] ([<0000000000161d64>] create_worker+0x174/0x1c0) After: [ 0.804751] Call Trace: [ 0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0) [ 0.804756] [<0000000000161d64>] create_worker+0x174/0x1c0 Fixes: 758d39ebd3d5 ("s390/dumpstack: merge all four stack tracers") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-18x86/build: Fix build with older GCC versionsJan Beulich
commit a2209b742e6cf978b85d4f31a25a269c3d3b062b upstream. Older GCC (observed with 4.1.x) doesn't support -Wno-override-init and also doesn't ignore unknown -Wno-* options. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu> Cc: Valdis.Kletnieks@vt.edu Fixes: 5e44258d16 "x86/build: Reduce the W=1 warnings noise when compiling x86 syscall tables" Link: http://lkml.kernel.org/r/580E3E1C02000078001191C4@prv-mh.provo.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-18arc: Implement arch-specific dma_map_ops.mmapAlexey Brodkin
commit a79a812131b07254c09cf325ec68c0d05aaed0b5 upstream. We used to use generic implementation of dma_map_ops.mmap which is dma_common_mmap() but that only worked for simpler cached mappings when vaddr = paddr. If a driver requests uncached DMA buffer kernel maps it to virtual address so that MMU gets involved and page uncached status takes into account. In that case usage of dma_common_mmap() lead to mapping of vaddr to vaddr for user-space which is obviously wrong. For more detals please refer to verbose explanation here [1]. So here we implement our own version of mmap() which always deals with dma_addr and maps underlying memory to user-space properly (note that DMA buffer mapped to user-space is always uncached because there's no way to properly manage cache from user-space). [1] https://lkml.org/lkml/2016/10/26/973 Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-18ARC: timer: rtc: implement read loop in "C" vs. inline asmVineet Gupta
commit 922cc171998ac3dbe74d57011ef7ed57e9b0d7df upstream. The current code doesn't even compile as somehow the inline assembly can't see the register names defined as ARC_RTC_* I'm pretty sure It worked when I first got it merged, but the tools were definitely different then. So better to write this in "C" anyways. Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-18s390/hypfs: Use get_free_page() instead of kmalloc to ensure page alignmentMichael Holzheu
commit 237d6e6884136923b6bd26d5141ebe1d065960c9 upstream. Since commit d86bd1bece6f ("mm/slub: support left redzone") it is no longer guaranteed that kmalloc(PAGE_SIZE) returns page aligned memory. After the above commit we get an error for diag224 because aligned memory is required. This leads to the following user visible error: # mount none -t s390_hypfs /sys/hypervisor/ mount: unknown filesystem type 's390_hypfs' # dmesg | grep hypfs hypfs.cccfb8: The hardware system does not provide all functions required by hypfs hypfs.7a79f0: Initialization of hypfs failed with rc=-61 Fix this problem and use get_free_page() instead of kmalloc() to get correctly aligned memory. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-15arch/powerpc: Update parameters for csum_tcpudp_magic & csum_tcpudp_nofoldIvan Vecera
[ Upstream commit f9d4286b9516b02e795214412d36885f572b57ad ] Commit 01cfbad "ipv4: Update parameters for csum_tcpudp_magic to their original types" changed parameters for csum_tcpudp_magic and csum_tcpudp_nofold for many platforms but not for PowerPC. Fixes: 01cfbad "ipv4: Update parameters for csum_tcpudp_magic to their original types" Cc: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10kvm: x86: Check memopp before dereference (CVE-2016-8630)Owen Hofmann
commit d9092f52d7e61dd1557f2db2400ddb430e85937e upstream. Commit 41061cdb98 ("KVM: emulate: do not initialize memopp") removes a check for non-NULL under incorrect assumptions. An undefined instruction with a ModR/M byte with Mod=0 and R/M-5 (e.g. 0xc7 0x15) will attempt to dereference a null pointer here. Fixes: 41061cdb98a0bec464278b4db8e894a3121671f5 Message-Id: <1477592752-126650-2-git-send-email-osh@google.com> Signed-off-by: Owen Hofmann <osh@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10ARM: fix oops when using older ARMv4T CPUsRussell King
commit 04946fb60fb157faafa01658dff3131d49f49ccb upstream. Alexander Shiyan reports that CLPS711x fails at boot time in the data exception handler due to a NULL pointer dereference. This is caused by the late-v4t abort handler overwriting R9 (which becomes zero). Fix this by making the abort handler save and restore R9. Unable to handle kernel NULL pointer dereference at virtual address 00000008 pgd = c3b58000 [00000008] *pgd=800000000, *pte=00000000, *ppte=feff4140 Internal error: Oops: 63c11817 [#1] PREEMPT ARM CPU: 0 PID: 448 Comm: ash Not tainted 4.8.1+ #1 Hardware name: Cirrus Logic CLPS711X (Device Tree Support) task: c39e03a0 ti: c3b4e000 task.ti: c3b4e000 PC is at __dabt_svc+0x4c/0x60 LR is at do_page_fault+0x144/0x2ac pc : [<c000d3ac>] lr : [<c000fcec>] psr: 60000093 sp : c3b4fe6c ip : 00000001 fp : b6f1bf88 r10: c387a5a0 r9 : 00000000 r8 : e4e0e001 r7 : bee3ef83 r6 : 00100000 r5 : 80000013 r4 : c022fcf8 r3 : 00000000 r2 : 00000008 r1 : bf000000 r0 : 00000000 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 0000217f Table: c3b58055 DAC: 00000055 Process ash (pid: 448, stack limit = 0xc3b4e190) Stack: (0xc3b4fe6c to 0xc3b50000) fe60: bee3ef83 c05168d1 ffffffff 00000000 c3adfe80 fe80: c3a03300 00000000 c3b4fed0 c3a03400 bee3ef83 c387a5a0 b6f1bf88 00000001 fea0: c3b4febc 00000076 c022fcf8 80000013 ffffffff 0000003f bf000000 bee3ef83 fec0: 00000004 00000000 c3adfe80 c00e432c 00000812 00000005 00000001 00000006 fee0: b6f1b000 00000000 00010000 0003c944 0004d000 0004d439 00010000 b6f1b000 ff00: 00000005 00000000 00015ecc c3b4fed0 0000000a 00000000 00000000 c00a1dc0 ff20: befff000 c3a03300 c3b4e000 c0507cd8 c0508024 fffffff8 c3a03300 00000000 ff40: c0516a58 c00a35bc c39e03a0 000001c0 bea84ce8 0004e008 c3b3a000 c00a3ac0 ff60: c3b40374 c3b3a000 bea84d11 00000000 c0500188 bea84d11 bea84ce8 00000001 ff80: 0000000b c000a304 c3b4e000 00000000 bea84ce4 c00a3cd0 00000000 bea84d11 ffa0: bea84ce8 c000a160 bea84d11 bea84ce8 bea84d11 bea84ce8 0004e008 0004d450 ffc0: bea84d11 bea84ce8 00000001 0000000b b6f45ee4 00000000 b6f5ff70 bea84ce4 ffe0: b6f2f130 bea84cb0 b6f2f194 b6ef29f4 a0000010 bea84d11 02c7cffa 02c7cffd [<c000d3ac>] (__dabt_svc) from [<c022fcf8>] (__copy_to_user_std+0xf8/0x330) [<c022fcf8>] (__copy_to_user_std) from [<c00e432c>] +(load_elf_binary+0x920/0x107c) [<c00e432c>] (load_elf_binary) from [<c00a35bc>] +(search_binary_handler+0x80/0x16c) [<c00a35bc>] (search_binary_handler) from [<c00a3ac0>] +(do_execveat_common+0x418/0x600) [<c00a3ac0>] (do_execveat_common) from [<c00a3cd0>] (do_execve+0x28/0x30) [<c00a3cd0>] (do_execve) from [<c000a160>] (ret_fast_syscall+0x0/0x30) Code: e1a0200d eb00136b e321f093 e59d104c (e5891008) ---[ end trace 4b4f8086ebef98c5 ]--- Fixes: e6978e4bf181 ("ARM: save and reset the address limit when entering an exception") Reported-by: Alexander Shiyan <shc_work@mail.ru> Tested-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10parisc: Ensure consistent state when switching to kernel stack at syscall entryJohn David Anglin
commit 6ed518328d0189e0fdf1bb7c73290d546143ea66 upstream. We have one critical section in the syscall entry path in which we switch from the userspace stack to kernel stack. In the event of an external interrupt, the interrupt code distinguishes between those two states by analyzing the value of sr7. If sr7 is zero, it uses the kernel stack. Therefore it's important, that the value of sr7 is in sync with the currently enabled stack. This patch now disables interrupts while executing the critical section. This prevents the interrupt handler to possibly see an inconsistent state which in the worst case can lead to crashes. Interestingly, in the syscall exit path interrupts were already disabled in the critical section which switches back to the userspace stack. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10MIPS: KASLR: Fix handling of NULL FDTMatt Redfearn
commit 4736697963385e6257ee8e260e97347e858cd962 upstream. If platform code returns a NULL pointer to the FDT, initial_boot_params will not get set to a valid pointer and attempting to find the /chosen node in it will cause a NULL pointer dereference and the kernel to crash immediately on startup - with no output to the console. Fix this by checking that initial_boot_params is valid before using it. Fixes: 405bc8fd12f5 ("MIPS: Kernel: Implement KASLR using CONFIG_RELOCATABLE") Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/14414/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10ARM: dts: fix the SD card on the SnowballLinus Walleij
commit 1b283eea6228880b765bc40fe4e555416437ce58 upstream. This fixes a very annoying regression on the Snowball SD card that has been around for a while. It turns out that the device tree does not configure the direction pins properly, nor sets up the pins for the voltage converter properly at boot. Unless all things are correctly set up, the feedback clock will not work, and makes the driver spew messages in the console (but it works, very slowly): root@Ux500:/ mount /dev/mmcblk0p2 /mnt/ [ 9.953460] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer! [ 9.960296] mmcblk0: error -110 sending status command, retrying [ 9.966461] mmcblk0: error -110 sending status command, retrying [ 9.972534] mmcblk0: error -110 sending status command, aborting Fix this by rectifying the device tree to correspond to that of the Ux500 HREF boards plus the DAT31DIR setting that is unique for the Snowball, and things start working smoothly. Add in the SDR12 and SDR25 modes which this host can do without any problems. I don't know if this has ever been correct, sadly. It works after this patch. Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10ARM: mvebu: Select corediv clk for all mvebu v7 SoCGregory CLEMENT
commit 33c45ef8adc8a7cf781b2566d50e6ea8e97b3596 upstream. Since the commit bd3677ff31a3 ("clk: mvebu: Remove corediv clock from Armada XP"), the corediv clk is no more selected for Armada XP, however this clock is used for Armada XP using the compatible armada-370-corediv-clock. While since commit 1594d568c6e3 ("clk: mvebu: Move corediv config to mvebu config") Armada 38x and Armada 375 got corediv support again, not only Armada XP was missed but also Armada 39x. Actually all the SoC selecting MVEBU_V7 config need this clock: git grep "\-corediv-clock" arch/arm/boot/dts arch/arm/boot/dts/armada-370-xp.dtsi: compatible = "marvell,armada-370-corediv-clock"; arch/arm/boot/dts/armada-375.dtsi: compatible = "marvell,armada-375-corediv-clock"; arch/arm/boot/dts/armada-38x.dtsi: compatible = "marvell,armada-380-corediv-clock"; arch/arm/boot/dts/armada-39x.dtsi: compatible = "marvell,armada-390-corediv-clock" This commit now fixes this behavior by letting MVEBU_V7 select MVEBU_CLK_COREDIV. Fixes: bd3677ff31a3 ("clk: mvebu: Remove corediv clock from Armada XP") Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10KVM: MIPS: Precalculate MMIO load resume PCJames Hogan
commit e1e575f6b026734be3b1f075e780e91ab08ca541 upstream. The advancing of the PC when completing an MMIO load is done before re-entering the guest, i.e. before restoring the guest ASID. However if the load is in a branch delay slot it may need to access guest code to read the prior branch instruction. This isn't safe in TLB mapped code at the moment, nor in the future when we'll access unmapped guest segments using direct user accessors too, as it could read the branch from host user memory instead. Therefore calculate the resume PC in advance while we're still in the right context and save it in the new vcpu->arch.io_pc (replacing the no longer needed vcpu->arch.pending_load_cause), and restore it on MMIO completion. Fixes: e685c689f3a8 ("KVM/MIPS32: Privileged instruction/target branch emulation.") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10KVM: MIPS: Make ERET handle ERL before EXLJames Hogan
commit ede5f3e7b54a4347be4d8525269eae50902bd7cd upstream. The ERET instruction to return from exception is used for returning from exception level (Status.EXL) and error level (Status.ERL). If both bits are set however we should be returning from ERL first, as ERL can interrupt EXL, for example when an NMI is taken. KVM however checks EXL first. Fix the order of the checks to match the pseudocode in the instruction set manual. Fixes: e685c689f3a8 ("KVM/MIPS32: Privileged instruction/target branch emulation.") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10KVM: s390: Fix STHYI buffer alignment for diag224Janosch Frank
commit 45c7ee43a5184ddbff652ee0d2e826f86f1b616b upstream. Diag224 requires a page-aligned 4k buffer to store the name table into. kmalloc does not guarantee page alignment, hence we replace it with __get_free_page for the buffer allocation. Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10KVM: x86: fix wbinvd_dirty_mask use-after-freeIdo Yariv
commit bd768e146624cbec7122ed15dead8daa137d909d upstream. vcpu->arch.wbinvd_dirty_mask may still be used after freeing it, corrupting memory. For example, the following call trace may set a bit in an already freed cpu mask: kvm_arch_vcpu_load vcpu_load vmx_free_vcpu_nested vmx_free_vcpu kvm_arch_vcpu_free Fix this by deferring freeing of wbinvd_dirty_mask. Signed-off-by: Ido Yariv <ido@wizery.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10arm64: dts: marvell: fix clocksource for CP110 master SPI0Marcin Wojtas
commit 51227bf52008bd4c4c50da4b749bbc6e7bbbca52 upstream. I2C and SPI interfaces share common clock trees within the CP110 HW block. It occurred that SPI0 interface has wrong clock assignment in the device tree, which is fixed in this commit to a proper value. Fixes: 728dacc7f4dd ("arm64: dts: marvell: initial DT description of ...") Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10x86/smpboot: Init apic mapping before usageThomas Gleixner
commit 1e90a13d0c3dc94512af1ccb2b6563e8297838fa upstream. The recent changes, which forced the registration of the boot cpu on UP systems, which do not have ACPI tables, have been fixed for systems w/o local APIC, but left a wreckage for systems which have neither ACPI nor mptables, but the CPU has an APIC, e.g. virtualbox. The boot process crashes in prefill_possible_map() as it wants to register the boot cpu, which needs to access the local apic, but the local APIC is not yet mapped. There is no reason why init_apic_mapping() can't be invoked before prefill_possible_map(). So instead of playing another silly early mapping game, as the ACPI/mptables code does, we just move init_apic_mapping() before the call to prefill_possible_map(). In hindsight, I should have noticed that combination earlier. Sorry for the churn (also in stable)! Fixes: ff8560512b8d ("x86/boot/smp: Don't try to poke disabled/non-existent APIC") Reported-and-debugged-by: Michal Necasek <michal.necasek@oracle.com> Reported-and-tested-by: Wolfgang Bauer <wbauer@tmo.at> Cc: prarit@redhat.com Cc: ville.syrjala@linux.intel.com Cc: michael.thayer@oracle.com Cc: knut.osmundsen@oracle.com Cc: frank.mehnert@oracle.com Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610282114380.5053@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10x86/microcode/AMD: Fix more fallout from CONFIG_RANDOMIZE_MEMORY=yBorislav Petkov
commit 1c27f646b18fb56308dff82784ca61951bad0b48 upstream. We needed the physical address of the container in order to compute the offset within the relocated ramdisk. And we did this by doing __pa() on the virtual address. However, __pa() does checks whether the physical address is within PAGE_OFFSET and __START_KERNEL_map - see __phys_addr() - which fail if we have CONFIG_RANDOMIZE_MEMORY enabled: we feed a virtual address which *doesn't* have the randomization offset into a function which uses PAGE_OFFSET which *does* have that offset. This makes this check fire: VIRTUAL_BUG_ON((x > y) || !phys_addr_valid(x)); ^^^^^^ due to the randomization offset. The fix is as simple as using __pa_nodebug() because we do that randomization offset accounting later in that function ourselves. Reported-by: Bob Peterson <rpeterso@redhat.com> Tested-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm <linux-mm@kvack.org> Link: http://lkml.kernel.org/r/20161027123623.j2jri5bandimboff@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10powerpc/64: Fix race condition in setting lock bit in idle/wakeup codePaul Mackerras
commit 09b7e37b18eecc1e347f4b1a3bc863f32801f634 upstream. This fixes a race condition where one thread that is entering or leaving a power-saving state can inadvertently ignore the lock bit that was set by another thread, and potentially also clear it. The core_idle_lock_held function is called when the lock bit is seen to be set. It polls the lock bit until it is clear, then does a lwarx to load the word containing the lock bit and thread idle bits so it can be updated. However, it is possible that the value loaded with the lwarx has the lock bit set, even though an immediately preceding lwz loaded a value with the lock bit clear. If this happens then we go ahead and update the word despite the lock bit being set, and when called from pnv_enter_arch207_idle_mode, we will subsequently clear the lock bit. No identifiable misbehaviour has been attributed to this race. This fixes it by checking the lock bit in the value loaded by the lwarx. If it is set then we just go back and keep on polling. Fixes: b32aadc1a8ed ("powerpc/powernv: Fix race in updating core_idle_state") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10powerpc/64: Re-fix race condition between going idle and entering guestPaul Mackerras
commit 56c46222af0d09149fadec2a3ce9d4889de01cc6 upstream. Commit 8117ac6a6c2f ("powerpc/powernv: Switch off MMU before entering nap/sleep/rvwinkle mode", 2014-12-10) fixed a race condition where one thread entering a KVM guest could switch the MMU context to the guest while another thread was still in host kernel context with the MMU on. That commit moved the point where a thread entering a power-saving mode set its kvm_hstate.hwthread_state field in its PACA to KVM_HWTHREAD_IN_IDLE from a point where the MMU was on to after the MMU had been switched off. That commit also added a comment explaining that we have to switch to real mode before setting hwthread_state to avoid this race. Nevertheless, commit 4eae2c9ae54a ("powerpc/powernv: Make pnv_powersave_common more generic", 2016-07-08) subsequently moved the setting of hwthread_state back to a point where the MMU is on, thus reintroducing the race, despite the comment saying that this should not be done being included in full in the context lines of the patch that did it. This fixes the race again and adds a bigger and shoutier comment explaining the potential race condition. Fixes: 4eae2c9ae54a ("powerpc/powernv: Make pnv_powersave_common more generic") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Shreyas B. Prabhu <shreyasbp@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10powerpc/mm/radix: Use tlbiel only if we ever ran on the current cpuAneesh Kumar K.V
commit bd77c4498616e27d5725b5959d880ce2272fefa9 upstream. Before this patch, we used tlbiel, if we ever ran only on this core. That was mostly derived from the nohash usage of the same. But is incorrect, the ISA 3.0 clarifies tlbiel such that: "All TLB entries that have all of the following properties are made invalid on the thread executing the tlbiel instruction" ie. tlbiel only invalidates TLB entries on the current thread. So if the mm has been used on any other thread (aka. cpu) then we must broadcast the invalidate. This bug could lead to invalid TLB entries if a program runs on multiple threads of a core. Hence use tlbiel, if we only ever ran on only the current cpu. Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10powerpc: Convert cmp to cmpd in idle enter sequenceSegher Boessenkool
commit 80f23935cadb1c654e81951f5a8b7ceae0acc1b4 upstream. PowerPC's "cmp" instruction has four operands. Normally people write "cmpw" or "cmpd" for the second cmp operand 0 or 1. But, frequently people forget, and write "cmp" with just three operands. With older binutils this is silently accepted as if this was "cmpw", while often "cmpd" is wanted. With newer binutils GAS will complain about this for 64-bit code. For 32-bit code it still silently assumes "cmpw" is what is meant. In this instance the code comes directly from ISA v2.07, including the cmp, but cmpd is correct. Backport to stable so that new toolchains can build old kernels. Fixes: 948cf67c4726 ("powerpc: Add NAP mode support on Power7 in HV mode") Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10h8300: fix syscall restartingMark Rutland
commit 21753583056d48a5fad964d6f272e28168426845 upstream. Back in commit f56141e3e2d9 ("all arches, signal: move restart_block to struct task_struct"), all architectures and core code were changed to use task_struct::restart_block. However, when h8300 support was subsequently restored in v4.2, it was not updated to account for this, and maintains thread_info::restart_block, which is not kept in sync. This patch drops the redundant restart_block from thread_info, and moves h8300 to the common one in task_struct, ensuring that syscall restarting always works as expected. Fixes: f56141e3e2d9 ("all arches, signal: move restart_block to struct task_struct") Link: http://lkml.kernel.org/r/1476714934-11635-1-git-send-email-mark.rutland@arm.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31ARM: dts: omap3: overo: add missing unit name for lcd35 displayJavier Martinez Canillas
commit 0b965a13ad81fa895e534d1f50b355ff8b0b3ed3 upstream. Commit b8d368caa8dc ("ARM: dts: omap3: overo: remove unneded unit names in display nodes") removed the unit names for all Overo display nodes that didn't have a reg property. But the display in arch/arm/boot/dts/omap3-overo-common-lcd35.dtsi does have a reg property so the correct fix was to make the unit name match the value of the reg property, instead of removing it. This patch fixes the following DTC warning for boards using this dtsi: "ocp/spi@48098000/display has a reg or ranges property, but no unit name" Fixes: b8d368caa8dc ("ARM: dts: omap3: overo: remove unneded unit names in display nodes") Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31ARM: dts: fix RealView EB SMSC ethernet versionLinus Walleij
commit c4ad72560df11961d3e57fb0fadfe88a9863c9ad upstream. The ethernet version in the earlier RealView EB variants is LAN91C111 and not LAN9118 according to ARM DUI 0303E "RealView Emulation Baseboard User Guide" page 3-57. Make sure that this is used for the base variant of the board. As the DT bindings for LAN91C111 does not specify any power supplies, these need to be deleted from the DTS file. Fixes: 2440d29d2ae2 ("ARM: dts: realview: support all the RealView EB board variants") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31ARM: dts: NSP: Correct RAM amount for BCM958625HR boardJon Mason
commit c53beb47f621e4a56f31af9f86470041655516c7 upstream. The BCM958625HR board has 2GB of RAM available. Increase the amount from 512MB to 2GB and add the device type to the memory entry. Fixes: 9a4865d42fe5 ("ARM: dts: NSP: Specify RAM amount for BCM958625HR board") Signed-off-by: Jon Mason <jon.mason@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31ARM: pxa: fix GPIO double shiftsRobert Jarzmik
commit ca26475bf02ed8562b9b46f91d3e8b52ec312541 upstream. The commit 9bf448c66d4b ("ARM: pxa: use generic gpio operation instead of gpio register") from Oct 17, 2011, leads to the following static checker warning: arch/arm/mach-pxa/spitz_pm.c:172 spitz_charger_wakeup() warn: double left shift '!gpio_get_value(SPITZ_GPIO_KEY_INT) << (1 << ((SPITZ_GPIO_KEY_INT) & 31))' As Dan reported, the value is shifted three times : - once by gpio_get_value(), which returns either 0 or BIT(gpio) - once by the shift operation '<<' - a last time by GPIO_bit(gpio) which is BIT(gpio) Therefore the calculation lead to a chained or operator of : - (1 << gpio) << (1 << gpio) = (2^gpio)^gpio = 2 ^ (gpio * gpio) It is be sheer luck the former statement works, only because each gpio used is strictly smaller than 6, and therefore 2^(gpio^2) never overflows a 32 bits value, and because it is used as a boolean value to check a gpio activation. As the xxx_charger_wakeup() functions are used as a true/false detection mechanism, take that opportunity to change their prototypes from integer return value to boolean one. Fixes: 9bf448c66d4b ("ARM: pxa: use generic gpio operation instead of gpio register") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31ARM: pxa: pxa_cplds: fix interrupt handlingRobert Jarzmik
commit 9ba63e3cc849cdaf3b675c47cc51fe35419e5117 upstream. Since its initial commit, the driver is buggy for multiple interrupts handling. The translation from the former lubbock.c file was not complete, and might stall all interrupt handling when multiple interrupts occur. This is especially true when inside the interrupt handler and if a new interrupt comes and is not handled, leaving the output line still held, and not creating a transition as the GPIO block behind would expect to trigger another cplds_irq_handler() call. For the record, the hardware is working as follows. The interrupt mechanism relies on : - one status register - one mask register Let's suppose the input irq lines are called : - i_sa1111 - i_lan91x - i_mmc_cd Let's suppose the status register for each irq line is called : - status_sa1111 - status_lan91x - status_mmc_cd Let's suppose the interrupt mask for each irq line is called : - irqen_sa1111 - irqen_lan91x - irqen_mmc_cd Let's suppose the output irq line, connected to GPIO0 is called : - o_gpio0 The behavior is as follows : - o_gpio0 = not((status_sa1111 & irqen_sa1111) | (status_lan91x & irqen_lan91x) | (status_mmc_cd & irqen_mmc_cd)) => this is a N-to-1 NOR gate and multiple AND gates - irqen_* is exactly as programmed by a write to the FPGA - status_* behavior is governed by a bi-stable D flip-flop => on next FPGA clock : - if i_xxx is high, status_xxx becomes 1 - if i_xxx is low, status_xxx remains as it is - if software sets status_xxx to 0, the D flip-flop is reset => status_xxx becomes 0 => on next FPGA clock cycle, if i_xxx is high, status_xxx becomes 1 again Fixes: fc9e38c0f4d3 ("ARM: pxa: lubbock: use new pxa_cplds driver") Reported-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31powerpc: Fix usage of _PAGE_RO in hugepageChristophe Leroy
commit 6b8cb66a6a7cc182b47da6a0a1d4e5da324c0695 upstream. On some CPUs like the 8xx, _PAGE_RW hence _PAGE_WRITE is defined as 0 and _PAGE_RO has to be set when a page is not writable _PAGE_RO is defined by default in pte-common.h, however BOOK3S/64 doesn't include that file so _PAGE_RO has to be defined explicitly in book3s/64/pgtable.h Fixes: a7b9f671f2d14 ("powerpc32: adds handling of _PAGE_RO") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31powerpc/nvram: Fix an incorrect partition mergePan Xinhui
commit 11b7e154b132232535befe51c55db048069c8461 upstream. When we merge two contiguous partitions whose signatures are marked NVRAM_SIG_FREE, We need update prev's length and checksum, then write it to nvram, not cur's. So lets fix this mistake now. Also use memset instead of strncpy to set the partition's name. It's more readable if we want to fill up with duplicate chars . Fixes: fa2b4e54d41f ("powerpc/nvram: Improve partition removal") Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31powerpc: Add check_if_tm_restore_required() to giveup_all()Cyril Bur
commit b0f16b46988fde02a1e32078f66a3059d7e53bfc upstream. giveup_all() causes FPU/VMX/VSX facilities to be disabled in a threads MSR. If the thread performing the giveup was transactional, the kernel must record which facilities were in use before the giveup as the thread must have these facilities re-enabled on return to userspace. >From process.c: /* * This is called if we are on the way out to userspace and the * TIF_RESTORE_TM flag is set. It checks if we need to reload * FP and/or vector state and does so if necessary. * If userspace is inside a transaction (whether active or * suspended) and FP/VMX/VSX instructions have ever been enabled * inside that transaction, then we have to keep them enabled * and keep the FP/VMX/VSX state loaded while ever the transaction * continues. The reason is that if we didn't, and subsequently * got a FP/VMX/VSX unavailable interrupt inside a transaction, * we don't know whether it's the same transaction, and thus we * don't know which of the checkpointed state and the transactional * state to use. */ Calling check_if_tm_restore_required() will set TIF_RESTORE_TM and save the MSR if needed. Fixes: c208505 ("powerpc: create giveup_all()") Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31powerpc: Always restore FPU/VEC/VSX if hardware transactional memory in useCyril Bur
commit dc16b553c949e81f37555777dc7bab66d78285a7 upstream. Comment from arch/powerpc/kernel/process.c:967: If userspace is inside a transaction (whether active or suspended) and FP/VMX/VSX instructions have ever been enabled inside that transaction, then we have to keep them enabled and keep the FP/VMX/VSX state loaded while ever the transaction continues. The reason is that if we didn't, and subsequently got a FP/VMX/VSX unavailable interrupt inside a transaction, we don't know whether it's the same transaction, and thus we don't know which of the checkpointed state and the ransactional state to use. restore_math() restore_fp() and restore_altivec() currently may not restore the registers. It doesn't appear that this is more serious than a performance penalty. If the math registers aren't restored the userspace thread will still be run with the facility disabled. Userspace will not be able to read invalid values. On the first access it will take an facility unavailable exception and the kernel will detected an active transaction, at which point it will abort the transaction. There is the possibility for a pathological case preventing any progress by transactions, however, transactions are never guaranteed to make progress. Fixes: 70fe3d9 ("powerpc: Restore FPU/VEC/VSX if previously used") Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31ARM: dts: sun9i: Add missing #interrupt-cells to R_PIO pinctrl device nodeChen-Yu Tsai
commit 06ad11be7a9e13499ff8e55e46f09d22f9ee6fc0 upstream. The R_PIO device node is missing #interrupt-cells, which causes interrupt parsing to fail to match it as a valid interrupt controller. Add #interrupt-cells to it. Also remove the unnecesary #address-cells and #size-cells. Fixes: 1ac56a6da9e1 ("ARM: dts: sun9i: Add A80 R_PIO pin controller device node") Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31crypto: arm/ghash-ce - add missing async import/exportArd Biesheuvel
commit ed4767d612fd2c39e2c4c69eba484c1219dcddb6 upstream. Since commit 8996eafdcbad ("crypto: ahash - ensure statesize is non-zero"), all ahash drivers are required to implement import()/export(), and must have a non-zero statesize. Fix this for the ARM Crypto Extensions GHASH implementation. Fixes: 8996eafdcbad ("crypto: ahash - ensure statesize is non-zero") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31drm/i915: Account for TSEG size when determining 865G stolen baseVille Syrjälä
commit d721b02fd00bf133580f431b82ef37f3b746dfb2 upstream. Looks like the TSEG lives just above TOUD, stolen comes after TSEG. The spec seems somewhat self-contradictory in places, in the ESMRAMC register desctription it says: TSEG Size: 10=(TOUD + 512 KB) to TOUD 11 =(TOUD + 1 MB) to TOUD so that agrees with TSEG being at TOUD. But the example given elsehwere in the spec says: TOUD equals 62.5 MB = 03E7FFFFh TSEG selected as 512 KB in size, Graphics local memory selected as 1 MB in size General System RAM available in system = 62.5 MB General system RAM range00000000h to 03E7FFFFh TSEG address range03F80000h to 03FFFFFFh TSEG pre-allocated from03F80000h to 03FFFFFFh Graphics local memory pre-allocated from03E80000h to 03F7FFFFh so here we have TSEG above stolen. Real world evidence agrees with the TOUD->TSEG->stolen order however, so let's fix up the code to account for the TSEG size. Cc: Taketo Kabe <fdporg@vega.pgw.jp> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Fixes: 0ad98c74e093 ("drm/i915: Determine the stolen memory base address on gen2") Fixes: a4dff76924fe ("x86/gpu: Add Intel graphics stolen memory quirk for gen2 platforms") Reported-by: Taketo Kabe <fdporg@vega.pgw.jp> Tested-by: Taketo Kabe <fdporg@vega.pgw.jp> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96473 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470653919-27251-1-git-send-email-ville.syrjala@linux.intel.com Link: http://download.intel.com/design/chipsets/datashts/25251405.pdf Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28KVM: s390: reject invalid modes for runtime instrumentationChristian Borntraeger
commit a5efb6b6c99a3a6dc4330f51d8066f638bdea0ac upstream. Usually a validity intercept is a programming error of the host because of invalid entries in the state description. We can get a validity intercept if the mode of the runtime instrumentation control block is wrong. As the host does not know which modes are valid, this can be used by userspace to trigger a WARN. Instead of printing a WARN let's return an error to userspace as this can only happen if userspace provides a malformed initial value (e.g. on migration). The kernel should never warn on bogus input. Instead let's log it into the s390 debug feature. While at it, let's return -EINVAL for all validity intercepts as this will trigger an error in QEMU like error: kvm run failed Invalid argument PSW=mask 0404c00180000000 addr 000000000063c226 cc 00 R00=000000000000004f R01=0000000000000004 R02=0000000000760005 R03=000000007fe0a000 R04=000000000064ba2a R05=000000049db73dd0 R06=000000000082c4b0 R07=0000000000000041 R08=0000000000000002 R09=000003e0804042a8 R10=0000000496152c42 R11=000000007fe0afb0 [...] This will avoid an endless loop of validity intercepts. Fixes: c6e5f166373a ("KVM: s390: implement the RI support of guest") Acked-by: Fan Zhang <zhangfan@linux.vnet.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28powerpc/mm: Prevent unlikely crash in copro_calculate_slb()Frederic Barrat
commit d2cf909cda5f8c5609cb7ed6cda816c3e15528c7 upstream. If a cxl adapter faults on an invalid address for a kernel context, we may enter copro_calculate_slb() with a NULL mm pointer (kernel context) and an effective address which looks like a user address. Which will cause a crash when dereferencing mm. It is clearly an AFU bug, but there's no reason to crash either. So return an error, so that cxl can ack the interrupt with an address error. Fixes: 73d16a6e0e51 ("powerpc/cell: Move data segment faulting code out of cell platform") Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28arm64: KVM: Take S1 walks into account when determining S2 write faultsWill Deacon
commit 60e21a0ef54cd836b9eb22c7cb396989b5b11648 upstream. The WnR bit in the HSR/ESR_EL2 indicates whether a data abort was generated by a read or a write instruction. For stage 2 data aborts generated by a stage 1 translation table walk (i.e. the actual page table access faults at EL2), the WnR bit therefore reports whether the instruction generating the walk was a load or a store, *not* whether the page table walker was reading or writing the entry. For page tables marked as read-only at stage 2 (e.g. due to KSM merging them with the tables from another guest), this could result in livelock, where a page table walk generated by a load instruction attempts to set the access flag in the stage 1 descriptor, but fails to trigger CoW in the host since only a read fault is reported. This patch modifies the arm64 kvm_vcpu_dabt_iswrite function to take into account stage 2 faults in stage 1 walks. Since DBM cannot be disabled at EL2 for CPUs that implement it, we assume that these faults are always causes by writes, avoiding the livelock situation at the expense of occasional, spurious CoWs. We could, in theory, do a bit better by checking the guest TCR configuration and inspecting the page table to see why the PTE faulted. However, I doubt this is measurable in practice, and the threat of livelock is real. Cc: Julien Grall <julien.grall@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28arm64: Cortex-A53 errata workaround: check for kernel addressesAndre Przywara
commit 87261d19046aeaeed8eb3d2793fde850ae1b5c9e upstream. Commit 7dd01aef0557 ("arm64: trap userspace "dc cvau" cache operation on errata-affected core") adds code to execute cache maintenance instructions in the kernel on behalf of userland on CPUs with certain ARM CPU errata. It turns out that the address hasn't been checked to be a valid user space address, allowing userland to clean cache lines in kernel space. Fix this by introducing an address check before executing the instructions on behalf of userland. Since the address doesn't come via a syscall parameter, we can't just reject tagged pointers and instead have to remove the tag when checking against the user address limit. Fixes: 7dd01aef0557 ("arm64: trap userspace "dc cvau" cache operation on errata-affected core") Reported-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> [will: rework commit message + replace access_ok with max_user_addr()] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28arm64: kernel: Init MDCR_EL2 even in the absence of a PMUMarc Zyngier
commit 850540351bb1a4fa5f192e5ce55b89928cc57f42 upstream. Commit f436b2ac90a0 ("arm64: kernel: fix architected PMU registers unconditional access") made sure we wouldn't access unimplemented PMU registers, but also left MDCR_EL2 uninitialized in that case, leading to trap bits being potentially left set. Make sure we always write something in that register. Fixes: f436b2ac90a0 ("arm64: kernel: fix architected PMU registers unconditional access") Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28arm64: percpu: rewrite ll/sc loops in assemblyWill Deacon
commit 1e6e57d9b34a9075d5f9e2048ea7b09756590d11 upstream. Writing the outer loop of an LL/SC sequence using do {...} while constructs potentially allows the compiler to hoist memory accesses between the STXR and the branch back to the LDXR. On CPUs that do not guarantee forward progress of LL/SC loops when faced with memory accesses to the same ERG (up to 2k) between the failed STXR and the branch back, we may end up livelocking. This patch avoids this issue in our percpu atomics by rewriting the outer loop as part of the LL/SC inline assembly block. Fixes: f97fc810798c ("arm64: percpu: Implement this_cpu operations") Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28arm64: kaslr: fix breakage with CONFIG_MODVERSIONS=yArd Biesheuvel
commit 9c0e83c371cf4696926c95f9c8c77cd6ea803426 upstream. As it turns out, the KASLR code breaks CONFIG_MODVERSIONS, since the kcrctab has an absolute address field that is relocated at runtime when the kernel offset is randomized. This has been fixed already for PowerPC in the past, so simply wire up the existing code dealing with this issue. Fixes: f80fb3a3d508 ("arm64: add support for kernel ASLR") Tested-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28arm64: swp emulation: bound LL/SC retries before reschedulingWill Deacon
commit 1c5b51dfb7b4564008e0cadec5381a69e88b0d21 upstream. If a CPU does not implement a global monitor for certain memory types, then userspace can attempt a kernel DoS by issuing SWP instructions targetting the problematic memory (for example, a framebuffer mapped with non-cacheable attributes). The SWP emulation code protects against these sorts of attacks by checking for pending signals and potentially rescheduling when the STXR instruction fails during the emulation. Whilst this is good for avoiding livelock, it harms emulation of legitimate SWP instructions on CPUs where forward progress is not guaranteed if there are memory accesses to the same reservation granule (up to 2k) between the failing STXR and the retry of the LDXR. This patch solves the problem by retrying the STXR a bounded number of times (4) before breaking out of the LL/SC loop and looking for something else to do. Fixes: bd35a4adc413 ("arm64: Port SWP/SWPB emulation support from arm") Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28x86/boot/smp: Don't try to poke disabled/non-existent APICVille Syrjälä
commit ff8560512b8d4b7ca3ef4fd69166634ac30b2525 upstream. Apparently trying to poke a disabled or non-existent APIC leads to a box that doesn't even boot. Let's not do that. No real clue if this is the right fix, but at least my P3 machine boots again. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yinghai Lu <yinghai@kernel.org> Cc: dyoung@redhat.com Cc: kexec@lists.infradead.org Fixes: 2a51fe083eba ("arch/x86: Handle non enumerated CPU after physical hotplug") Link: http://lkml.kernel.org/r/1477102684-5092-1-git-send-email-ville.syrjala@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updatesAlex Thorlton
commit caef78b6cdeddf4ad364f95910bba6b43b8eb9bf upstream. Some time ago, we brought our UV BIOS callback code up to speed with the new EFI memory mapping scheme, in commit: d1be84a232e3 ("x86/uv: Update uv_bios_call() to use efi_call_virt_pointer()") By leveraging some changes that I made to a few of the EFI runtime callback mechanisms, in commit: 80e75596079f ("efi: Convert efi_call_virt() to efi_call_virt_pointer()") This got everything running smoothly on UV, with the new EFI mapping code. However, this left one, small loose end, in that EFI_OLD_MEMMAP (a.k.a. efi=old_map) will no longer work on UV, on kernels that include the aforementioned changes. At the time this was not a major issue (in fact, it still really isn't), but there's no reason that EFI_OLD_MEMMAP *shouldn't* work on our systems. This commit adds a check into uv_bios_call(), to see if we have the EFI_OLD_MEMMAP bit set in efi.flags. If it is set, we fall back to using our old callback method, which uses efi_call() directly on the __va() of our function pointer. Signed-off-by: Alex Thorlton <athorlton@sgi.com> Acked-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dimitri Sivanich <sivanich@sgi.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Mike Travis <travis@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russ Anderson <rja@sgi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1476928131-170101-1-git-send-email-athorlton@sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28kvm: x86: memset whole irq_eoiJiri Slaby
commit 8678654e3c7ad7b0f4beb03fa89691279cba71f9 upstream. gcc 7 warns: arch/x86/kvm/ioapic.c: In function 'kvm_ioapic_reset': arch/x86/kvm/ioapic.c:597:2: warning: 'memset' used with length equal to number of elements without multiplication by element size [-Wmemset-elt-size] And it is right. Memset whole array using sizeof operator. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> [Added x86 subject tag] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28x86/e820: Don't merge consecutive E820_PRAM rangesDan Williams
commit 23446cb66c073b827779e5eb3dec301623299b32 upstream. Commit: 917db484dc6a ("x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation") ... fixed up the broken manipulations of max_pfn in the presence of E820_PRAM ranges. However, it also broke the sanitize_e820_map() support for not merging E820_PRAM ranges. Re-introduce the enabling to keep resource boundaries between consecutive defined ranges. Otherwise, for example, an environment that boots with memmap=2G!8G,2G!10G will end up with a single 4G /dev/pmem0 device instead of a /dev/pmem0 and /dev/pmem1 device 2G in size. Reported-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Zhang Yi <yizhan@redhat.com> Cc: linux-nvdimm@lists.01.org Fixes: 917db484dc6a ("x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation") Link: http://lkml.kernel.org/r/147629530854.10618.10383744751594021268.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28arc: don't leak bits of kernel stack into coredumpAl Viro
commit 7798bf2140ebcc36eafec6a4194fffd8d585d471 upstream. On faulting sigreturn we do get SIGSEGV, all right, but anything we'd put into pt_regs could end up in the coredump. And since __copy_from_user() never zeroed on arc, we'd better bugger off on its failure without copying random uninitialized bits of kernel stack into pt_regs... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>