diff options
author | Boqun Feng <boqun.feng@gmail.com> | 2017-10-03 21:36:51 +0800 |
---|---|---|
committer | Ben Hutchings <ben@decadent.org.uk> | 2017-11-26 13:50:41 +0000 |
commit | 9c32447e8a0e624fdc3c01bb013b73d49b11809b (patch) | |
tree | a952fbc3926ba014a932755b72426b1196a67d2e /arch/x86/kernel/kvm.c | |
parent | 6e461f01995fd592bbbd82fe1b71e2e2ca5a8baf (diff) |
kvm/x86: Avoid async PF preempting the kernel incorrectly
commit a2b7861bb33b2538420bb5d8554153484d3f961f upstream.
Currently, in PREEMPT_COUNT=n kernel, kvm_async_pf_task_wait() could call
schedule() to reschedule in some cases. This could result in
accidentally ending the current RCU read-side critical section early,
causing random memory corruption in the guest, or otherwise preempting
the currently running task inside between preempt_disable and
preempt_enable.
The difficulty to handle this well is because we don't know whether an
async PF delivered in a preemptible section or RCU read-side critical section
for PREEMPT_COUNT=n, since preempt_disable()/enable() and rcu_read_lock/unlock()
are both no-ops in that case.
To cure this, we treat any async PF interrupting a kernel context as one
that cannot be preempted, preventing kvm_async_pf_task_wait() from choosing
the schedule() path in that case.
To do so, a second parameter for kvm_async_pf_task_wait() is introduced,
so that we know whether it's called from a context interrupting the
kernel, and the parameter is set properly in all the callsites.
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
[bwh: Backported to 3.16:
- Use user_mode_vm() as equivalent to upstream user_mode()
- Adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Diffstat (limited to 'arch/x86/kernel/kvm.c')
-rw-r--r-- | arch/x86/kernel/kvm.c | 14 |
1 files changed, 10 insertions, 4 deletions
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 3ec44283b71d..26e265e3e8d7 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -116,7 +116,11 @@ static struct kvm_task_sleep_node *_find_apf_task(struct kvm_task_sleep_head *b, return NULL; } -void kvm_async_pf_task_wait(u32 token) +/* + * @interrupt_kernel: Is this called from a routine which interrupts the kernel + * (other than user space)? + */ +void kvm_async_pf_task_wait(u32 token, int interrupt_kernel) { u32 key = hash_32(token, KVM_TASK_SLEEP_HASHBITS); struct kvm_task_sleep_head *b = &async_pf_sleepers[key]; @@ -139,8 +143,10 @@ void kvm_async_pf_task_wait(u32 token) n.token = token; n.cpu = smp_processor_id(); - n.halted = is_idle_task(current) || preempt_count() > 1 || - rcu_preempt_depth(); + n.halted = is_idle_task(current) || + (IS_ENABLED(CONFIG_PREEMPT_COUNT) + ? preempt_count() > 1 || rcu_preempt_depth() + : interrupt_kernel); init_waitqueue_head(&n.wq); hlist_add_head(&n.link, &b->list); spin_unlock(&b->lock); @@ -269,7 +275,7 @@ do_async_page_fault(struct pt_regs *regs, unsigned long error_code) /* page is swapped out by the host. */ prev_state = exception_enter(); exit_idle(); - kvm_async_pf_task_wait((u32)read_cr2()); + kvm_async_pf_task_wait((u32)read_cr2(), !user_mode_vm(regs)); exception_exit(prev_state); break; case KVM_PV_REASON_PAGE_READY: |