Lines Matching +full:is +full:- +full:decoded +full:- +full:cs
10 * (such as the example in Documentation/virtual/lguest/lguest.c) is called the
20 * This file consists of all the replacements for such low-level native
31 * This program is free software; you can redistribute it and/or modify
36 * This program is distributed in the hope that it will be useful, but
79 * The Guest in our tale is a simple creature: identical to the Host but
80 * behaving in simplified but equivalent ways. In particular, the Guest is the
85 .hcall_status = { [0 ... LHCALL_RING_SIZE-1] = 0xFF },
94 * async_hcall() is pretty simple: I'm quite proud of it really. We have a
97 * arguments, and a "hcall_status" word which is 0 if the call is ready to go,
100 * If we come around to a slot which hasn't been finished, then the table is
138 * Notice the lazy_hcall() above, rather than hcall(). This is our first real
141 * When lazy_mode is set, it means we're allowed to defer all hypercalls and do
142 * them as a batch when lazy_mode is eventually turned off. Because hypercalls
196 * When lazy mode is turned off, we issue the do-nothing hypercall to
198 * per-cpu lazy mode variable.
210 * (Technically, this is lazy CPU mode, and normally we're in lazy MMU
220 * After that diversion we return to our first native-instruction
224 * off" and "turn interrupts on" hypercalls. Unfortunately, this is too slow:
233 * save_flags() is expected to return the processor state (ie. "flags"). The
268 * page by itself, and have the Host write-protect it when an interrupt comes
270 * interrupts are re-enabled.
272 * A better method is to implement soft interrupt disable generally for x86:
274 * in, we then disable them for real. This is uncommon, so we could simply use
282 * entry in the table is a 64-bit descriptor: this holds the privilege level,
290 * The gate_desc structure is 8 bytes long: we hand it to the Host in in lguest_write_idt_entry()
291 * two 32-bit chunks. The whole 32-bit kernel used to hand descriptors in lguest_write_idt_entry()
303 * Changing to a different IDT is very rare: we keep the IDT up-to-date every
304 * time it is written, so we can simply loop through all entries and tell the
310 struct desc_struct *idt = (void *)desc->address; in lguest_load_idt()
312 for (i = 0; i < (desc->size+1)/8; i++) in lguest_load_idt()
320 * Table (GDT). You tell the CPU where it is (and its size) using the "lgdt"
326 * This is the exactly like the IDT code.
331 struct desc_struct *gdt = (void *)desc->address; in lguest_load_gdt()
333 for (i = 0; i < (desc->size+1)/8; i++) in lguest_load_gdt()
367 lazy_hcall2(LHCALL_LOAD_TLS, __pa(&t->tls_array), cpu); in lguest_load_tls()
374 * This is the Local Descriptor Table, another weird Intel thingy. Linux only
391 * override the native version with a do-nothing version.
398 * The "cpuid" instruction is a way of querying both the CPU identity
401 * As you might imagine, after a decade and a half this treatment, it is now a
410 * lguest sales!) Shut up, inner voice! (Hey, just pointing out that this is
414 * Replacing the cpuid so we can turn features off is great for the kernel, but
436 * CPUID 1 is a basic feature request. in lguest_cpuid()
447 * PAGE_OFFSET is set to) haven't changed. But Linux calls in lguest_cpuid()
449 * the Page Global Enable (PGE) feature bit is set. in lguest_cpuid()
455 * Family ID is returned as bits 8-12 in ax. in lguest_cpuid()
462 * This is used to detect if we're running under KVM. We might be, in lguest_cpuid()
479 * PAE systems can mark pages as non-executable. Linux calls this the in lguest_cpuid()
497 * features, but Linux only really cares about one: the horrifically-named Task
501 * the floating point unit is used. Which allows us to restore FPU state
522 * to use write_cr0() to do it. This "clts" instruction is faster, because all
532 * cr2 is the virtual address of the last page fault, which the Guest only ever
546 * cr3 is the current toplevel pagetable page: the principle is the same as
564 /* cr4 is used to enable and disable PGE, but we don't care. */
581 * Quick refresher: memory is divided into "pages" of 4096 bytes each. The CPU
583 * use one huge index of 1 million entries: each address is 4 bytes, so that's
587 * contains physical addresses of up to 1024 second-level pages. Each of these
593 * cr3 ---> +---------+
594 * | --------->+---------+
596 * Mid-level | | PADDR2 |
598 * | | Lower-level |
606 * level entry was not present, then the virtual address is invalid (we
609 * Put another way, a 32-bit virtual address is divided up like so:
612 * |<---- 10 bits ---->|<---- 10 bits ---->|<------ 12 bits ------>|
618 * These are held in 64-bit page table entries, so we can now only fit 512
619 * entries in a page, and the neat three-level tree breaks down.
621 * The result is a four level page table:
623 * cr3 --> [ 4 Upper ]
626 * [(PUD Page)]---> +---------+
627 * | --------->+---------+
629 * Mid-level | | PADDR2 |
631 * | | Lower-level |
637 * And the virtual address is decoded as:
640 * |<-2->|<--- 9 bits ---->|<---- 9 bits --->|<------ 12 bits ------>|
645 * supports one or the other depending on whether CONFIG_X86_PAE is set. Many
655 * top-level page directory and lower-level pagetable pages. The Guest doesn't
662 * The Guest calls this after it has set a second-level entry (pte), ie. to map
665 * we need to tell the Host which one we're changing (mm->pgd).
672 lazy_hcall4(LHCALL_SET_PTE, __pa(mm->pgd), addr, in lguest_pte_update()
673 ptep->pte_low, ptep->pte_high); in lguest_pte_update()
675 lazy_hcall3(LHCALL_SET_PTE, __pa(mm->pgd), addr, ptep->pte_low); in lguest_pte_update()
679 /* This is the "set and update" combo-meal-deal version. */
688 * The Guest calls lguest_set_pud to set a top-level entry and lguest_set_pmd
689 * to set a middle-level entry when PAE is activated.
708 (__pa(pmdp) & (PAGE_SIZE - 1)) / sizeof(pmd_t)); in lguest_set_pmd()
712 /* The Guest calls lguest_set_pmd to set a top-level entry when !PAE. */
717 (__pa(pmdp) & (PAGE_SIZE - 1)) / sizeof(pmd_t)); in lguest_set_pmd()
723 * don't know the top level any more. This is useless for us, since we don't
724 * know which pagetable is changing or what address, so we just tell the Host
725 * to forget all of them. Fortunately, this is very rare.
741 * With 64-bit PTE values, we need to be careful setting them: if we set 32
742 * bits at a time, the hardware could see a weird half-set entry. These
769 * a TLB flush (a TLB is a little cache of page table entries kept by the CPU).
772 * called when a valid entry is written, not when it's removed (ie. marked not
773 * present). Instead, this is where we come when the Guest wants to remove a
775 * bit is zero).
784 * This is what happens after the Guest has removed a large number of entries.
794 * This is called when the kernel page tables have changed. That's not very
795 * common (unless the Guest is using highmem, which makes the Guest extremely
806 * This is an attempt to implement the simplest possible interrupt controller.
809 * I *think* this is as simple as it gets.
812 * lguest_data.interrupts bitmap, so disabling (aka "masking") them is as
818 set_bit(data->irq, lguest_data.blocked_interrupts); in disable_lguest_irq()
823 clear_bit(data->irq, lguest_data.blocked_interrupts); in enable_lguest_irq()
836 * interrupt (except 128, which is used for system calls), and then tells the
837 * Linux infrastructure that each interrupt is controlled by our level-based
846 __this_cpu_write(vector_irq[i], i - FIRST_EXTERNAL_VECTOR); in lguest_init_IRQ()
848 set_intr_gate(i, interrupt[i - FIRST_EXTERNAL_VECTOR]); in lguest_init_IRQ()
852 * This call is required to set up for 4k stacks, where we have in lguest_init_IRQ()
859 * Interrupt descriptors are allocated as-needed, but low-numbered ones are
861 * tells us the irq is already used: other errors (ie. ENOMEM) we take
868 /* Returns -ve error or vector number. */ in lguest_setup_irq()
870 if (err < 0 && err != -EEXIST) in lguest_setup_irq()
890 * The TSC is an Intel thing called the Time Stamp Counter. The Host tells us
901 * If we can't use the TSC, the kernel falls back to our lower-priority
904 static cycle_t lguest_clock_read(struct clocksource *cs) in lguest_clock_read() argument
909 * Since the time is in two parts (seconds and nanoseconds), we risk in lguest_clock_read()
930 /* Our lguest clock is in real nanoseconds. */ in lguest_clock_read()
934 /* This is the fallback clocksource: lower priority than the TSC clocksource. */
957 return -ETIME; in lguest_clockevent_set_next_event()
975 /* This is what we expect. */ in lguest_clockevent_set_mode()
998 * This is the Guest timer interrupt handler (hardware interrupt 0). We just
1005 /* Don't interrupt us while this is running. */ in lguest_time_irq()
1037 * Here is an oddball collection of functions which the Guest needs for things
1043 * native hardware, this is part of the Task State Segment mentioned above in
1046 * We tell the Host the segment we want to use (__KERNEL_DS is the kernel data
1047 * segment), the privilege level (we're privilege level 1, the Host is 0 and
1054 lazy_hcall3(LHCALL_SET_STACK, __KERNEL_DS | 0x1, thread->sp0, in lguest_load_sp0()
1070 * (clflush) instruction is available and the kernel uses that. Otherwise, it
1119 apic->read = lguest_apic_read; in set_lguest_basic_apic_ops()
1120 apic->write = lguest_apic_write; in set_lguest_basic_apic_ops()
1121 apic->icr_read = lguest_apic_icr_read; in set_lguest_basic_apic_ops()
1122 apic->icr_write = lguest_apic_icr_write; in set_lguest_basic_apic_ops()
1123 apic->wait_icr_idle = lguest_apic_wait_icr_idle; in set_lguest_basic_apic_ops()
1124 apic->safe_wait_icr_idle = lguest_apic_safe_wait_icr_idle; in set_lguest_basic_apic_ops()
1150 * Don't. But if you did, this is what happens.
1163 /* Setting up memory is fairly easy. */
1174 /* This string is for the boot messages. */ in lguest_memory_setup()
1180 * but before that is set up we use LHCALL_NOTIFY on normal memory to produce
1188 /* We use a nul-terminated string, so we make a copy. Icky, huh? */ in early_put_chars()
1189 if (len > sizeof(scratch) - 1) in early_put_chars()
1190 len = sizeof(scratch) - 1; in early_put_chars()
1217 * be solved with another layer of indirection"? The rest of that quote is
1218 * "... But that usually will create another problem." This is the first of
1221 * Our current solution is to allow the paravirt back end to optionally patch
1242 * Now our patch routine is fairly simple (based on the native one in
1255 insn_len = lguest_insns[type].end - lguest_insns[type].start; in lguest_patch()
1275 /* Paravirt is enabled. */ in lguest_init()
1287 /* Interrupt-related operations */ in lguest_init()
1351 * Now is a good time to look at the implementations of these functions in lguest_init()
1362 * The stack protector is a weird thing where gcc places a canary in lguest_init()
1363 * value on the stack and then checks it on return. This file is in lguest_init()
1364 * compiled with -fno-stack-protector it, so we got this far without in lguest_init()
1365 * problems. The value of the canary is kept at offset 20 from the in lguest_init()
1374 * per-cpu segment descriptor register %fs as well. in lguest_init()
1379 * The Host<->Guest Switcher lives at the top of our address space, and in lguest_init()
1380 * the Host told us how big it is when we made LGUEST_INIT hypercall: in lguest_init()
1403 * This is messy CPU setup stuff which the native boot code does before in lguest_init()
1410 /* Math is always hard! */ in lguest_init()
1422 * We set the preferred console to "hvc". This is the "hypervisor in lguest_init()
1448 * It is now time for us to explore the layer of virtual drivers and complete