Note: the original fujitsu implementation didn't rename the various
save_fs function/desc to save_tls for some reason, might as well go all
the way though...
Change-Id: Ic362c15c8b320c4d258d2ead8c5fd4eafd9d0ae9
Fujitsu: POSTK_DEBUG_ARCH_DEP_91
arm64 performs context-switch in kernel space instead of user space as in
x86_64.
Change-Id: Ib119b9ff014effb970183ee86cfac67fab773cba
Futjitsu: POSTK_DEBUG_ARCH_DEP_99
mcoverlayfs code is now unused (technically should work on top of the
soft emulation but not well tested, and untested unused code is bad).
Remove it.
Left the unshare/bind_mount_recursive code in mcexec in a new
MCEXEC_BIND_MOUNT ifdef (only in config.h.in directly to discourage use.
it disables the ioctl as well, but the main code is still compiled to
keep up to date with linux api changes... although it's using kallsyms
lookup so it does not validate much more than "the symbol still exists")
I honestly think this should go as well (people who would want to use it
are root and could do it manually), but will give up for now.
Change-Id: I832b6a8ab19e24ed67a1a5044b1c6c32381ae0aa
In order to speed up test bot work it would be helpful to check for
identical build outputs and skip tests if required.
This removes most use of the install path in c code:
- ql_mpi uses /proc/self/exe and looks for talker/server in same
directory as itself
- mcexec looks for libihk.so in /proc/self/maps and use that path for
LD_PRELOAD prefix path
- rootfsdir is not used right now but until a better fix happens just
hardcode it, someone who wants to change it can set it through cmake
There is one last occurence of the install directory, MCEXEC_PATH in
mcctrl's binfmt code, for which the build system will just overwrite it
to a constant string at build time instead of trying to remove it too
hard. It would be possible to pass it as a kernel parameter or look for
mcexec in PATH but this is too much work for now.
Change-Id: I5d1352bc5748a1ea10dcae4be630f30a07609296
There are valid use cases where a remote page fault has no available
thread data/packet available to use, e.g. when device driver threads
need to access the data (BXI).
Do the per thread data lookup to use the right channel/tid if available,
and use mcctrl_ikc_send_wait with a new message number directly.
The fault is no longer handled in mckernel syscall forwarding code but
in the ikc handler directly in irq, this should be ok because page
faults are interrupts anyway so the code should be irq-safe.
Change-Id: Ie60f413cdaee6c1a824b4a2c93637899cb9bf9c9
- arm64: Get TSC corresponding to boot time from IHK.
- x86_64: Calculate the current time using vdso.
Refs: #1186
Fujitsu: POSTK_DEBUG_ARCH_DEP_52
Change-Id: I293ba4bbe5390d50dea44b8a5b7471f59237daff
vmf_insert_pfn got added as a wrapper around vm_insert_pfn in 4.17
1c8f422059ae5da ("mm: change return type to vm_fault_t") and totally
replaced the later in 4.20 ae2b01f37044c ("mm: remove vm_insert_pfn()")
Compare with 4.18 here specifically to avoid troubles when rhel
backports this change later, and avoid adding a rhel version check down
the road.
Change-Id: Ibf108e2fb6f1199f89cde6a7973f4eb55447260b
This fix is rejected because it only makes the setfsuid test in ostest
pass and doesn't fix the other issues including the one in which file
I/O could be done with the old fsuid because an mcexec thread with an
arbitrary tid could handle the system-call offload request.
Explanation of the rejected fix:
setfsuid() proceeds as follows:
1. McKernel asks mcexec for __NR_setfsuid (set)
2. mcexec calls setfsuid, reports the id to McKernel
3. McKernel asks mcexec for __NR_setfsuid (get)
4. mcexec calls mcexec_getcred(), reports the id to Mckernel
5. McKernel sets proc->fsuid to the obtained value
tid of mcexec on the 2nd and 4th step could be different. So this
fix lets mcexec report its tid on the 2nd step and McKernel specify
it in the 3rd step.
Change-Id: Id5cfeed18c64430d576a56e961bbca1ecb2e39ad
Fujitsu: POSTK_DEBUG_TEMP_FIX_45
Fix that process will remain even if signal is received between PPD
registration and release_handler registration.
Refs: #1201
Fujitsu: POSTK_DEBUG_TEMP_FIX_64
Change-Id: I571781963578df8cedb327f19298f595cfb137a3
Add the following patterns of symlinks:
- /sys/bus/cpu/drivers/processor/cpu*
- /sys/bus/node/devices/node*
And slightly change how /sys/devices/system/cpu/cpu*/node* are created
to avoid duplicate lookups
Change-Id: Id94a4d157da06d75f6bd450d5bd9a9e7709a1414
zap_vma_ptes no longer returns an error code as of Linux's
27d036e33237e4 ("mm: Remove return value of zap_vma_ptes()"),
where they decided nobody is interested in it....
Just copy the check out of the function.
Change-Id: I2eda0f91ec55a34bba96f45cc3d887bc80132a82
Originally-by: Kagawa Kodai <fj1731iw@aa.jp.fujitsu.com>
Separate copyright bumps in a different commit.
A lot of files only had the copyright change at this point; these
were probably changes I added separatly in other patches but just
split these in a different commit instead to simplify git stats
Change-Id: I93cf3fc1c0fa04ee743a79c3fe9768933e6bd0d2
arch perf counters are placed at start, so offset all
other counters (because placing arch perf counters at the end
wouldn't have been intrusive enough?)
Change-Id: Ifab1047872384927d9cfa0a0212327ee73545c29
Fujitsu: POSTK_DEBUG_ARCH_DEP_86
ihk_ikc_release_packet takes the channel and puts the packet into its
free-list. This fix makes it easy and safe to identify the proper
channel.
Change-Id: I5584b1e8a3ed675c2f9d68f0b5ed331b909197f6
Fujitsu: POSTK_DEBUG_TEMP_FIX_89
Fixed an issue where errors generated in arch_cpu_read_write_register()
are not transmitted to the caller.
Change-Id: I05d7d872eab834918220cf18f628aee37208a156
Fujitsu: POSTK_DEBUG_TEMP_FIX_94
Since 4.17.0, kernel cannot call syscalls directly because the calling
convention can be different on x86_64, as explained in this email:
https://lore.kernel.org/lkml/20180325162527.GA17492@light.dominikbrodowski.net
Use the ksys_* alternatives instead when possible, or for readlink use
do_readlinkat (and use readlinkat all the time to simplify ifdefs)
It might be possible to change some of these without ifdefs, but for
example ksys_unshare only got introduced in 4.17 so we need to keep some
syscall calling...
Change-Id: Ic47e184b29ef8b21731b2eae6193b0af2548b872
it was meant for 3.10 kernels, so the regular < 4.0.0 check
will work for el7 and older kernels as well
Change-Id: I807f030f6303c9c3d17b0d80de55c256a3479486
newer version of this function no longer return an error on the basis
that "no-one checks what it returns anyway"........
See linux 4.18's 27d036e33237e ("mm: Remove return value of zap_vma_ptes()")
Change-Id: I8fb9f060e3e145cc2db21738585c9ee7f1445f74
newer linux got a 5 level page table now, try to handle that.
Some of the macros will be no-op (e.g. loop only on one iteration) on
architecture/kernels with only 4 levels but the code needs to be there
to compile
Change-Id: Ifc6304cbb066dce7d4e30962687ae05d7e034730
The headers defines __task_cred and other macroes we use, and always
existed; we must have gotten it indirectly on older kernels, it doesn't
hurt to always include
Change-Id: Iacfff0365e7a21e6247eea42606bbbf1dfccc077
on newer x64 kernels (config option?), syscalls can be renamed to allow
both x64 and ia32 versions to coexist. Lookup either names
Change-Id: I2f55cc804d3eee948ee1ed6d18c69c75bd2f652c
The symbol appears in some header in some linux version,
it's still not exported so we need our own lookup anyway; just rename it.
Change-Id: Ia4bce85988641c96fa3f5a0ae1d42c25c713b6c2
In addition to that, mcctrl_perf_set is modified so that it updates
usrdata->perf_event_num with number of registered events.
Change-Id: I3f343176f55b06d3baab0b0fe34e240f39706cf6
Fujitsu: POSTK_DEBUG_TEMP_FIX_80
This reverts commit b70d470e20.
That commit had been landed too fast after a mistake during migration
from old to new gerrit that didn't keep -1 vote ; it needs some fix
Change-Id: Ifc8a23e42449dfe471049270b4706e9b137e096e
Hugetlbfs file mappings are handled differently than regular files:
- pager_req_create will tell us the file is in a hugetlbfs
- allocate memory upfront, we need to fail if not enough memory
- the memory needs to be given again if another process maps the same
file
This implementation still has some hacks, in particular, the memory
needs to be freed when all mappings are done and the file has been
deleted/closed by all processes.
We cannot know when the file is closed/unlinked easily, so clean up
memory when all processes have exited.
To test, install libhugetlbfs and link a program with the additional
LDFLAGS += -B /usr/share/libhugetlbfs -Wl,--hugetlbfs-align
Then run with HUGETLB_ELFMAP=RW set, you can check this works with
HUGETLB_DEBUG=1 HUGETLB_VERBOSE=2
Change-Id: I327920ff06efd82e91b319b27319f41912169af1
- remove unused MF_END (that only makes sense for enums without holes,
this one is a set of bits masks)
- remove useless goto in pager_req_create()
- init maxprot to 0 from the start, it's not used in the error cases
(except for debug print)
Change-Id: Ic56c0754824b99f8a7e45fa8e99b8fe3e7c7e592