arm64 performs context-switch in kernel space instead of user space as in
x86_64.
Change-Id: Ib119b9ff014effb970183ee86cfac67fab773cba
Futjitsu: POSTK_DEBUG_ARCH_DEP_99
mcoverlayfs code is now unused (technically should work on top of the
soft emulation but not well tested, and untested unused code is bad).
Remove it.
Left the unshare/bind_mount_recursive code in mcexec in a new
MCEXEC_BIND_MOUNT ifdef (only in config.h.in directly to discourage use.
it disables the ioctl as well, but the main code is still compiled to
keep up to date with linux api changes... although it's using kallsyms
lookup so it does not validate much more than "the symbol still exists")
I honestly think this should go as well (people who would want to use it
are root and could do it manually), but will give up for now.
Change-Id: I832b6a8ab19e24ed67a1a5044b1c6c32381ae0aa
In order to speed up test bot work it would be helpful to check for
identical build outputs and skip tests if required.
This removes most use of the install path in c code:
- ql_mpi uses /proc/self/exe and looks for talker/server in same
directory as itself
- mcexec looks for libihk.so in /proc/self/maps and use that path for
LD_PRELOAD prefix path
- rootfsdir is not used right now but until a better fix happens just
hardcode it, someone who wants to change it can set it through cmake
There is one last occurence of the install directory, MCEXEC_PATH in
mcctrl's binfmt code, for which the build system will just overwrite it
to a constant string at build time instead of trying to remove it too
hard. It would be possible to pass it as a kernel parameter or look for
mcexec in PATH but this is too much work for now.
Change-Id: I5d1352bc5748a1ea10dcae4be630f30a07609296
There are valid use cases where a remote page fault has no available
thread data/packet available to use, e.g. when device driver threads
need to access the data (BXI).
Do the per thread data lookup to use the right channel/tid if available,
and use mcctrl_ikc_send_wait with a new message number directly.
The fault is no longer handled in mckernel syscall forwarding code but
in the ikc handler directly in irq, this should be ok because page
faults are interrupts anyway so the code should be irq-safe.
Change-Id: Ie60f413cdaee6c1a824b4a2c93637899cb9bf9c9
- arm64: Get TSC corresponding to boot time from IHK.
- x86_64: Calculate the current time using vdso.
Refs: #1186
Fujitsu: POSTK_DEBUG_ARCH_DEP_52
Change-Id: I293ba4bbe5390d50dea44b8a5b7471f59237daff
vmf_insert_pfn got added as a wrapper around vm_insert_pfn in 4.17
1c8f422059ae5da ("mm: change return type to vm_fault_t") and totally
replaced the later in 4.20 ae2b01f37044c ("mm: remove vm_insert_pfn()")
Compare with 4.18 here specifically to avoid troubles when rhel
backports this change later, and avoid adding a rhel version check down
the road.
Change-Id: Ibf108e2fb6f1199f89cde6a7973f4eb55447260b
This fix is rejected because it only makes the setfsuid test in ostest
pass and doesn't fix the other issues including the one in which file
I/O could be done with the old fsuid because an mcexec thread with an
arbitrary tid could handle the system-call offload request.
Explanation of the rejected fix:
setfsuid() proceeds as follows:
1. McKernel asks mcexec for __NR_setfsuid (set)
2. mcexec calls setfsuid, reports the id to McKernel
3. McKernel asks mcexec for __NR_setfsuid (get)
4. mcexec calls mcexec_getcred(), reports the id to Mckernel
5. McKernel sets proc->fsuid to the obtained value
tid of mcexec on the 2nd and 4th step could be different. So this
fix lets mcexec report its tid on the 2nd step and McKernel specify
it in the 3rd step.
Change-Id: Id5cfeed18c64430d576a56e961bbca1ecb2e39ad
Fujitsu: POSTK_DEBUG_TEMP_FIX_45
Fix that process will remain even if signal is received between PPD
registration and release_handler registration.
Refs: #1201
Fujitsu: POSTK_DEBUG_TEMP_FIX_64
Change-Id: I571781963578df8cedb327f19298f595cfb137a3
Add the following patterns of symlinks:
- /sys/bus/cpu/drivers/processor/cpu*
- /sys/bus/node/devices/node*
And slightly change how /sys/devices/system/cpu/cpu*/node* are created
to avoid duplicate lookups
Change-Id: Id94a4d157da06d75f6bd450d5bd9a9e7709a1414
Apparently /proc needs it; it's normally implemented using get_link if
readlink isn't implemented but proc's get_link crashes the kernel in
this case (because nameidata is only defined for open* paths)
Change-Id: I1864d6c948db879d33ea29b1b281bf84ff8eeec6
zap_vma_ptes no longer returns an error code as of Linux's
27d036e33237e4 ("mm: Remove return value of zap_vma_ptes()"),
where they decided nobody is interested in it....
Just copy the check out of the function.
Change-Id: I2eda0f91ec55a34bba96f45cc3d887bc80132a82
Originally-by: Kagawa Kodai <fj1731iw@aa.jp.fujitsu.com>
Separate copyright bumps in a different commit.
A lot of files only had the copyright change at this point; these
were probably changes I added separatly in other patches but just
split these in a different commit instead to simplify git stats
Change-Id: I93cf3fc1c0fa04ee743a79c3fe9768933e6bd0d2
arch perf counters are placed at start, so offset all
other counters (because placing arch perf counters at the end
wouldn't have been intrusive enough?)
Change-Id: Ifab1047872384927d9cfa0a0212327ee73545c29
Fujitsu: POSTK_DEBUG_ARCH_DEP_86
While we are here:
- fix uname -r (single quote?!)
- add compat for rhel8 (el kernel and version is 4.18)
- also remove linux version check in mcreboot.sh, trust configure check
Change-Id: I14726d4374b0dfd941640096044ea1d5d88bfcb8
ihk_ikc_release_packet takes the channel and puts the packet into its
free-list. This fix makes it easy and safe to identify the proper
channel.
Change-Id: I5584b1e8a3ed675c2f9d68f0b5ed331b909197f6
Fujitsu: POSTK_DEBUG_TEMP_FIX_89
Fixed an issue where errors generated in arch_cpu_read_write_register()
are not transmitted to the caller.
Change-Id: I05d7d872eab834918220cf18f628aee37208a156
Fujitsu: POSTK_DEBUG_TEMP_FIX_94
Since 4.17.0, kernel cannot call syscalls directly because the calling
convention can be different on x86_64, as explained in this email:
https://lore.kernel.org/lkml/20180325162527.GA17492@light.dominikbrodowski.net
Use the ksys_* alternatives instead when possible, or for readlink use
do_readlinkat (and use readlinkat all the time to simplify ifdefs)
It might be possible to change some of these without ifdefs, but for
example ksys_unshare only got introduced in 4.17 so we need to keep some
syscall calling...
Change-Id: Ic47e184b29ef8b21731b2eae6193b0af2548b872
it was meant for 3.10 kernels, so the regular < 4.0.0 check
will work for el7 and older kernels as well
Change-Id: I807f030f6303c9c3d17b0d80de55c256a3479486
rhel8 is a 4.18 kernel but they've already backported some later fixes.
Instead of relying on the kernel version, the changes removed some defines so
we can check for the define presence to make the code more robust to kernel
version wilderness instead
Change-Id: I6cf5548a7b73a7394405daf850f715a1e20ab0b4
This newer version is much simpler than the old ones:
- the options are noop, this lets the code simplify all the allocating
of a new option struct and passing it around
- ovl_reset_ovl_entry was added and called all the time, but the
mechanism that made this required is gone in this kernel version
On the other hand, one new thing in this version:
- newer kernel check the stacking depth of filesystems now, and we are
reaching the default limit of two with our setup. Bump it to three here.
Also, while we are here, make make fail if requested directory does not
exist, instead of infinitely recurse into make modules in the mcoverlayfs
directory...
Change-Id: I45050d693a0aa6fd3027deaf417c29876ef6a1ea
newer version of this function no longer return an error on the basis
that "no-one checks what it returns anyway"........
See linux 4.18's 27d036e33237e ("mm: Remove return value of zap_vma_ptes()")
Change-Id: I8fb9f060e3e145cc2db21738585c9ee7f1445f74
newer linux got a 5 level page table now, try to handle that.
Some of the macros will be no-op (e.g. loop only on one iteration) on
architecture/kernels with only 4 levels but the code needs to be there
to compile
Change-Id: Ifc6304cbb066dce7d4e30962687ae05d7e034730
The headers defines __task_cred and other macroes we use, and always
existed; we must have gotten it indirectly on older kernels, it doesn't
hurt to always include
Change-Id: Iacfff0365e7a21e6247eea42606bbbf1dfccc077
on newer x64 kernels (config option?), syscalls can be renamed to allow
both x64 and ia32 versions to coexist. Lookup either names
Change-Id: I2f55cc804d3eee948ee1ed6d18c69c75bd2f652c
The symbol appears in some header in some linux version,
it's still not exported so we need our own lookup anyway; just rename it.
Change-Id: Ia4bce85988641c96fa3f5a0ae1d42c25c713b6c2
In addition to that, mcctrl_perf_set is modified so that it updates
usrdata->perf_event_num with number of registered events.
Change-Id: I3f343176f55b06d3baab0b0fe34e240f39706cf6
Fujitsu: POSTK_DEBUG_TEMP_FIX_80
This reverts commit b70d470e20.
That commit had been landed too fast after a mistake during migration
from old to new gerrit that didn't keep -1 vote ; it needs some fix
Change-Id: Ifc8a23e42449dfe471049270b4706e9b137e096e