While we are here:
- fix uname -r (single quote?!)
- add compat for rhel8 (el kernel and version is 4.18)
- also remove linux version check in mcreboot.sh, trust configure check
Change-Id: I14726d4374b0dfd941640096044ea1d5d88bfcb8
ihk_ikc_release_packet takes the channel and puts the packet into its
free-list. This fix makes it easy and safe to identify the proper
channel.
Change-Id: I5584b1e8a3ed675c2f9d68f0b5ed331b909197f6
Fujitsu: POSTK_DEBUG_TEMP_FIX_89
Fixed an issue where errors generated in arch_cpu_read_write_register()
are not transmitted to the caller.
Change-Id: I05d7d872eab834918220cf18f628aee37208a156
Fujitsu: POSTK_DEBUG_TEMP_FIX_94
Since 4.17.0, kernel cannot call syscalls directly because the calling
convention can be different on x86_64, as explained in this email:
https://lore.kernel.org/lkml/20180325162527.GA17492@light.dominikbrodowski.net
Use the ksys_* alternatives instead when possible, or for readlink use
do_readlinkat (and use readlinkat all the time to simplify ifdefs)
It might be possible to change some of these without ifdefs, but for
example ksys_unshare only got introduced in 4.17 so we need to keep some
syscall calling...
Change-Id: Ic47e184b29ef8b21731b2eae6193b0af2548b872
envs are stuck after args which are now possibly unaligned, and used
from a non-aligned pointer in prepare_process_ranges_args_envs (env)
The memory immediately after args/envs is copied anyway with memcpy_long,
so make sure the bits are initialized and realign env correctly
Fixes: 70e52faf36 ("flatten_strings: do not return unused trailing bits")
Change-Id: Ic747e947d151c0eea65dec36bc9c888cf6e0c394
it was meant for 3.10 kernels, so the regular < 4.0.0 check
will work for el7 and older kernels as well
Change-Id: I807f030f6303c9c3d17b0d80de55c256a3479486
the libc takes care of trying execve as many times as needed for
execvp, it's not a kernel call.
Also, sneak a double-free fix (desc was not reset properly in case
load_elf_desc_shebang failed)
Fixes: b1681f4a3affff ("mcexec/execve: fix shebangs handling")
Change-Id: If8e3d7ae53acdeffc0331ae8621e0832fcfa406f
Running "mcexec dfsafds" did not print any message in normal use.
Rather than looking for which message shows in debug and turn in into
eprintf, add a single coherent message (more shell-like) at the end and
turn other messages off.
There is a small loss of information but this is equivalent to what
shells give (a single errno value with no details), and it is now easy
to add --debug to mcexec to see more information if required
Change-Id: Id2c3a47880b7d1d7467883351e6e7af561f91bbf
rhel8 is a 4.18 kernel but they've already backported some later fixes.
Instead of relying on the kernel version, the changes removed some defines so
we can check for the define presence to make the code more robust to kernel
version wilderness instead
Change-Id: I6cf5548a7b73a7394405daf850f715a1e20ab0b4
This newer version is much simpler than the old ones:
- the options are noop, this lets the code simplify all the allocating
of a new option struct and passing it around
- ovl_reset_ovl_entry was added and called all the time, but the
mechanism that made this required is gone in this kernel version
On the other hand, one new thing in this version:
- newer kernel check the stacking depth of filesystems now, and we are
reaching the default limit of two with our setup. Bump it to three here.
Also, while we are here, make make fail if requested directory does not
exist, instead of infinitely recurse into make modules in the mcoverlayfs
directory...
Change-Id: I45050d693a0aa6fd3027deaf417c29876ef6a1ea
newer version of this function no longer return an error on the basis
that "no-one checks what it returns anyway"........
See linux 4.18's 27d036e33237e ("mm: Remove return value of zap_vma_ptes()")
Change-Id: I8fb9f060e3e145cc2db21738585c9ee7f1445f74
strncat must not look at the appendee's length, but at how much
is left where we're appending.
This API is stupid anyway, where is strlcat when we need it...
Change-Id: Icdf418083146420a06f8ba5ffdf882982610d39b
newer linux got a 5 level page table now, try to handle that.
Some of the macros will be no-op (e.g. loop only on one iteration) on
architecture/kernels with only 4 levels but the code needs to be there
to compile
Change-Id: Ifc6304cbb066dce7d4e30962687ae05d7e034730
The headers defines __task_cred and other macroes we use, and always
existed; we must have gotten it indirectly on older kernels, it doesn't
hurt to always include
Change-Id: Iacfff0365e7a21e6247eea42606bbbf1dfccc077
on newer x64 kernels (config option?), syscalls can be renamed to allow
both x64 and ia32 versions to coexist. Lookup either names
Change-Id: I2f55cc804d3eee948ee1ed6d18c69c75bd2f652c
The symbol appears in some header in some linux version,
it's still not exported so we need our own lookup anyway; just rename it.
Change-Id: Ia4bce85988641c96fa3f5a0ae1d42c25c713b6c2
In addition to that, mcctrl_perf_set is modified so that it updates
usrdata->perf_event_num with number of registered events.
Change-Id: I3f343176f55b06d3baab0b0fe34e240f39706cf6
Fujitsu: POSTK_DEBUG_TEMP_FIX_80
Trailing bits were displayed in proc->saved_cmdline, displaying
uninitialized data to the user in /proc/<pid>/cmdline
Change-Id: I74831c8c68dd2f2197b35e9b49aaaae29c4c1dd5
This would incorrectly make "mcexec sh -c './script.sh'" run with
/bin/bash instead of /bin/sh (which is important, because bash behaviour
changes depending on how it is invoked)
Change-Id: I80610cf442c6c3ecacfa23e8ed15652bc8d4e3f7
This reverts commit b70d470e20.
That commit had been landed too fast after a mistake during migration
from old to new gerrit that didn't keep -1 vote ; it needs some fix
Change-Id: Ifc8a23e42449dfe471049270b4706e9b137e096e
Hugetlbfs file mappings are handled differently than regular files:
- pager_req_create will tell us the file is in a hugetlbfs
- allocate memory upfront, we need to fail if not enough memory
- the memory needs to be given again if another process maps the same
file
This implementation still has some hacks, in particular, the memory
needs to be freed when all mappings are done and the file has been
deleted/closed by all processes.
We cannot know when the file is closed/unlinked easily, so clean up
memory when all processes have exited.
To test, install libhugetlbfs and link a program with the additional
LDFLAGS += -B /usr/share/libhugetlbfs -Wl,--hugetlbfs-align
Then run with HUGETLB_ELFMAP=RW set, you can check this works with
HUGETLB_DEBUG=1 HUGETLB_VERBOSE=2
Change-Id: I327920ff06efd82e91b319b27319f41912169af1
- remove unused MF_END (that only makes sense for enums without holes,
this one is a set of bits masks)
- remove useless goto in pager_req_create()
- init maxprot to 0 from the start, it's not used in the error cases
(except for debug print)
Change-Id: Ic56c0754824b99f8a7e45fa8e99b8fe3e7c7e592
There were mainly two problems with shebangs:
- Suffix arguments handling e.g. '#!/bin/sh -x'
- Recursive handling e.g. script1 fetchs '#!/path/to/script2'
and script2 itself has a shebang
- (did I say two?) running shebang would replace argv[optind] instead
of appending e.g. script with '#!/bin/sh' and running './script -c'
would run '/bin/sh -c' instead of '/bin/sh ./script -c'
There also are two places where this needs parsing:
- starting a fresh program from mcexec
- starting a new program from execve in mcexec
The first was easy to fix as we already had argv around, but the later
required a new way to transfer the 'new argv elements from the script'
to mckernel to append before its argv -- it used to be 'desc->shell_path'
but that was no longer used at some point and just one keyword is not
enough to handle this properly.
This commit does:
- Refactors the lookup_path + load_elf_desc that was only done at most
twice in its own function that loops indefinitely and use that in both
situations described above
- Transmits the argv addition in the transfer to mckernel after the
desc; mckernel allocates 4 pages (hardcoded) for the descs and we will
hopefully have room for the script arguments on top of that... (there is
no guard!!!)
- Change flatten_strings to allow prepending a flattened string instead
of a single string.
Note that the flatten_string change also brought in a difference in the
format, to have the full length embedded within the string, the latest
slot that used to be zeroes now contains the position of the end of the
buffer (where the last+1 string would be if there had been one)
This required a trivial change in mckernel prepare args function that
used this property for no real reason.
Hopefully things work™, this probably warrants adding a couple of new
ostests...
- create a couple of scripts with recursive invocation/arguments and
check their own argv.
- execute "mcexec script args" and "mcexec sh -c 'script args'"
Change-Id: I2cf9cde5c07c9293f730de89c9731bd93dbfa789
Refs: #1115
Do not return from fork() until mcctrl side has created mckernel's
procfs entries for the child PID.
This fixes programs doing fork() immediately followed by opening
/proc/<child pid>/something, and would get some error
Refs: #1189
Change-Id: Ie10ea56b65c55f59e96a1ab6ef83a1070e36048d
For get_user_pages_remote in binfmt_mcexec.c:
In 4.10 with 5b56d49fc31d ("mm: add locked parameter to
get_user_pages_remote()")
In 4.9 with 9beae1ea8930 ("mm: replace get_user_pages_remote()
write/force parameters with gup_flags")
For vmf in syscall.c, these two patches in 4.10:
82b0f8c39a38 ("mm: join struct fault_env and vm_fault")
1a29d85eb0f1 ("mm: use vmf->address instead of
vmf->virtual_address")
Fujitsu: POSTK_DEBUG_ARCH_DEP_41
Change-Id: I89a02d03169a2162ea186da1804bf48910446d11
This includes the following fix:
send_syscall, do_syscall: remove argument pid
Fujitsu: POSTK_TEMP_FIX_26
Refs: #1165
Change-Id: I702362c07a28f507a5e43dd751949aefa24bc8c0
We had a deadlock between:
- free_process_memory_range (take lock) -> ihk_mc_pt_free_range ->
... -> remote_flush_tlb_array_cpumask -> "/* Wait for all cores */"
and
- obj_list_lookup() under fileobj_list_lock that disabled irqs
and thus never ack'd the remote flush
The rework is quite big but removes the need for the big lock,
although devobj and shmobj needed a new smaller lock to be
introduced - the new locks are used much more locally and
should not cause problems.
On the bright side, refcounting being moved to memobj level means
we could remove refcounting implemented separately in all object
types and simplifies code a bit.
Change-Id: I6bc8438a98b1d8edddc91c4ac33c11b88e097ebb