Commit Graph

294 Commits

Author SHA1 Message Date
8cf70900e7 mcexec: Fix LD_PRELOAD string manipulation
To suppress compiler warnings.

Change-Id: I4d6b5ce2d2a8fca3f2675a7fc309df40cfe3c04b
2020-04-01 01:18:10 -04:00
d82ac31bc6 faccessat: Specify AT_SYMLINK_NOFOLLOW only when necessary.
- Specify AT_SYMLINK_NOFOLLOW in faccessat only when
   the symbolic-link is analyzed by overlay_path().

Change-Id: Ie3b1f7fedef7441fd4b39c5c8b2ef0f73cba770e
Refs: #1370
2020-03-20 00:22:50 +00:00
72af689e69 mcexec: detect mismatch of mcexec -n and mpirun -ppn
Change-Id: Iaf5cfb11c37bea6957b77a2114f783e9a46a48f2
Refs: #929
2020-02-05 06:39:57 +00:00
911b07f507 fix: fork's race-condition caused by child and grand-child
Refs: #1329
Change-Id: Ia2d7641d1203f40155fef5db718d1bb2c583c1c5
2020-01-09 06:33:13 +00:00
3c256e1a6c overlay: getdents: support lseek
Refs: #1421
Change-Id: Ife7ab1b50159a5897552ff695bb001ada27ec934
2019-12-13 03:49:20 +00:00
7fc4272b89 handle execveat systemcall on McKernel
Refs: #1366
Change-Id: I921e04a0df8d0d798fc94f675e5112dd2fec190a
2019-12-06 09:33:13 +09:00
b3b7801d51 overlay: fix /proc/PID/task/ corner cases
Change-Id: I17086c684af4c665d0c228b4a65cdb232eccf602
2019-06-07 01:48:10 +00:00
2c5c47344d x86_64: mcexec: Remove "#include <asm/prctl.h>"
Change-Id: I441f7a1c2e23b927fcd065fefba3ef3617356c18
Fujitsu: POSTK_DEBUG_ARCH_DEP_77
2019-04-25 10:14:19 +09:00
c32edff2bb uti: rename x86-specific 'fs' to 'tls' + arm implem
Note: the original fujitsu implementation didn't rename the various
save_fs function/desc to save_tls for some reason, might as well go all
the way though...

Change-Id: Ic362c15c8b320c4d258d2ead8c5fd4eafd9d0ae9
Fujitsu: POSTK_DEBUG_ARCH_DEP_91
2019-03-22 16:38:29 +09:00
8356ef6c96 arm64: uti: Add arch-dependent helper for context switch
arm64 performs context-switch in kernel space instead of user space as in
x86_64.

Change-Id: Ib119b9ff014effb970183ee86cfac67fab773cba
Futjitsu: POSTK_DEBUG_ARCH_DEP_99
2019-03-22 06:52:21 +00:00
63d500515a mcexec: fix printf format warning
Some old commit before -Werror was enabled got merged,
blocking other builds. Quickly fix before anyone notices

Change-Id: I5a034cef6f79e3e99b381bb1a5d97088e33a6718
2019-03-22 05:25:34 +00:00
791e8c2114 Remove mcoverlayfs code
mcoverlayfs code is now unused (technically should work on top of the
soft emulation but not well tested, and untested unused code is bad).
Remove it.

Left the unshare/bind_mount_recursive code in mcexec in a new
MCEXEC_BIND_MOUNT ifdef (only in config.h.in directly to discourage use.
it disables the ioctl as well, but the main code is still compiled to
keep up to date with linux api changes... although it's using kallsyms
lookup so it does not validate much more than "the symbol still exists")

I honestly think this should go as well (people who would want to use it
are root and could do it manually), but will give up for now.

Change-Id: I832b6a8ab19e24ed67a1a5044b1c6c32381ae0aa
2019-03-22 05:18:43 +00:00
b87ac8b8c0 reproductible builds: remove most install paths in c code
In order to speed up test bot work it would be helpful to check for
identical build outputs and skip tests if required.

This removes most use of the install path in c code:
 - ql_mpi uses /proc/self/exe and looks for talker/server in same
directory as itself
 - mcexec looks for libihk.so in /proc/self/maps and use that path for
LD_PRELOAD prefix path
 - rootfsdir is not used right now but until a better fix happens just
hardcode it, someone who wants to change it can set it through cmake

There is one last occurence of the install directory, MCEXEC_PATH in
mcctrl's binfmt code, for which the build system will just overwrite it
to a constant string at build time instead of trying to remove it too
hard. It would be possible to pass it as a kernel parameter or look for
mcexec in PATH but this is too much work for now.

Change-Id: I5d1352bc5748a1ea10dcae4be630f30a07609296
2019-03-22 05:01:32 +00:00
0cc3496747 warnings: fix missing field in initializer
use generic struct zero initializer instead.
Older gcc used on arm also seem to have trouble with '{}',
so use '{ 0 }' instead

Change-Id: I83d43b05f8d1d44e1dd86502b48e28fe242e1db2
2019-03-06 06:30:30 +00:00
06e96005a6 mcexec: restore --enable-vdso/disable-vdso for x86
Fujitsu added this ifdef together with ifndef __arch64__ and thus disabled
the option for both archs in practice; it probably does not hurt to restore...

I'm not sure I see the point of disabling the option at mcexec level though,
but who am I to care.

Change-Id: I0d4bffb6ed325edac8ae577773e19c0fff6ca2ed
Fujitsu: POSTK_DEBUG_ARCH_DEP_53
2019-03-01 05:08:45 +00:00
2a63c962fc build system switch to cmake
Remove old build system at the same time

Change-Id: Ifdffe1fcd4cfece05f036d8de6e7cb74aca65f62
2019-02-14 16:44:09 +09:00
950ea678dd Reject "setfsuid: Specify mcexec tid when asking mcexec for fsuid"
This fix is rejected because it only makes the setfsuid test in ostest
pass and doesn't fix the other issues including the one in which file
I/O could be done with the old fsuid because an mcexec thread with an
arbitrary tid could handle the system-call offload request.

Explanation of the rejected fix:

  setfsuid() proceeds as follows:

  1. McKernel asks mcexec for __NR_setfsuid (set)
  2. mcexec calls setfsuid, reports the id to McKernel
  3. McKernel asks mcexec for __NR_setfsuid (get)
  4. mcexec calls mcexec_getcred(), reports the id to Mckernel
  5. McKernel sets proc->fsuid to the obtained value

  tid of mcexec on the 2nd and 4th step could be different. So this
  fix lets mcexec report its tid on the 2nd step and McKernel specify
  it in the 3rd step.

Change-Id: Id5cfeed18c64430d576a56e961bbca1ecb2e39ad
Fujitsu: POSTK_DEBUG_TEMP_FIX_45
2019-02-14 04:42:32 +00:00
ff0395581c Register PPD and release_handler at the same time.
Fix that process will remain even if signal is received between PPD
registration and release_handler registration.

Refs: #1201
Fujitsu: POSTK_DEBUG_TEMP_FIX_64
Change-Id: I571781963578df8cedb327f19298f595cfb137a3
2019-02-08 10:20:58 +09:00
9bf225d193 mckernel overlay: replace mcoverlayfs with a soft userspace overlay
mcoverlayfs has a high maintenance burden and does not work on rhel8's 4.18
kernel (while it works on vanilla 4.18...); instead of debugging this further
time is better spent making it independent from overlayfs.

Change-Id: I7454ae95b0fbb3373c256aa2fd83cdfec466c009
2019-02-06 08:27:25 +00:00
516ab87ab9 Copyrights: fujitsu 2018 bump
Separate copyright bumps in a different commit.
A lot of files only had the copyright change at this point; these
were probably changes I added separatly in other patches but just
split these in a different commit instead to simplify git stats

Change-Id: I93cf3fc1c0fa04ee743a79c3fe9768933e6bd0d2
2019-02-01 13:18:52 +09:00
ca34154a43 mcexec: lookup page_size with sysconf
page size is not defiend in sys/user.h on aarch64

Change-Id: Idbdaef2519792eeb1e1a2794be0a34d67e87907e
Fujitsu: POSTK_DEBUG_ARCH_DEP_35
2019-02-01 13:16:40 +09:00
960a6f5f90 prepare process: add magic header in program_load_desc
Check we mapped the correct region with a magic header in the struct

Original commit: d246b93a3bced92d0ac2a4a337118091b010658a

Fujitsu: POSTK_DEBUG_TEMP_FIX_76
Change-Id: If848be64af5d76844ba65b48493021637c8114f4
2019-02-01 13:16:25 +09:00
dfd23c3ebe prctl: Add support for PR_SET_THP_DISABLE and PR_GET_THP_DISABLE
Change-Id: I04c5568a9eb78bcac632b734f34bba49cf602c4d
Refs: #1181
2019-01-22 05:40:56 +00:00
ef9fda23a9 mcexec: Set default heap extension amount to sysconf(_SC_PAGESIZE)
Change-Id: I3ac660d33918c1fa28093ab59f3a7ead65d337d7
2018-12-12 00:38:10 +00:00
da02f76a25 mcexec: Fix error handling of init_worker_threads
Refs: #1233
Change-Id: Icce49c996d69b3cf64a71e7bd470421f329c881f
2018-12-04 09:40:24 +00:00
c585a37440 move mcoverlayfs kernel version check from mcexec.c to configure
While we are here:
 - fix uname -r (single quote?!)
 - add compat for rhel8 (el kernel and version is 4.18)
 - also remove linux version check in mcreboot.sh, trust configure check

Change-Id: I14726d4374b0dfd941640096044ea1d5d88bfcb8
2018-11-26 12:09:00 +00:00
1253f4d18c mcexec shebang: delete spaces *before* path as well
Apparently, a shebang '#! /bin/sh' should work.
Will add some ostests for these...

Change-Id: Iab8ba8e3cc7e434c98742f71fe7db3c425f08278
2018-11-21 07:39:51 +00:00
525b90d028 flatten_string/process env: realign env and clear trailing bits
envs are stuck after args which are now possibly unaligned, and used
from a non-aligned pointer in prepare_process_ranges_args_envs (env)

The memory immediately after args/envs is copied anyway with memcpy_long,
so make sure the bits are initialized and realign env correctly

Fixes: 70e52faf36 ("flatten_strings: do not return unused trailing bits")
Change-Id: Ic747e947d151c0eea65dec36bc9c888cf6e0c394
2018-11-21 07:39:16 +00:00
1a5b10277f mcexec: load_elf: disable execvp for within-mckernel execs
the libc takes care of trying execve as many times as needed for
execvp, it's not a kernel call.

Also, sneak a double-free fix (desc was not reset properly in case
load_elf_desc_shebang failed)

Fixes: b1681f4a3affff ("mcexec/execve: fix shebangs handling")
Change-Id: If8e3d7ae53acdeffc0331ae8621e0832fcfa406f
2018-11-21 16:17:58 +09:00
a59c55c188 mcexec load_elf_desc: print error after returning
Running "mcexec dfsafds" did not print any message in normal use.
Rather than looking for which message shows in debug and turn in into
eprintf, add a single coherent message (more shell-like) at the end and
turn other messages off.

There is a small loss of information but this is equivalent to what
shells give (a single errno value with no details), and it is now easy
to add --debug to mcexec to see more information if required

Change-Id: Id2c3a47880b7d1d7467883351e6e7af561f91bbf
2018-11-21 16:17:58 +09:00
1d6a078afa mcexec: add --debug-mcexec
We already have debug statements compiled in, add a toggle for it
Also fix case indent for 's'

Change-Id: I1104ee57d571b82ec5e061f22cd44033a5c7fc39
2018-11-21 07:16:54 +00:00
3a90521489 mcexec: fix strncat bounding
strncat must not look at the appendee's length, but at how much
is left where we're appending.
This API is stupid anyway, where is strlcat when we need it...

Change-Id: Icdf418083146420a06f8ba5ffdf882982610d39b
2018-11-21 16:06:31 +09:00
85c936a6cb mcexec: fix terminating zero after readlink()
Change-Id: Icb5432f157ceb2182d93e2d327cfa63ad02a8c0e
2018-11-08 17:01:22 +09:00
70e52faf36 flatten_strings: do not return unused trailing bits
Trailing bits were displayed in proc->saved_cmdline, displaying
uninitialized data to the user in /proc/<pid>/cmdline

Change-Id: I74831c8c68dd2f2197b35e9b49aaaae29c4c1dd5
2018-10-15 08:35:50 +00:00
8db36c3828 mcexec: do not resolve links in lookup_exec_path
This would incorrectly make "mcexec sh -c './script.sh'" run with
/bin/bash instead of /bin/sh (which is important, because bash behaviour
changes depending on how it is invoked)

Change-Id: I80610cf442c6c3ecacfa23e8ed15652bc8d4e3f7
2018-10-15 08:35:41 +00:00
b1681f4a3a mcexec/execve: fix shebangs handling
There were mainly two problems with shebangs:
 - Suffix arguments handling e.g. '#!/bin/sh -x'
 - Recursive handling e.g. script1 fetchs '#!/path/to/script2'
and script2 itself has a shebang
 - (did I say two?) running shebang would replace argv[optind] instead
of appending e.g. script with '#!/bin/sh' and running './script -c'
would run '/bin/sh -c' instead of '/bin/sh ./script -c'

There also are two places where this needs parsing:
 - starting a fresh program from mcexec
 - starting a new program from execve in mcexec

The first was easy to fix as we already had argv around, but the later
required a new way to transfer the 'new argv elements from the script'
to mckernel to append before its argv -- it used to be 'desc->shell_path'
but that was no longer used at some point and just one keyword is not
enough to handle this properly.

This commit does:
 - Refactors the lookup_path + load_elf_desc that was only done at most
twice in its own function that loops indefinitely and use that in both
situations described above
 - Transmits the argv addition in the transfer to mckernel after the
desc; mckernel allocates 4 pages (hardcoded) for the descs and we will
hopefully have room for the script arguments on top of that... (there is
no guard!!!)
 - Change flatten_strings to allow prepending a flattened string instead
of a single string.
Note that the flatten_string change also brought in a difference in the
format, to have the full length embedded within the string, the latest
slot that used to be zeroes now contains the position of the end of the
buffer (where the last+1 string would be if there had been one)
This required a trivial change in mckernel prepare args function that
used this property for no real reason.

Hopefully things work™, this probably warrants adding a couple of new
ostests...
 - create a couple of scripts with recursive invocation/arguments and
check their own argv.
 - execute "mcexec script args" and "mcexec sh -c 'script args'"

Change-Id: I2cf9cde5c07c9293f730de89c9731bd93dbfa789
Refs: #1115
2018-10-04 14:31:02 +09:00
9b77630c8b mcexec: readlink and use full path for reexec
This fixes comm on linux side, showing mcexec instead of 'exe'

Change-Id: I9345d7a23dccb36b3a1e17fd3e7491eaeca54e5b
2018-10-04 01:03:10 +00:00
7e342751a2 do_syscall: Delegate system calls to the mcexec with the same pid
This includes the following fix:
send_syscall, do_syscall: remove argument pid

Fujitsu: POSTK_TEMP_FIX_26
Refs: #1165
Change-Id: I702362c07a28f507a5e43dd751949aefa24bc8c0
2018-09-13 16:59:47 +09:00
b51886421e uti: Don't compile syscall_intercept related stuff when not specified with configure option
Change-Id: I9be8cb9b3fcae78d33a33b057c43caee23a81fc1
2018-09-05 16:29:20 +09:00
00a34a8ba3 uti: util_thread: Hoist uti_desc check
Change-Id: I8c4b75140df2fe149dfe20e0a8f0bf323b5f1763
2018-09-04 19:53:03 +09:00
8900c2cec5 uti: mcexec_uti_attr: Fix CPU binding decision
Change-Id: I4047858895503ae912e5575bb232dbbb2f915722
2018-09-04 19:53:03 +09:00
781a69617b uti: Replace data types represented as arrays with C structures
Defining C structures for the following objects:
(1) Remote and local context
(2) Stack of system call arguments / return values

Change-Id: Iafbb6c795bd765e3c78c54a255d8a1e4d4536288
2018-09-04 19:53:03 +09:00
04d4145b3e uti: Replace dead uti thread with new mcexec thread in proc->tids
Change-Id: Ic6e906dd1bfac1b07f1317732cbe0a5191831cd8
2018-09-04 19:53:03 +09:00
6b031c5472 uti: Fix condition for pthread_join of mcexec threads
Change-Id: Iaeee91c197b84436f84ce4380768aa79e7f9419e
2018-09-04 19:53:02 +09:00
e42c414454 uti: Hook system calls by binary-patching glibc
(1) Add --enable-uti option. The binary-patch library is
    preloaded with this option.
(2) Binary-patching is done by syscall_intercept developed by Intel

This commit includes the following fixes:

(1) Fix do_exit() and terminate() handling
(2) Fix timing of killing mcexec threads when McKernel thread calls terminate()

Change-Id: Iad885e1e5540ed79f0808debd372463e3b8fecea
2018-09-04 19:53:02 +09:00
e613483bee uti: Add system call profile 2018-09-04 19:53:02 +09:00
4969762f15 uti: Add usage of uti specific options to mcexec 2018-09-04 19:53:02 +09:00
8c11daf726 uti: Fix signal relay from mcexec to McKernel
Change-Id: I2ffd8049a0fb1637cfc6bab7fe24c6a85e5e53fc
2018-09-04 19:53:01 +09:00
5cb8a1f10f uti: Workaround not to share CPU with OpenMP threads
* Assign uti thread to the last idle CPU so that it's not shared with
  an OpenMP thread

Change-Id: Ia42cae056ce81fde9b6dab6286b39a52f3c9e172
2018-09-04 19:53:01 +09:00
b6ab5911b7 uti: Identify uti thread by clone count
--uti-thread-count <count> is added to mcexec.

Change-Id: Id9ec464412a5bb71e4d9e87d05f79de22d35b067
2018-09-04 19:53:01 +09:00