Commit Graph

821 Commits

Author SHA1 Message Date
3a90521489 mcexec: fix strncat bounding
strncat must not look at the appendee's length, but at how much
is left where we're appending.
This API is stupid anyway, where is strlcat when we need it...

Change-Id: Icdf418083146420a06f8ba5ffdf882982610d39b
2018-11-21 16:06:31 +09:00
03802052ed mcctrl: add handling for one more level of page tables
newer linux got a 5 level page table now, try to handle that.

Some of the macros will be no-op (e.g. loop only on one iteration) on
architecture/kernels with only 4 levels but the code needs to be there
to compile

Change-Id: Ifc6304cbb066dce7d4e30962687ae05d7e034730
2018-11-21 07:03:24 +00:00
c21485d427 mcctrl: include linux/cred.h
The headers defines __task_cred and other macroes we use, and always
existed; we must have gotten it indirectly on older kernels, it doesn't
hurt to always include

Change-Id: Iacfff0365e7a21e6247eea42606bbbf1dfccc077
2018-11-21 06:38:08 +00:00
18d50e48dc mcctrl: lookup for alternate syscall names
on newer x64 kernels (config option?), syscalls can be renamed to allow
both x64 and ia32 versions to coexist. Lookup either names

Change-Id: I2f55cc804d3eee948ee1ed6d18c69c75bd2f652c
2018-11-21 06:38:08 +00:00
a2be475ae4 mcctrl control: replace cpu_isset by cpumask_test_cpu for new kernels
Change-Id: I60635118e5ce7281de97e024c626ac40d1a4aa36
Fujitsu: POSTK_DEBUG_ARCH_DEP_54
2018-11-21 06:38:08 +00:00
38f683d1d0 mcctrl control: task start_time changed to u64 nsec
Change-Id: I1128c20cf836d20b6e84d7ec58cf8dfb075297da
Fujitsu: POSTK_DEBUG_ARCH_DEP_74
2018-11-21 06:38:08 +00:00
59828db5c9 mcctrl archdeps: rename vdso_image_64 to _vdso_image_64
The symbol appears in some header in some linux version,
it's still not exported so we need our own lookup anyway; just rename it.

Change-Id: Ia4bce85988641c96fa3f5a0ae1d42c25c713b6c2
2018-11-21 06:38:08 +00:00
85c936a6cb mcexec: fix terminating zero after readlink()
Change-Id: Icb5432f157ceb2182d93e2d327cfa63ad02a8c0e
2018-11-08 17:01:22 +09:00
6f9fef2b13 procfs: Make /proc/<PID>/mem unwritable
refs: #1177
Change-Id: Ibb319221155547febf9126e05a9e322bd9f140cc
2018-10-26 08:58:31 +00:00
cc1d39e55d mcctrl_perf_enable: Fix type of integer constant
Change-Id: Ib98eca85a9962520dafdd08b8fc223a6a83bafd0
2018-10-24 14:56:26 +09:00
fd8bed670e ihk_os_setperfevent: Return number of registered events
In addition to that, mcctrl_perf_set is modified so that it updates
usrdata->perf_event_num with number of registered events.

Change-Id: I3f343176f55b06d3baab0b0fe34e240f39706cf6
Fujitsu: POSTK_DEBUG_TEMP_FIX_80
2018-10-24 06:16:41 +00:00
70e52faf36 flatten_strings: do not return unused trailing bits
Trailing bits were displayed in proc->saved_cmdline, displaying
uninitialized data to the user in /proc/<pid>/cmdline

Change-Id: I74831c8c68dd2f2197b35e9b49aaaae29c4c1dd5
2018-10-15 08:35:50 +00:00
8db36c3828 mcexec: do not resolve links in lookup_exec_path
This would incorrectly make "mcexec sh -c './script.sh'" run with
/bin/bash instead of /bin/sh (which is important, because bash behaviour
changes depending on how it is invoked)

Change-Id: I80610cf442c6c3ecacfa23e8ed15652bc8d4e3f7
2018-10-15 08:35:41 +00:00
06dd71a7e0 Revert "procfs: add '/proc/pid/stat' to mckernel side and fix its comm"
This reverts commit b70d470e20.

That commit had been landed too fast after a mistake during migration
from old to new gerrit that didn't keep -1 vote ; it needs some fix

Change-Id: Ifc8a23e42449dfe471049270b4706e9b137e096e
2018-10-12 10:54:14 +09:00
c86d168165 procfs: handle 'comm' on mckernel side
Change-Id: Ie68514ba3e5161b931b88eeee9e8a2267ee69354
2018-10-11 09:19:42 +00:00
1e1fa4f70d trivial warnings fixes (unused variable/function)
Change-Id: I71cedd2c09eeb5d2c2fd2e988dfdde0877627abc
2018-10-11 09:03:53 +00:00
39f9d7fdff Handle hugetlbfs file mapping
Hugetlbfs file mappings are handled differently than regular files:
 - pager_req_create will tell us the file is in a hugetlbfs
 - allocate memory upfront, we need to fail if not enough memory
 - the memory needs to be given again if another process maps the same
   file

This implementation still has some hacks, in particular, the memory
needs to be freed when all mappings are done and the file has been
deleted/closed by all processes.
We cannot know when the file is closed/unlinked easily, so clean up
memory when all processes have exited.

To test, install libhugetlbfs and link a program with the additional
LDFLAGS += -B /usr/share/libhugetlbfs -Wl,--hugetlbfs-align

Then run with HUGETLB_ELFMAP=RW set, you can check this works with
HUGETLB_DEBUG=1 HUGETLB_VERBOSE=2

Change-Id: I327920ff06efd82e91b319b27319f41912169af1
2018-10-11 08:54:13 +00:00
13e71ac9dc pager: minor cleanups
- remove unused MF_END (that only makes sense for enums without holes,
  this one is a set of bits masks)
- remove useless goto in pager_req_create()
- init maxprot to 0 from the start, it's not used in the error cases
  (except for debug print)

Change-Id: Ic56c0754824b99f8a7e45fa8e99b8fe3e7c7e592
2018-10-11 08:54:13 +00:00
b1681f4a3a mcexec/execve: fix shebangs handling
There were mainly two problems with shebangs:
 - Suffix arguments handling e.g. '#!/bin/sh -x'
 - Recursive handling e.g. script1 fetchs '#!/path/to/script2'
and script2 itself has a shebang
 - (did I say two?) running shebang would replace argv[optind] instead
of appending e.g. script with '#!/bin/sh' and running './script -c'
would run '/bin/sh -c' instead of '/bin/sh ./script -c'

There also are two places where this needs parsing:
 - starting a fresh program from mcexec
 - starting a new program from execve in mcexec

The first was easy to fix as we already had argv around, but the later
required a new way to transfer the 'new argv elements from the script'
to mckernel to append before its argv -- it used to be 'desc->shell_path'
but that was no longer used at some point and just one keyword is not
enough to handle this properly.

This commit does:
 - Refactors the lookup_path + load_elf_desc that was only done at most
twice in its own function that loops indefinitely and use that in both
situations described above
 - Transmits the argv addition in the transfer to mckernel after the
desc; mckernel allocates 4 pages (hardcoded) for the descs and we will
hopefully have room for the script arguments on top of that... (there is
no guard!!!)
 - Change flatten_strings to allow prepending a flattened string instead
of a single string.
Note that the flatten_string change also brought in a difference in the
format, to have the full length embedded within the string, the latest
slot that used to be zeroes now contains the position of the end of the
buffer (where the last+1 string would be if there had been one)
This required a trivial change in mckernel prepare args function that
used this property for no real reason.

Hopefully things work™, this probably warrants adding a couple of new
ostests...
 - create a couple of scripts with recursive invocation/arguments and
check their own argv.
 - execute "mcexec script args" and "mcexec sh -c 'script args'"

Change-Id: I2cf9cde5c07c9293f730de89c9731bd93dbfa789
Refs: #1115
2018-10-04 14:31:02 +09:00
73ea4b1ce9 ihk_os_getperfevent,setperfevent: Return -ETIME when IKC timeouts
Change the return value from -EINVAL to -ETIME.

Refs: #1167
Change-Id: I87fa57bb45d0036b7e4b25366aa7b7ce6fb2c764
2018-10-04 02:44:22 +00:00
09f663c246 mcctrl procfs: check entry was returned before using it
Change-Id: If66e95d217d1045e2e65bc5978bba020e3fa7c0d
Refs: #1116
2018-10-04 02:41:16 +00:00
9b77630c8b mcexec: readlink and use full path for reexec
This fixes comm on linux side, showing mcexec instead of 'exe'

Change-Id: I9345d7a23dccb36b3a1e17fd3e7491eaeca54e5b
2018-10-04 01:03:10 +00:00
b70d470e20 procfs: add '/proc/pid/stat' to mckernel side and fix its comm
This lets ps show the proper executable name instead of mcexec's comm
on linux side

Change-Id: I62732037451f129fc2e905357ebdc351bf7f6d2d
Refs: #1114
2018-10-04 01:01:19 +00:00
ecc850dfef procfs/do_fork: wait until procfs entries are registered
Do not return from fork() until mcctrl side has created mckernel's
procfs entries for the child PID.

This fixes programs doing fork() immediately followed by opening
/proc/<child pid>/something, and would get some error

Refs: #1189
Change-Id: Ie10ea56b65c55f59e96a1ab6ef83a1070e36048d
2018-10-04 01:00:52 +00:00
daa234d8b9 mcexec_create_per_process_data: use copy_from_user
Refs: #1205
Change-Id: Idced73a7f88aada5fc2462b490d56603f8fe2472
2018-09-27 15:42:01 +00:00
dd58d366c3 procfs: Fix pread/pwrite to procfs fail when specified size is bigger than 4MB
Fujitsu: POSTK_DEBUG_TEMP_FIX_43
Refs: #1018
Change-Id: I736ac69885695ef8eeababc3fcfe69a6258b4e16
2018-09-20 02:06:17 +00:00
42b9b31606 mcctrl: Propagate writecore()'s return value to caller
Fujitsu: POSTK_DEBUG_TEMP_FIX_62
Change-Id: I847dd520187cbf66fbad8140f79f62c6d5d9d5fc
2018-09-20 11:01:22 +09:00
29c5c68761 coredump: Change type of coretable.len to loff_t from int
Fujitsu: POSTK_DEBUG_TEMP_FIX_61
Change-Id: I6a27a8d477c3b3dcc12be772a15dfcff370bd2a8
2018-09-20 11:01:22 +09:00
38c08a6663 coredump: Add O_TRUNC to flags opening corefile
Fujitsu: POSTK_DEBUG_TEMP_FIX_59
Change-Id: I36c89fa894dfc0cdd170781e8ca4aab6149d4928
2018-09-20 11:01:20 +09:00
8c33c92720 mcctrl: Switch Linux functions/structures according to the version
For get_user_pages_remote in binfmt_mcexec.c:
In 4.10 with 5b56d49fc31d ("mm: add locked parameter to
get_user_pages_remote()")
In 4.9 with 9beae1ea8930 ("mm: replace get_user_pages_remote()
write/force parameters with gup_flags")

For vmf in syscall.c, these two patches in 4.10:
82b0f8c39a38 ("mm: join struct fault_env and vm_fault")
1a29d85eb0f1 ("mm: use vmf->address instead of
vmf->virtual_address")

Fujitsu: POSTK_DEBUG_ARCH_DEP_41
Change-Id: I89a02d03169a2162ea186da1804bf48910446d11
2018-09-20 01:50:04 +00:00
a269d96978 coredump: Exclude special areas
Fujitsu: POSTK_DEBUG_TEMP_FIX_38
Refs: #1005
Change-Id: I8934d2aecf06a09469afe131347e42b48b6f67f6
2018-09-20 01:48:17 +00:00
7e342751a2 do_syscall: Delegate system calls to the mcexec with the same pid
This includes the following fix:
send_syscall, do_syscall: remove argument pid

Fujitsu: POSTK_TEMP_FIX_26
Refs: #1165
Change-Id: I702362c07a28f507a5e43dd751949aefa24bc8c0
2018-09-13 16:59:47 +09:00
c25fb2aa39 memobj: transform memobj lock to refcounting
We had a deadlock between:
 - free_process_memory_range (take lock) -> ihk_mc_pt_free_range ->
... -> remote_flush_tlb_array_cpumask -> "/* Wait for all cores */"
and
 - obj_list_lookup() under fileobj_list_lock that disabled irqs
and thus never ack'd the remote flush

The rework is quite big but removes the need for the big lock,
although devobj and shmobj needed a new smaller lock to be
introduced - the new locks are used much more locally and
should not cause problems.

On the bright side, refcounting being moved to memobj level means
we could remove refcounting implemented separately in all object
types and simplifies code a bit.

Change-Id: I6bc8438a98b1d8edddc91c4ac33c11b88e097ebb
2018-09-12 18:03:25 +09:00
b51886421e uti: Don't compile syscall_intercept related stuff when not specified with configure option
Change-Id: I9be8cb9b3fcae78d33a33b057c43caee23a81fc1
2018-09-05 16:29:20 +09:00
00a34a8ba3 uti: util_thread: Hoist uti_desc check
Change-Id: I8c4b75140df2fe149dfe20e0a8f0bf323b5f1763
2018-09-04 19:53:03 +09:00
8900c2cec5 uti: mcexec_uti_attr: Fix CPU binding decision
Change-Id: I4047858895503ae912e5575bb232dbbb2f915722
2018-09-04 19:53:03 +09:00
781a69617b uti: Replace data types represented as arrays with C structures
Defining C structures for the following objects:
(1) Remote and local context
(2) Stack of system call arguments / return values

Change-Id: Iafbb6c795bd765e3c78c54a255d8a1e4d4536288
2018-09-04 19:53:03 +09:00
04d4145b3e uti: Replace dead uti thread with new mcexec thread in proc->tids
Change-Id: Ic6e906dd1bfac1b07f1317732cbe0a5191831cd8
2018-09-04 19:53:03 +09:00
98ee584ab6 uti: Change field name of release_user_space_desc
Change-Id: I18ada86ec3835198c1a947d8ceb36075d6ff2e94
2018-09-04 19:53:02 +09:00
6b031c5472 uti: Fix condition for pthread_join of mcexec threads
Change-Id: Iaeee91c197b84436f84ce4380768aa79e7f9419e
2018-09-04 19:53:02 +09:00
e42c414454 uti: Hook system calls by binary-patching glibc
(1) Add --enable-uti option. The binary-patch library is
    preloaded with this option.
(2) Binary-patching is done by syscall_intercept developed by Intel

This commit includes the following fixes:

(1) Fix do_exit() and terminate() handling
(2) Fix timing of killing mcexec threads when McKernel thread calls terminate()

Change-Id: Iad885e1e5540ed79f0808debd372463e3b8fecea
2018-09-04 19:53:02 +09:00
e613483bee uti: Add system call profile 2018-09-04 19:53:02 +09:00
c0271f4727 Add debug messages for per-process data 2018-09-04 19:53:02 +09:00
4969762f15 uti: Add usage of uti specific options to mcexec 2018-09-04 19:53:02 +09:00
8c11daf726 uti: Fix signal relay from mcexec to McKernel
Change-Id: I2ffd8049a0fb1637cfc6bab7fe24c6a85e5e53fc
2018-09-04 19:53:01 +09:00
5cb8a1f10f uti: Workaround not to share CPU with OpenMP threads
* Assign uti thread to the last idle CPU so that it's not shared with
  an OpenMP thread

Change-Id: Ia42cae056ce81fde9b6dab6286b39a52f3c9e172
2018-09-04 19:53:01 +09:00
b6ab5911b7 uti: Identify uti thread by clone count
--uti-thread-count <count> is added to mcexec.

Change-Id: Id9ec464412a5bb71e4d9e87d05f79de22d35b067
2018-09-04 19:53:01 +09:00
b0d7f890d0 uti: Reverse-offload msync() 2018-09-04 19:53:01 +09:00
b9c0cdddab uti: Cosmetic change 2018-09-04 19:52:14 +09:00
7ee7dd5e2c uti: Allow tracer to call release_handler() for the main process
Change-Id: I934a6eefbcb87473e87c109d6b4d32c7ab486894
2018-09-04 19:52:14 +09:00