Commit Graph

400 Commits

Author SHA1 Message Date
99fba2df1c mem: per-CPU allocator cache (ThunderX2 workaround)
Change-Id: I7694524c5e9674a6f7bfcd911f8b0dbbead7df5a
2019-06-03 01:22:03 +00:00
0cf89c5682 Linux lockless linked list implementation
Change-Id: I8bd6ee989cecac269b55b3a0ff10cf8543629001
2019-04-09 01:52:49 +00:00
ea7f517e3d arm64: ptrace: Fix overwriting 1st argument with return value
Since arm64 shares the return value with the area of
the first argument, rewriting the return value before
the system call execution completes destroys the first argument.

Change-Id: I959944879254d8dd3a29489a65d8f274d45338e6
Fujitsu: POSTK_DEBUG_ARCH_DEP_110
2019-03-08 08:06:19 +00:00
9ec0aeeab5 debug.h: merge both instances into ihk/debug.h
We do not need two debug.h files.

Take Fujitsu's STATIC_ASSERT over BUILD_BUG_ON because it is more used

Change-Id: If04c17fbb7406ab15fe86267fed8d6da460cec62
Fujitsu: POSTK_DEBUG_ARCH_DEP_9
2019-03-01 05:10:35 +00:00
a5d5baf8a8 rus_vm_fault: always use a packet on the stack
There are valid use cases where a remote page fault has no available
thread data/packet available to use, e.g. when device driver threads
need to access the data (BXI).

Do the per thread data lookup to use the right channel/tid if available,
and use mcctrl_ikc_send_wait with a new message number directly.

The fault is no longer handled in mckernel syscall forwarding code but
in the ikc handler directly in irq, this should be ok because page
faults are interrupts anyway so the code should be irq-safe.

Change-Id: Ie60f413cdaee6c1a824b4a2c93637899cb9bf9c9
2019-03-01 05:08:03 +00:00
6810506c3d rusage: Fix available page sizes
Change-Id: I418075ff4b5341e0f5c7ff317e96461879a60f87
2019-02-22 14:08:18 +09:00
5bc54a3bbe Fixed time processing.
- arm64: Get TSC corresponding to boot time from IHK.

- x86_64: Calculate the current time using vdso.

Refs: #1186
Fujitsu: POSTK_DEBUG_ARCH_DEP_52
Change-Id: I293ba4bbe5390d50dea44b8a5b7471f59237daff
2019-02-22 04:13:13 +00:00
07aa96ef95 arm64: Scalable Vector Extension (SVE) support.
Change-Id: I3568687913f583edfaa297d5cf5ac91d319d97e9
2019-02-22 04:07:29 +00:00
e828398c8b do_mmap: don't pre-populate the whole file when asked for smaller segment
The linker maps parts of libs with different access flags,
so we cannot prepopulate the whole file.

[dominique.martinet@cea.fr: moved min and friends in compiler.h]
Change-Id: Ifbeddc0908699099cfae5ce9cc2adc578221db31
2019-02-14 16:26:15 +09:00
641d9f1b39 clear_range_l1, clear_range_middle: Fix handling contiguous PTE
Change-Id: I2609c94d7f9342fe25aa9a5cfc208375274d46fa
2019-02-14 16:26:14 +09:00
9cfc373538 Refactor "do write back only MAP_SHARED pages"
* free_process_memory_range() always passes memobj to
  ihk_mc_pt_free_range()
* clear_range_*() don't flush page in fileobj with MF_PRIVATE flag

Fujitsu: POSTK_DEBUG_TEMP_FIX_87
Change-Id: I8d46d029b3fc51ca6f0e59d748a2fe93e324a374
2019-02-14 16:25:58 +09:00
60dcd0e798 move rusage into kernel ELF image (avoid dynamic alloc before NUMA init)
Change-Id: I7fe86244c8707694b379e567b31de65ee2c56887
2019-02-07 10:43:47 +09:00
6fc9ec1c92 gencore: finish reintegration into arch-independent code
Change-Id: Ic2fc935aeec17c54931817bf43f67ef6da78adc8
Fujitsu: POSTK_DEBUG_ARCH_DEP_18
2019-02-06 17:23:54 +09:00
d4d78e9c61 Following arm64-support to development branch
This includes the following fixes:
* fix build of arch/arm64/kernel/vdso

Change-Id: I73b05034d29f7f8731ac17f9736edbba4fb2c639
2019-02-01 15:14:45 +09:00
516ab87ab9 Copyrights: fujitsu 2018 bump
Separate copyright bumps in a different commit.
A lot of files only had the copyright change at this point; these
were probably changes I added separatly in other patches but just
split these in a different commit instead to simplify git stats

Change-Id: I93cf3fc1c0fa04ee743a79c3fe9768933e6bd0d2
2019-02-01 13:18:52 +09:00
f299fff266 stack: add hwcap auxval
Fix the AUXV_LEN to account for hwcap and remove the ifdefs

Change-Id: I303fc2c5fa4c8cea7ec9823f8580b8a66de2f58f
Fujitsu: POSTK_DEBUG_ARCH_DEP_65
2019-02-01 13:17:58 +09:00
960a6f5f90 prepare process: add magic header in program_load_desc
Check we mapped the correct region with a magic header in the struct

Original commit: d246b93a3bced92d0ac2a4a337118091b010658a

Fujitsu: POSTK_DEBUG_TEMP_FIX_76
Change-Id: If848be64af5d76844ba65b48493021637c8114f4
2019-02-01 13:16:25 +09:00
dfd23c3ebe prctl: Add support for PR_SET_THP_DISABLE and PR_GET_THP_DISABLE
Change-Id: I04c5568a9eb78bcac632b734f34bba49cf602c4d
Refs: #1181
2019-01-22 05:40:56 +00:00
dc1f96fee3 Add set_cputime() kernel to kernel case and mode enum.
Change-Id: Id4584389f39f255335d3bf7b5606f054f108ad51
Fujitsu: POSTK_DEBUG_TEMP_FIX_84
2018-11-27 05:03:39 +00:00
ae9a1f39df ihk_ikc_recv: Record channel to packet for release
ihk_ikc_release_packet takes the channel and puts the packet into its
free-list.  This fix makes it easy and safe to identify the proper
channel.

Change-Id: I5584b1e8a3ed675c2f9d68f0b5ed331b909197f6
Fujitsu: POSTK_DEBUG_TEMP_FIX_89
2018-11-21 17:01:58 +09:00
04c11f35e9 xpmem: Add xpmem_openat
In arm64, glibc-open of /dev/xpmem is hooked in sys_openat. This
commit adds xpmem_openat which is called by sys_openat.
This commit silently applies copy_from_user fix to sys_open as well.

Change-Id: I3b4f7bf0e152c359250bb2b56910db9192390cb1
Fujitsu: POSTK_DEBUG_ARCH_DEP_46, POSTK_DEBUG_ARCH_DEP_62
2018-11-21 07:39:56 +00:00
38e68f358a Add kernel argument to turn on/off time sharing
Add "-T 0" to mcreboot.sh if you want to turn off time sharing.  When
it's turned off, McKernel doesn't activate interval timer when the
length of per-CPU run-queue is larger than one.

Change-Id: I2cedc1b30a9cd9a0f4608a32ecec0a0d58c6225e
2018-11-21 07:37:01 +00:00
fd8bed670e ihk_os_setperfevent: Return number of registered events
In addition to that, mcctrl_perf_set is modified so that it updates
usrdata->perf_event_num with number of registered events.

Change-Id: I3f343176f55b06d3baab0b0fe34e240f39706cf6
Fujitsu: POSTK_DEBUG_TEMP_FIX_80
2018-10-24 06:16:41 +00:00
06dd71a7e0 Revert "procfs: add '/proc/pid/stat' to mckernel side and fix its comm"
This reverts commit b70d470e20.

That commit had been landed too fast after a mistake during migration
from old to new gerrit that didn't keep -1 vote ; it needs some fix

Change-Id: Ifc8a23e42449dfe471049270b4706e9b137e096e
2018-10-12 10:54:14 +09:00
01fe83dcb3 do_mmap: change addr to uintptr_t
Change-Id: I7df45e125387083aef7e62b046c20b7422f60f22
2018-10-11 09:24:23 +00:00
c3bfa3f6a9 move BUG_ON, panic and kprintf define to debug.h; add BUILD_BUG_ON
these functions are more logical to keep together there as they depend
on each other.

Also add a comment about the __printf attribute, if we have a quiet
period it would be useful to enable and clear the thousands of
warnings...

Change-Id: I47d3891c9cd87da28b2883c29384959f5abd1459
2018-10-11 09:03:53 +00:00
39f9d7fdff Handle hugetlbfs file mapping
Hugetlbfs file mappings are handled differently than regular files:
 - pager_req_create will tell us the file is in a hugetlbfs
 - allocate memory upfront, we need to fail if not enough memory
 - the memory needs to be given again if another process maps the same
   file

This implementation still has some hacks, in particular, the memory
needs to be freed when all mappings are done and the file has been
deleted/closed by all processes.
We cannot know when the file is closed/unlinked easily, so clean up
memory when all processes have exited.

To test, install libhugetlbfs and link a program with the additional
LDFLAGS += -B /usr/share/libhugetlbfs -Wl,--hugetlbfs-align

Then run with HUGETLB_ELFMAP=RW set, you can check this works with
HUGETLB_DEBUG=1 HUGETLB_VERBOSE=2

Change-Id: I327920ff06efd82e91b319b27319f41912169af1
2018-10-11 08:54:13 +00:00
3e3ccf377c compiler.h: add READ_ONCE/WRITE_ONCE macro
These macros are needed to make sure the compiler does not optimize away
atomic constructs such as "while (!READ_ONCE(foo))" loops that do not
modify foo within the loop

Also move the barrier() define where it belongs while we are here, it is
needed for READ_ONCE/WRITE_ONCE and including ihk/cpu.h here causes
include loops

Change-Id: Ia533a849ed674719ccbc0495be47d22a3c47b8f8
2018-10-11 08:54:13 +00:00
13e71ac9dc pager: minor cleanups
- remove unused MF_END (that only makes sense for enums without holes,
  this one is a set of bits masks)
- remove useless goto in pager_req_create()
- init maxprot to 0 from the start, it's not used in the error cases
  (except for debug print)

Change-Id: Ic56c0754824b99f8a7e45fa8e99b8fe3e7c7e592
2018-10-11 08:54:13 +00:00
b1681f4a3a mcexec/execve: fix shebangs handling
There were mainly two problems with shebangs:
 - Suffix arguments handling e.g. '#!/bin/sh -x'
 - Recursive handling e.g. script1 fetchs '#!/path/to/script2'
and script2 itself has a shebang
 - (did I say two?) running shebang would replace argv[optind] instead
of appending e.g. script with '#!/bin/sh' and running './script -c'
would run '/bin/sh -c' instead of '/bin/sh ./script -c'

There also are two places where this needs parsing:
 - starting a fresh program from mcexec
 - starting a new program from execve in mcexec

The first was easy to fix as we already had argv around, but the later
required a new way to transfer the 'new argv elements from the script'
to mckernel to append before its argv -- it used to be 'desc->shell_path'
but that was no longer used at some point and just one keyword is not
enough to handle this properly.

This commit does:
 - Refactors the lookup_path + load_elf_desc that was only done at most
twice in its own function that loops indefinitely and use that in both
situations described above
 - Transmits the argv addition in the transfer to mckernel after the
desc; mckernel allocates 4 pages (hardcoded) for the descs and we will
hopefully have room for the script arguments on top of that... (there is
no guard!!!)
 - Change flatten_strings to allow prepending a flattened string instead
of a single string.
Note that the flatten_string change also brought in a difference in the
format, to have the full length embedded within the string, the latest
slot that used to be zeroes now contains the position of the end of the
buffer (where the last+1 string would be if there had been one)
This required a trivial change in mckernel prepare args function that
used this property for no real reason.

Hopefully things work™, this probably warrants adding a couple of new
ostests...
 - create a couple of scripts with recursive invocation/arguments and
check their own argv.
 - execute "mcexec script args" and "mcexec sh -c 'script args'"

Change-Id: I2cf9cde5c07c9293f730de89c9731bd93dbfa789
Refs: #1115
2018-10-04 14:31:02 +09:00
b70d470e20 procfs: add '/proc/pid/stat' to mckernel side and fix its comm
This lets ps show the proper executable name instead of mcexec's comm
on linux side

Change-Id: I62732037451f129fc2e905357ebdc351bf7f6d2d
Refs: #1114
2018-10-04 01:01:19 +00:00
ed1edb152b ptrace supports threads
Fujitsu: POSTK_DEBUG_TEMP_FIX_53, POSTK_DEBUG_ARCH_DEP_44
Refs: #771, #1179, #1143
Change-Id: Ie17ece6864f0eeb0c0e550f4e369abb77980a0d0
2018-10-01 03:57:16 +00:00
29c5c68761 coredump: Change type of coretable.len to loff_t from int
Fujitsu: POSTK_DEBUG_TEMP_FIX_61
Change-Id: I6a27a8d477c3b3dcc12be772a15dfcff370bd2a8
2018-09-20 11:01:22 +09:00
7e342751a2 do_syscall: Delegate system calls to the mcexec with the same pid
This includes the following fix:
send_syscall, do_syscall: remove argument pid

Fujitsu: POSTK_TEMP_FIX_26
Refs: #1165
Change-Id: I702362c07a28f507a5e43dd751949aefa24bc8c0
2018-09-13 16:59:47 +09:00
c23bc8d401 syscall_time: Handle by McKernel
refs: #1036
Change-Id: Ifa81b613c7ee8d95ae7cdf3dd54643f60526fa73
2018-09-13 07:44:02 +00:00
e4da71010c check_signal: system call restart is done only once
Fujitsu: POSTK_TEMP_FIX_66
Refs: #1009
Change-Id: Ic0f04ac6b7f6c6bb01b55fb389bf9befd56b1dd9
2018-09-13 07:00:49 +00:00
c25fb2aa39 memobj: transform memobj lock to refcounting
We had a deadlock between:
 - free_process_memory_range (take lock) -> ihk_mc_pt_free_range ->
... -> remote_flush_tlb_array_cpumask -> "/* Wait for all cores */"
and
 - obj_list_lookup() under fileobj_list_lock that disabled irqs
and thus never ack'd the remote flush

The rework is quite big but removes the need for the big lock,
although devobj and shmobj needed a new smaller lock to be
introduced - the new locks are used much more locally and
should not cause problems.

On the bright side, refcounting being moved to memobj level means
we could remove refcounting implemented separately in all object
types and simplifies code a bit.

Change-Id: I6bc8438a98b1d8edddc91c4ac33c11b88e097ebb
2018-09-12 18:03:25 +09:00
cd00fc3a78 set_timer: Start timer when runnable thread count is bigger than one
Change-Id: Ie32799fff2936ffc057f166db5681edccdbf5920
2018-09-04 19:53:03 +09:00
781a69617b uti: Replace data types represented as arrays with C structures
Defining C structures for the following objects:
(1) Remote and local context
(2) Stack of system call arguments / return values

Change-Id: Iafbb6c795bd765e3c78c54a255d8a1e4d4536288
2018-09-04 19:53:03 +09:00
04d4145b3e uti: Replace dead uti thread with new mcexec thread in proc->tids
Change-Id: Ic6e906dd1bfac1b07f1317732cbe0a5191831cd8
2018-09-04 19:53:03 +09:00
5cb8a1f10f uti: Workaround not to share CPU with OpenMP threads
* Assign uti thread to the last idle CPU so that it's not shared with
  an OpenMP thread

Change-Id: Ia42cae056ce81fde9b6dab6286b39a52f3c9e172
2018-09-04 19:53:01 +09:00
b6ab5911b7 uti: Identify uti thread by clone count
--uti-thread-count <count> is added to mcexec.

Change-Id: Id9ec464412a5bb71e4d9e87d05f79de22d35b067
2018-09-04 19:53:01 +09:00
52afbbbc98 uti: Call into McKernel futex()
(1) Masquerade clv
(2) Fix timeout
(3) Let mcexec thread with the same tid as McKernel thread migrating
    to Linux handles the migration request
(4) Call create_tracer() before creating proxy related objects

Change-Id: I6b2689b70db49827f10aa7d5a4c581aa81319b55
2018-09-04 19:52:10 +09:00
d7b882855a Correct comments in declaration of struct ikc_scd_packet 2018-09-04 19:52:10 +09:00
0b0b7b03d7 Prevent one CPU from getting chosen by concurrent forks
One CPU could be chosen by concurrent forks because CPU selection and
runq addition are not done atomicly. So this fix makes the two steps
atomic.

Change-Id: Ib6b75ad655789385d13207e0a47fa4717dec854a
2018-09-04 19:51:11 +09:00
567dcd3846 Fix deadlock involving mmap_sem and memory_range_lock
Change-Id: I187246271163e708af6542c057d0a8dfde5b211e
Fujitsu: TEMP_FIX_1
Refs: #986
2018-09-04 19:51:10 +09:00
2db69d0f24 process/vm: implement access_ok() 2018-09-04 19:51:10 +09:00
a697f5e98d partitioned execution: pass process rank to LWK
Cherry-pick of d2d134d5e6a4b16a34d55d31b14614a2a91ecf47

Conflicts:
	kernel/include/process.h
2018-09-04 19:51:10 +09:00
4246d41007 kmalloc_header: use signed integer for target CPU id
Cherry-pick of bdb2d4d8fa94f9c0268cdfdb21af1a2a5c2bcae5
2018-09-04 19:51:09 +09:00
f57b0c5d4f wait: Delay wake-up parent within switch context
Fujitsu: POSTK_DEBUG_TEMP_FIX_41
Refs: #1006
Change-Id: Ia98e896505ad0f6549766604ade84550eee8bd2d
2018-08-30 02:13:51 +00:00