Commit Graph

666 Commits

Author SHA1 Message Date
366e95856c Null-check ihk_os_t and mcctrl_usrdata pointers
Change-Id: I941c58d4ab6a0c1ce6bd53c24b552218a1716750
Refs: #1216
2019-02-14 16:26:19 +09:00
9cfc373538 Refactor "do write back only MAP_SHARED pages"
* free_process_memory_range() always passes memobj to
  ihk_mc_pt_free_range()
* clear_range_*() don't flush page in fileobj with MF_PRIVATE flag

Fujitsu: POSTK_DEBUG_TEMP_FIX_87
Change-Id: I8d46d029b3fc51ca6f0e59d748a2fe93e324a374
2019-02-14 16:25:58 +09:00
207d653b41 mcctrl: use vmf_insert_pfn for kernel >= 4.18
vmf_insert_pfn got added as a wrapper around vm_insert_pfn in 4.17
1c8f422059ae5da ("mm: change return type to vm_fault_t") and totally
replaced the later in 4.20 ae2b01f37044c ("mm: remove vm_insert_pfn()")

Compare with 4.18 here specifically to avoid troubles when rhel
backports this change later, and avoid adding a rhel version check down
the road.

Change-Id: Ibf108e2fb6f1199f89cde6a7973f4eb55447260b
2019-02-14 16:25:49 +09:00
950ea678dd Reject "setfsuid: Specify mcexec tid when asking mcexec for fsuid"
This fix is rejected because it only makes the setfsuid test in ostest
pass and doesn't fix the other issues including the one in which file
I/O could be done with the old fsuid because an mcexec thread with an
arbitrary tid could handle the system-call offload request.

Explanation of the rejected fix:

  setfsuid() proceeds as follows:

  1. McKernel asks mcexec for __NR_setfsuid (set)
  2. mcexec calls setfsuid, reports the id to McKernel
  3. McKernel asks mcexec for __NR_setfsuid (get)
  4. mcexec calls mcexec_getcred(), reports the id to Mckernel
  5. McKernel sets proc->fsuid to the obtained value

  tid of mcexec on the 2nd and 4th step could be different. So this
  fix lets mcexec report its tid on the 2nd step and McKernel specify
  it in the 3rd step.

Change-Id: Id5cfeed18c64430d576a56e961bbca1ecb2e39ad
Fujitsu: POSTK_DEBUG_TEMP_FIX_45
2019-02-14 04:42:32 +00:00
cd42d186b7 uti: Report error of offloading ioctl if any
Change-Id: If4218b9fb89f34728c4aaf81bccab2dfbb0d4a87
2019-02-14 04:15:44 +00:00
ff0395581c Register PPD and release_handler at the same time.
Fix that process will remain even if signal is received between PPD
registration and release_handler registration.

Refs: #1201
Fujitsu: POSTK_DEBUG_TEMP_FIX_64
Change-Id: I571781963578df8cedb327f19298f595cfb137a3
2019-02-08 10:20:58 +09:00
97e0219f50 Make Linux handler run when mmap to procfs.
Change-Id: I98a3d098c5c676f33c83fa4354c623988ee591f2
Refs: #1222
2019-02-06 11:54:50 +00:00
f9d8d98af1 sysfs: add missing symlinks for cpu/node
Add the following patterns of symlinks:
 - /sys/bus/cpu/drivers/processor/cpu*
 - /sys/bus/node/devices/node*

And slightly change how /sys/devices/system/cpu/cpu*/node* are created
to avoid duplicate lookups

Change-Id: Id94a4d157da06d75f6bd450d5bd9a9e7709a1414
2019-02-06 09:55:54 +00:00
9bf225d193 mckernel overlay: replace mcoverlayfs with a soft userspace overlay
mcoverlayfs has a high maintenance burden and does not work on rhel8's 4.18
kernel (while it works on vanilla 4.18...); instead of debugging this further
time is better spent making it independent from overlayfs.

Change-Id: I7454ae95b0fbb3373c256aa2fd83cdfec466c009
2019-02-06 08:27:25 +00:00
6ed2e5ffc1 Fix ThunderX2 write-combined PTE flag insanity
Change-Id: I59999a680b556acf3e22ac516f4758e3aee7f355
2019-02-01 21:03:19 +09:00
25ef4e9261 Merge branch 'postk_master' into development
* Merge 53e436ae7db1ed457692dbe16ccb15511aa6bc64
* Only arm64 stuff are left

Change-Id: I6b79de1f659fa61e75f44811b639d41f9a37d6cc
2019-02-01 15:14:58 +09:00
d4d78e9c61 Following arm64-support to development branch
This includes the following fixes:
* fix build of arch/arm64/kernel/vdso

Change-Id: I73b05034d29f7f8731ac17f9736edbba4fb2c639
2019-02-01 15:14:45 +09:00
e52d748744 new_mcos_handler_info: Propagate kmalloc failure
Change-Id: If484cf32cd0bf096ffd712561dd1f73046c60cd8
Fujitsu: POSTK_TEMP_FIX_64
2019-02-01 15:11:36 +09:00
8db2d3beec sysfs: use nr_cpu_ids for cpumasks (fixes libnuma parsing error on ARM)
Change-Id: I466ffbaf38fe5fd2b1ca0439fa7ea4a813e226ca
2019-02-01 15:08:49 +09:00
f5320fc2b4 overlayfs: make mcoverlayfs compile for 4.14.0-115 (el7 arm64)
Use the 4.18 module as a base

Change-Id: I6c9ef66399800828e1932573da5a97573545c5da
2019-02-01 15:08:47 +09:00
0fbdcc44b9 mcoverlayfs 4.18: re-define ovl_readlink
Apparently /proc needs it; it's normally implemented using get_link if
readlink isn't implemented but proc's get_link crashes the kernel in
this case (because nameidata is only defined for open* paths)

Change-Id: I1864d6c948db879d33ea29b1b281bf84ff8eeec6
2019-02-01 15:08:45 +09:00
452d93f14d mcctrl_clear_pte_range: fix zap_page for kernel >= 4.18
zap_vma_ptes no longer returns an error code as of Linux's
27d036e33237e4 ("mm: Remove return value of zap_vma_ptes()"),
where they decided nobody is interested in it....

Just copy the check out of the function.

Change-Id: I2eda0f91ec55a34bba96f45cc3d887bc80132a82
Originally-by: Kagawa Kodai <fj1731iw@aa.jp.fujitsu.com>
2019-02-01 13:18:58 +09:00
516ab87ab9 Copyrights: fujitsu 2018 bump
Separate copyright bumps in a different commit.
A lot of files only had the copyright change at this point; these
were probably changes I added separatly in other patches but just
split these in a different commit instead to simplify git stats

Change-Id: I93cf3fc1c0fa04ee743a79c3fe9768933e6bd0d2
2019-02-01 13:18:52 +09:00
a9884453e2 vmcore2mckdump: make arm-compatible, 'fix' timeout
Change-Id: Icdb42ff47d9dff5c6a818cb8c9ae94d183b19569
Fujitsu: POSTK_DEBUG_ARCH_DEP_93
Fujitsu: POSTK_DEBUG_ARCH_DEP_102
2019-02-01 13:18:12 +09:00
fb9832af6d perf counters: add arch-specific perf counters
arch perf counters are placed at start, so offset all
other counters (because placing arch perf counters at the end
wouldn't have been intrusive enough?)

Change-Id: Ifab1047872384927d9cfa0a0212327ee73545c29
Fujitsu: POSTK_DEBUG_ARCH_DEP_86
2019-02-01 13:18:09 +09:00
0e895478a1 mcctrl rus_mmap: make vma->vm_flags arch-dependent
[Dominique: renamed arch_vm_flags to arch_rus_vm_flags]
Change-Id: I5ec89b3ff80af6bf0ede342eb5816df8c78de348
Fujitsu: POSTK_DEBUG_ARCH_DEP_100
2019-02-01 13:18:07 +09:00
19659aa908 mcctrl: move translate_rva_to_rpa to archdep
Change-Id: I0efa51468a7ff4d776d8340a612e6f44eac2ed53
Fujitsu: POSTK_DEBUG_ARCH_DEP_83
2019-02-01 13:18:06 +09:00
e5de0b81ca ldump2mcdump: move PAGE_SHIFT to arch-dependent includes
Change-Id: I42e49db87e375f2dc094926e21dfc00e50484855
Fujitsu: POSTK_DEBUG_ARCH_DEP_94
2019-02-01 13:18:04 +09:00
ca34154a43 mcexec: lookup page_size with sysconf
page size is not defiend in sys/user.h on aarch64

Change-Id: Idbdaef2519792eeb1e1a2794be0a34d67e87907e
Fujitsu: POSTK_DEBUG_ARCH_DEP_35
2019-02-01 13:16:40 +09:00
a10f4b861c do_pageout: fix direct kernel-user access
Change-Id: Ie02faca93fdb0d52d72e1f2aa1384a214c84ebff
Fujitsu: POSTK_DEBUG_ARCH_DEP_46
2019-02-01 13:16:32 +09:00
960a6f5f90 prepare process: add magic header in program_load_desc
Check we mapped the correct region with a magic header in the struct

Original commit: d246b93a3bced92d0ac2a4a337118091b010658a

Fujitsu: POSTK_DEBUG_TEMP_FIX_76
Change-Id: If848be64af5d76844ba65b48493021637c8114f4
2019-02-01 13:16:25 +09:00
dfd23c3ebe prctl: Add support for PR_SET_THP_DISABLE and PR_GET_THP_DISABLE
Change-Id: I04c5568a9eb78bcac632b734f34bba49cf602c4d
Refs: #1181
2019-01-22 05:40:56 +00:00
13e29c0da5 mcoverlayfs: fix disabled build
Change-Id: Ia40853432547084329fc034e3942e51954e1ddf5
2019-01-22 02:15:43 +00:00
ef9fda23a9 mcexec: Set default heap extension amount to sysconf(_SC_PAGESIZE)
Change-Id: I3ac660d33918c1fa28093ab59f3a7ead65d337d7
2018-12-12 00:38:10 +00:00
da02f76a25 mcexec: Fix error handling of init_worker_threads
Refs: #1233
Change-Id: Icce49c996d69b3cf64a71e7bd470421f329c881f
2018-12-04 09:40:24 +00:00
c585a37440 move mcoverlayfs kernel version check from mcexec.c to configure
While we are here:
 - fix uname -r (single quote?!)
 - add compat for rhel8 (el kernel and version is 4.18)
 - also remove linux version check in mcreboot.sh, trust configure check

Change-Id: I14726d4374b0dfd941640096044ea1d5d88bfcb8
2018-11-26 12:09:00 +00:00
ae9a1f39df ihk_ikc_recv: Record channel to packet for release
ihk_ikc_release_packet takes the channel and puts the packet into its
free-list.  This fix makes it easy and safe to identify the proper
channel.

Change-Id: I5584b1e8a3ed675c2f9d68f0b5ed331b909197f6
Fujitsu: POSTK_DEBUG_TEMP_FIX_89
2018-11-21 17:01:58 +09:00
190039f5d9 arch_cpu_read_write_register: error return fix.
Fixed an issue where errors generated in arch_cpu_read_write_register()
are not transmitted to the caller.

Change-Id: I05d7d872eab834918220cf18f628aee37208a156
Fujitsu: POSTK_DEBUG_TEMP_FIX_94
2018-11-21 16:49:21 +09:00
583cb94667 mcctrl: remove in-kernel calls to syscalls
Since 4.17.0, kernel cannot call syscalls directly because the calling
convention can be different on x86_64, as explained in this email:
https://lore.kernel.org/lkml/20180325162527.GA17492@light.dominikbrodowski.net

Use the ksys_* alternatives instead when possible, or for readlink use
do_readlinkat (and use readlinkat all the time to simplify ifdefs)

It might be possible to change some of these without ifdefs, but for
example ksys_unshare only got introduced in 4.17 so we need to keep some
syscall calling...

Change-Id: Ic47e184b29ef8b21731b2eae6193b0af2548b872
2018-11-21 16:42:26 +09:00
e12d5ed341 Expose McKernel version in /proc/mckernel
Change-Id: Ica0fbb0ff70a4ff2559e92738926279a3ae78a21
2018-11-21 07:39:54 +00:00
1253f4d18c mcexec shebang: delete spaces *before* path as well
Apparently, a shebang '#! /bin/sh' should work.
Will add some ostests for these...

Change-Id: Iab8ba8e3cc7e434c98742f71fe7db3c425f08278
2018-11-21 07:39:51 +00:00
525b90d028 flatten_string/process env: realign env and clear trailing bits
envs are stuck after args which are now possibly unaligned, and used
from a non-aligned pointer in prepare_process_ranges_args_envs (env)

The memory immediately after args/envs is copied anyway with memcpy_long,
so make sure the bits are initialized and realign env correctly

Fixes: 70e52faf36 ("flatten_strings: do not return unused trailing bits")
Change-Id: Ic747e947d151c0eea65dec36bc9c888cf6e0c394
2018-11-21 07:39:16 +00:00
7a3f4d7501 mcctrl rhel8 compat: remove unneeded RHEL_RELEASE_CODE check
it was meant for 3.10 kernels, so the regular < 4.0.0 check
will work for el7 and older kernels as well

Change-Id: I807f030f6303c9c3d17b0d80de55c256a3479486
2018-11-21 07:36:50 +00:00
1a5b10277f mcexec: load_elf: disable execvp for within-mckernel execs
the libc takes care of trying execve as many times as needed for
execvp, it's not a kernel call.

Also, sneak a double-free fix (desc was not reset properly in case
load_elf_desc_shebang failed)

Fixes: b1681f4a3affff ("mcexec/execve: fix shebangs handling")
Change-Id: If8e3d7ae53acdeffc0331ae8621e0832fcfa406f
2018-11-21 16:17:58 +09:00
a59c55c188 mcexec load_elf_desc: print error after returning
Running "mcexec dfsafds" did not print any message in normal use.
Rather than looking for which message shows in debug and turn in into
eprintf, add a single coherent message (more shell-like) at the end and
turn other messages off.

There is a small loss of information but this is equivalent to what
shells give (a single errno value with no details), and it is now easy
to add --debug to mcexec to see more information if required

Change-Id: Id2c3a47880b7d1d7467883351e6e7af561f91bbf
2018-11-21 16:17:58 +09:00
1d6a078afa mcexec: add --debug-mcexec
We already have debug statements compiled in, add a toggle for it
Also fix case indent for 's'

Change-Id: I1104ee57d571b82ec5e061f22cd44033a5c7fc39
2018-11-21 07:16:54 +00:00
9db8d115d9 overlayfs: rhel8 compat for the 4.18 version
rhel8 is a 4.18 kernel but they've already backported some later fixes.
Instead of relying on the kernel version, the changes removed some defines so
we can check for the define presence to make the code more robust to kernel
version wilderness instead

Change-Id: I6cf5548a7b73a7394405daf850f715a1e20ab0b4
2018-11-21 16:06:31 +09:00
e26e693e58 mcoverlayfs: update and compile new overlayfs for 4.18 kernels
This newer version is much simpler than the old ones:
 - the options are noop, this lets the code simplify all the allocating
of a new option struct and passing it around
 - ovl_reset_ovl_entry was added and called all the time, but the
mechanism that made this required is gone in this kernel version

On the other hand, one new thing in this version:
 - newer kernel check the stacking depth of filesystems now, and we are
reaching the default limit of two with our setup. Bump it to three here.

Also, while we are here, make make fail if requested directory does not
exist, instead of infinitely recurse into make modules in the mcoverlayfs
directory...

Change-Id: I45050d693a0aa6fd3027deaf417c29876ef6a1ea
2018-11-21 16:06:31 +09:00
fc2775c932 mcoverlayfs: add new base from 4.18.14
This just lays out new files so the next commit is easier to review;
nothing changes here

Change-Id: I66669877d2d10632f5436c0eeb32248cd4c8b996
2018-11-21 16:06:31 +09:00
6581f9b4b2 mcctrl syscall: compat for newer zap_vma_ptes
newer version of this function no longer return an error on the basis
that "no-one checks what it returns anyway"........

See linux 4.18's 27d036e33237e ("mm: Remove return value of zap_vma_ptes()")

Change-Id: I8fb9f060e3e145cc2db21738585c9ee7f1445f74
2018-11-21 16:06:31 +09:00
3a90521489 mcexec: fix strncat bounding
strncat must not look at the appendee's length, but at how much
is left where we're appending.
This API is stupid anyway, where is strlcat when we need it...

Change-Id: Icdf418083146420a06f8ba5ffdf882982610d39b
2018-11-21 16:06:31 +09:00
03802052ed mcctrl: add handling for one more level of page tables
newer linux got a 5 level page table now, try to handle that.

Some of the macros will be no-op (e.g. loop only on one iteration) on
architecture/kernels with only 4 levels but the code needs to be there
to compile

Change-Id: Ifc6304cbb066dce7d4e30962687ae05d7e034730
2018-11-21 07:03:24 +00:00
c21485d427 mcctrl: include linux/cred.h
The headers defines __task_cred and other macroes we use, and always
existed; we must have gotten it indirectly on older kernels, it doesn't
hurt to always include

Change-Id: Iacfff0365e7a21e6247eea42606bbbf1dfccc077
2018-11-21 06:38:08 +00:00
18d50e48dc mcctrl: lookup for alternate syscall names
on newer x64 kernels (config option?), syscalls can be renamed to allow
both x64 and ia32 versions to coexist. Lookup either names

Change-Id: I2f55cc804d3eee948ee1ed6d18c69c75bd2f652c
2018-11-21 06:38:08 +00:00
a2be475ae4 mcctrl control: replace cpu_isset by cpumask_test_cpu for new kernels
Change-Id: I60635118e5ce7281de97e024c626ac40d1a4aa36
Fujitsu: POSTK_DEBUG_ARCH_DEP_54
2018-11-21 06:38:08 +00:00