Compare commits

..

45 Commits

Author SHA1 Message Date
2585c8afaa prerelease: 0.95: add ihk_*_str() functions
Change-Id: I0dc2ff3c8a2b21d167cfff04ccf6d1533555ee1c
2021-02-26 11:24:48 +09:00
82056961cd uti: integrate libuti and redirect to mck/libuti.so
Change-Id: I74e0f677ea8e1cd06e8ab05d92f1d38f9be8fd7a
2021-02-26 11:03:16 +09:00
0848b64c1d uti: integrate syscall_intercept
Change-Id: Ide14341acdca1450b0ad4f8a16cc078d0743afc8
2021-02-26 10:37:56 +09:00
8a9b43fee0 cmake: add -Wno-stringop-truncation
Change-Id: I43d9ba731d0feaf8934d2724ff98072df88a902d
2021-02-26 10:37:56 +09:00
19cb302d5f uti: util_indicate_clone: check --enable-uti mcexec option
Change-Id: Ic7474d01c18acd1edbc07844d7a7b010b2175f71
2021-02-26 10:37:56 +09:00
90895cfb1f test: uti: add tofu examples
Change-Id: I1c55c872d125201e60b4fe744af74106e1c5d3a4
2021-02-26 10:37:55 +09:00
32afa80718 uti: fix handling UTI_CPU_SET env
Change-Id: Icbf8dc7e82bd6983374aefdd0d5b89ad4152c9aa
2021-02-26 10:24:19 +09:00
e3927a0b95 uti: futex: McKernel waker sends IPI to Linux waiter CPU
Change-Id: I6f725b3a6b1b26b9f553d8c58132c0c0a4416683
2021-02-26 10:24:19 +09:00
adc5b7102f uti: futex: cache remote va to remote pa result
Change-Id: Idbbb3f2981b76a0235615fceaa6281d2c7134ca2
2021-02-26 10:24:19 +09:00
5d16ce9dcc uti: identify UTI thread by thread local variable
Change-Id: I64372a932378e4ead09ea27fbf5b52062a109756
2021-02-26 10:24:19 +09:00
a9973e913d uti: futex call function in mcctrl
Previously, futex code of McKerenl was called by mccontrol,
but there ware some problems with this method.
(Mainly, location of McKernel image on memory)

Call futex code in mcctrl instead of the one in McKernel image,
giving the following benefits:
1. Not relying on shared kernel virtual address space with Linux any more
2. The cpu id store / retrieve is not needed and resulting in the code

Change-Id: Ic40929b64a655b270c435859fa287fedb713ee5c
refe: #1428
2021-02-26 10:24:19 +09:00
35296c8210 uti: fix syscall response is mis-consumed by __do_in_kernel_irq_syscall
Refs: #1617
Change-Id: Iddd8ccd81d7f692f1f45ec888d31c2a87ec521ce
2021-02-25 01:42:29 +00:00
afea6af667 Send a signal to mcexec after switching to that process.
Change-Id: Ia882ef5027931009ee65febd0cbe22022a755c4a
Refs: #1505
2021-02-19 02:28:29 +00:00
b0bd1feefb remap_file_pages: check file mapping
Change-Id: Ibf145a20181938a9825214253337a423fcd53064
Refs: #1521
2021-02-19 02:23:39 +00:00
e6e66e0392 shmget: make small free numbers reusable.
Change-Id: Ic6670214fa31a309e96794361e3ec2dcc6375f4a
Refs: #1531
2021-02-19 02:22:50 +00:00
b3ddd60277 shmget: don't update refcount when shmid is found.
Change-Id: I3eac47cd67d27efd838190f5a4c21b5d682c5fe9
Refs: #1379
2021-02-19 02:22:33 +00:00
6dce9a2bf9 add_process_memory_range: Change order of update page and insert range.
Unintended update page was occurred, when inserting range failed.

Change-Id: I3d117b8613c5fbb64463c759b5fcc81db22bd624
refs: #1512
2021-02-18 16:02:30 +09:00
93dafc5f79 migrate: Don't migrate on in-kernel interrupt
Change-Id: I9c07e0d633687ce232ec3cd0c80439ca2e856293
Refs: #1555
2021-02-18 15:30:22 +09:00
583319125a prerelease: 0.94: fix __mcctrl_os_read_write_cpu_register
Change-Id: Ibcfbe7796347cc9c2148cdea2519fe6c7ca9e97e
2021-02-18 15:23:01 +09:00
9f39d1cd88 move_pages: Fix and support some specs for LTP.
1. When nodes array is NULL, move_pages doesn't move any pages,
 instead will return the node where each page
 currently resides by status array.
2. Check whether all specified node is online or not.

Change-Id: Ie3534997833d797e2a9f595d1107b07d46e1c6cf
Refs: #1523
2021-02-18 06:16:17 +00:00
a0d446b27f smp: make smp_call_func() arch independent
Change-Id: Ib60604ceb3274b173bd7f96cf57c8c35c1889e44
2021-02-18 06:16:17 +00:00
f3c875b8e6 mbind: Use range_policy's numamask as priority on MPOL_BIND
Change-Id: Iaaa7998945c6e2b42d91d34a2f7b05db1f4d696d
2021-02-18 06:16:17 +00:00
9f1e6d707c get_mempolicy: Support (MPOL_F_NODE | MPOL_F_ADDR) specified
If flags specifies both MPOL_F_NODE and MPOL_F_ADDR,
get_mempolicy() will return the node ID of the node on
which the address addr is allocated into the location pointed to by mode.

Change-Id: Id485e3f4838e3679d877a95e53b21e3421cac88a
2021-02-18 06:16:17 +00:00
aef50d710c mempolicy: Support MPOL_INTERLEAVE
Change-Id: I6357892d792b2de8ea859a0a6799250f05066713
Refs: #959
2021-02-18 06:16:17 +00:00
7f0594d784 TO RESET: mbind: do nothing
Fixes: 00007daf ("mbind: do nothing (workaround for Fugaku)")

Change-Id: Id41940bebd2cbcc3e8637eadd4847984627b1c72
2021-02-18 06:16:17 +00:00
866f5c51a0 docs: add limitation of system calls that call copy_to_user()
Change-Id: If449c73f8d5949ab5526ea598b0f713ed4431157
Refs: #1514
2021-02-18 13:04:53 +09:00
48b1d548f2 __mcctrl_os_read_write_cpu_register: fix timeout
Change-Id: Id5a7d316d793bd535f24fd353b214aa12af1dab4
2021-02-15 08:56:04 +00:00
822b64b03c docs: add limitation related to Fujitsu TCS xos_hwb
Change-Id: I83a1ecd7a0b6d3bcde2b902cd526dfd4feb9e23a
2021-02-15 16:03:52 +09:00
aca83bcd3d Tofu: fault stack area if VM range doesn't exist in STAG registration
Change-Id: I407a8954ccaf22019b3082fd6eee68e772d1cb26
2021-02-15 14:46:58 +09:00
c7145c4b38 xpmem: fault stack area of remote process if VM range doesn't yet exist
Change-Id: I2bbb745cc9b79ab4f9ea81b242f35f1b88ad531e
2021-02-15 14:46:58 +09:00
a82d161be8 prerelease: 0.93: investigate smp_ihk_os_panic_notifier
Change-Id: I997b41f80038603261de2e8232b6b8ca200cd8cd
2021-02-09 21:39:49 -05:00
7152269a59 spec: create one rpm including .ko and binaries
Don't use kernel_module_package not to create a separate
kmod-mckernel-*.rpm containing .ko files.

Change-Id: I25b7ff662476bfc735d319b57cdf2da82f2c6aa7
2021-02-09 20:55:38 -05:00
31c08bcb7d spec, docs: update cmake options
Change-Id: Ib8277413a413b5ce956a48f7e3d9922311937ea8
2021-02-09 20:55:38 -05:00
dffb0918a2 docs: add capstone installation options
Change-Id: I96aa9a6405c17f8d9653f3d3894f0e71a57ab460
2021-02-09 06:10:32 +00:00
23cd14af7d __mcctrl_os_read_write_cpu_register: timeout in 1 sec for when McKernel can't respond
Change-Id: Ia2d5f64e107697dda1f3bae499eb3afb8a7aedba
2021-02-09 06:09:11 +00:00
a5cf2019bc cmake: fix detection of Fugaku native compilation
Change-Id: I4210e9b57223c3869464caea10c2d414e9484e14
2021-02-09 06:06:13 +00:00
11b9fe0377 page_fault_handler: fix missing increment of in_page_fault on SEGV
This integrates some of the changes of the following commit:
1cf0bd5a ("TO RESET: add debug instruments, map Linux areas for tofu")

Change-Id: Iffd8432d5a7b35f20bd45829a125583a0363dbf0
2021-02-09 00:56:15 -05:00
4905c8e638 mcexec: propagate error in __NR_gettid handler
Change-Id: I0e0f06199970fe839065567dcd5418d017b6ec00
2021-02-03 18:53:33 -05:00
3d71c6a8eb mcexec_transfer_image(): map exact size of remote memory (instead of forcing PAGE_SIZE)
Change-Id: Ic66770af6cdb15b7a2e18a08cbcd1736e5558bdf
2021-02-03 18:53:33 -05:00
1cea75dd51 mcexec: fix strncat missing NULL and pclose of uninitialized
Change-Id: I9ce4004580845a983949caa5668b2f950880cd24
2021-02-02 01:51:57 +00:00
661ba0ce4a docs: add editing spec file when building rpm
Change-Id: Ic8dc9d8c6aef6d2180844891d743a09f4a3bdd9d
2021-01-29 01:23:35 +00:00
7e82adc761 prerelease: 0.92: fix uninitialized usrdata->cpu_topology_list
Change-Id: Ia12970bda1225898823a67c2d0461144fc62ebb9
2021-01-29 09:50:53 +09:00
1f9fbe82db mcctrl: fix access to uninitialized usrdata->cpu_topology_list
Change-Id: I25a9182b9b470bb069f4f755a67fb50b88817cd2
2021-01-29 09:34:24 +09:00
aa3d4ba7bd spec: prerelease 0.91 for 4.18.0-240.8.1.el8_3.aarch64 support
Change-Id: I8b33714157b1c68c1fc1eadf0b9d072a3ee59608
2021-01-26 02:34:35 -05:00
c89ac042f9 spec: prerelease 0.9 for testing hidos and cgroup check
Change-Id: I3b04fbf3a1ffa10df9c76da7b2730b9a2521bf98
2021-01-20 13:03:16 +09:00
147 changed files with 11746 additions and 1543 deletions

6
.gitmodules vendored
View File

@ -4,3 +4,9 @@
[submodule "executer/user/lib/libdwarf/libdwarf"]
path = executer/user/lib/libdwarf/libdwarf
url = https://github.com/bgerofi/libdwarf.git
[submodule "executer/user/lib/syscall_intercept"]
path = executer/user/lib/syscall_intercept
url = https://github.com/RIKEN-SysSoft/syscall_intercept.git
[submodule "executer/user/lib/uti"]
path = executer/user/lib/uti
url = https://github.com/RIKEN-SysSoft/uti.git

View File

@ -10,7 +10,7 @@ project(mckernel C ASM)
set(MCKERNEL_VERSION "1.7.1")
# See "Fedora Packaging Guidelines -- Versioning"
set(MCKERNEL_RELEASE "0.8")
set(MCKERNEL_RELEASE "0.95")
set(CMAKE_MODULE_PATH ${CMAKE_SOURCE_DIR}/cmake/modules)
# for rpmbuild
@ -41,6 +41,11 @@ if(IMPLICIT_FALLTHROUGH)
set(EXTRA_WARNINGS "-Wno-implicit-fallthrough")
endif(IMPLICIT_FALLTHROUGH)
CHECK_C_COMPILER_FLAG(-Wno-stringop-truncation STRINGOP_TRUNCATION)
if(STRINGOP_TRUNCATION)
list(APPEND EXTRA_WARNINGS "-Wno-stringop-truncation")
endif(STRINGOP_TRUNCATION)
# build options
set(CFLAGS_WARNING "-Wall" "-Wextra" "-Wno-unused-parameter" "-Wno-sign-compare" "-Wno-unused-function" ${EXTRA_WARNINGS} CACHE STRING "Warning flags")
add_compile_options(${CFLAGS_WARNING})
@ -65,7 +70,7 @@ if(ENABLE_TOFU)
endif()
# when compiling on a compute-node
execute_process(COMMAND bash -c "grep $(hostname) /etc/opt/FJSVfefs/config/fefs_node1.csv 2>/dev/null | cut -d, -f2"
execute_process(COMMAND bash -c "grep $(hostname) /etc/opt/FJSVfefs/config/fefs_node1.csv 2>/dev/null | cut -d, -f2 | grep -o CN"
OUTPUT_VARIABLE FUGAKU_NODE_TYPE OUTPUT_STRIP_TRAILING_WHITESPACE)
if(FUGAKU_NODE_TYPE STREQUAL "CN")
option(ENABLE_FUGAKU_HACKS "Fugaku hacks" ON)
@ -213,11 +218,6 @@ if (ENABLE_QLMPI)
find_package(MPI REQUIRED)
endif()
if (ENABLE_UTI)
pkg_check_modules(LIBSYSCALL_INTERCEPT REQUIRED libsyscall_intercept)
link_directories(${LIBSYSCALL_INTERCEPT_LIBRARY_DIRS})
endif()
string(REGEX REPLACE "^([0-9]+)\\.([0-9]+)\\.([0-9]+)(-([0-9]+)(.*))?" "\\1;\\2;\\3;\\5;\\6" LINUX_VERSION ${UNAME_R})
list(GET LINUX_VERSION 0 LINUX_VERSION_MAJOR)
list(GET LINUX_VERSION 1 LINUX_VERSION_MINOR)

View File

@ -1524,6 +1524,11 @@ int ihk_mc_arch_get_special_register(enum ihk_asr_type type,
return -1;
}
int ihk_mc_get_interrupt_id(int cpu)
{
return cpu;
}
/*@
@ requires \valid_cpuid(cpu); // valid CPU logical ID
@ ensures \result == 0
@ -1972,15 +1977,15 @@ int arch_cpu_read_write_register(
return ret;
}
int smp_call_func(cpu_set_t *__cpu_set, smp_func_t __func, void *__arg)
{
/* TODO: skeleton for smp_call_func */
return -1;
}
void arch_flush_icache_all(void)
{
asm("ic ialluis");
dsb(ish);
}
int ihk_mc_get_smp_handler_irq(void)
{
return LOCAL_SMP_FUNC_CALL_VECTOR;
}
/*** end of file ***/

View File

@ -89,9 +89,6 @@
mov x2, #0
bl check_signal_irq_disabled // check whether the signal is delivered(for kernel_exit)
.endif
.if \el == 1
bl check_sig_pending
.endif
disable_irq x1 // disable interrupts
.if \need_enable_step == 1
ldr x1, [tsk, #TI_FLAGS]

View File

@ -7,7 +7,8 @@
* @ref.impl
* linux-linaro/arch/arm64/include/asm/futex.h:__futex_atomic_op
*/
#define __futex_atomic_op(insn, ret, oldval, uaddr, tmp, oparg) \
#define ___futex_atomic_op(insn, ret, oldval, uaddr, tmp, oparg) \
do { \
asm volatile( \
"1: ldxr %w1, %2\n" \
insn "\n" \
@ -26,7 +27,24 @@
" .popsection\n" \
: "=&r" (ret), "=&r" (oldval), "+Q" (*uaddr), "=&r" (tmp) \
: "r" (oparg), "Ir" (-EFAULT) \
: "memory")
: "memory"); \
} while (0);
#ifndef IHK_OS_MANYCORE
#include <linux/uaccess.h>
#define __futex_atomic_op(insn, ret, oldval, uaddr, tmp, oparg) \
do { \
uaccess_enable(); \
___futex_atomic_op(insn, ret, oldval, uaddr, tmp, oparg) \
uaccess_disable(); \
} while (0);
#else
#define __futex_atomic_op(insn, ret, oldval, uaddr, tmp, oparg) \
___futex_atomic_op(insn, ret, oldval, uaddr, tmp, oparg) \
#endif
/*
* @ref.impl
@ -135,12 +153,4 @@ futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
return ret;
}
static inline int get_futex_value_locked(uint32_t *dest, uint32_t *from)
{
*dest = *(volatile uint32_t *)from;
return 0;
}
#endif /* !__HEADER_ARM64_COMMON_ARCH_FUTEX_H */

View File

@ -17,6 +17,7 @@
#define INTRID_STACK_TRACE 5
#define INTRID_MULTI_INTR 6
#define INTRID_MULTI_NMI 7
#define LOCAL_SMP_FUNC_CALL_VECTOR 1 /* same as IKC */
/* use PPI interrupt number */
#define INTRID_PERF_OVF 23

View File

@ -344,10 +344,13 @@ void handle_interrupt_gicv3(struct pt_regs *regs)
//irqflags = ihk_mc_spinlock_lock(&v->runq_lock);
/* For migration by IPI or by timesharing */
if (v->flags &
(CPU_FLAG_NEED_MIGRATE | CPU_FLAG_NEED_RESCHED)) {
v->flags &= ~CPU_FLAG_NEED_RESCHED;
do_check = 1;
if (v->flags & CPU_FLAG_NEED_RESCHED) {
if (v->flags & CPU_FLAG_NEED_MIGRATE && !from_user) {
// Don't migrate on K2K schedule
} else {
v->flags &= ~CPU_FLAG_NEED_RESCHED;
do_check = 1;
}
}
//ihk_mc_spinlock_unlock(&v->runq_lock, irqflags);

View File

@ -16,6 +16,7 @@
#include <uio.h>
#include <syscall.h>
#include <rusage_private.h>
#include <memory.h>
#include <ihk/debug.h>
void terminate_mcexec(int, int);
@ -2250,8 +2251,10 @@ int move_pages_smp_handler(int cpu_index, int nr_cpus, void *arg)
case 0:
memcpy(mpsr->virt_addr, mpsr->user_virt_addr,
sizeof(void *) * count);
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
if (mpsr->user_nodes) {
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
}
memset(mpsr->ptep, 0, sizeof(pte_t) * count);
memset(mpsr->status, 0, sizeof(int) * count);
memset(mpsr->nr_pages, 0, sizeof(int) * count);
@ -2269,8 +2272,10 @@ int move_pages_smp_handler(int cpu_index, int nr_cpus, void *arg)
case 0:
memcpy(mpsr->virt_addr, mpsr->user_virt_addr,
sizeof(void *) * count);
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
if (mpsr->user_nodes) {
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
}
mpsr->nodes_ready = 1;
break;
case 1:
@ -2292,8 +2297,10 @@ int move_pages_smp_handler(int cpu_index, int nr_cpus, void *arg)
sizeof(void *) * count);
break;
case 1:
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
if (mpsr->user_nodes) {
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
}
mpsr->nodes_ready = 1;
break;
case 2:
@ -2322,8 +2329,10 @@ int move_pages_smp_handler(int cpu_index, int nr_cpus, void *arg)
sizeof(void *) * (count / 2));
break;
case 2:
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
if (mpsr->user_nodes) {
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
}
mpsr->nodes_ready = 1;
break;
case 3:
@ -2349,13 +2358,15 @@ int move_pages_smp_handler(int cpu_index, int nr_cpus, void *arg)
}
/* NUMA verification in parallel */
for (i = i_s; i < i_e; i++) {
if (mpsr->nodes[i] < 0 ||
mpsr->nodes[i] >= ihk_mc_get_nr_numa_nodes() ||
!test_bit(mpsr->nodes[i],
mpsr->proc->vm->numa_mask)) {
mpsr->phase_ret = -EINVAL;
break;
if (mpsr->user_nodes) {
for (i = i_s; i < i_e; i++) {
if (mpsr->nodes[i] < 0 ||
mpsr->nodes[i] >= ihk_mc_get_nr_numa_nodes() ||
!test_bit(mpsr->nodes[i],
mpsr->proc->vm->numa_mask)) {
mpsr->phase_ret = -EINVAL;
break;
}
}
}
@ -2387,7 +2398,7 @@ int move_pages_smp_handler(int cpu_index, int nr_cpus, void *arg)
/* PTE valid? */
if (!mpsr->ptep[i] || !pte_is_present(mpsr->ptep[i])) {
mpsr->status[i] = -ENOENT;
mpsr->status[i] = -EFAULT;
mpsr->ptep[i] = NULL;
continue;
}
@ -2451,6 +2462,26 @@ pte_out:
dkprintf("%s: phase %d done\n", __FUNCTION__, phase);
++phase;
/*
* When nodes array is NULL, move_pages doesn't move any pages,
* instead will return the node where each page
* currently resides by status array.
*/
if (!mpsr->user_nodes) {
/* get nid in parallel */
for (i = i_s; i < i_e; i++) {
if (mpsr->status[i] < 0) {
continue;
}
mpsr->status[i] = phys_to_nid(
pte_get_phys(mpsr->ptep[i]));
}
mpsr->phase_ret = 0;
goto out; // return node information
}
/* Processing of move pages */
if (cpu_index == 0) {
/* Allocate new pages on target NUMA nodes */
for (i = 0; i < count; i++) {
@ -2463,8 +2494,11 @@ pte_out:
/* TODO: store pgalign info in an array as well? */
if (mpsr->nr_pages[i] > 1) {
if (mpsr->nr_pages[i] * PAGE_SIZE == PTL2_SIZE)
pgalign = PTL2_SHIFT - PTL1_SHIFT;
int nr_pages;
for (pgalign = 0, nr_pages = mpsr->nr_pages[i];
nr_pages != 1; pgalign++, nr_pages >>= 1) {
}
}
dst = ihk_mc_alloc_aligned_pages_node(mpsr->nr_pages[i],

View File

@ -80,7 +80,11 @@ static void (*lapic_icr_write)(unsigned int h, unsigned int l);
static void (*lapic_wait_icr_idle)(void);
void (*x86_issue_ipi)(unsigned int apicid, unsigned int low);
int running_on_kvm(void);
static void smp_func_call_handler(void);
void smp_func_call_handler(void);
int ihk_mc_get_smp_handler_irq(void)
{
return LOCAL_SMP_FUNC_CALL_VECTOR;
}
void init_processors_local(int max_id);
void assign_processor_id(void);
@ -919,20 +923,18 @@ void interrupt_exit(struct x86_user_context *regs)
check_need_resched();
check_signal(0, regs, -1);
}
else {
check_sig_pending();
}
}
void handle_interrupt(int vector, struct x86_user_context *regs)
{
struct ihk_mc_interrupt_handler *h;
struct cpu_local_var *v = get_this_cpu_local_var();
int from_user = interrupt_from_user(regs);
lapic_ack();
++v->in_interrupt;
set_cputime(interrupt_from_user(regs) ?
set_cputime(from_user ?
CPUTIME_MODE_U2K : CPUTIME_MODE_K2K_IN);
dkprintf("CPU[%d] got interrupt, vector: %d, RIP: 0x%lX\n",
@ -1050,15 +1052,18 @@ void handle_interrupt(int vector, struct x86_user_context *regs)
}
interrupt_exit(regs);
set_cputime(interrupt_from_user(regs) ?
set_cputime(from_user ?
CPUTIME_MODE_K2U : CPUTIME_MODE_K2K_OUT);
--v->in_interrupt;
/* for migration by IPI */
if (v->flags & CPU_FLAG_NEED_MIGRATE) {
schedule();
check_signal(0, regs, 0);
// Don't migrate on K2K schedule
if (from_user) {
schedule();
check_signal(0, regs, 0);
}
}
}
@ -1673,6 +1678,11 @@ int ihk_mc_arch_get_special_register(enum ihk_asr_type type,
}
}
int ihk_mc_get_interrupt_id(int cpu)
{
return get_x86_cpu_local_variable(cpu)->apic_id;
}
/*@
@ requires \valid_cpuid(cpu); // valid CPU logical ID
@ ensures \result == 0
@ -2170,144 +2180,6 @@ int arch_cpu_read_write_register(
return 0;
}
/*
* Generic remote CPU function invocation facility.
*/
static void smp_func_call_handler(void)
{
int irq_flags;
struct smp_func_call_request *req;
int reqs_left;
reiterate:
req = NULL;
reqs_left = 0;
irq_flags = ihk_mc_spinlock_lock(
&cpu_local_var(smp_func_req_lock));
/* Take requests one-by-one */
if (!list_empty(&cpu_local_var(smp_func_req_list))) {
req = list_first_entry(&cpu_local_var(smp_func_req_list),
struct smp_func_call_request, list);
list_del(&req->list);
reqs_left = !list_empty(&cpu_local_var(smp_func_req_list));
}
ihk_mc_spinlock_unlock(&cpu_local_var(smp_func_req_lock),
irq_flags);
if (req) {
req->ret = req->sfcd->func(req->cpu_index,
req->sfcd->nr_cpus, req->sfcd->arg);
ihk_atomic_dec(&req->sfcd->cpus_left);
}
if (reqs_left)
goto reiterate;
}
int smp_call_func(cpu_set_t *__cpu_set, smp_func_t __func, void *__arg)
{
int cpu, nr_cpus = 0;
int cpu_index = 0;
int this_cpu_index = 0;
struct smp_func_call_data sfcd;
struct smp_func_call_request *reqs;
int ret = 0;
int call_on_this_cpu = 0;
cpu_set_t cpu_set;
/* Sanity checks */
if (!__cpu_set || !__func) {
return -EINVAL;
}
/* Make sure it won't change in between */
cpu_set = *__cpu_set;
for_each_set_bit(cpu, (unsigned long *)&cpu_set,
sizeof(cpu_set) * BITS_PER_BYTE) {
if (cpu == ihk_mc_get_processor_id()) {
call_on_this_cpu = 1;
}
++nr_cpus;
}
if (!nr_cpus) {
return -EINVAL;
}
reqs = kmalloc(sizeof(*reqs) * nr_cpus, IHK_MC_AP_NOWAIT);
if (!reqs) {
ret = -ENOMEM;
goto free_out;
}
sfcd.nr_cpus = nr_cpus;
sfcd.func = __func;
sfcd.arg = __arg;
ihk_atomic_set(&sfcd.cpus_left,
call_on_this_cpu ? nr_cpus - 1 : nr_cpus);
/* Add requests and send IPIs */
cpu_index = 0;
for_each_set_bit(cpu, (unsigned long *)&cpu_set,
sizeof(cpu_set) * BITS_PER_BYTE) {
unsigned long irq_flags;
reqs[cpu_index].cpu_index = cpu_index;
reqs[cpu_index].ret = 0;
if (cpu == ihk_mc_get_processor_id()) {
this_cpu_index = cpu_index;
++cpu_index;
continue;
}
reqs[cpu_index].sfcd = &sfcd;
irq_flags =
ihk_mc_spinlock_lock(&get_cpu_local_var(cpu)->smp_func_req_lock);
list_add_tail(&reqs[cpu_index].list,
&get_cpu_local_var(cpu)->smp_func_req_list);
ihk_mc_spinlock_unlock(&get_cpu_local_var(cpu)->smp_func_req_lock,
irq_flags);
ihk_mc_interrupt_cpu(cpu, LOCAL_SMP_FUNC_CALL_VECTOR);
++cpu_index;
}
/* Is this CPU involved? */
if (call_on_this_cpu) {
reqs[this_cpu_index].ret =
__func(this_cpu_index, nr_cpus, __arg);
}
/* Wait for the rest of the CPUs */
while (ihk_atomic_read(&sfcd.cpus_left) > 0) {
cpu_pause();
}
/* Check return values, if error, report the first non-zero */
for (cpu_index = 0; cpu_index < nr_cpus; ++cpu_index) {
if (reqs[cpu_index].ret != 0) {
ret = reqs[cpu_index].ret;
goto free_out;
}
}
ret = 0;
free_out:
kfree(reqs);
return ret;
}
extern int nmi_mode;
extern long freeze_thaw(void *nmi_ctx);

View File

@ -129,12 +129,4 @@ static inline int futex_atomic_op_inuser(int encoded_op,
return ret;
}
static inline int get_futex_value_locked(uint32_t *dest, uint32_t *from)
{
*dest = *(volatile uint32_t *)from;
return 0;
}
#endif

View File

@ -32,6 +32,7 @@
#include <limits.h>
#include <syscall.h>
#include <rusage_private.h>
#include <memory.h>
#include <ihk/debug.h>
void terminate_mcexec(int, int);
@ -2302,8 +2303,10 @@ int move_pages_smp_handler(int cpu_index, int nr_cpus, void *arg)
case 0:
memcpy(mpsr->virt_addr, mpsr->user_virt_addr,
sizeof(void *) * count);
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
if (mpsr->user_nodes) {
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
}
memset(mpsr->ptep, 0, sizeof(pte_t) * count);
memset(mpsr->status, 0, sizeof(int) * count);
memset(mpsr->nr_pages, 0, sizeof(int) * count);
@ -2321,8 +2324,10 @@ int move_pages_smp_handler(int cpu_index, int nr_cpus, void *arg)
case 0:
memcpy(mpsr->virt_addr, mpsr->user_virt_addr,
sizeof(void *) * count);
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
if (mpsr->user_nodes) {
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
}
mpsr->nodes_ready = 1;
break;
case 1:
@ -2344,8 +2349,10 @@ int move_pages_smp_handler(int cpu_index, int nr_cpus, void *arg)
sizeof(void *) * count);
break;
case 1:
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
if (mpsr->user_nodes) {
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
}
mpsr->nodes_ready = 1;
break;
case 2:
@ -2374,8 +2381,10 @@ int move_pages_smp_handler(int cpu_index, int nr_cpus, void *arg)
sizeof(void *) * (count / 2));
break;
case 2:
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
if (mpsr->user_nodes) {
memcpy(mpsr->nodes, mpsr->user_nodes,
sizeof(int) * count);
}
mpsr->nodes_ready = 1;
break;
case 3:
@ -2401,13 +2410,15 @@ int move_pages_smp_handler(int cpu_index, int nr_cpus, void *arg)
}
/* NUMA verification in parallel */
for (i = i_s; i < i_e; i++) {
if (mpsr->nodes[i] < 0 ||
mpsr->nodes[i] >= ihk_mc_get_nr_numa_nodes() ||
!test_bit(mpsr->nodes[i],
mpsr->proc->vm->numa_mask)) {
mpsr->phase_ret = -EINVAL;
break;
if (mpsr->user_nodes) {
for (i = i_s; i < i_e; i++) {
if (mpsr->nodes[i] < 0 ||
mpsr->nodes[i] >= ihk_mc_get_nr_numa_nodes() ||
!test_bit(mpsr->nodes[i],
mpsr->proc->vm->numa_mask)) {
mpsr->phase_ret = -EINVAL;
break;
}
}
}
@ -2503,6 +2514,26 @@ pte_out:
dkprintf("%s: phase %d done\n", __FUNCTION__, phase);
++phase;
/*
* When nodes array is NULL, move_pages doesn't move any pages,
* instead will return the node where each page
* currently resides by status array.
*/
if (!mpsr->user_nodes) {
/* get nid in parallel */
for (i = i_s; i < i_e; i++) {
if (mpsr->status[i] < 0) {
continue;
}
mpsr->status[i] = phys_to_nid(
pte_get_phys(mpsr->ptep[i]));
}
mpsr->phase_ret = 0;
goto out; // return node information
}
/* Processing of move pages */
if (cpu_index == 0) {
/* Allocate new pages on target NUMA nodes */
for (i = 0; i < count; i++) {

View File

@ -129,11 +129,29 @@ Create the tarball and the spec file:
make dist
cp mckernel-<version>.tar.gz <rpmbuild>/SOURCES
(optional) Edit the following line in ``scripts/mckernel.spec`` to change
cmake options. For example:
::
%cmake -DCMAKE_BUILD_TYPE=Release \
-DUNAME_R=%{kernel_version} \
-DKERNEL_DIR=%{kernel_dir} \
%{?cmake_libdir:-DCMAKE_INSTALL_LIBDIR=%{cmake_libdir}} \
%{?build_target:-DBUILD_TARGET=%{build_target}} \
%{?toolchain_file:-DCMAKE_TOOLCHAIN_FILE=%{toolchain_file}} \
-DENABLE_TOFU=ON -DENABLE_FUGAKU_HACKS=ON \
-DENABLE_KRM_WORKAROUND=OFF -DWITH_KRM=ON \
-DENABLE_FUGAKU_DEBUG=OFF \
.
Create the rpm package:
When not cross-compiling:
"""""""""""""""""""""""""
Then build the rpm:
::
rpmbuild -ba scripts/mckernel.spec

View File

@ -202,3 +202,21 @@ Limitations
28. munlockall() is not supported and returns zero.
29. scheduling behavior is not Linux compatible. For example, sometimes one of the two processes on the same CPU continues to run after yielding.
30. (Fujitsu TCS-only) A job following the one in which __mcctrl_os_read_write_cpu_register() returns ``-ETIME`` fails because xos_hwb related CPU state isn't finalized. You can tell if the function returned ``-ETIME`` by checking if the following line appeared in the Linux kernel message:
::
__mcctrl_os_read_write_cpu_register: ERROR sending IKC msg: -62
You can re-initialize xos_hwb related CPU state by the following command:
::
sudo systemctl restart xos_hwb
31. System calls can write the mcexec VMAs with PROT_WRITE flag not
set. This is because we never turn off PROT_WRITE of the mcexec
VMAs to circumvent the issue "set_host_vma(): do NOT read protect
Linux VMA".

View File

@ -4,69 +4,66 @@ Advanced: Enable Utility Thread offloading Interface (UTI)
UTI enables a runtime such as MPI runtime to spawn utility threads such
as MPI asynchronous progress threads to Linux cores.
Install capstone
~~~~~~~~~~~~~~~~~~~~
Install ``capstone`` and ``capstone-devel``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When compute nodes don't have access to repositories
""""""""""""""""""""""""""""""""""""""""""""""""""""
When compute nodes don't have access to EPEL repository
"""""""""""""""""""""""""""""""""""""""""""""""""""""""
Install EPEL capstone-devel:
Install EPEL ``capstone`` and ``capstone-devel``:
::
sudo yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
sudo yum install capstone-devel
sudo yum install capstone capstone-devel
When compute nodes don't have access to repositories
""""""""""""""""""""""""""""""""""""""""""""""""""""
When compute nodes don't have access to EPEL repository
"""""""""""""""""""""""""""""""""""""""""""""""""""""""
Ask the system administrator to install ``capstone-devel``. Note that it is in the EPEL repository.
A. Ask the system administrator to install ``capstone`` and ``capstone-devel``. Note that it is in the EPEL repository.
Install syscall_intercept
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
B. Download the rpm with the machine in which you are the administrator:
::
git clone https://github.com/RIKEN-SysSoft/syscall_intercept.git
mkdir build && cd build
cmake <syscall_intercept>/arch/aarch64 -DCMAKE_INSTALL_PREFIX=<syscall-intercept-install> -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DTREAT_WARNINGS_AS_ERRORS=OFF
sudo yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
sudo yum install yum-utils
yumdownloader capstone capstone-devel
Install UTI for McKernel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
and then install it to your home directory of the login node:
Install:
::
.. code-block:: none
cd $HOME/$(uname -p)
rpm2cpio capstone-4.0.1-9.el8.aarch64.rpm | cpio -idv
rpm2cpio capstone-devel-4.0.1-9.el8.aarch64.rpm | cpio -idv
sed -i 's#/usr/#'"$HOME"'/'"$(uname -p)"'/usr/#' $HOME/$(uname -p)/usr/lib64/pkgconfig/capstone.pc
git clone https://github.com/RIKEN-SysSoft/uti.git
mkdir build && cd build
../uti/configure --prefix=<mckernel-install> --with-rm=mckernel
make && make install
Install McKernel
~~~~~~~~~~~~~~~~~~~~
Add ``-DENABLE_UTI=ON`` option to ``cmake``:
``cmake`` with the additional options:
::
CMAKE_PREFIX_PATH=<syscall-intercept-install> cmake -DCMAKE_INSTALL_PREFIX=${HOME}/ihk+mckernel -DENABLE_UTI=ON $HOME/src/ihk+mckernel/mckernel
cmake -DCMAKE_INSTALL_PREFIX=${HOME}/ihk+mckernel -DENABLE_UTI=ON $HOME/src/ihk+mckernel/mckernel
make -j install
Run programs
~~~~~~~~~~~~~~~~
~~~~~~~~~~~~
Add ``--enable-uti`` option to ``mcexec``:
``mcexec`` with ``--enable-uti`` option:
::
mcexec --enable-uti <command>
Install UTI for Linux
~~~~~~~~~~~~~~~~~~~~~~~~~
(Optional) Install UTI for Linux
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You should skip this step if it's already installed as with, for example, Fujitsu Technical Computing Suite.
You can skip this step if you don't want to develop a run-time using UTI, or if it's already installed with, for example, Fujitsu Technical Computing Suite.
Install by make
"""""""""""""""
@ -89,3 +86,9 @@ Install by rpm
rm -f ~/rpmbuild/SOURCES/<version>.tar.gz
rpmbuild -ba ./scripts/uti.spec
rpm -Uvh uti-<version>-<release>-<arch>.rpm
(Optional) Install UTI for McKernel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You can skip this step if you don't want to develop a run-time using UTI.
Execute the commands above for installing UTI for Linux, with ``--with-rm=linux`` replaced with ``--with-rm=mckernel``.

View File

@ -164,6 +164,7 @@ struct program_load_desc {
(sizeof(unsigned long) * 8)];
int thp_disable;
int enable_uti;
int uti_thread_rank; /* N-th clone() spawns a thread on Linux CPU */
int uti_use_last_cpu; /* Work-around not to share CPU with OpenMP thread */
int straight_map;

View File

@ -5,7 +5,7 @@ struct syscall_struct {
int number;
unsigned long args[6];
unsigned long ret;
unsigned long uti_clv; /* copy of a clv in McKernel */
unsigned long uti_info; /* reference to data in McKernel */
};
#define UTI_SZ_SYSCALL_STACK 16
@ -17,7 +17,7 @@ struct uti_desc {
int mck_tid; /* TODO: Move this out for multiple migrated-to-Linux threads */
unsigned long key; /* struct task_struct* of mcexec thread, used to search struct host_thread */
int pid, tid; /* Used as the id of tracee when issuing MCEXEC_UP_TERMINATE_THREAD */
unsigned long uti_clv; /* copy of McKernel clv */
unsigned long uti_info; /* reference to data in McKernel */
int fd; /* /dev/mcosX */
struct syscall_struct syscall_stack[UTI_SZ_SYSCALL_STACK]; /* stack of system call arguments and return values */
@ -26,6 +26,36 @@ struct uti_desc {
int start_syscall_intercept; /* Used to sync between mcexec.c and syscall_intercept.c */
};
/* Reference to McKernel variables accessed by mcctrl */
struct uti_info {
/* clv info */
unsigned long thread_va;
void *uti_futex_resp;
void *ikc2linux;
unsigned long uti_futex_resp_pa;
unsigned long ikc2linux_pa;
/* thread info */
int tid;
int cpu;
void *status;
void *spin_sleep_lock;
void *spin_sleep;
void *vm;
void *futex_q;
unsigned long status_pa;
unsigned long spin_sleep_lock_pa;
unsigned long spin_sleep_pa;
unsigned long vm_pa;
unsigned long futex_q_pa;
/* global info */
int mc_idle_halt;
void *futex_queue;
void *os; // set by mcctrl
unsigned long futex_queue_pa;
};
#endif

View File

@ -16,13 +16,15 @@ kmod(mcctrl
-I${IHK_FULL_SOURCE_DIR}/include/arch/${ARCH}
-I${PROJECT_SOURCE_DIR}/executer/include
-I${CMAKE_CURRENT_SOURCE_DIR}/arch/${ARCH}/include
-I${CMAKE_CURRENT_SOURCE_DIR}/include
-I${PROJECT_BINARY_DIR}
-I${PROJECT_SOURCE_DIR}/kernel/include
-I${PROJECT_SOURCE_DIR}/arch/${ARCH}/kernel/include
-DMCEXEC_PATH=\\"${MCEXEC_PATH}\\"
${ARCH_C_FLAGS}
SOURCES
driver.c control.c ikc.c syscall.c procfs.c binfmt_mcexec.c
sysfs.c sysfs_files.c arch/${ARCH}/archdeps.c
sysfs.c sysfs_files.c mc_plist.c futex.c arch/${ARCH}/archdeps.c arch/${ARCH}/cpu.c
EXTRA_SYMBOLS
${PROJECT_BINARY_DIR}/ihk/linux/core/Module.symvers
DEPENDS

View File

@ -0,0 +1,96 @@
/* cpu.c COPYRIGHT FUJITSU LIMITED 2015-2019 */
#include <cpu.h>
/* we not have "pause" instruction, instead "yield" instruction */
void cpu_pause(void)
{
asm volatile("yield" ::: "memory");
}
#if defined(CONFIG_HAS_NMI)
#include <arm-gic-v3.h>
/* restore interrupt (ICC_PMR_EL1 <= flags) */
void cpu_restore_interrupt(unsigned long flags)
{
asm volatile(
"msr_s " __stringify(ICC_PMR_EL1) ",%0"
:
: "r" (flags)
: "memory");
}
/* save ICC_PMR_EL1 & disable interrupt (ICC_PMR_EL1 <= ICC_PMR_EL1_MASKED) */
unsigned long cpu_disable_interrupt_save(void)
{
unsigned long flags;
unsigned long masked = ICC_PMR_EL1_MASKED;
asm volatile(
"mrs_s %0, " __stringify(ICC_PMR_EL1) "\n"
"msr_s " __stringify(ICC_PMR_EL1) ",%1"
: "=&r" (flags)
: "r" (masked)
: "memory");
return flags;
}
/* save ICC_PMR_EL1 & enable interrupt (ICC_PMR_EL1 <= ICC_PMR_EL1_UNMASKED) */
unsigned long cpu_enable_interrupt_save(void)
{
unsigned long flags;
unsigned long masked = ICC_PMR_EL1_UNMASKED;
asm volatile(
"mrs_s %0, " __stringify(ICC_PMR_EL1) "\n"
"msr_s " __stringify(ICC_PMR_EL1) ",%1"
: "=&r" (flags)
: "r" (masked)
: "memory");
return flags;
}
#else /* defined(CONFIG_HAS_NMI) */
/* @ref.impl arch/arm64/include/asm/spinlock.h::arch_local_irq_restore */
/* restore interrupt (PSTATE.DAIF = flags restore) */
void cpu_restore_interrupt(unsigned long flags)
{
asm volatile(
"msr daif, %0 // arch_local_irq_restore"
:
: "r" (flags)
: "memory");
}
/* @ref.impl arch/arm64/include/asm/irqflags.h::arch_local_irq_save */
/* save PSTATE.DAIF & disable interrupt (PSTATE.DAIF I bit set) */
unsigned long cpu_disable_interrupt_save(void)
{
unsigned long flags;
asm volatile(
"mrs %0, daif // arch_local_irq_save\n"
"msr daifset, #2"
: "=r" (flags)
:
: "memory");
return flags;
}
/* save PSTATE.DAIF & enable interrupt (PSTATE.DAIF I bit set) */
unsigned long cpu_enable_interrupt_save(void)
{
unsigned long flags;
asm volatile(
"mrs %0, daif // arch_local_irq_save\n"
"msr daifclr, #2"
: "=r" (flags)
:
: "memory");
return flags;
}
#endif /* defined(CONFIG_HAS_NMI) */

View File

@ -0,0 +1,142 @@
/* This is copy of the necessary part from McKernel, for uti-futex */
/* arch-lock.h COPYRIGHT FUJITSU LIMITED 2015-2018 */
#ifndef __HEADER_ARM64_COMMON_ARCH_LOCK_H
#define __HEADER_ARM64_COMMON_ARCH_LOCK_H
#include <linux/preempt.h>
#include <cpu.h>
#define ihk_mc_spinlock_lock __ihk_mc_spinlock_lock
#define ihk_mc_spinlock_unlock __ihk_mc_spinlock_unlock
#define ihk_mc_spinlock_lock_noirq __ihk_mc_spinlock_lock_noirq
#define ihk_mc_spinlock_unlock_noirq __ihk_mc_spinlock_unlock_noirq
/* @ref.impl arch/arm64/include/asm/spinlock_types.h::TICKET_SHIFT */
#define TICKET_SHIFT 16
/* @ref.impl ./arch/arm64/include/asm/lse.h::ARM64_LSE_ATOMIC_INSN */
/* else defined(CONFIG_AS_LSE) && defined(CONFIG_ARM64_LSE_ATOMICS) */
#define _ARM64_LSE_ATOMIC_INSN(llsc, lse) llsc
/* @ref.impl arch/arm64/include/asm/spinlock_types.h::arch_spinlock_t */
typedef struct {
#ifdef __AARCH64EB__
uint16_t next;
uint16_t owner;
#else /* __AARCH64EB__ */
uint16_t owner;
uint16_t next;
#endif /* __AARCH64EB__ */
} __attribute__((aligned(4))) _ihk_spinlock_t;
/* @ref.impl arch/arm64/include/asm/spinlock.h::arch_spin_lock */
/* spinlock lock */
static inline void
__ihk_mc_spinlock_lock_noirq(_ihk_spinlock_t *lock)
{
unsigned int tmp;
_ihk_spinlock_t lockval, newval;
preempt_disable();
asm volatile(
/* Atomically increment the next ticket. */
_ARM64_LSE_ATOMIC_INSN(
/* LL/SC */
" prfm pstl1strm, %3\n"
"1: ldaxr %w0, %3\n"
" add %w1, %w0, %w5\n"
" stxr %w2, %w1, %3\n"
" cbnz %w2, 1b\n",
/* LSE atomics */
" mov %w2, %w5\n"
" ldadda %w2, %w0, %3\n"
__nops(3)
)
/* Did we get the lock? */
" eor %w1, %w0, %w0, ror #16\n"
" cbz %w1, 3f\n"
/*
* No: spin on the owner. Send a local event to avoid missing an
* unlock before the exclusive load.
*/
" sevl\n"
"2: wfe\n"
" ldaxrh %w2, %4\n"
" eor %w1, %w2, %w0, lsr #16\n"
" cbnz %w1, 2b\n"
/* We got the lock. Critical section starts here. */
"3:"
: "=&r" (lockval), "=&r" (newval), "=&r" (tmp), "+Q" (*lock)
: "Q" (lock->owner), "I" (1 << TICKET_SHIFT)
: "memory");
}
/* spinlock lock & interrupt disable & PSTATE.DAIF save */
static inline unsigned long
__ihk_mc_spinlock_lock(_ihk_spinlock_t *lock)
{
unsigned long flags;
flags = cpu_disable_interrupt_save();
__ihk_mc_spinlock_lock_noirq(lock);
return flags;
}
/* @ref.impl arch/arm64/include/asm/spinlock.h::arch_spin_unlock */
/* spinlock unlock */
static inline void
__ihk_mc_spinlock_unlock_noirq(_ihk_spinlock_t *lock)
{
unsigned long tmp;
asm volatile(_ARM64_LSE_ATOMIC_INSN(
/* LL/SC */
" ldrh %w1, %0\n"
" add %w1, %w1, #1\n"
" stlrh %w1, %0",
/* LSE atomics */
" mov %w1, #1\n"
" staddlh %w1, %0\n"
__nops(1))
: "=Q" (lock->owner), "=&r" (tmp)
:
: "memory");
preempt_enable();
}
static inline void
__ihk_mc_spinlock_unlock(_ihk_spinlock_t *lock, unsigned long flags)
{
__ihk_mc_spinlock_unlock_noirq(lock);
cpu_restore_interrupt(flags);
}
typedef struct mcs_rwlock_lock {
_ihk_spinlock_t slock;
#ifndef ENABLE_UBSAN
} __aligned(64) mcs_rwlock_lock_t;
#else
} mcs_rwlock_lock_t;
#endif
static inline void
mcs_rwlock_writer_lock_noirq(struct mcs_rwlock_lock *lock)
{
ihk_mc_spinlock_lock_noirq(&lock->slock);
}
static inline void
mcs_rwlock_writer_unlock_noirq(struct mcs_rwlock_lock *lock)
{
ihk_mc_spinlock_unlock_noirq(&lock->slock);
}
#endif /* !__HEADER_ARM64_COMMON_ARCH_LOCK_H */

View File

@ -38,4 +38,26 @@ static const unsigned long arch_rus_vm_flags = VM_RESERVED | VM_MIXEDMAP | VM_EX
#else
static const unsigned long arch_rus_vm_flags = VM_DONTDUMP | VM_MIXEDMAP | VM_EXEC;
#endif
#define _xchg(ptr, x) \
({ \
__typeof__(*(ptr)) __ret; \
__ret = (__typeof__(*(ptr))) \
__xchg((unsigned long)(x), (ptr), sizeof(*(ptr))); \
__ret; \
})
#define xchg4(ptr, x) _xchg(ptr, x)
#define xchg8(ptr, x) _xchg(ptr, x)
enum arm64_pf_error_code {
PF_PROT = 1 << 0,
PF_WRITE = 1 << 1,
PF_USER = 1 << 2,
PF_RSVD = 1 << 3,
PF_INSTR = 1 << 4,
PF_PATCH = 1 << 29,
PF_POPULATE = 1 << 30,
};
#endif /* __HEADER_MCCTRL_ARM64_ARCHDEPS_H */

View File

@ -0,0 +1,51 @@
/* This is copy of the necessary part from McKernel, for uti-futex */
#include <cpu.h>
/*@
@ assigns \nothing;
@ behavior to_enabled:
@ assumes flags & RFLAGS_IF;
@ ensures \interrupt_disabled == 0;
@ behavior to_disabled:
@ assumes !(flags & RFLAGS_IF);
@ ensures \interrupt_disabled > 0;
@*/
void cpu_restore_interrupt(unsigned long flags)
{
asm volatile("push %0; popf" : : "g"(flags) : "memory", "cc");
}
void cpu_pause(void)
{
asm volatile("pause" ::: "memory");
}
/*@
@ assigns \nothing;
@ ensures \interrupt_disabled > 0;
@ behavior from_enabled:
@ assumes \interrupt_disabled == 0;
@ ensures \result & RFLAGS_IF;
@ behavior from_disabled:
@ assumes \interrupt_disabled > 0;
@ ensures !(\result & RFLAGS_IF);
@*/
unsigned long cpu_disable_interrupt_save(void)
{
unsigned long flags;
asm volatile("pushf; pop %0; cli" : "=r"(flags) : : "memory", "cc");
return flags;
}
unsigned long cpu_enable_interrupt_save(void)
{
unsigned long flags;
asm volatile("pushf; pop %0; sti" : "=r"(flags) : : "memory", "cc");
return flags;
}

View File

@ -0,0 +1,106 @@
/* This is copy of the necessary part from McKernel, for uti-futex */
#ifndef __HEADER_X86_COMMON_ARCH_LOCK
#define __HEADER_X86_COMMON_ARCH_LOCK
#include <linux/preempt.h>
#include <cpu.h>
#define ihk_mc_spinlock_lock __ihk_mc_spinlock_lock
#define ihk_mc_spinlock_unlock __ihk_mc_spinlock_unlock
#define ihk_mc_spinlock_lock_noirq __ihk_mc_spinlock_lock_noirq
#define ihk_mc_spinlock_unlock_noirq __ihk_mc_spinlock_unlock_noirq
typedef unsigned short __ticket_t;
typedef unsigned int __ticketpair_t;
/* arch/x86/include/asm/spinlock_types.h defines struct __raw_tickets */
typedef struct ihk_spinlock {
union {
__ticketpair_t head_tail;
struct ihk__raw_tickets {
__ticket_t head, tail;
} tickets;
};
} _ihk_spinlock_t;
static inline void ihk_mc_spinlock_init(_ihk_spinlock_t *lock)
{
lock->head_tail = 0;
}
static inline void __ihk_mc_spinlock_lock_noirq(_ihk_spinlock_t *lock)
{
register struct ihk__raw_tickets inc = { .tail = 0x0002 };
preempt_disable();
asm volatile ("lock xaddl %0, %1\n"
: "+r" (inc), "+m" (*(lock)) : : "memory", "cc");
if (inc.head == inc.tail)
goto out;
for (;;) {
if (*((volatile __ticket_t *)&lock->tickets.head) == inc.tail)
goto out;
cpu_pause();
}
out:
barrier(); /* make sure nothing creeps before the lock is taken */
}
static inline void __ihk_mc_spinlock_unlock_noirq(_ihk_spinlock_t *lock)
{
__ticket_t inc = 0x0002;
asm volatile ("lock addw %1, %0\n"
: "+m" (lock->tickets.head)
: "ri" (inc) : "memory", "cc");
preempt_enable();
}
static inline unsigned long __ihk_mc_spinlock_lock(_ihk_spinlock_t *lock)
{
unsigned long flags;
flags = cpu_disable_interrupt_save();
__ihk_mc_spinlock_lock_noirq(lock);
return flags;
}
static inline void __ihk_mc_spinlock_unlock(_ihk_spinlock_t *lock,
unsigned long flags)
{
__ihk_mc_spinlock_unlock_noirq(lock);
cpu_restore_interrupt(flags);
}
typedef struct mcs_rwlock_lock {
_ihk_spinlock_t slock;
#ifndef ENABLE_UBSAN
} __aligned(64) mcs_rwlock_lock_t;
#else
} mcs_rwlock_lock_t;
#endif
static inline void
mcs_rwlock_writer_lock_noirq(struct mcs_rwlock_lock *lock)
{
ihk_mc_spinlock_lock_noirq(&lock->slock);
}
static inline void
mcs_rwlock_writer_unlock_noirq(struct mcs_rwlock_lock *lock)
{
ihk_mc_spinlock_unlock_noirq(&lock->slock);
}
#endif

View File

@ -23,4 +23,26 @@ static const unsigned long arch_rus_vm_flags = VM_RESERVED | VM_MIXEDMAP;
#else
static const unsigned long arch_rus_vm_flags = VM_DONTDUMP | VM_MIXEDMAP;
#endif
#define xchg4(ptr, x) \
({ \
int __x = (x); \
asm volatile("xchgl %k0,%1" \
: "=r" (__x) \
: "m" (*ptr), "0" (__x) \
: "memory"); \
__x; \
})
enum x86_pf_error_code {
PF_PROT = 1 << 0,
PF_WRITE = 1 << 1,
PF_USER = 1 << 2,
PF_RSVD = 1 << 3,
PF_INSTR = 1 << 4,
PF_PATCH = 1 << 29,
PF_POPULATE = 1 << 30,
};
#endif /* __HEADER_MCCTRL_X86_64_ARCHDEPS_H */

View File

@ -50,6 +50,8 @@
#include <uapi/linux/sched/types.h>
#endif
#include <archdeps.h>
#include <uti.h>
#include <futex.h>
//#define DEBUG
@ -270,16 +272,17 @@ int mcexec_transfer_image(ihk_os_t os, struct remote_transfer *__user upt)
return -EFAULT;
}
#ifdef CONFIG_MIC
if (pt.size > PAGE_SIZE) {
printk("mcexec_transfer_image(): ERROR: size exceeds PAGE_SIZE\n");
return -EFAULT;
}
phys = ihk_device_map_memory(ihk_os_to_dev(os), pt.rphys, PAGE_SIZE);
#ifdef CONFIG_MIC
rpm = ioremap_wc(phys, PAGE_SIZE);
#else
rpm = ihk_device_map_virtual(ihk_os_to_dev(os), phys, PAGE_SIZE, NULL, 0);
phys = ihk_device_map_memory(ihk_os_to_dev(os), pt.rphys, pt.size);
rpm = ihk_device_map_virtual(ihk_os_to_dev(os), phys, pt.size, NULL, 0);
#endif
if (!rpm) {
@ -304,10 +307,11 @@ int mcexec_transfer_image(ihk_os_t os, struct remote_transfer *__user upt)
#ifdef CONFIG_MIC
iounmap(rpm);
ihk_device_unmap_memory(ihk_os_to_dev(os), phys, PAGE_SIZE);
#else
ihk_device_unmap_virtual(ihk_os_to_dev(os), rpm, PAGE_SIZE);
ihk_device_unmap_virtual(ihk_os_to_dev(os), rpm, pt.size);
ihk_device_unmap_memory(ihk_os_to_dev(os), phys, pt.size);
#endif
ihk_device_unmap_memory(ihk_os_to_dev(os), phys, PAGE_SIZE);
return ret;
@ -1297,6 +1301,7 @@ void mcctrl_put_per_proc_data(struct mcctrl_per_proc_data *ppd)
ihk_ikc_spinlock_unlock(&ppd->wq_list_lock, flags);
pager_remove_process(ppd);
futex_remove_process(ppd);
kfree(ppd);
}
@ -1897,6 +1902,7 @@ int mcexec_create_per_process_data(ihk_os_t os,
spin_lock_init(&ppd->wq_list_lock);
memset(&ppd->cpu_set, 0, sizeof(cpumask_t));
ppd->ikc_target_cpu = 0;
ppd->rva_to_rpa_cache = RB_ROOT;
/* Final ref will be dropped in release_handler() through
* mcexec_destroy_per_process_data() */
atomic_set(&ppd->refcount, 1);
@ -2889,57 +2895,28 @@ static long mcexec_release_user_space(struct release_user_space_desc *__user arg
#endif
}
static long (*mckernel_do_futex)(int n, unsigned long arg0, unsigned long arg1,
unsigned long arg2, unsigned long arg3,
unsigned long arg4, unsigned long arg5,
unsigned long _uti_clv,
void *uti_futex_resp,
void *_linux_wait_event,
void *_linux_printk,
void *_linux_clock_gettime);
/* Convert phys_addr to virt_addr on Linux */
static void
uti_info_p2v(struct uti_info *info)
{
info->uti_futex_resp =
(void *)phys_to_virt(info->uti_futex_resp_pa);
info->ikc2linux =
(void *)phys_to_virt(info->ikc2linux_pa);
long uti_wait_event(void *_resp, unsigned long nsec_timeout) {
struct uti_futex_resp *resp = _resp;
if (nsec_timeout) {
return wait_event_interruptible_timeout(resp->wq, resp->done, nsecs_to_jiffies(nsec_timeout));
} else {
return wait_event_interruptible(resp->wq, resp->done);
}
}
info->status =
(void *)phys_to_virt(info->status_pa);
info->spin_sleep_lock =
(void *)phys_to_virt(info->spin_sleep_lock_pa);
info->spin_sleep =
(void *)phys_to_virt(info->spin_sleep_pa);
info->vm =
(void *)phys_to_virt(info->vm_pa);
info->futex_q =
(void *)phys_to_virt(info->futex_q_pa);
int uti_printk(const char *fmt, ...) {
int sum = 0, nwritten;
va_list args;
va_start(args, fmt);
nwritten = vprintk(fmt, args);
sum += nwritten;
va_end(args);
return sum;
}
int uti_clock_gettime(clockid_t clk_id, struct timespec *tp) {
int ret = 0;
struct timespec64 ts64;
dprintk("%s: clk_id=%x,REALTIME=%x,MONOTONIC=%x\n", __FUNCTION__, clk_id, CLOCK_REALTIME, CLOCK_MONOTONIC);
switch(clk_id) {
case CLOCK_REALTIME:
getnstimeofday64(&ts64);
tp->tv_sec = ts64.tv_sec;
tp->tv_nsec = ts64.tv_nsec;
dprintk("%s: CLOCK_REALTIME,%ld.%09ld\n", __FUNCTION__, tp->tv_sec, tp->tv_nsec);
break;
case CLOCK_MONOTONIC: {
/* Do not use getrawmonotonic() because it returns different value than clock_gettime() */
ktime_get_ts64(&ts64);
tp->tv_sec = ts64.tv_sec;
tp->tv_nsec = ts64.tv_nsec;
dprintk("%s: CLOCK_MONOTONIC,%ld.%09ld\n", __FUNCTION__, tp->tv_sec, tp->tv_nsec);
break; }
default:
ret = -EINVAL;
break;
}
return ret;
info->futex_queue =
(void *)phys_to_virt(info->futex_queue_pa);
}
long mcexec_syscall_thread(ihk_os_t os, unsigned long arg, struct file *file)
@ -2948,36 +2925,38 @@ long mcexec_syscall_thread(ihk_os_t os, unsigned long arg, struct file *file)
int number;
unsigned long args[6];
unsigned long ret;
unsigned long uti_clv; /* copy of a clv in McKernel */
unsigned long uti_info; /* reference to data in McKernel */
};
struct syscall_struct param;
struct syscall_struct __user *uparam =
(struct syscall_struct __user *)arg;
long rc;
if (copy_from_user(&param, uparam, sizeof param)) {
return -EFAULT;
}
if (param.number == __NR_futex) {
struct uti_futex_resp resp = {
.done = 0
};
init_waitqueue_head(&resp.wq);
if (!mckernel_do_futex) {
if (ihk_os_get_special_address(os, IHK_SPADDR_MCKERNEL_DO_FUTEX,
(unsigned long *)&mckernel_do_futex,
NULL)) {
kprintf("%s: ihk_os_get_special_address failed\n", __FUNCTION__);
return -EINVAL;
}
dprintk("%s: mckernel_do_futex=%p\n", __FUNCTION__, mckernel_do_futex);
}
struct uti_info *_uti_info = NULL;
init_waitqueue_head(&resp.wq);
_uti_info = (struct uti_info *)param.uti_info;
/* Convert phys_addr to virt_addr on Linux */
uti_info_p2v(_uti_info);
_uti_info->os = (void *)os;
rc = do_futex(param.number, param.args[0],
param.args[1], param.args[2],
param.args[3], param.args[4], param.args[5],
(struct uti_info *)param.uti_info,
(void *)&resp);
rc = (*mckernel_do_futex)(param.number, param.args[0], param.args[1], param.args[2],
param.args[3], param.args[4], param.args[5], param.uti_clv, (void *)&resp, (void *)uti_wait_event, (void *)uti_printk, (void *)uti_clock_gettime);
param.ret = rc;
} else {
struct mcctrl_usrdata *usrdata = ihk_host_os_get_usrdata(os);
@ -3037,6 +3016,8 @@ void mcctrl_futex_wake(struct ikc_scd_packet *pisp)
}
resp->done = 1;
dprintk("%s: cpu: %d\n", __func__, ihk_ikc_get_processor_id());
wake_up_interruptible(&resp->wq);
}
@ -3151,7 +3132,7 @@ static long
mcexec_uti_attr(ihk_os_t os, struct uti_attr_desc __user *_desc)
{
struct uti_attr_desc desc;
char *uti_cpu_set_str;
char *uti_cpu_set_str = NULL;
struct kuti_attr *kattr;
cpumask_t *cpuset = NULL, *env_cpuset = NULL;
struct mcctrl_usrdata *ud = ihk_host_os_get_usrdata(os);
@ -3186,22 +3167,33 @@ mcexec_uti_attr(ihk_os_t os, struct uti_attr_desc __user *_desc)
goto out;
}
if (!(uti_cpu_set_str = kmalloc(desc.uti_cpu_set_len, GFP_KERNEL))) {
pr_err("%s: error: allocating uti_cpu_set_str\n",
__func__);
rc = -ENOMEM;
goto out;
}
if (desc.uti_cpu_set_str) {
if (!(uti_cpu_set_str = kmalloc(desc.uti_cpu_set_len, GFP_KERNEL))) {
pr_err("%s: error: allocating uti_cpu_set_str\n",
__func__);
rc = -ENOMEM;
goto out;
}
if ((rc = copy_from_user(uti_cpu_set_str, desc.uti_cpu_set_str, desc.uti_cpu_set_len))) {
pr_err("%s: error: copy_from_user\n",
__func__);
rc = -EFAULT;
goto out;
if ((rc = copy_from_user(uti_cpu_set_str, desc.uti_cpu_set_str, desc.uti_cpu_set_len))) {
pr_err("%s: error: copy_from_user\n",
__func__);
rc = -EFAULT;
goto out;
}
}
kattr = phys_to_virt(desc.phys_attr);
{
int i;
pr_info("%s: flag: %lx\n", __func__, (unsigned long)kattr->attr.flags);
for (i = 0; i < UTI_MAX_NUMA_DOMAINS; i+= 64) {
kprintf("%s: numa_set[%d]: %lx\n", __func__, i, (unsigned long)kattr->attr.numa_set[i / 64]);
}
}
/* Find caller cpu for later resolution of subgroups */
list_for_each_entry(cpu_topo, &ud->cpu_topology_list, chain) {
if (cpu_topo->mckernel_cpu_id == kattr->parent_cpuid) {
@ -3644,7 +3636,8 @@ int __mcctrl_os_read_write_cpu_register(ihk_os_t os, int cpu,
isp.op = op;
isp.pdesc = virt_to_phys(ldesc);
ret = mcctrl_ikc_send_wait(os, cpu, &isp, 0, NULL, &do_free, 1, ldesc);
/* 1 sec timeout for the case where McKernel can't respond */
ret = mcctrl_ikc_send_wait(os, cpu, &isp, -1000, NULL, &do_free, 1, ldesc);
if (ret != 0) {
printk("%s: ERROR sending IKC msg: %d\n", __FUNCTION__, ret);
goto out;

File diff suppressed because it is too large Load Diff

View File

@ -536,9 +536,6 @@ int prepare_ikc_channels(ihk_os_t os)
usrdata->os = os;
ihk_host_os_set_usrdata(os, usrdata);
ihk_ikc_listen_port(os, &lp_ikc2linux);
ihk_ikc_listen_port(os, &lp_ikc2mckernel);
init_waitqueue_head(&usrdata->wq_procfs);
mutex_init(&usrdata->reserve_lock);
mutex_init(&usrdata->part_exec_lock);
@ -555,6 +552,16 @@ int prepare_ikc_channels(ihk_os_t os)
INIT_LIST_HEAD(&usrdata->wakeup_descs_list);
spin_lock_init(&usrdata->wakeup_descs_lock);
/* ihk_ikc_listen_port should be performed after
* usrdata->cpu_topology_list is initialized because the
* function enables syscall_packet_handler which accesses
* the list (the call path is sysfsm_packet_handler -->
* sysfsm_work_main --> sysfsm_setup --> setup_sysfs_files
* --> setup_cpus_sysfs_files).
*/
ihk_ikc_listen_port(os, &lp_ikc2linux);
ihk_ikc_listen_port(os, &lp_ikc2mckernel);
return 0;
error:

View File

@ -0,0 +1,10 @@
/* This is copy of the necessary part from McKernel, for uti-futex */
#ifndef MC_CPU_H
#define MC_CPU_H
void cpu_restore_interrupt(unsigned long flags);
void cpu_pause(void);
unsigned long cpu_disable_interrupt_save(void);
unsigned long cpu_enable_interrupt_save(void);
#endif

View File

@ -0,0 +1,174 @@
/* This is copy of the necessary part from McKernel, for uti-futex */
#ifndef _FUTEX_H
#define _FUTEX_H
#include <mc_plist.h>
#include <arch-lock.h>
#include <linux/uaccess.h>
/** \name Futex Commands
* @{
*/
#define FUTEX_WAIT 0
#define FUTEX_WAKE 1
#define FUTEX_FD 2
#define FUTEX_REQUEUE 3
#define FUTEX_CMP_REQUEUE 4
#define FUTEX_WAKE_OP 5
#define FUTEX_LOCK_PI 6
#define FUTEX_UNLOCK_PI 7
#define FUTEX_TRYLOCK_PI 8
#define FUTEX_WAIT_BITSET 9
#define FUTEX_WAKE_BITSET 10
#define FUTEX_WAIT_REQUEUE_PI 11
#define FUTEX_CMP_REQUEUE_PI 12
// @}
#define FUTEX_PRIVATE_FLAG 128
#define FUTEX_CLOCK_REALTIME 256
#define FUTEX_CMD_MASK ~(FUTEX_PRIVATE_FLAG | FUTEX_CLOCK_REALTIME)
#define FUTEX_WAIT_PRIVATE (FUTEX_WAIT | FUTEX_PRIVATE_FLAG)
#define FUTEX_WAKE_PRIVATE (FUTEX_WAKE | FUTEX_PRIVATE_FLAG)
#define FUTEX_REQUEUE_PRIVATE (FUTEX_REQUEUE | FUTEX_PRIVATE_FLAG)
#define FUTEX_CMP_REQUEUE_PRIVATE (FUTEX_CMP_REQUEUE | FUTEX_PRIVATE_FLAG)
#define FUTEX_WAKE_OP_PRIVATE (FUTEX_WAKE_OP | FUTEX_PRIVATE_FLAG)
#define FUTEX_LOCK_PI_PRIVATE (FUTEX_LOCK_PI | FUTEX_PRIVATE_FLAG)
#define FUTEX_UNLOCK_PI_PRIVATE (FUTEX_UNLOCK_PI | FUTEX_PRIVATE_FLAG)
#define FUTEX_TRYLOCK_PI_PRIVATE (FUTEX_TRYLOCK_PI | FUTEX_PRIVATE_FLAG)
#define FUTEX_WAIT_BITSET_PRIVATE (FUTEX_WAIT_BITSET | FUTEX_PRIVATE_FLAG)
#define FUTEX_WAKE_BITSET_PRIVATE (FUTEX_WAKE_BITSET | FUTEX_PRIVATE_FLAG)
#define FUTEX_WAIT_REQUEUE_PI_PRIVATE (FUTEX_WAIT_REQUEUE_PI | \
FUTEX_PRIVATE_FLAG)
#define FUTEX_CMP_REQUEUE_PI_PRIVATE (FUTEX_CMP_REQUEUE_PI | \
FUTEX_PRIVATE_FLAG)
/** \name Futex Operations, used for FUTEX_WAKE_OP
* @{
*/
#define FUTEX_OP_SET 0 /* *(int *)UADDR2 = OPARG; */
#define FUTEX_OP_ADD 1 /* *(int *)UADDR2 += OPARG; */
#define FUTEX_OP_OR 2 /* *(int *)UADDR2 |= OPARG; */
#define FUTEX_OP_ANDN 3 /* *(int *)UADDR2 &= ~OPARG; */
#define FUTEX_OP_XOR 4 /* *(int *)UADDR2 ^= OPARG; */
#define FUTEX_OP_OPARG_SHIFT 8U /* Use (1 << OPARG) instead of OPARG. */
#define FUTEX_OP_CMP_EQ 0 /* if (oldval == CMPARG) wake */
#define FUTEX_OP_CMP_NE 1 /* if (oldval != CMPARG) wake */
#define FUTEX_OP_CMP_LT 2 /* if (oldval < CMPARG) wake */
#define FUTEX_OP_CMP_LE 3 /* if (oldval <= CMPARG) wake */
#define FUTEX_OP_CMP_GT 4 /* if (oldval > CMPARG) wake */
#define FUTEX_OP_CMP_GE 5 /* if (oldval >= CMPARG) wake */
// @}
#define FUT_OFF_INODE 1 /* We set bit 0 if key has a reference on inode */
#define FUT_OFF_MMSHARED 2 /* We set bit 1 if key has a reference on mm */
#define FUTEX_HASHBITS 8 /* 256 entries in each futex hash tbl */
#define PS_RUNNING 0x1
#define PS_INTERRUPTIBLE 0x2
#define PS_UNINTERRUPTIBLE 0x4
#define PS_ZOMBIE 0x8
#define PS_EXITED 0x10
#define PS_STOPPED 0x20
static inline int get_futex_value_locked(uint32_t *dest, uint32_t *from)
{
int ret;
pagefault_disable();
ret = __get_user(*dest, from);
pagefault_enable();
return ret ? -EFAULT : 0;
}
union futex_key {
struct {
unsigned long pgoff;
void *phys;
int offset;
} shared;
struct {
unsigned long address;
void *mm; // Acctually, process_vm
int offset;
} private;
struct {
unsigned long word;
void *ptr;
int offset;
} both;
};
#define FUTEX_KEY_INIT ((union futex_key) { .both = { .ptr = NULL } })
#define FUTEX_BITSET_MATCH_ANY 0xffffffff
/**
* struct futex_q - The hashed futex queue entry, one per waiting task
* @task: the task waiting on the futex
* @lock_ptr: the hash bucket lock
* @key: the key the futex is hashed on
* @requeue_pi_key: the requeue_pi target futex key
* @bitset: bitset for the optional bitmasked wakeup
*
* We use this hashed waitqueue, instead of a normal wait_queue_t, so
* we can wake only the relevant ones (hashed queues may be shared).
*
* A futex_q has a woken state, just like tasks have TASK_RUNNING.
* It is considered woken when plist_node_empty(&q->list) || q->lock_ptr == 0.
* The order of wakup is always to make the first condition true, then
* the second.
*
* PI futexes are typically woken before they are removed from the hash list via
* the rt_mutex code. See unqueue_me_pi().
*/
struct futex_q {
struct mc_plist_node list;
void *task; // Actually, struct thread
_ihk_spinlock_t *lock_ptr;
union futex_key key;
union futex_key *requeue_pi_key;
uint32_t bitset;
/* Used to wake-up a thread running on a Linux CPU */
void *uti_futex_resp;
/* Used to send IPI directly to the waiter CPU */
int linux_cpu;
/* Used to wake-up a thread running on a McKernel from Linux */
void *th_spin_sleep;
void *th_status;
void *th_spin_sleep_lock;
void *proc_status;
void *proc_update_lock;
void *runq_lock;
void *clv_flags;
int intr_id;
int intr_vector;
unsigned long th_spin_sleep_pa;
unsigned long th_status_pa;
unsigned long th_spin_sleep_lock_pa;
unsigned long proc_status_pa;
unsigned long proc_update_lock_pa;
unsigned long runq_lock_pa;
unsigned long clv_flags_pa;
};
long do_futex(int n, unsigned long arg0, unsigned long arg1,
unsigned long arg2, unsigned long arg3,
unsigned long arg4, unsigned long arg5,
struct uti_info *uti_info,
void *uti_futex_resp);
void futex_remove_process(struct mcctrl_per_proc_data *ppd);
#endif

View File

@ -0,0 +1,277 @@
/* This is copy of the necessary part from McKernel, for uti-futex */
/*
* Descending-priority-sorted double-linked list
*
* (C) 2002-2003 Intel Corp
* Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>.
*
* 2001-2005 (c) MontaVista Software, Inc.
* Daniel Walker <dwalker@mvista.com>
*
* (C) 2005 Thomas Gleixner <tglx@linutronix.de>
*
* Simplifications of the original code by
* Oleg Nesterov <oleg@tv-sign.ru>
*
* Licensed under the FSF's GNU Public License v2 or later.
*
* Based on simple lists (include/linux/list.h).
*
* This is a priority-sorted list of nodes; each node has a
* priority from INT_MIN (highest) to INT_MAX (lowest).
*
* Addition is O(K), removal is O(1), change of priority of a node is
* O(K) and K is the number of RT priority levels used in the system.
* (1 <= K <= 99)
*
* This list is really a list of lists:
*
* - The tier 1 list is the prio_list, different priority nodes.
*
* - The tier 2 list is the node_list, serialized nodes.
*
* Simple ASCII art explanation:
*
* |HEAD |
* | |
* |prio_list.prev|<------------------------------------|
* |prio_list.next|<->|pl|<->|pl|<--------------->|pl|<-|
* |10 | |10| |21| |21| |21| |40| (prio)
* | | | | | | | | | | | |
* | | | | | | | | | | | |
* |node_list.next|<->|nl|<->|nl|<->|nl|<->|nl|<->|nl|<-|
* |node_list.prev|<------------------------------------|
*
* The nodes on the prio_list list are sorted by priority to simplify
* the insertion of new nodes. There are no nodes with duplicate
* priorites on the list.
*
* The nodes on the node_list are ordered by priority and can contain
* entries which have the same priority. Those entries are ordered
* FIFO
*
* Addition means: look for the prio_list node in the prio_list
* for the priority of the node and insert it before the node_list
* entry of the next prio_list node. If it is the first node of
* that priority, add it to the prio_list in the right position and
* insert it into the serialized node_list list
*
* Removal means remove it from the node_list and remove it from
* the prio_list if the node_list list_head is non empty. In case
* of removal from the prio_list it must be checked whether other
* entries of the same priority are on the list or not. If there
* is another entry of the same priority then this entry has to
* replace the removed entry on the prio_list. If the entry which
* is removed is the only entry of this priority then a simple
* remove from both list is sufficient.
*
* INT_MIN is the highest priority, 0 is the medium highest, INT_MAX
* is lowest priority.
*
* No locking is done, up to the caller.
*
*/
#ifndef _MC_PLIST_H_
#define _MC_PLIST_H_
#include <arch-lock.h>
struct mc_plist_head {
struct list_head prio_list;
struct list_head node_list;
#ifdef CONFIG_DEBUG_PI_LIST
raw_spinlock_t *rawlock;
spinlock_t *spinlock;
#endif
};
struct mc_plist_node {
int prio;
struct mc_plist_head plist;
};
#ifdef CONFIG_DEBUG_PI_LIST
# define PLIST_HEAD_LOCK_INIT(_lock) .spinlock = _lock
# define PLIST_HEAD_LOCK_INIT_RAW(_lock) .rawlock = _lock
#else
# define PLIST_HEAD_LOCK_INIT(_lock)
# define PLIST_HEAD_LOCK_INIT_RAW(_lock)
#endif
#define _MCK_PLIST_HEAD_INIT(head) \
.prio_list = LIST_HEAD_INIT((head).prio_list), \
.node_list = LIST_HEAD_INIT((head).node_list)
/**
* PLIST_HEAD_INIT - static struct plist_head initializer
* @head: struct plist_head variable name
* @_lock: lock to initialize for this list
*/
#define MCK_PLIST_HEAD_INIT(head, _lock) \
{ \
_MCK_PLIST_HEAD_INIT(head), \
MCK_PLIST_HEAD_LOCK_INIT(&(_lock)) \
}
/**
* PLIST_HEAD_INIT_RAW - static struct plist_head initializer
* @head: struct plist_head variable name
* @_lock: lock to initialize for this list
*/
#define MCK_PLIST_HEAD_INIT_RAW(head, _lock) \
{ \
_MCK_PLIST_HEAD_INIT(head), \
MCK_PLIST_HEAD_LOCK_INIT_RAW(&(_lock)) \
}
/**
* PLIST_NODE_INIT - static struct plist_node initializer
* @node: struct plist_node variable name
* @__prio: initial node priority
*/
#define MCK_PLIST_NODE_INIT(node, __prio) \
{ \
.prio = (__prio), \
.plist = { _MCK_PLIST_HEAD_INIT((node).plist) }, \
}
/**
* plist_head_init - dynamic struct plist_head initializer
* @head: &struct plist_head pointer
* @lock: spinlock protecting the list (debugging)
*/
static inline void
mc_plist_head_init(struct mc_plist_head *head, _ihk_spinlock_t *lock)
{
INIT_LIST_HEAD(&head->prio_list);
INIT_LIST_HEAD(&head->node_list);
#ifdef CONFIG_DEBUG_PI_LIST
head->spinlock = lock;
head->rawlock = NULL;
#endif
}
/**
* plist_head_init_raw - dynamic struct plist_head initializer
* @head: &struct plist_head pointer
* @lock: raw_spinlock protecting the list (debugging)
*/
static inline void
mc_plist_head_init_raw(struct mc_plist_head *head, _ihk_spinlock_t *lock)
{
INIT_LIST_HEAD(&head->prio_list);
INIT_LIST_HEAD(&head->node_list);
#ifdef CONFIG_DEBUG_PI_LIST
head->rawlock = lock;
head->spinlock = NULL;
#endif
}
/**
* plist_node_init - Dynamic struct plist_node initializer
* @node: &struct plist_node pointer
* @prio: initial node priority
*/
static inline void mc_plist_node_init(struct mc_plist_node *node, int prio)
{
node->prio = prio;
mc_plist_head_init(&node->plist, NULL);
}
extern void mc_plist_add(struct mc_plist_node *node,
struct mc_plist_head *head);
extern void mc_plist_del(struct mc_plist_node *node,
struct mc_plist_head *head);
/**
* plist_for_each - iterate over the plist
* @pos: the type * to use as a loop counter
* @head: the head for your list
*/
#define mc_plist_for_each(pos, head) \
list_for_each_entry(pos, &(head)->node_list, plist.node_list)
/**
* plist_for_each_safe - iterate safely over a plist of given type
* @pos: the type * to use as a loop counter
* @n: another type * to use as temporary storage
* @head: the head for your list
*
* Iterate over a plist of given type, safe against removal of list entry.
*/
#define mc_plist_for_each_safe(pos, n, head) \
list_for_each_entry_safe(pos, n, &(head)->node_list, plist.node_list)
/**
* plist_for_each_entry - iterate over list of given type
* @pos: the type * to use as a loop counter
* @head: the head for your list
* @mem: the name of the list_struct within the struct
*/
#define mc_plist_for_each_entry(pos, head, mem) \
list_for_each_entry(pos, &(head)->node_list, mem.plist.node_list)
/**
* plist_for_each_entry_safe - iterate safely over list of given type
* @pos: the type * to use as a loop counter
* @n: another type * to use as temporary storage
* @head: the head for your list
* @m: the name of the list_struct within the struct
*
* Iterate over list of given type, safe against removal of list entry.
*/
#define mc_plist_for_each_entry_safe(pos, n, head, m) \
list_for_each_entry_safe(pos, n, &(head)->node_list, m.plist.node_list)
/**
* plist_head_empty - return !0 if a plist_head is empty
* @head: &struct plist_head pointer
*/
static inline int mc_plist_head_empty(const struct mc_plist_head *head)
{
return list_empty(&head->node_list);
}
/**
* plist_node_empty - return !0 if plist_node is not on a list
* @node: &struct plist_node pointer
*/
static inline int mc_plist_node_empty(const struct mc_plist_node *node)
{
return mc_plist_head_empty(&node->plist);
}
/* All functions below assume the plist_head is not empty. */
/**
* plist_first_entry - get the struct for the first entry
* @head: the &struct plist_head pointer
* @type: the type of the struct this is embedded in
* @member: the name of the list_struct within the struct
*/
#ifdef CONFIG_DEBUG_PI_LIST
# define mc_plist_first_entry(head, type, member) \
({ \
WARN_ON(mc_plist_head_empty(head)); \
container_of(mc_plist_first(head), type, member); \
})
#else
# define mc_plist_first_entry(head, type, member) \
container_of(mc_plist_first(head), type, member)
#endif
/**
* plist_first - return the first node (and thus, highest priority)
* @head: the &struct plist_head pointer
*
* Assumes the plist is _not_ empty.
*/
static inline struct mc_plist_node *mc_plist_first(
const struct mc_plist_head *head)
{
return list_entry(head->node_list.next,
struct mc_plist_node, plist.node_list);
}
#endif

View File

@ -0,0 +1,100 @@
/* This is copy of the necessary part from McKernel, for uti-futex */
#include <mc_plist.h>
#include <arch-lock.h>
#ifdef CONFIG_DEBUG_PI_LIST
static void mc_plist_check_prev_next(struct list_head *t, struct list_head *p,
struct list_head *n)
{
WARN(n->prev != p || p->next != n,
"top: %p, n: %p, p: %p\n"
"prev: %p, n: %p, p: %p\n"
"next: %p, n: %p, p: %p\n",
t, t->next, t->prev,
p, p->next, p->prev,
n, n->next, n->prev);
}
static void mc_plist_check_list(struct list_head *top)
{
struct list_head *prev = top, *next = top->next;
mc_plist_check_prev_next(top, prev, next);
while (next != top) {
prev = next;
next = prev->next;
mc_plist_check_prev_next(top, prev, next);
}
}
static void mc_plist_check_head(struct mc_plist_head *head)
{
WARN_ON(!head->rawlock && !head->spinlock);
if (head->rawlock)
WARN_ON_SMP(!raw_spin_is_locked(head->rawlock));
if (head->spinlock)
WARN_ON_SMP(!spin_is_locked(head->spinlock));
mc_plist_check_list(&head->prio_list);
mc_plist_check_list(&head->node_list);
}
#else
# define mc_plist_check_head(h) do { } while (0)
#endif
/**
* plist_add - add @node to @head
*
* @node: &struct plist_node pointer
* @head: &struct plist_head pointer
*/
void mc_plist_add(struct mc_plist_node *node, struct mc_plist_head *head)
{
struct mc_plist_node *iter;
mc_plist_check_head(head);
#if 0
WARN_ON(!plist_node_empty(node));
#endif
list_for_each_entry(iter, &head->prio_list, plist.prio_list) {
if (node->prio < iter->prio)
goto lt_prio;
else if (node->prio == iter->prio) {
iter = list_entry(iter->plist.prio_list.next,
struct mc_plist_node, plist.prio_list);
goto eq_prio;
}
}
lt_prio:
list_add_tail(&node->plist.prio_list, &iter->plist.prio_list);
eq_prio:
list_add_tail(&node->plist.node_list, &iter->plist.node_list);
mc_plist_check_head(head);
}
/**
* plist_del - Remove a @node from plist.
*
* @node: &struct plist_node pointer - entry to be removed
* @head: &struct plist_head pointer - list head
*/
void mc_plist_del(struct mc_plist_node *node, struct mc_plist_head *head)
{
mc_plist_check_head(head);
if (!list_empty(&node->plist.prio_list)) {
struct mc_plist_node *next = mc_plist_first(&node->plist);
list_move_tail(&next->plist.prio_list, &node->plist.prio_list);
list_del_init(&node->plist.prio_list);
}
list_del_init(&node->plist.node_list);
mc_plist_check_head(head);
}

View File

@ -264,6 +264,8 @@ struct mcctrl_per_proc_data {
struct list_head devobj_pager_list;
struct semaphore devobj_pager_lock;
int enable_tofu;
struct rb_root rva_to_rpa_cache;
};
struct sysfsm_req {

View File

@ -366,6 +366,7 @@ retry_alloc:
#define STATUS_IN_PROGRESS 0
#define STATUS_SYSCALL 4
#define __NR_syscall_response 8001
req->valid = 0;
if (__notify_syscall_requester(usrdata->os, packet, resp) < 0) {
@ -440,7 +441,7 @@ retry_alloc:
req->valid = 0;
/* check result */
if (req->number != __NR_mmap) {
if (req->number != __NR_syscall_response) {
printk("%s:unexpected response. %lx %lx\n",
__FUNCTION__, req->number, req->args[0]);
syscall_ret = -EIO;

View File

@ -69,13 +69,18 @@ if (ENABLE_QLMPI)
endif()
if (ENABLE_UTI)
link_directories("${CMAKE_CURRENT_BINARY_DIR}/lib/syscall_intercept")
add_library(mck_syscall_intercept SHARED syscall_intercept.c arch/${ARCH}/archdep_c.c)
# target name is defined by add_library(), not project() or add_subdirectory()
add_dependencies(mck_syscall_intercept syscall_intercept_shared)
if (${ARCH} STREQUAL "arm64")
set_source_files_properties(syscall_intercept.c PROPERTIES COMPILE_FLAGS -mgeneral-regs-only)
endif()
target_link_libraries(mck_syscall_intercept ${LIBSYSCALL_INTERCEPT_LIBRARIES})
target_include_directories(mck_syscall_intercept PRIVATE ${LIBSYSCALL_INTERCEPT_INCLUDE_DIRS})
set_target_properties(mck_syscall_intercept PROPERTIES INSTALL_RPATH_USE_LINK_PATH TRUE)
target_link_libraries(mck_syscall_intercept syscall_intercept)
target_include_directories(mck_syscall_intercept PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/lib/syscall_intercept/include)
set_target_properties(mck_syscall_intercept PROPERTIES INSTALL_RPATH ${CMAKE_INSTALL_PREFIX}/lib64)
install(TARGETS mck_syscall_intercept
DESTINATION "${CMAKE_INSTALL_LIBDIR}")

View File

@ -1,3 +1,49 @@
if (NOT LIBDWARF)
add_subdirectory(libdwarf)
endif()
if (ENABLE_UTI)
if (${ARCH} STREQUAL "arm64")
set(SYSCALL_INTERCEPT_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}/syscall_intercept/arch/aarch64" CACHE STRINGS "relative path to syscalL_intercept source directory")
elseif (${ARCH} STREQUAL "x86_64")
set(SYSCALL_INTERCEPT_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}/syscall_intercept" CACHE STRINGS "relative path to syscalL_intercept source directory")
endif()
# syscall_intercept
# change cmake options only in this directory
SET(CMAKE_BUILD_TYPE Release CACHE STRING "release build" FORCE)
SET(TREAT_WARNINGS_AS_ERRORS OFF CACHE BOOL "ignore warnings" FORCE)
add_subdirectory(${SYSCALL_INTERCEPT_SOURCE_DIR} syscall_intercept)
# libuti
find_path(LIBCAP_INCLUDE_DIRS
capability.h
PATHS /usr/include/sys
NO_DEFAULT_PATH)
find_library(LIBCAP_LIBRARIES
NAME cap
PATHS /usr/lib64
NO_DEFAULT_PATH)
if (NOT LIBCAP_INCLUDE_DIRS OR NOT LIBCAP_LIBRARIES)
message(FATAL_ERROR "error: couldn't find libcap")
endif()
include(ExternalProject)
# Install libuti.so.* into <prefix>/mck/ so that mcexec can
# redirect ld*.so's access to it. In this way, a.out created
# by Fujitsu MPI and linked to libuti.so in the standard path
# can use the McKernel version when invoked through mcexec.
ExternalProject_Add(libuti
SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR}/uti
BINARY_DIR ${CMAKE_CURRENT_BINARY_DIR}/uti
INSTALL_DIR ${prefix}
CONFIGURE_COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/uti/configure --prefix=<INSTALL_DIR> --libdir=<INSTALL_DIR>/lib64 --disable-static --with-rm=mckernel
BUILD_COMMAND ${MAKE}
BUILD_IN_SOURCE FALSE
INSTALL_COMMAND ${MAKE} install && bash -c "rm ${prefix}/include/uti.h ${prefix}/lib64/libuti.la && [[ -d ${prefix}/lib64/mck ]] || mkdir ${prefix}/lib64/mck && mv ${prefix}/lib64/libuti.* ${prefix}/lib64/mck"
)
endif()

1
executer/user/lib/uti Submodule

Submodule executer/user/lib/uti added at 8c5a556814

View File

@ -1957,14 +1957,14 @@ opendev()
fprintf(stderr, "%s: warning: LD_PRELOAD line is too long\n", __FUNCTION__); \
return; \
} \
strncat(envbuf, elembuf, remainder); \
strncat(envbuf, elembuf, remainder - 1); \
remainder = PATH_MAX - (strlen(envbuf) + 1); \
nelem++; \
} while (0)
static ssize_t find_libdir(char *libdir, size_t len)
{
FILE *filep;
FILE *filep = NULL;
ssize_t rc;
size_t linelen = 0;
char *line = NULL;
@ -2020,7 +2020,9 @@ static ssize_t find_libdir(char *libdir, size_t len)
}
out:
pclose(filep);
if (filep) {
pclose(filep);
}
free(line);
return rc;
}
@ -2814,6 +2816,7 @@ int main(int argc, char **argv)
desc->mpol_mode, desc->mpol_nodemask[0]);
}
desc->enable_uti = enable_uti;
desc->uti_thread_rank = uti_thread_rank;
desc->uti_use_last_cpu = uti_use_last_cpu;
desc->thp_disable = get_thp_disable();
@ -2960,7 +2963,9 @@ static void kill_thread(unsigned long tid, int sig,
}
}
static long util_thread(struct thread_data_s *my_thread, unsigned long rp_rctx, int remote_tid, unsigned long pattr, unsigned long uti_clv, unsigned long _uti_desc)
static long util_thread(struct thread_data_s *my_thread,
unsigned long rp_rctx, int remote_tid, unsigned long pattr,
unsigned long uti_info, unsigned long _uti_desc)
{
struct uti_get_ctx_desc get_ctx_desc;
struct uti_switch_ctx_desc switch_ctx_desc;
@ -3012,7 +3017,7 @@ static long util_thread(struct thread_data_s *my_thread, unsigned long rp_rctx,
uti_desc->key = get_ctx_desc.key;
uti_desc->pid = getpid();
uti_desc->tid = gettid();
uti_desc->uti_clv = uti_clv;
uti_desc->uti_info = uti_info;
/* Initialize list of syscall arguments for syscall_intercept */
if (sizeof(struct syscall_struct) * 11 > page_size) {
@ -3026,7 +3031,9 @@ static long util_thread(struct thread_data_s *my_thread, unsigned long rp_rctx,
desc.phys_attr = pattr;
desc.uti_cpu_set_str = getenv("UTI_CPU_SET");
desc.uti_cpu_set_len = strlen(desc.uti_cpu_set_str) + 1;
if (desc.uti_cpu_set_str) {
desc.uti_cpu_set_len = strlen(desc.uti_cpu_set_str) + 1;
}
if ((rc = ioctl(fd, MCEXEC_UP_UTI_ATTR, &desc))) {
fprintf(stderr, "%s: error: MCEXEC_UP_UTI_ATTR: %s\n",
@ -3388,6 +3395,29 @@ overlay_path(int dirfd, const char *in, char *buf, int *resolvelinks)
if (!strcmp(path, "/dev/xpmem"))
return "/dev/null";
if (enable_uti && strstr(path, "libuti.so")) {
char libdir[PATH_MAX];
char *basename;
basename = strrchr(path, '/');
if (basename == NULL) {
basename = (char *)path;
} else {
basename++;
}
if (find_libdir(libdir, sizeof(libdir)) < 0) {
fprintf(stderr, "error: failed to find library directory\n");
return in;
}
n = snprintf(buf, PATH_MAX, "%s/mck/%s",
libdir, basename);
__dprintf("%s: %s replaced with %s\n",
__func__, path, buf);
goto checkexist;
}
if (!strncmp(path, "/proc/self", 10) &&
(path[10] == '/' || path[10] == '\0')) {
n = snprintf(buf, PATH_MAX, "/proc/mcos%d/%d%s",
@ -4121,6 +4151,7 @@ int main_loop(struct thread_data_s *my_thread)
#endif
case __NR_gettid:{
int rc = 0;
/*
* Number of TIDs and the remote physical address where TIDs are
* expected are passed in arg 4 and 5, respectively.
@ -4132,6 +4163,7 @@ int main_loop(struct thread_data_s *my_thread)
int *tids = malloc(sizeof(int) * w.sr.args[4]);
if (!tids) {
fprintf(stderr, "__NR_gettid(): error allocating TIDs\n");
rc = -ENOMEM;
goto gettid_out;
}
@ -4152,13 +4184,14 @@ int main_loop(struct thread_data_s *my_thread)
trans.direction = MCEXEC_UP_TRANSFER_TO_REMOTE;
if (ioctl(fd, MCEXEC_UP_TRANSFER, &trans) != 0) {
rc = -EFAULT;
fprintf(stderr, "__NR_gettid(): error transfering TIDs\n");
}
free(tids);
}
gettid_out:
do_syscall_return(fd, cpu, 0, 0, 0, 0, 0);
do_syscall_return(fd, cpu, rc, 0, 0, 0, 0);
break;
}
@ -4818,7 +4851,8 @@ return_execve2:
case __NR_sched_setaffinity:
if (w.sr.args[0] == 0) {
ret = util_thread(my_thread, w.sr.args[1], w.sr.rtid,
w.sr.args[2], w.sr.args[3], w.sr.args[4]);
w.sr.args[2], w.sr.args[3],
w.sr.args[4]);
}
else {
__eprintf("__NR_sched_setaffinity: invalid argument (%lx)\n", w.sr.args[0]);

View File

@ -1,3 +1,4 @@
#define _GNU_SOURCE
#include <libsyscall_intercept_hook_point.h>
#include <errno.h>
#include <stdio.h>
@ -5,13 +6,16 @@
#include <syscall.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <sys/types.h> /* for pid_t in uprotocol.h */
#include "../include/uprotocol.h"
#include "../include/uti.h"
#include "./archdep_uti.h"
#define DEBUG_UTI
static struct uti_desc uti_desc;
#define DEBUG_UTI
static __thread int on_linux = -1;
static int
hook(long syscall_number,
@ -20,22 +24,29 @@ hook(long syscall_number,
long arg4, long arg5,
long *result)
{
//return 1; /* debug */
int tid = uti_syscall0(__NR_gettid);
struct terminate_thread_desc term_desc;
unsigned long code;
int stack_top;
long ret;
if (!uti_desc.start_syscall_intercept) {
return 1; /* System call isn't taken over */
}
if (tid != uti_desc.mck_tid) {
/* new thread */
if (on_linux == -1) {
int tid = uti_syscall0(__NR_gettid);
on_linux = (tid == uti_desc.mck_tid) ? 1 : 0;
}
if (on_linux == 0) {
if (uti_desc.syscalls2 && syscall_number >= 0 && syscall_number < 512) {
uti_desc.syscalls2[syscall_number]++;
}
return 1;
}
#ifdef DEBUG_UTI
if (uti_desc.syscalls && syscall_number >= 0 && syscall_number < 512) {
uti_desc.syscalls[syscall_number]++;
@ -76,7 +87,7 @@ hook(long syscall_number,
uti_desc.syscall_stack[stack_top].args[3] = arg3;
uti_desc.syscall_stack[stack_top].args[4] = arg4;
uti_desc.syscall_stack[stack_top].args[5] = arg5;
uti_desc.syscall_stack[stack_top].uti_clv = uti_desc.uti_clv;
uti_desc.syscall_stack[stack_top].uti_info = uti_desc.uti_info;
uti_desc.syscall_stack[stack_top].ret = -EINVAL;
ret = uti_syscall3(__NR_ioctl, uti_desc.fd,

2
ihk

Submodule ihk updated: 8830360da3...a98a13ef5f

View File

@ -267,3 +267,154 @@ cpu_sysfs_setup(void)
return;
} /* cpu_sysfs_setup() */
/*
* Generic remote CPU function invocation facility.
*/
void smp_func_call_handler(void)
{
unsigned long irq_flags;
struct smp_func_call_request *req;
int reqs_left;
reiterate:
req = NULL;
reqs_left = 0;
irq_flags = ihk_mc_spinlock_lock(
&cpu_local_var(smp_func_req_lock));
/* Take requests one-by-one */
if (!list_empty(&cpu_local_var(smp_func_req_list))) {
req = list_first_entry(&cpu_local_var(smp_func_req_list),
struct smp_func_call_request, list);
list_del(&req->list);
reqs_left = !list_empty(&cpu_local_var(smp_func_req_list));
}
ihk_mc_spinlock_unlock(&cpu_local_var(smp_func_req_lock),
irq_flags);
if (req) {
req->ret = req->sfcd->func(req->cpu_index,
req->sfcd->nr_cpus, req->sfcd->arg);
ihk_atomic_dec(&req->sfcd->cpus_left);
}
if (reqs_left)
goto reiterate;
}
int smp_call_func(cpu_set_t *__cpu_set, smp_func_t __func, void *__arg)
{
int cpu, nr_cpus = 0;
int cpu_index = 0;
int this_cpu_index = 0;
struct smp_func_call_data sfcd;
struct smp_func_call_request *reqs;
int ret = 0;
int call_on_this_cpu = 0;
cpu_set_t cpu_set;
int max_nr_cpus = 4;
/* Sanity checks */
if (!__cpu_set || !__func) {
return -EINVAL;
}
/* Make sure it won't change in between */
cpu_set = *__cpu_set;
for_each_set_bit(cpu, (unsigned long *)&cpu_set,
sizeof(cpu_set) * BITS_PER_BYTE) {
if (cpu == ihk_mc_get_processor_id()) {
call_on_this_cpu = 1;
}
++nr_cpus;
if (nr_cpus == max_nr_cpus)
break;
}
if (!nr_cpus) {
return -EINVAL;
}
reqs = kmalloc(sizeof(*reqs) * nr_cpus, IHK_MC_AP_NOWAIT);
if (!reqs) {
ret = -ENOMEM;
goto free_out;
}
kprintf("%s: interrupting %d CPUs for SMP call..\n", __func__, nr_cpus);
sfcd.nr_cpus = nr_cpus;
sfcd.func = __func;
sfcd.arg = __arg;
ihk_atomic_set(&sfcd.cpus_left,
call_on_this_cpu ? nr_cpus - 1 : nr_cpus);
smp_wmb();
/* Add requests and send IPIs */
cpu_index = 0;
for_each_set_bit(cpu, (unsigned long *)&cpu_set,
sizeof(cpu_set) * BITS_PER_BYTE) {
unsigned long irq_flags;
reqs[cpu_index].cpu_index = cpu_index;
reqs[cpu_index].ret = 0;
if (cpu == ihk_mc_get_processor_id()) {
this_cpu_index = cpu_index;
++cpu_index;
continue;
}
reqs[cpu_index].sfcd = &sfcd;
irq_flags =
ihk_mc_spinlock_lock(&get_cpu_local_var(cpu)->smp_func_req_lock);
list_add_tail(&reqs[cpu_index].list,
&get_cpu_local_var(cpu)->smp_func_req_list);
ihk_mc_spinlock_unlock(&get_cpu_local_var(cpu)->smp_func_req_lock,
irq_flags);
dkprintf("%s: interrupting IRQ: %d -> CPU: %d\n", __func__,
ihk_mc_get_smp_handler_irq(), cpu);
ihk_mc_interrupt_cpu(cpu, ihk_mc_get_smp_handler_irq());
++cpu_index;
if (cpu_index == max_nr_cpus)
break;
}
/* Is this CPU involved? */
if (call_on_this_cpu) {
reqs[this_cpu_index].ret =
__func(this_cpu_index, nr_cpus, __arg);
}
dkprintf("%s: waiting for remote CPUs..\n", __func__);
/* Wait for the rest of the CPUs */
while (smp_load_acquire(&sfcd.cpus_left.counter) > 0) {
cpu_pause();
}
/* Check return values, if error, report the first non-zero */
for (cpu_index = 0; cpu_index < nr_cpus; ++cpu_index) {
if (reqs[cpu_index].ret != 0) {
ret = reqs[cpu_index].ret;
goto free_out;
}
}
kprintf("%s: all CPUs finished SMP call successfully\n", __func__);
ret = 0;
free_out:
kfree(reqs);
return ret;
}

View File

@ -113,7 +113,7 @@ int devobj_create(int fd, size_t len, off_t off, struct memobj **objp, int *maxp
__FUNCTION__, fd, len, off, result.handle, result.maxprot);
obj->memobj.ops = &devobj_ops;
obj->memobj.flags = MF_HAS_PAGER | MF_DEV_FILE;
obj->memobj.flags = MF_HAS_PAGER | MF_REMAP_FILE_PAGES | MF_DEV_FILE;
obj->memobj.size = len;
ihk_atomic_set(&obj->memobj.refcnt, 1);
obj->handle = result.handle;

View File

@ -236,6 +236,7 @@ int fileobj_create(int fd, struct memobj **objp, int *maxprotp, int flags,
memset(newobj, 0, sizeof(*newobj));
newobj->memobj.ops = &fileobj_ops;
newobj->memobj.flags = MF_HAS_PAGER | MF_REG_FILE |
MF_REMAP_FILE_PAGES |
((flags & MAP_PRIVATE) ? MF_PRIVATE : 0);
newobj->handle = result.handle;

View File

@ -62,7 +62,7 @@
#include <process.h>
#include <futex.h>
#include <jhash.h>
#include <mc_jhash.h>
#include <ihk/lock.h>
#include <ihk/atomic.h>
#include <list.h>
@ -72,39 +72,27 @@
#include <timer.h>
#include <ihk/debug.h>
#include <syscall.h>
//#define DEBUG_PRINT_FUTEX
#ifdef DEBUG_PRINT_FUTEX
#undef DDEBUG_DEFAULT
#define DDEBUG_DEFAULT DDEBUG_PRINT
#define uti_dkprintf(...) do { ((clv_override && linux_printk) ? (*linux_printk) : kprintf)(__VA_ARGS__); } while (0)
#else
#define uti_dkprintf(...) do { } while (0)
#endif
#define uti_kprintf(...) do { ((clv_override && linux_printk) ? (*linux_printk) : kprintf)(__VA_ARGS__); } while (0)
#include <kmalloc.h>
#include <ikc/queue.h>
unsigned long ihk_mc_get_ns_per_tsc(void);
/*
* Hash buckets are shared by all the futex_keys that hash to the same
* location. Each key may have multiple futex_q structures, one for each task
* waiting on a futex.
*/
struct futex_hash_bucket {
ihk_spinlock_t lock;
struct plist_head chain;
};
struct futex_hash_bucket *futex_queues;
static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS];
extern struct ihk_ikc_channel_desc **ikc2linuxs;
struct futex_hash_bucket *get_futex_queues(void)
{
return futex_queues;
}
/*
* We hash on the keys returned from get_futex_key (see below).
*/
static struct futex_hash_bucket *hash_futex(union futex_key *key)
{
uint32_t hash = jhash2((uint32_t*)&key->both.word,
uint32_t hash = mc_jhash2((uint32_t *)&key->both.word,
(sizeof(key->both.word)+sizeof(key->both.ptr))/4,
key->both.offset);
return &futex_queues[hash & ((1 << FUTEX_HASHBITS)-1)];
@ -157,11 +145,11 @@ static void drop_futex_key_refs(union futex_key *key)
* lock_page() might sleep, the caller should not hold a spinlock.
*/
static int
get_futex_key(uint32_t *uaddr, int fshared, union futex_key *key, struct cpu_local_var *clv_override)
get_futex_key(uint32_t *uaddr, int fshared, union futex_key *key)
{
unsigned long address = (unsigned long)uaddr;
unsigned long phys;
struct thread *thread = cpu_local_var_with_override(current, clv_override);
struct thread *thread = cpu_local_var(current);
struct process_vm *mm = thread->vm;
/*
@ -228,7 +216,7 @@ static int cmpxchg_futex_value_locked(uint32_t __user *uaddr, uint32_t uval, uin
* The hash bucket lock must be held when this is called.
* Afterwards, the futex_q must not be accessed.
*/
static void wake_futex(struct futex_q *q, struct cpu_local_var *clv_override)
static void wake_futex(struct futex_q *q)
{
struct thread *p = q->task;
@ -253,26 +241,30 @@ static void wake_futex(struct futex_q *q, struct cpu_local_var *clv_override)
if (q->uti_futex_resp) {
int rc;
uti_dkprintf("wake_futex(): waking up migrated-to-Linux thread (tid %d),uti_futex_resp=%p\n", p->tid, q->uti_futex_resp);
/* TODO: Add the case when a Linux thread waking up another Linux thread */
if (clv_override) {
uti_dkprintf("%s: ERROR: A Linux thread is waking up migrated-to-Linux thread\n", __FUNCTION__);
}
if (p->spin_sleep == 0) {
uti_dkprintf("%s: INFO: woken up by someone else\n", __FUNCTION__);
struct ikc_scd_packet pckt;
struct ihk_ikc_channel_desc *resp_channel;
dkprintf("%s: waking up migrated-to-Linux thread (tid %d),uti_futex_resp=%p,linux_cpu: %d\n",
__func__, p->tid, q->uti_futex_resp, q->linux_cpu);
/* does this Linux CPU have a connected channel? */
if (ikc2linuxs[q->linux_cpu]) {
resp_channel = ikc2linuxs[q->linux_cpu];
} else {
resp_channel = cpu_local_var(ikc2linux);
}
struct ikc_scd_packet pckt;
struct ihk_ikc_channel_desc *resp_channel = cpu_local_var_with_override(ikc2linux, clv_override);
pckt.msg = SCD_MSG_FUTEX_WAKE;
pckt.futex.resp = q->uti_futex_resp;
pckt.futex.spin_sleep = &p->spin_sleep;
rc = ihk_ikc_send(resp_channel, &pckt, 0);
if (rc) {
uti_dkprintf("%s: ERROR: ihk_ikc_send returned %d, resp_channel=%p\n", __FUNCTION__, rc, resp_channel);
dkprintf("%s: ERROR: ihk_ikc_send returned %d, resp_channel=%p\n",
__func__, rc, resp_channel);
}
} else {
uti_dkprintf("wake_futex(): waking up McKernel thread (tid %d)\n", p->tid);
dkprintf("%s: waking up McKernel thread (tid %d)\n",
__func__, p->tid);
sched_wakeup_thread(p, PS_NORMAL);
}
}
@ -304,7 +296,8 @@ double_unlock_hb(struct futex_hash_bucket *hb1, struct futex_hash_bucket *hb2)
/*
* Wake up waiters matching bitset queued on this futex (uaddr).
*/
static int futex_wake(uint32_t *uaddr, int fshared, int nr_wake, uint32_t bitset, struct cpu_local_var *clv_override)
static int futex_wake(uint32_t *uaddr, int fshared, int nr_wake,
uint32_t bitset)
{
struct futex_hash_bucket *hb;
struct futex_q *this, *next;
@ -316,7 +309,7 @@ static int futex_wake(uint32_t *uaddr, int fshared, int nr_wake, uint32_t bitset
if (!bitset)
return -EINVAL;
ret = get_futex_key(uaddr, fshared, &key, clv_override);
ret = get_futex_key(uaddr, fshared, &key);
if ((ret != 0))
goto out;
@ -332,7 +325,7 @@ static int futex_wake(uint32_t *uaddr, int fshared, int nr_wake, uint32_t bitset
if (!(this->bitset & bitset))
continue;
wake_futex(this, clv_override);
wake_futex(this);
if (++ret >= nr_wake)
break;
}
@ -350,8 +343,7 @@ out:
*/
static int
futex_wake_op(uint32_t *uaddr1, int fshared, uint32_t *uaddr2,
int nr_wake, int nr_wake2, int op,
struct cpu_local_var *clv_override)
int nr_wake, int nr_wake2, int op)
{
union futex_key key1 = FUTEX_KEY_INIT, key2 = FUTEX_KEY_INIT;
struct futex_hash_bucket *hb1, *hb2;
@ -360,10 +352,10 @@ futex_wake_op(uint32_t *uaddr1, int fshared, uint32_t *uaddr2,
int ret, op_ret;
retry:
ret = get_futex_key(uaddr1, fshared, &key1, clv_override);
ret = get_futex_key(uaddr1, fshared, &key1);
if ((ret != 0))
goto out;
ret = get_futex_key(uaddr2, fshared, &key2, clv_override);
ret = get_futex_key(uaddr2, fshared, &key2);
if ((ret != 0))
goto out_put_key1;
@ -397,7 +389,7 @@ retry_private:
plist_for_each_entry_safe(this, next, head, list) {
if (match_futex (&this->key, &key1)) {
wake_futex(this, clv_override);
wake_futex(this);
if (++ret >= nr_wake)
break;
}
@ -409,7 +401,7 @@ retry_private:
op_ret = 0;
plist_for_each_entry_safe(this, next, head, list) {
if (match_futex (&this->key, &key2)) {
wake_futex(this, clv_override);
wake_futex(this);
if (++op_ret >= nr_wake2)
break;
}
@ -471,8 +463,8 @@ void requeue_futex(struct futex_q *q, struct futex_hash_bucket *hb1,
* <0 - on error
*/
static int futex_requeue(uint32_t *uaddr1, int fshared, uint32_t *uaddr2,
int nr_wake, int nr_requeue, uint32_t *cmpval,
int requeue_pi, struct cpu_local_var *clv_override)
int nr_wake, int nr_requeue, uint32_t *cmpval,
int requeue_pi)
{
union futex_key key1 = FUTEX_KEY_INIT, key2 = FUTEX_KEY_INIT;
int drop_count = 0, task_count = 0, ret;
@ -480,10 +472,10 @@ static int futex_requeue(uint32_t *uaddr1, int fshared, uint32_t *uaddr2,
struct plist_head *head1;
struct futex_q *this, *next;
ret = get_futex_key(uaddr1, fshared, &key1, clv_override);
ret = get_futex_key(uaddr1, fshared, &key1);
if ((ret != 0))
goto out;
ret = get_futex_key(uaddr2, fshared, &key2, clv_override);
ret = get_futex_key(uaddr2, fshared, &key2);
if ((ret != 0))
goto out_put_key1;
@ -518,7 +510,7 @@ static int futex_requeue(uint32_t *uaddr1, int fshared, uint32_t *uaddr2,
*/
/* RIKEN: no requeue_pi at this moment */
if (++task_count <= nr_wake) {
wake_futex(this, clv_override);
wake_futex(this);
continue;
}
@ -577,9 +569,12 @@ queue_unlock(struct futex_q *q, struct futex_hash_bucket *hb)
* state is implicit in the state of woken task (see futex_wait_requeue_pi() for
* an example).
*/
static inline void queue_me(struct futex_q *q, struct futex_hash_bucket *hb, struct cpu_local_var *clv_override)
static inline void queue_me(struct futex_q *q, struct futex_hash_bucket *hb)
{
int prio;
struct thread *thread = cpu_local_var(current);
ihk_spinlock_t *_runq_lock = &cpu_local_var(runq_lock);
unsigned int *_flags = &cpu_local_var(flags);
/*
* The priority used to register this element is
@ -598,7 +593,19 @@ static inline void queue_me(struct futex_q *q, struct futex_hash_bucket *hb, str
q->list.plist.spinlock = &hb->lock;
#endif
plist_add(&q->list, &hb->chain);
q->task = cpu_local_var_with_override(current, clv_override);
/* Store information about wait thread for uti-futex*/
q->task = thread;
q->th_spin_sleep_pa = virt_to_phys((void *)&thread->spin_sleep);
q->th_status_pa = virt_to_phys((void *)&thread->status);
q->th_spin_sleep_lock_pa = virt_to_phys((void *)&thread->spin_sleep_lock);
q->proc_status_pa = virt_to_phys((void *)&thread->proc->status);
q->proc_update_lock_pa = virt_to_phys((void *)&thread->proc->update_lock);
q->runq_lock_pa = virt_to_phys((void *)_runq_lock);
q->clv_flags_pa = virt_to_phys((void *)_flags);
q->intr_id = ihk_mc_get_interrupt_id(thread->cpu_id);
q->intr_vector = ihk_mc_get_vector(IHK_GV_IKC);
ihk_mc_spinlock_unlock_noirq(&hb->lock);
}
@ -661,12 +668,12 @@ retry:
/* RIKEN: this function has been rewritten so that it returns the remaining
* time in case we are waken.
*/
static int64_t futex_wait_queue_me(struct futex_hash_bucket *hb, struct futex_q *q,
uint64_t timeout, struct cpu_local_var *clv_override)
static int64_t futex_wait_queue_me(struct futex_hash_bucket *hb,
struct futex_q *q, uint64_t timeout)
{
int64_t time_remain = 0;
unsigned long irqstate;
struct thread *thread = cpu_local_var_with_override(current, clv_override);
struct thread *thread = cpu_local_var(current);
/*
* The task state is guaranteed to be set before another task can
* wake it.
@ -685,25 +692,9 @@ static int64_t futex_wait_queue_me(struct futex_hash_bucket *hb, struct futex_q
ihk_mc_spinlock_unlock(&thread->spin_sleep_lock, irqstate);
}
queue_me(q, hb, clv_override);
queue_me(q, hb);
if (!plist_node_empty(&q->list)) {
if (clv_override) {
uti_dkprintf("%s: tid: %d is trying to sleep\n", __FUNCTION__, thread->tid);
/* Note that the unit of timeout is nsec */
time_remain = (*linux_wait_event)(q->uti_futex_resp, timeout);
/* Note that time_remain == 0 indicates contidion evaluated to false after the timeout elapsed */
if (time_remain < 0) {
if (time_remain == -ERESTARTSYS) { /* Interrupted by signal */
uti_dkprintf("%s: DEBUG: wait_event returned -ERESTARTSYS\n", __FUNCTION__);
} else {
uti_kprintf("%s: ERROR: wait_event returned %d\n", __FUNCTION__, time_remain);
}
}
uti_dkprintf("%s: tid: %d woken up\n", __FUNCTION__, thread->tid);
} else {
if (timeout) {
dkprintf("futex_wait_queue_me(): tid: %d schedule_timeout()\n", thread->tid);
time_remain = schedule_timeout(timeout);
@ -714,7 +705,6 @@ static int64_t futex_wait_queue_me(struct futex_hash_bucket *hb, struct futex_q
time_remain = 0;
}
dkprintf("futex_wait_queue_me(): tid: %d woken up\n", thread->tid);
}
}
/* This does not need to be serialized */
@ -742,8 +732,7 @@ static int64_t futex_wait_queue_me(struct futex_hash_bucket *hb, struct futex_q
* <1 - -EFAULT or -EWOULDBLOCK (uaddr does not contain val) and hb is unlcoked
*/
static int futex_wait_setup(uint32_t __user *uaddr, uint32_t val, int fshared,
struct futex_q *q, struct futex_hash_bucket **hb,
struct cpu_local_var *clv_override)
struct futex_q *q, struct futex_hash_bucket **hb)
{
uint32_t uval;
int ret;
@ -766,7 +755,7 @@ static int futex_wait_setup(uint32_t __user *uaddr, uint32_t val, int fshared,
* rare, but normal.
*/
q->key = FUTEX_KEY_INIT;
ret = get_futex_key(uaddr, fshared, &q->key, clv_override);
ret = get_futex_key(uaddr, fshared, &q->key);
if (ret != 0)
return ret;
@ -790,8 +779,7 @@ static int futex_wait_setup(uint32_t __user *uaddr, uint32_t val, int fshared,
}
static int futex_wait(uint32_t __user *uaddr, int fshared,
uint32_t val, uint64_t timeout, uint32_t bitset, int clockrt,
struct cpu_local_var *clv_override)
uint32_t val, uint64_t timeout, uint32_t bitset, int clockrt)
{
struct futex_hash_bucket *hb;
int64_t time_remain;
@ -802,57 +790,55 @@ static int futex_wait(uint32_t __user *uaddr, int fshared,
if (!bitset)
return -EINVAL;
if (!clv_override) {
q = &lq;
}
else {
q = &cpu_local_var_with_override(current,
clv_override)->futex_q;
}
q = &lq;
#ifdef PROFILE_ENABLE
if (cpu_local_var_with_override(current, clv_override)->profile &&
cpu_local_var_with_override(current, clv_override)->profile_start_ts) {
cpu_local_var_with_override(current, clv_override)->profile_elapsed_ts +=
(rdtsc() - cpu_local_var_with_override(current, clv_override)->profile_start_ts);
cpu_local_var_with_override(current, clv_override)->profile_start_ts = 0;
if (cpu_local_var(current)->profile &&
cpu_local_var(current)->profile_start_ts) {
cpu_local_var(current)->profile_elapsed_ts +=
(rdtsc() - cpu_local_var(current)->profile_start_ts);
cpu_local_var(current)->profile_start_ts = 0;
}
#endif
q->bitset = bitset;
q->requeue_pi_key = NULL;
q->uti_futex_resp = cpu_local_var_with_override(uti_futex_resp,
clv_override);
q->uti_futex_resp = cpu_local_var(uti_futex_resp);
retry:
/* Prepare to wait on uaddr. */
ret = futex_wait_setup(uaddr, val, fshared, q, &hb, clv_override);
ret = futex_wait_setup(uaddr, val, fshared, q, &hb);
if (ret) {
uti_dkprintf("%s: tid=%d futex_wait_setup returns zero, no need to sleep\n", __FUNCTION__, cpu_local_var_with_override(current, clv_override)->tid);
dkprintf("%s: tid=%d futex_wait_setup returns zero, no need to sleep\n",
__func__, cpu_local_var(current)->tid);
goto out;
}
/* queue_me and wait for wakeup, timeout, or a signal. */
time_remain = futex_wait_queue_me(hb, q, timeout, clv_override);
time_remain = futex_wait_queue_me(hb, q, timeout);
/* If we were woken (and unqueued), we succeeded, whatever. */
ret = 0;
if (!unqueue_me(q)) {
uti_dkprintf("%s: tid=%d unqueued\n", __FUNCTION__, cpu_local_var_with_override(current, clv_override)->tid);
dkprintf("%s: tid=%d unqueued\n",
__func__, cpu_local_var(current)->tid);
goto out_put_key;
}
ret = -ETIMEDOUT;
/* RIKEN: timer expired case (indicated by !time_remain) */
if (timeout && !time_remain) {
uti_dkprintf("%s: tid=%d timer expired\n", __FUNCTION__, cpu_local_var_with_override(current, clv_override)->tid);
dkprintf("%s: tid=%d timer expired\n",
__func__, cpu_local_var(current)->tid);
goto out_put_key;
}
/* RIKEN: futex_wait_queue_me() returns -ERESTARTSYS when waiting on Linux CPU and woken up by signal */
if (hassigpending(cpu_local_var_with_override(current, clv_override)) || time_remain == -ERESTARTSYS) {
if (hassigpending(cpu_local_var(current)) ||
time_remain == -ERESTARTSYS) {
ret = -EINTR;
uti_dkprintf("%s: tid=%d woken up by signal\n", __FUNCTION__, cpu_local_var_with_override(current, clv_override)->tid);
dkprintf("%s: tid=%d woken up by signal\n",
__func__, cpu_local_var(current)->tid);
goto out_put_key;
}
@ -864,21 +850,22 @@ out_put_key:
put_futex_key(fshared, &q->key);
out:
#ifdef PROFILE_ENABLE
if (cpu_local_var_with_override(current, clv_override)->profile) {
cpu_local_var_with_override(current, clv_override)->profile_start_ts = rdtsc();
if (cpu_local_var(current)->profile) {
cpu_local_var(current)->profile_start_ts = rdtsc();
}
#endif
return ret;
}
int futex(uint32_t *uaddr, int op, uint32_t val, uint64_t timeout,
uint32_t *uaddr2, uint32_t val2, uint32_t val3, int fshared,
struct cpu_local_var *clv_override)
uint32_t *uaddr2, uint32_t val2, uint32_t val3, int fshared)
{
int clockrt, ret = -ENOSYS;
int cmd = op & FUTEX_CMD_MASK;
uti_dkprintf("%s: uaddr=%p, op=%x, val=%x, timeout=%ld, uaddr2=%p, val2=%x, val3=%x, fshared=%d, clv=%p\n", __FUNCTION__, uaddr, op, val, timeout, uaddr2, val2, val3, fshared, clv_override);
dkprintf("%s: uaddr=%p, op=%x, val=%x, timeout=%ld, uaddr2=%p, val2=%x, val3=%x, fshared=%d\n",
__func__, uaddr, op, val, timeout, uaddr2,
val2, val3, fshared);
clockrt = op & FUTEX_CLOCK_REALTIME;
if (clockrt && cmd != FUTEX_WAIT_BITSET && cmd != FUTEX_WAIT_REQUEUE_PI)
@ -888,21 +875,23 @@ int futex(uint32_t *uaddr, int op, uint32_t val, uint64_t timeout,
case FUTEX_WAIT:
val3 = FUTEX_BITSET_MATCH_ANY;
case FUTEX_WAIT_BITSET:
ret = futex_wait(uaddr, fshared, val, timeout, val3, clockrt, clv_override);
ret = futex_wait(uaddr, fshared, val, timeout, val3, clockrt);
break;
case FUTEX_WAKE:
val3 = FUTEX_BITSET_MATCH_ANY;
case FUTEX_WAKE_BITSET:
ret = futex_wake(uaddr, fshared, val, val3, clv_override);
ret = futex_wake(uaddr, fshared, val, val3);
break;
case FUTEX_REQUEUE:
ret = futex_requeue(uaddr, fshared, uaddr2, val, val2, NULL, 0, clv_override);
ret = futex_requeue(uaddr, fshared, uaddr2,
val, val2, NULL, 0);
break;
case FUTEX_CMP_REQUEUE:
ret = futex_requeue(uaddr, fshared, uaddr2, val, val2, &val3, 0, clv_override);
ret = futex_requeue(uaddr, fshared, uaddr2,
val, val2, &val3, 0);
break;
case FUTEX_WAKE_OP:
ret = futex_wake_op(uaddr, fshared, uaddr2, val, val2, val3, clv_override);
ret = futex_wake_op(uaddr, fshared, uaddr2, val, val2, val3);
break;
/* RIKEN: these calls are not supported for now.
case FUTEX_LOCK_PI:
@ -942,7 +931,9 @@ int futex_init(void)
{
int i;
for (i = 0; i < ARRAY_SIZE(futex_queues); i++) {
futex_queues = kmalloc(sizeof(struct futex_hash_bucket) *
(1 << FUTEX_HASHBITS), IHK_MC_AP_NOWAIT);
for (i = 0; i < (1 << FUTEX_HASHBITS); i++) {
plist_head_init(&futex_queues[i].chain, &futex_queues[i].lock);
ihk_mc_spinlock_init(&futex_queues[i].lock);
}

View File

@ -43,7 +43,7 @@
#endif
/* Linux channel table, indexec by Linux CPU id */
static struct ihk_ikc_channel_desc **ikc2linuxs = NULL;
struct ihk_ikc_channel_desc **ikc2linuxs;
void check_mapping_for_proc(struct thread *thread, unsigned long addr)
{
@ -563,6 +563,7 @@ static int process_msg_prepare_process(unsigned long rphys)
__func__, vm->numa_mem_policy, vm->numa_mask[0]);
}
proc->enable_uti = pn->enable_uti;
proc->uti_thread_rank = pn->uti_thread_rank;
proc->uti_use_last_cpu = pn->uti_use_last_cpu;

View File

@ -106,9 +106,7 @@ struct cpu_local_var {
ihk_spinlock_t migq_lock;
struct list_head migq;
int in_interrupt;
#ifdef ENABLE_FUGAKU_HACKS
int in_page_fault;
#endif
int no_preempt;
int timer_enabled;
unsigned long nr_ctx_switches;

View File

@ -128,6 +128,26 @@
struct process_vm;
static inline int get_futex_value_locked(uint32_t *dest, uint32_t *from)
{
*dest = *(volatile uint32_t *)from;
return 0;
}
/*
* Hash buckets are shared by all the futex_keys that hash to the same
* location. Each key may have multiple futex_q structures, one for each task
* waiting on a futex.
*/
struct futex_hash_bucket {
ihk_spinlock_t lock;
struct plist_head chain;
};
struct futex_hash_bucket *get_futex_queues(void);
union futex_key {
struct {
unsigned long pgoff;
@ -161,8 +181,7 @@ futex(
uint32_t __user * uaddr2,
uint32_t val2,
uint32_t val3,
int fshared,
struct cpu_local_var *clv_override
int fshared
);
@ -196,6 +215,28 @@ struct futex_q {
/* Used to wake-up a thread running on a Linux CPU */
void *uti_futex_resp;
/* Used to send IPI directly to the waiter CPU */
int linux_cpu;
/* Used to wake-up a thread running on a McKernel from Linux */
void *th_spin_sleep;
void *th_status;
void *th_spin_sleep_lock;
void *proc_status;
void *proc_update_lock;
void *runq_lock;
void *clv_flags;
int intr_id;
int intr_vector;
unsigned long th_spin_sleep_pa;
unsigned long th_status_pa;
unsigned long th_spin_sleep_lock_pa;
unsigned long proc_status_pa;
unsigned long proc_update_lock_pa;
unsigned long runq_lock_pa;
unsigned long clv_flags_pa;
};
#endif

View File

@ -1,158 +0,0 @@
#ifndef _LINUX_JHASH_H
#define _LINUX_JHASH_H
/**
* \file futex.c
* Licence details are found in the file LICENSE.
*
* \brief
* Adaptation to McKernel
*
* \author Balazs Gerofi <bgerofi@riken.jp> \par
* Copyright (C) 2012 RIKEN AICS
*
*
* HISTORY:
*/
/*
* jhash.h: Jenkins hash support.
*
* Copyright (C) 1996 Bob Jenkins (bob_jenkins@burtleburtle.net)
*
* http://burtleburtle.net/bob/hash/
*
* These are the credits from Bob's sources:
*
* lookup2.c, by Bob Jenkins, December 1996, Public Domain.
* hash(), hash2(), hash3, and mix() are externally useful functions.
* Routines to test the hash are included if SELF_TEST is defined.
* You can use this free for any purpose. It has no warranty.
*
* Copyright (C) 2003 David S. Miller (davem@redhat.com)
*
* I've modified Bob's hash to be useful in the Linux kernel, and
* any bugs present are surely my fault. -DaveM
*
*/
/* NOTE: Arguments are modified. */
#define __jhash_mix(a, b, c) \
{ \
a -= b; a -= c; a ^= (c>>13); \
b -= c; b -= a; b ^= (a<<8); \
c -= a; c -= b; c ^= (b>>13); \
a -= b; a -= c; a ^= (c>>12); \
b -= c; b -= a; b ^= (a<<16); \
c -= a; c -= b; c ^= (b>>5); \
a -= b; a -= c; a ^= (c>>3); \
b -= c; b -= a; b ^= (a<<10); \
c -= a; c -= b; c ^= (b>>15); \
}
/* The golden ration: an arbitrary value */
#define JHASH_GOLDEN_RATIO 0x9e3779b9
/* The most generic version, hashes an arbitrary sequence
* of bytes. No alignment or length assumptions are made about
* the input key.
*/
static inline uint32_t jhash(const void *key, uint32_t length, uint32_t initval)
{
uint32_t a, b, c, len;
const uint8_t *k = key;
len = length;
a = b = JHASH_GOLDEN_RATIO;
c = initval;
while (len >= 12) {
a += (k[0] +((uint32_t)k[1]<<8) +((uint32_t)k[2]<<16) +((uint32_t)k[3]<<24));
b += (k[4] +((uint32_t)k[5]<<8) +((uint32_t)k[6]<<16) +((uint32_t)k[7]<<24));
c += (k[8] +((uint32_t)k[9]<<8) +((uint32_t)k[10]<<16)+((uint32_t)k[11]<<24));
__jhash_mix(a,b,c);
k += 12;
len -= 12;
}
c += length;
switch (len) {
case 11: c += ((uint32_t)k[10]<<24);
case 10: c += ((uint32_t)k[9]<<16);
case 9 : c += ((uint32_t)k[8]<<8);
case 8 : b += ((uint32_t)k[7]<<24);
case 7 : b += ((uint32_t)k[6]<<16);
case 6 : b += ((uint32_t)k[5]<<8);
case 5 : b += k[4];
case 4 : a += ((uint32_t)k[3]<<24);
case 3 : a += ((uint32_t)k[2]<<16);
case 2 : a += ((uint32_t)k[1]<<8);
case 1 : a += k[0];
};
__jhash_mix(a,b,c);
return c;
}
/* A special optimized version that handles 1 or more of uint32_ts.
* The length parameter here is the number of uint32_ts in the key.
*/
static inline uint32_t jhash2(const uint32_t *k, uint32_t length, uint32_t initval)
{
uint32_t a, b, c, len;
a = b = JHASH_GOLDEN_RATIO;
c = initval;
len = length;
while (len >= 3) {
a += k[0];
b += k[1];
c += k[2];
__jhash_mix(a, b, c);
k += 3; len -= 3;
}
c += length * 4;
switch (len) {
case 2 : b += k[1];
case 1 : a += k[0];
};
__jhash_mix(a,b,c);
return c;
}
/* A special ultra-optimized versions that knows they are hashing exactly
* 3, 2 or 1 word(s).
*
* NOTE: In partilar the "c += length; __jhash_mix(a,b,c);" normally
* done at the end is not done here.
*/
static inline uint32_t jhash_3words(uint32_t a, uint32_t b, uint32_t c, uint32_t initval)
{
a += JHASH_GOLDEN_RATIO;
b += JHASH_GOLDEN_RATIO;
c += initval;
__jhash_mix(a, b, c);
return c;
}
static inline uint32_t jhash_2words(uint32_t a, uint32_t b, uint32_t initval)
{
return jhash_3words(a, b, 0, initval);
}
static inline uint32_t jhash_1word(uint32_t a, uint32_t initval)
{
return jhash_3words(a, 0, 0, initval);
}
#endif /* _LINUX_JHASH_H */

88
kernel/include/mc_jhash.h Normal file
View File

@ -0,0 +1,88 @@
#ifndef _MC_JHASH_H
#define _MC_JHASH_H
/**
* \file mc_jhash.h
* Licence details are found in the file LICENSE.
*
* \brief
* Adaptation to McKernel
*
* \author Balazs Gerofi <bgerofi@riken.jp> \par
* Copyright (C) 2012 RIKEN AICS
*
*
* HISTORY:
*/
/*
* jhash.h: Jenkins hash support.
*
* Copyright (C) 1996 Bob Jenkins (bob_jenkins@burtleburtle.net)
*
* http://burtleburtle.net/bob/hash/
*
* These are the credits from Bob's sources:
*
* lookup2.c, by Bob Jenkins, December 1996, Public Domain.
* hash(), hash2(), hash3, and mix() are externally useful functions.
* Routines to test the hash are included if SELF_TEST is defined.
* You can use this free for any purpose. It has no warranty.
*
* Copyright (C) 2003 David S. Miller (davem@redhat.com)
*
* I've modified Bob's hash to be useful in the Linux kernel, and
* any bugs present are surely my fault. -DaveM
*
*/
/* NOTE: Arguments are modified. */
#define __mc_jhash_mix(a, b, c) \
{ \
a -= b; a -= c; a ^= (c>>13); \
b -= c; b -= a; b ^= (a<<8); \
c -= a; c -= b; c ^= (b>>13); \
a -= b; a -= c; a ^= (c>>12); \
b -= c; b -= a; b ^= (a<<16); \
c -= a; c -= b; c ^= (b>>5); \
a -= b; a -= c; a ^= (c>>3); \
b -= c; b -= a; b ^= (a<<10); \
c -= a; c -= b; c ^= (b>>15); \
}
/* The golden ration: an arbitrary value */
#define JHASH_GOLDEN_RATIO 0x9e3779b9
/* A special optimized version that handles 1 or more of uint32_ts.
* The length parameter here is the number of uint32_ts in the key.
*/
static inline uint32_t mc_jhash2(const uint32_t *k, uint32_t length, uint32_t initval)
{
uint32_t a, b, c, len;
a = b = JHASH_GOLDEN_RATIO;
c = initval;
len = length;
while (len >= 3) {
a += k[0];
b += k[1];
c += k[2];
__mc_jhash_mix(a, b, c);
k += 3; len -= 3;
}
c += length * 4;
switch (len) {
case 2:
b += k[1];
case 1:
a += k[0];
};
__mc_jhash_mix(a, b, c);
return c;
}
#endif /* _MC_JHASH_H */

View File

@ -37,6 +37,7 @@ enum {
MF_SHM = 0x40000,
MF_HUGETLBFS = 0x100000,
MF_PRIVATE = 0x200000, /* To prevent flush in clear_range_* */
MF_REMAP_FILE_PAGES = 0x400000, /* remap_file_pages possible */
};
#define MEMOBJ_READY 0
@ -181,4 +182,11 @@ static inline int is_freeable(struct memobj *memobj)
return 1;
}
static inline int is_callable_remap_file_pages(struct memobj *memobj)
{
if (!memobj || !(memobj->flags & MF_REMAP_FILE_PAGES))
return 0;
return 1;
}
#endif /* HEADER_MEMOBJ_H */

View File

@ -406,6 +406,7 @@ struct vm_range_numa_policy {
unsigned long start, end;
DECLARE_BITMAP(numa_mask, PROCESS_NUMA_MASK_BITS);
int numa_mem_policy;
int il_prev;
};
struct vm_regions {
@ -564,6 +565,7 @@ struct process {
unsigned long mpol_bind_mask;
int mpol_mode;
int enable_uti;
int uti_thread_rank; /* Spawn on Linux CPU when clone_count reaches this */
int uti_use_last_cpu; /* Work-around not to share CPU with OpenMP thread */
int clone_count;
@ -797,6 +799,7 @@ struct process_vm {
long currss;
DECLARE_BITMAP(numa_mask, PROCESS_NUMA_MASK_BITS);
int numa_mem_policy;
int il_prev;
/* Protected by memory_range_lock */
struct rb_root vm_range_numa_policy_tree;
struct vm_range *range_cache[VM_RANGE_CACHE_SIZE];

View File

@ -235,6 +235,7 @@ struct program_load_desc {
(sizeof(unsigned long) * 8)];
int thp_disable;
int enable_uti;
int uti_thread_rank; /* N-th clone() spawns a thread on Linux CPU */
int uti_use_last_cpu; /* Work-around not to share CPU with OpenMP thread */
int straight_map;
@ -679,4 +680,5 @@ extern int (*linux_clock_gettime)(clockid_t clk_id, struct timespec *tp);
extern void terminate_host(int pid, struct thread *thread);
struct sig_pending *getsigpending(struct thread *thread, int delflag);
int interrupt_from_user(void *regs0);
extern unsigned long shmid_index[];
#endif

View File

@ -252,7 +252,6 @@ static void nmi_init()
static void uti_init()
{
ihk_set_mckernel_do_futex((unsigned long)do_futex);
}
static void rest_init(void)

View File

@ -523,6 +523,18 @@ static void reserve_pages(struct ihk_page_allocator_desc *pa_allocator,
ihk_pagealloc_reserve(pa_allocator, start, end);
}
static int interleave_nodes(int off, unsigned long *numa_mask)
{
int next;
next = find_next_bit(numa_mask, PROCESS_NUMA_MASK_BITS, off + 1);
if (next >= PROCESS_NUMA_MASK_BITS) {
next = find_first_bit(numa_mask, PROCESS_NUMA_MASK_BITS);
}
return next;
}
extern int cpu_local_var_initialized;
static void *mckernel_allocate_aligned_pages_node(int npages, int p2align,
ihk_mc_ap_flag flag, int pref_node, int is_user, uintptr_t virt_addr)
@ -538,7 +550,9 @@ static void *mckernel_allocate_aligned_pages_node(int npages, int p2align,
int numa_mem_policy = -1;
struct process_vm *vm;
struct vm_range *range = NULL;
int chk_shm = 0;
int chk_shm = 0, il_start, looping;
int *il_prev = NULL;
unsigned long *numa_mask = NULL;
if(npages <= 0)
return NULL;
@ -549,31 +563,39 @@ static void *mckernel_allocate_aligned_pages_node(int npages, int p2align,
!cpu_local_var(current)->vm)
goto distance_based;
/* No explicitly requested NUMA or user policy? */
if ((pref_node == -1) && (!(flag & IHK_MC_AP_USER) ||
cpu_local_var(current)->vm->numa_mem_policy == MPOL_DEFAULT)) {
vm = cpu_local_var(current)->vm;
node = ihk_mc_get_numa_id();
if (virt_addr != -1) {
vm = cpu_local_var(current)->vm;
range_policy_iter = vm_range_policy_search(vm, virt_addr);
if (range_policy_iter) {
range = lookup_process_memory_range(vm, (uintptr_t)virt_addr, ((uintptr_t)virt_addr) + 1);
if (range) {
if( (range->memobj) && (range->memobj->flags == MF_SHM)) {
chk_shm = 1;
}
}
/* Get mempolicy user requested */
if (virt_addr != -1) {
range_policy_iter = vm_range_policy_search(vm, virt_addr);
if (range_policy_iter) {
range = lookup_process_memory_range(vm,
(uintptr_t)virt_addr,
((uintptr_t)virt_addr) + 1);
if ((range && (range->memobj->flags == MF_SHM))) {
chk_shm = 1;
}
/* Use range policy */
numa_mem_policy = range_policy_iter->numa_mem_policy;
numa_mask = range_policy_iter->numa_mask;
il_prev = &range_policy_iter->il_prev;
} else {
/* Use process policy */
numa_mem_policy = vm->numa_mem_policy;
numa_mask = vm->numa_mask;
il_prev = &vm->il_prev;
}
if ((!((range_policy_iter) && (range_policy_iter->numa_mem_policy != MPOL_DEFAULT))) && (chk_shm == 0))
goto distance_based;
}
node = ihk_mc_get_numa_id();
if (!memory_nodes[node].nodes_by_distance)
goto order_based;
/* No explicitly requested NUMA or user policy? */
if ((pref_node == -1) && !(flag & IHK_MC_AP_USER)) {
if ((numa_mem_policy == MPOL_DEFAULT) && (chk_shm == 0)) {
goto distance_based;
}
}
/* Explicit valid node? */
if (pref_node > -1 && pref_node < ihk_mc_get_nr_numa_nodes()) {
@ -615,27 +637,6 @@ static void *mckernel_allocate_aligned_pages_node(int npages, int p2align,
}
}
if ((virt_addr != -1) && (chk_shm == 0)) {
vm = cpu_local_var(current)->vm;
if (!(range_policy_iter)) {
range_policy_iter = vm_range_policy_search(vm, virt_addr);
}
if (range_policy_iter) {
range = lookup_process_memory_range(vm, (uintptr_t)virt_addr, ((uintptr_t)virt_addr) + 1);
if ((range && (range->memobj->flags == MF_SHM))) {
chk_shm = 1;
} else {
numa_mem_policy = range_policy_iter->numa_mem_policy;
}
}
}
if (numa_mem_policy == -1)
numa_mem_policy = cpu_local_var(current)->vm->numa_mem_policy;
switch (numa_mem_policy) {
case MPOL_BIND:
case MPOL_PREFERRED:
@ -644,9 +645,8 @@ static void *mckernel_allocate_aligned_pages_node(int npages, int p2align,
* only the ones requested in user policy */
for (i = 0; i < ihk_mc_get_nr_numa_nodes(); ++i) {
/* Not part of user requested policy? */
if (!test_bit(memory_nodes[node].nodes_by_distance[i].id,
cpu_local_var(current)->proc->vm->numa_mask)) {
numa_mask)) {
continue;
}
@ -687,7 +687,55 @@ static void *mckernel_allocate_aligned_pages_node(int npages, int p2align,
break;
case MPOL_INTERLEAVE:
/* TODO: */
/* Initialize interleave */
il_start = *il_prev;
looping = 0;
retry_interleave:
/* Find next node */
numa_id = interleave_nodes(*il_prev, numa_mask);
*il_prev = numa_id;
if (il_start == *il_prev && looping) {
/* All interleave nodes are full */
pa = 0;
break;
}
looping = 1;
#ifdef IHK_RBTREE_ALLOCATOR
{
if (rusage_check_oom(numa_id, npages, is_user)
== -ENOMEM) {
goto retry_interleave;
} else {
pa = ihk_numa_alloc_pages(
&memory_nodes[numa_id],
npages, p2align);
}
#else
list_for_each_entry(pa_allocator,
&memory_nodes[numa_id].allocators,
list) {
if (rusage_check_oom(numa_id, npages, is_user)
== -ENOMEM) {
goto retry_interleave;
} else {
pa = ihk_pagealloc_alloc(pa_allocator,
npages, p2align);
}
#endif
if (pa) {
rusage_page_add(numa_id, npages,
is_user);
dkprintf("%s: policy: CPU @ node %d allocated "
"%d pages from node %d\n",
__func__,
ihk_mc_get_numa_id(),
npages, node);
}
}
break;
default:
@ -1395,7 +1443,6 @@ static void page_fault_handler(void *fault_addr, uint64_t reason, void *regs)
__FUNCTION__, fault_addr, reason, regs);
preempt_disable();
#ifdef ENABLE_FUGAKU_HACKS
++cpu_local_var(in_page_fault);
if (cpu_local_var(in_page_fault) > 1) {
kprintf("%s: PF in PF??\n", __func__);
@ -1408,7 +1455,6 @@ static void page_fault_handler(void *fault_addr, uint64_t reason, void *regs)
panic("PANIC");
}
}
#endif
cpu_enable_interrupt();
@ -1475,6 +1521,7 @@ out_linux:
__func__, thread ? thread->tid : -1, fault_addr,
reason, error);
unhandled_page_fault(thread, fault_addr, reason, regs);
--cpu_local_var(in_page_fault);
preempt_enable();
#ifdef ENABLE_FUGAKU_DEBUG
@ -1511,9 +1558,7 @@ out_linux:
out_ok:
#endif
error = 0;
#ifdef ENABLE_FUGAKU_HACKS
--cpu_local_var(in_page_fault);
#endif
preempt_enable();
out:
dkprintf("%s: addr: %p, reason: %lx, regs: %p -> error: %d\n",
@ -2885,3 +2930,44 @@ retry:
return ptep;
}
int phys_to_nid(unsigned long p)
{
int i, numa_id = -1, _numa_id;
unsigned long _start, _end;
for (i = 0; i < ihk_mc_get_nr_memory_chunks(); i++) {
ihk_mc_get_memory_chunk(i, &_start, &_end, &_numa_id);
if (p >= _start && p < _end) {
numa_id = _numa_id;
goto out;
}
}
out:
return numa_id;
}
int lookup_node(struct process_vm *vm, void *addr)
{
int node, err, reason = PF_POPULATE | PF_USER;
pte_t *ptep;
err = page_fault_process_vm(vm, (void *)addr, reason);
if (err) {
node = err;
goto out;
}
ptep = ihk_mc_pt_lookup_pte(vm->address_space->page_table,
(void *)addr, 0, NULL, NULL, NULL);
if (!ptep || !pte_is_present(ptep)) {
node = -ENOENT;
goto out;
}
node = phys_to_nid(pte_get_phys(ptep));
out:
return node;
}

View File

@ -122,6 +122,7 @@ init_process(struct process *proc, struct process *parent)
sizeof(struct rlimit) * MCK_RLIM_MAX);
memcpy(&proc->cpu_set, &parent->cpu_set,
sizeof(proc->cpu_set));
proc->enable_uti = parent->enable_uti;
}
INIT_LIST_HEAD(&proc->hash_list);
@ -1498,6 +1499,14 @@ int add_process_memory_range(struct process_vm *vm,
INIT_LIST_HEAD(&range->tofu_stag_list);
#endif
rc = vm_range_insert(vm, range);
if (rc) {
kprintf("%s: ERROR: could not insert range: %d\n",
__func__, rc);
kfree(range);
return rc;
}
rc = 0;
if (phys == NOPHYS) {
/* Nothing to map */
@ -1523,17 +1532,12 @@ int add_process_memory_range(struct process_vm *vm,
if (rc != 0) {
kprintf("%s: ERROR: preparing page tables\n", __FUNCTION__);
remove_process_memory_range(vm,
range->start, range->end, NULL);
kfree(range);
return rc;
}
rc = vm_range_insert(vm, range);
if (rc) {
kprintf("%s: ERROR: could not insert range: %d\n",
__FUNCTION__, rc);
return rc;
}
/* Clear content! */
if (phys != NOPHYS && !(flag & (VR_REMOTE | VR_DEMAND_PAGING))
&& ((flag & VR_PROT_MASK) != VR_PROT_NONE)) {
@ -2739,6 +2743,7 @@ unsigned long extend_process_region(struct process_vm *vm,
int rc;
size_t len;
int npages;
struct vm_range *range;
size_t align_size = vm->proc->heap_extension > PAGE_SIZE ?
LARGE_PAGE_SIZE : PAGE_SIZE;
@ -2755,6 +2760,15 @@ unsigned long extend_process_region(struct process_vm *vm,
(align_size - 1)) & align_mask;
}
/* Check if the range to be extended already exists */
range = lookup_process_memory_range(vm,
end_allocated, new_end_allocated);
if (range) {
dkprintf("%s: warning: vm_range for extension already exists\n",
__func__);
return end_allocated;
}
len = new_end_allocated - end_allocated;
npages = len >> PAGE_SHIFT;
@ -2782,7 +2796,7 @@ unsigned long extend_process_region(struct process_vm *vm,
if ((rc = add_process_memory_range(vm, end_allocated, new_end_allocated,
(p == 0 ? 0 : virt_to_phys(p)), flag, NULL, 0,
align_shift, NULL)) != 0) {
ihk_mc_free_pages_user(p, (new_end_allocated - end_allocated) >> PAGE_SHIFT);
ihk_mc_free_pages_user(p, npages);
return end_allocated;
}
// memory_stat_rss_add() is called in add_process_memory_range()
@ -3636,6 +3650,8 @@ void spin_sleep_or_schedule(void)
}
if (woken) {
dkprintf("%s: woken while spinning, cpu: %d, do_schedule: %d\n",
__func__, ihk_ikc_get_processor_id(), do_schedule);
if (do_schedule) {
irqstate = ihk_mc_spinlock_lock(&v->runq_lock);
v->flags |= CPU_FLAG_NEED_RESCHED;
@ -3654,6 +3670,8 @@ void spin_sleep_or_schedule(void)
out_schedule:
schedule();
dkprintf("%s: woken while sleeping, cpu: %d\n",
__func__, ihk_ikc_get_processor_id());
}
void schedule(void)
@ -3667,7 +3685,7 @@ void schedule(void)
if (cpu_local_var(no_preempt)) {
kprintf("%s: WARNING can't schedule() while no preemption, cnt: %d\n",
__FUNCTION__, cpu_local_var(no_preempt));
__func__, cpu_local_var(no_preempt));
irqstate = cpu_disable_interrupt_save();
ihk_mc_spinlock_lock_noirq(

View File

@ -225,8 +225,6 @@ int shmobj_create_indexed(struct shmid_ds *ds, struct shmobj **objp)
static void shmobj_destroy(struct shmobj *obj)
{
extern struct shm_info the_shm_info;
extern struct list_head kds_free_list;
extern int the_maxi;
struct shmlock_user *user;
size_t size;
int npages;
@ -306,27 +304,13 @@ static void shmobj_destroy(struct shmobj *obj)
kfree(obj);
}
else {
int i = obj->index / 64;
unsigned long x = 1UL << (obj->index % 64);
list_del(&obj->chain);
--the_shm_info.used_ids;
list_add(&obj->chain, &kds_free_list);
/* For index reuse, release in descending order of index. */
for (;;) {
struct shmobj *p;
list_for_each_entry(p, &kds_free_list, chain) {
if (p->index == the_maxi) {
break;
}
}
if (&p->chain == &kds_free_list) {
break;
}
list_del(&p->chain);
kfree(p);
--the_maxi;
}
shmid_index[i] &= ~x;
kfree(obj);
}
return;
}

View File

@ -74,13 +74,6 @@
#define DDEBUG_DEFAULT DDEBUG_PRINT
#endif
//#define DEBUG_UTI
#ifdef DEBUG_UTI
#define uti_dkprintf(...) do { ((uti_clv && linux_printk) ? (*linux_printk) : kprintf)(__VA_ARGS__); } while (0)
#else
#define uti_dkprintf(...) do { } while (0)
#endif
//static ihk_atomic_t pid_cnt = IHK_ATOMIC_INIT(1024);
/* generate system call handler's prototypes */
@ -244,6 +237,7 @@ long do_syscall(struct syscall_request *req, int cpu)
#define STATUS_COMPLETED 1
#define STATUS_PAGE_FAULT 3
#define STATUS_SYSCALL 4
#define __NR_syscall_response 8001
while (smp_load_acquire(&res.status) != STATUS_COMPLETED) {
while (smp_load_acquire(&res.status) == STATUS_IN_PROGRESS) {
struct cpu_local_var *v;
@ -252,6 +246,7 @@ long do_syscall(struct syscall_request *req, int cpu)
unsigned long flags;
DECLARE_WAITQ_ENTRY(scd_wq_entry, cpu_local_var(current));
check_sig_pending();
cpu_pause();
/* Spin if not preemptable */
@ -371,7 +366,7 @@ long do_syscall(struct syscall_request *req, int cpu)
}
/* send result */
req2.number = __NR_mmap;
req2.number = __NR_syscall_response;
req2.args[1] = syscall_ret;
/* The current thread is the requester and only the waiting thread
* may serve the request */
@ -2735,7 +2730,7 @@ SYSCALL_DECLARE(brk)
}
/* If already allocated, just expand and return */
if (address < region->brk_end_allocated) {
if (address <= region->brk_end_allocated) {
region->brk_end = address;
r = region->brk_end;
goto out;
@ -6161,9 +6156,40 @@ struct kshmid_ds {
struct list_head chain;
};
int the_maxi = -1;
unsigned long shmid_index[512];
static int get_shmid_max_index(void)
{
int i;
int index = -1;
for (i = 511; i >= 0; i--) {
if (shmid_index[i]) {
index = i * 64 + 63 - __builtin_clzl(shmid_index[i]);
break;
}
}
return index;
}
static int get_shmid_index(void)
{
int index = get_shmid_max_index();
int i;
unsigned long x;
for (index = 0;; index++) {
i = index / 64;
x = 1UL << (index % 64);
if (!(shmid_index[i] & x)) {
shmid_index[i] |= x;
break;
}
}
return index;
}
LIST_HEAD(kds_list);
LIST_HEAD(kds_free_list);
struct shminfo the_shminfo = {
.shmmax = 64L * 1024 * 1024 * 1024,
.shmmin = 1,
@ -6327,6 +6353,7 @@ int do_shmget(const key_t key, const size_t size, const int shmflg)
}
shmid = make_shmid(obj);
shmobj_list_unlock();
memobj_unref(&obj->memobj);
dkprintf("do_shmget(%#lx,%#lx,%#x): %d\n", key, size, shmflg, shmid);
return shmid;
}
@ -6383,7 +6410,7 @@ int do_shmget(const key_t key, const size_t size, const int shmflg)
return error;
}
obj->index = ++the_maxi;
obj->index = get_shmid_index();
list_add(&obj->chain, &kds_list);
++the_shm_info.used_ids;
@ -6669,7 +6696,7 @@ SYSCALL_DECLARE(shmctl)
return error;
}
maxi = the_maxi;
maxi = get_shmid_max_index();
if (maxi < 0) {
maxi = 0;
}
@ -6776,7 +6803,7 @@ SYSCALL_DECLARE(shmctl)
return error;
}
maxi = the_maxi;
maxi = get_shmid_max_index();
if (maxi < 0) {
maxi = 0;
}
@ -6873,7 +6900,7 @@ long do_futex(int n, unsigned long arg0, unsigned long arg1,
}
op = (op & FUTEX_CMD_MASK);
uti_dkprintf("futex op=[%x, %s],uaddr=%lx, val=%x, utime=%lx, uaddr2=%lx, val3=%x, []=%x, shared: %d\n",
dkprintf("futex op=[%x, %s],uaddr=%lx, val=%x, utime=%lx, uaddr2=%lx, val3=%x, []=%x, shared: %d\n",
flags,
(op == FUTEX_WAIT) ? "FUTEX_WAIT" :
(op == FUTEX_WAIT_BITSET) ? "FUTEX_WAIT_BITSET" :
@ -6885,7 +6912,8 @@ long do_futex(int n, unsigned long arg0, unsigned long arg1,
(unsigned long)uaddr, val, utime, uaddr2, val3, *uaddr, fshared);
if ((op == FUTEX_WAIT || op == FUTEX_WAIT_BITSET) && utime) {
uti_dkprintf("%s: utime=%ld.%09ld\n", __FUNCTION__, utime->tv_sec, utime->tv_nsec);
dkprintf("%s: utime=%ld.%09ld\n",
__func__, utime->tv_sec, utime->tv_nsec);
}
if (utime && (op == FUTEX_WAIT_BITSET || op == FUTEX_WAIT)) {
unsigned long nsec_timeout;
@ -6943,7 +6971,8 @@ long do_futex(int n, unsigned long arg0, unsigned long arg1,
if (ret) {
return ret;
}
uti_dkprintf("%s: ats=%ld.%09ld\n", __FUNCTION__, ats.tv_sec, ats.tv_nsec);
dkprintf("%s: ats=%ld.%09ld\n",
__func__, ats.tv_sec, ats.tv_nsec);
/* Use nsec for UTI case */
timeout = (utime->tv_sec * NS_PER_SEC + utime->tv_nsec) -
(ats.tv_sec * NS_PER_SEC + ats.tv_nsec);
@ -6959,9 +6988,9 @@ long do_futex(int n, unsigned long arg0, unsigned long arg1,
if (op == FUTEX_CMP_REQUEUE || op == FUTEX_WAKE_OP)
val2 = (uint32_t) (unsigned long) arg3;
ret = futex(uaddr, op, val, timeout, uaddr2, val2, val3, fshared, uti_clv);
ret = futex(uaddr, op, val, timeout, uaddr2, val2, val3, fshared);
uti_dkprintf("futex op=[%x, %s],uaddr=%lx, val=%x, utime=%lx, uaddr2=%lx, val3=%x, []=%x, shared: %d, ret: %d\n",
dkprintf("futex op=[%x, %s],uaddr=%lx, val=%x, utime=%lx, uaddr2=%lx, val3=%x, []=%x, shared: %d, ret: %d\n",
op,
(op == FUTEX_WAIT) ? "FUTEX_WAIT" :
(op == FUTEX_WAIT_BITSET) ? "FUTEX_WAIT_BITSET" :
@ -7009,7 +7038,7 @@ do_exit(int code)
setint_user((int*)thread->clear_child_tid, 0);
barrier();
futex((uint32_t *)thread->clear_child_tid,
FUTEX_WAKE, 1, 0, NULL, 0, 0, 1, NULL);
FUTEX_WAKE, 1, 0, NULL, 0, 0, 1);
thread->clear_child_tid = NULL;
}
@ -9181,7 +9210,7 @@ SYSCALL_DECLARE(remap_file_pages)
if (!range || (start < range->start) || (range->end < end)
|| (range->flag & VR_PRIVATE)
|| (range->flag & (VR_REMOTE|VR_IO_NOCACHE|VR_RESERVED))
|| !range->memobj) {
|| !is_callable_remap_file_pages(range->memobj)) {
ekprintf("sys_remap_file_pages(%#lx,%#lx,%#x,%#lx,%#x):"
"invalid VMR:[%#lx-%#lx) %#lx %p\n",
start0, size, prot, pgoff, flags,
@ -9676,7 +9705,9 @@ SYSCALL_DECLARE(mbind)
return -EINVAL;
}
#ifdef ENABLE_FUGAKU_HACKS
return 0;
#endif
memset(numa_mask, 0, sizeof(numa_mask));
@ -9921,6 +9952,10 @@ mbind_update_only:
sizeof(numa_mask));
}
range_policy->numa_mem_policy = mode;
if (mode == MPOL_INTERLEAVE) {
range_policy->il_prev =
PROCESS_NUMA_MASK_BITS - 1;
}
break;
@ -10082,6 +10117,9 @@ SYSCALL_DECLARE(set_mempolicy)
}
vm->numa_mem_policy = mode;
if (mode == MPOL_INTERLEAVE) {
vm->il_prev = PROCESS_NUMA_MASK_BITS - 1;
}
error = 0;
break;
@ -10144,6 +10182,20 @@ SYSCALL_DECLARE(get_mempolicy)
}
}
/* case of MPOL_F_NODE and MPOL_F_ADDR are specified */
if (flags & MPOL_F_NODE && flags & MPOL_F_ADDR) {
/* return the node ID which addr is allocated by mode */
int nid;
nid = lookup_node(vm, (void *)addr);
error = copy_to_user(mode, &nid, sizeof(int));
if (error) {
error = -EFAULT;
goto out;
}
goto out;
}
/* Special case of MPOL_F_MEMS_ALLOWED */
if (flags == MPOL_F_MEMS_ALLOWED) {
if (nodemask) {
@ -10227,7 +10279,7 @@ SYSCALL_DECLARE(move_pages)
struct move_pages_smp_req mpsr;
struct process_vm *vm = cpu_local_var(current)->vm;
int ret = 0;
int i, ret = 0;
unsigned long t_s, t_e;
@ -10237,18 +10289,20 @@ SYSCALL_DECLARE(move_pages)
if (pid) {
kprintf("%s: ERROR: only self (pid == 0)"
" is supported\n", __FUNCTION__);
return -EINVAL;
ret = -EINVAL;
goto out;
}
switch (flags) {
case MPOL_MF_MOVE_ALL:
/* Check flags */
if (flags & ~(MPOL_MF_MOVE|MPOL_MF_MOVE_ALL)) {
ret = -EINVAL;
goto out;
}
if (flags & MPOL_MF_MOVE_ALL) {
kprintf("%s: ERROR: MPOL_MF_MOVE_ALL"
" not supported\n", __func__);
return -EINVAL;
case MPOL_MF_MOVE:
break;
default:
return -EINVAL;
ret = -EINVAL;
goto out;
}
/* Allocate kernel arrays */
@ -10296,7 +10350,7 @@ t_e = rdtsc(); kprintf("%s: init malloc: %lu \n", __FUNCTION__, t_e - t_s); t_s
goto dealloc_out;
}
if (verify_process_vm(cpu_local_var(current)->vm,
if (user_nodes && verify_process_vm(cpu_local_var(current)->vm,
user_nodes, sizeof(int) * count)) {
ret = -EFAULT;
goto dealloc_out;
@ -10307,6 +10361,18 @@ t_e = rdtsc(); kprintf("%s: init malloc: %lu \n", __FUNCTION__, t_e - t_s); t_s
ret = -EFAULT;
goto dealloc_out;
}
/* Check node ID */
if (user_nodes) {
copy_from_user(nodes, user_nodes, sizeof(int) * count);
for (i = 0; i < count; i++) {
if (nodes[i] < 0 || nodes[i] >= ihk_mc_get_nr_numa_nodes()) {
ret = -ENODEV;
goto dealloc_out;
}
}
}
t_e = rdtsc(); kprintf("%s: init verify: %lu \n", __FUNCTION__, t_e - t_s); t_s = t_e;
#if 0
@ -10399,6 +10465,7 @@ dealloc_out:
kfree(ptep);
kfree(dst_phys);
out:
return ret;
}
@ -10524,7 +10591,7 @@ int util_thread(struct uti_attr *arg)
{
struct uti_ctx *rctx = NULL;
unsigned long rp_rctx;
struct cpu_local_var *uti_clv = NULL;
struct uti_info *uti_info = NULL;
struct syscall_request request IHK_DMA_ALIGN;
long rc;
struct thread *thread = cpu_local_var(current);
@ -10543,13 +10610,29 @@ int util_thread(struct uti_attr *arg)
rp_rctx = virt_to_phys((void *)rctx);
save_uctx((void *)rctx->ctx, NULL);
/* Create a copy of clv and replace clv with it when the Linux thread calls in a McKernel function */
uti_clv = kmalloc(sizeof(struct cpu_local_var), IHK_MC_AP_NOWAIT);
if (!uti_clv) {
/* Create a information for Linux thread */
uti_info = kmalloc(sizeof(struct uti_info), IHK_MC_AP_NOWAIT);
if (!uti_info) {
rc = -ENOMEM;
goto out;
}
memcpy(uti_clv, get_this_cpu_local_var(), sizeof(struct cpu_local_var));
/* clv info */
uti_info->thread_va = (unsigned long)cpu_local_var(current);
uti_info->uti_futex_resp_pa = virt_to_phys((void *)cpu_local_var(uti_futex_resp));
uti_info->ikc2linux_pa = virt_to_phys((void *)cpu_local_var(ikc2linux));
/* thread info */
uti_info->tid = thread->tid;
uti_info->cpu = ihk_mc_get_processor_id();
uti_info->status_pa = virt_to_phys((void *)&thread->status);
uti_info->spin_sleep_lock_pa = virt_to_phys((void *)&thread->spin_sleep_lock);
uti_info->spin_sleep_pa = virt_to_phys((void *)&thread->spin_sleep);
uti_info->vm_pa = virt_to_phys((void *)thread->vm);
uti_info->futex_q_pa = virt_to_phys((void *)&thread->futex_q);
/* global info */
uti_info->mc_idle_halt = idle_halt;
uti_info->futex_queue_pa = virt_to_phys((void *)get_futex_queues());
request.number = __NR_sched_setaffinity;
request.args[0] = 0;
@ -10560,7 +10643,7 @@ int util_thread(struct uti_attr *arg)
kattr.parent_cpuid = thread->parent_cpuid;
request.args[2] = virt_to_phys(&kattr);
}
request.args[3] = (unsigned long)uti_clv;
request.args[3] = (unsigned long)uti_info;
request.args[4] = uti_desc;
thread->uti_state = UTI_STATE_RUNNING_IN_LINUX;
rc = do_syscall(&request, ihk_mc_get_processor_id());
@ -10577,8 +10660,8 @@ int util_thread(struct uti_attr *arg)
kfree(rctx);
rctx = NULL;
kfree(uti_clv);
uti_clv = NULL;
kfree(uti_info);
uti_info = NULL;
if (rc >= 0) {
if (rc & 0x100000000) { /* exit_group */
@ -10601,7 +10684,7 @@ int util_thread(struct uti_attr *arg)
out:
kfree(rctx);
kfree(uti_clv);
kfree(uti_info);
return rc;
}
@ -10639,6 +10722,12 @@ SYSCALL_DECLARE(util_indicate_clone)
struct thread *thread = cpu_local_var(current);
struct uti_attr *kattr = NULL;
if (!thread->proc->enable_uti) {
kprintf("%s: error: --enable-uti mcexec option not specified\n",
__func__);
return -EINVAL;
}
if (mod != SPAWN_TO_LOCAL &&
mod != SPAWN_TO_REMOTE)
return -EINVAL;
@ -11040,8 +11129,8 @@ long syscall(int num, ihk_mc_user_context_t *ctx)
return l;
}
static int
check_sig_pending_thread(struct thread *thread)
void
check_sig_pending()
{
int found = 0;
struct list_head *head;
@ -11053,9 +11142,22 @@ check_sig_pending_thread(struct thread *thread)
__sigset_t x;
int sig = 0;
struct k_sigaction *k;
struct cpu_local_var *v;
struct thread *thread;
if (clv == NULL)
return;
thread = cpu_local_var(current);
if (thread == NULL || thread == &cpu_local_var(idle)) {
return;
}
if (thread->in_syscall_offload == 0) {
return;
}
if (thread->proc->group_exit_status & 0x0000000100000000L) {
return;
}
v = get_this_cpu_local_var();
w = thread->sigmask.__val[0];
lock = &thread->sigcommon->lock;
@ -11104,16 +11206,14 @@ check_sig_pending_thread(struct thread *thread)
}
if (found == 2) {
ihk_mc_spinlock_unlock(&v->runq_lock, v->runq_irqstate);
terminate_mcexec(0, sig);
return 1;
return;
}
else if (found == 1) {
ihk_mc_spinlock_unlock(&v->runq_lock, v->runq_irqstate);
interrupt_syscall(thread, 0);
return 1;
return;
}
return 0;
return;
}
struct sig_pending *
@ -11200,38 +11300,6 @@ hassigpending(struct thread *thread)
return getsigpending(thread, 0);
}
void
check_sig_pending(void)
{
struct thread *thread;
struct cpu_local_var *v;
if (clv == NULL)
return;
v = get_this_cpu_local_var();
repeat:
v->runq_irqstate = ihk_mc_spinlock_lock(&v->runq_lock);
list_for_each_entry(thread, &(v->runq), sched_list) {
if (thread == NULL || thread == &cpu_local_var(idle)) {
continue;
}
if (thread->in_syscall_offload == 0) {
continue;
}
if (thread->proc->group_exit_status & 0x0000000100000000L) {
continue;
}
if (check_sig_pending_thread(thread))
goto repeat;
}
ihk_mc_spinlock_unlock(&v->runq_lock, v->runq_irqstate);
}
static void
__check_signal(unsigned long rc, void *regs0, int num, int irq_disabled)
{

View File

@ -1236,6 +1236,7 @@ static int tof_utofu_ioctl_alloc_stag(struct tof_utofu_device *dev, unsigned lon
readonly = (req.flags & 1) != 0;
retry:
ihk_rwspinlock_read_lock_noirq(&vm->memory_range_lock);
/* Assume smallest page size at first */
@ -1271,6 +1272,20 @@ static int tof_utofu_ioctl_alloc_stag(struct tof_utofu_device *dev, unsigned lon
}
if (!range) {
if (vm->region.stack_start <= start &&
vm->region.stack_end > end) {
ihk_rwspinlock_read_unlock_noirq(&vm->memory_range_lock);
if (page_fault_process_vm(vm, (void *)start,
PF_POPULATE | PF_WRITE | PF_USER) < 0) {
ret = -EINVAL;
goto out;
}
goto retry;
}
ret = -EINVAL;
goto unlock_out;
}
@ -1358,6 +1373,7 @@ static int tof_utofu_ioctl_alloc_stag(struct tof_utofu_device *dev, unsigned lon
unlock_out:
ihk_rwspinlock_read_unlock_noirq(&vm->memory_range_lock);
out:
if(ret == 0){
if(copy_to_user((void *)arg, &req, sizeof(req)) != 0){
kprintf("%s: ret: %d\n", __func__, -EFAULT);

View File

@ -2056,6 +2056,7 @@ static int xpmem_pin_page(
XPMEM_DEBUG("call: tgid=%d, vaddr=0x%lx", tg->tgid, vaddr);
retry:
ihk_rwspinlock_read_lock_noirq(&src_vm->memory_range_lock);
range = lookup_process_memory_range(src_vm, vaddr, vaddr + 1);
@ -2063,6 +2064,20 @@ static int xpmem_pin_page(
ihk_rwspinlock_read_unlock_noirq(&src_vm->memory_range_lock);
if (!range || range->start > vaddr) {
/*
* Grow the stack if address falls into stack region
* so that we can lookup range successfully.
*/
if (src_vm->region.stack_start <= vaddr &&
src_vm->region.stack_end > vaddr) {
if (page_fault_process_vm(src_vm, (void *)vaddr,
PF_POPULATE | PF_WRITE | PF_USER) < 0) {
return -ENOENT;
}
goto retry;
}
return -ENOENT;
}

View File

@ -109,6 +109,7 @@ static inline int is_sampling_event(struct mc_perf_event *event)
void *ihk_mc_switch_context(ihk_mc_kernel_context_t *old_ctx,
ihk_mc_kernel_context_t *new_ctx,
void *prev);
int ihk_mc_get_interrupt_id(int cpu);
int ihk_mc_interrupt_cpu(int cpu, int vector);
void ihk_mc_init_user_process(ihk_mc_kernel_context_t *ctx,
@ -173,4 +174,7 @@ struct cpu_mapping;
int arch_get_cpu_mapping(struct cpu_mapping **buf, int *nelemsp);
int ihk_mc_ikc_arch_issue_host_ipi(int cpu, int vector);
void smp_func_call_handler(void);
int ihk_mc_get_smp_handler_irq(void);
#endif

View File

@ -247,7 +247,6 @@ int ihk_set_monitor(unsigned long addr, unsigned long size);
int ihk_set_rusage(unsigned long addr, unsigned long size);
int ihk_set_multi_intr_mode_addr(unsigned long addr);
int ihk_set_nmi_mode_addr(unsigned long addr);
int ihk_set_mckernel_do_futex(unsigned long addr);
extern void (*__tlb_flush_handler)(int vector);

View File

@ -20,6 +20,8 @@ struct process_vm;
unsigned long virt_to_phys(void *v);
void *phys_to_virt(unsigned long p);
int phys_to_nid(unsigned long p);
int lookup_node(struct process_vm *vm, void *addr);
int copy_from_user(void *dst, const void *src, size_t siz);
int strlen_user(const char *s);
int strcpy_from_user(char *dst, const char *src);

View File

@ -23,11 +23,19 @@ Summary: IHK/McKernel
License: GPLv2
Source0: mckernel-%{version}.tar.gz
Requires: systemd-libs numactl-libs libdwarf
Requires: systemd-libs numactl-libs libdwarf capstone
# kernel_module_package macro does not handle cross build...
# don't use kernel_module_package so that one rpm including .ko and binaries are created
%if "%{?_host_cpu}" == "x86_64" && "%{?_target_cpu}" == "aarch64"
%define cross_compile 1
%else
BuildRequires: systemd-devel numactl-devel binutils-devel kernel-devel libdwarf-devel capstone-devel
# Friendly reminder of the fact that kernel-rpm-macros is no longer included in kernel-devel
%if 0%{?rhel} >= 8
BuildRequires: redhat-rpm-config kernel-rpm-macros elfutils-libelf-devel
%endif
%endif
%if 0%{?rhel} >= 8
Requires: kernel >= %{krequires}
%else
@ -35,17 +43,6 @@ Requires: kernel = %{krequires}
%endif
Requires(post): /usr/sbin/depmod
Requires(postun): /usr/sbin/depmod
%else
BuildRequires: systemd-devel numactl-devel binutils-devel kernel-devel libdwarf-devel
# Friendly reminder of the fact that kernel-rpm-macros is no longer included in kernel-devel
%if 0%{?rhel} >= 8
BuildRequires: redhat-rpm-config kernel-rpm-macros elfutils-libelf-devel kmod
%endif
%if %{defined kernel_module_package_buildreqs}
BuildRequires: %kernel_module_package_buildreqs
%kernel_module_package %{?kmod_flavors}
%endif
%endif
%description
Interface for Heterogeneous Kernels and McKernel.
@ -78,6 +75,9 @@ This package contains headers and libraries required for build apps using IHK/Mc
%{?cmake_libdir:-DCMAKE_INSTALL_LIBDIR=%{cmake_libdir}} \
%{?build_target:-DBUILD_TARGET=%{build_target}} \
%{?toolchain_file:-DCMAKE_TOOLCHAIN_FILE=%{toolchain_file}} \
-DENABLE_TOFU=ON -DENABLE_FUGAKU_HACKS=ON \
-DENABLE_KRM_WORKAROUND=OFF -DWITH_KRM=ON \
-DENABLE_FUGAKU_DEBUG=OFF \
.
%make_build
@ -107,13 +107,19 @@ This package contains headers and libraries required for build apps using IHK/Mc
%{_libdir}/libsched_yield.so.1.0.0
%{_libdir}/libsched_yield.so
%{_libdir}/libldump2mcdump.so
%{_libdir}/libmck_syscall_intercept.so
%{_libdir}/libsyscall_intercept.so.0.1.0
%{_libdir}/libsyscall_intercept.so.0
%{_libdir}/libsyscall_intercept.so
%{_libdir}/mck/libuti.so.1.0.0
%{_libdir}/mck/libuti.so.1
%{_libdir}/mck/libuti.so
%{_sysconfdir}/irqbalance_mck.in
%{_mandir}/man1/mcreboot.1.gz
%{_mandir}/man1/ihkconfig.1.gz
%{_mandir}/man1/ihkosctl.1.gz
%{_mandir}/man1/mcexec.1.gz
%if 0%{?cross_compile}
/lib/modules/%{kernel_version}/extra/mckernel/ihk.ko
/lib/modules/%{kernel_version}/extra/mckernel/mcctrl.ko
%ifarch x86_64
@ -122,7 +128,6 @@ This package contains headers and libraries required for build apps using IHK/Mc
%ifarch aarch64
/lib/modules/%{kernel_version}/extra/mckernel/ihk-smp-arm64.ko
%endif
%endif
%files devel
%{_includedir}/ihklib.h
@ -133,10 +138,12 @@ This package contains headers and libraries required for build apps using IHK/Mc
%{_includedir}/ihk/ihk_monitor.h
%{_includedir}/ihk/ihk_debug.h
%{_includedir}/ihk/ihk_host_driver.h
%{_includedir}/libsyscall_intercept_hook_point.h
%{_libdir}/pkgconfig/libsyscall_intercept.pc
%{_mandir}/man3/libsyscall_intercept.3
/lib/modules/%{kernel_version}/extra/mckernel/ihk/linux/core/Module.symvers
%if 0%{?cross_compile}
# scripts from /usr/lib/rpm/redhat/kmodtool (kernel_module_package) as well
# taken from /usr/lib/rpm/redhat/kmodtool (kernel_module_package)
%post
if [ -e "/boot/System.map-%{kernel_version}" ]; then
/usr/sbin/depmod -aeF "/boot/System.map-%{kernel_version}" "%{kernel_version}" > /dev/null || :
@ -162,7 +169,6 @@ if [ -x "/sbin/weak-modules" ]; then
printf '%s\n' "${modules[@]}" \
| /sbin/weak-modules --remove-modules
fi
%endif
%changelog
* Tue Feb 12 2019 Dominique Martinet <dominique.martinet@cea.fr> - 1.6.0-0

View File

@ -0,0 +1,24 @@
#!/bin/sh
USELTP=1
USEOSTEST=0
. ../../common.sh
################################################################################
for i in shmctl05:01 shmctl01:02 shmctl02:03 shmctl03:04 shmctl04:05 \
remap_file_pages01:06 remap_file_pages02:07; do
tp=`echo $i|sed 's/:.*//'`
id=`echo $i|sed 's/.*://'`
sudo PATH=$PATH:$LTPBIN $MCEXEC $LTPBIN/$tp 2>&1 | tee $tp.txt
ok=`grep TPASS $tp.txt | wc -l`
ng=`grep TFAIL $tp.txt | wc -l`
if [ $ok = 0 -a $ng = 0 ]; then
ok=`awk '/^passed/{print $2}' $tp.txt`
ng=`awk '/^failed/{print $2}' $tp.txt`
fi
if [ $ng = 0 ]; then
echo "*** C1379T$id: $tp PASS ($ok)"
else
echo "*** C1379T$id: $tp FAIL (ok=$ok ng=$ng)"
fi
done

View File

@ -0,0 +1,74 @@
Script started on Tue 01 Sep 2020 07:13:12 AM JST
[shirasawa@apollo15 1379+1521+1531]$ uname -m
aarch64
[shirasawa@apollo15 1379+1521+1531]$ make test
sh ./C1379.sh
mcstop+release.sh ... done
mcreboot.sh -c 2-31 -m 2G@0,2G@1 -O ... done
tst_test.c:1096: INFO: Timeout per run is 0h 00m 20s
../../../../../include/tst_fuzzy_sync.h:477: INFO: Minimum sampling period ended
../../../../../include/tst_fuzzy_sync.h:301: INFO: loop = 1024, delay_bias = 0
../../../../../include/tst_fuzzy_sync.h:290: INFO: start_a - start_b: { avg = 21ns, avg_dev = 9ns, dev_ratio = 0.45 }
../../../../../include/tst_fuzzy_sync.h:290: INFO: end_a - start_a : { avg = 403181ns, avg_dev = 63ns, dev_ratio = 0.00 }
../../../../../include/tst_fuzzy_sync.h:290: INFO: end_b - start_b : { avg = 2429ns, avg_dev = 24ns, dev_ratio = 0.01 }
../../../../../include/tst_fuzzy_sync.h:290: INFO: end_a - end_b : { avg = 400772ns, avg_dev = 78ns, dev_ratio = 0.00 }
../../../../../include/tst_fuzzy_sync.h:290: INFO: spins : { avg = 61836 , avg_dev = 15 , dev_ratio = 0.00 }
../../../../../include/tst_fuzzy_sync.h:489: INFO: Reached deviation ratios < 0.10, introducing randomness
../../../../../include/tst_fuzzy_sync.h:492: INFO: Delay range is [-462, 62221]
../../../../../include/tst_fuzzy_sync.h:301: INFO: loop = 4616, delay_bias = 0
../../../../../include/tst_fuzzy_sync.h:290: INFO: start_a - start_b: { avg = 32ns, avg_dev = 3ns, dev_ratio = 0.10 }
../../../../../include/tst_fuzzy_sync.h:290: INFO: end_a - start_a : { avg = 402983ns, avg_dev = 38ns, dev_ratio = 0.00 }
../../../../../include/tst_fuzzy_sync.h:290: INFO: end_b - start_b : { avg = 2998ns, avg_dev = 30ns, dev_ratio = 0.01 }
../../../../../include/tst_fuzzy_sync.h:290: INFO: end_a - end_b : { avg = 400017ns, avg_dev = 27ns, dev_ratio = 0.00 }
../../../../../include/tst_fuzzy_sync.h:290: INFO: spins : { avg = 61763 , avg_dev = 50 , dev_ratio = 0.00 }
../../../../../include/tst_fuzzy_sync.h:606: INFO: Exceeded execution time, requesting exit
shmctl05.c:97: PASS: didn't crash
Summary:
passed 1
failed 0
skipped 0
warnings 0
*** C1379T01: shmctl05 PASS (1)
shmctl01 1 TPASS : pid, size, # of attaches and mode are correct - pass #1
shmctl01 2 TPASS : pid, size, # of attaches and mode are correct - pass #2
shmctl01 3 TPASS : new mode and change time are correct
shmctl01 4 TPASS : get correct shared memory limits
shmctl01 5 TPASS : get correct shared memory id
shmctl01 6 TPASS : SHM_LOCK is set
shmctl01 7 TPASS : SHM_LOCK is cleared
shmctl01 8 TPASS : shared memory appears to be removed
*** C1379T02: shmctl01 PASS (8)
shmctl02 1 TPASS : expected failure - errno = 13 : Permission denied
shmctl02 2 TPASS : expected failure - errno = 14 : Bad address
shmctl02 3 TPASS : expected failure - errno = 14 : Bad address
shmctl02 4 TPASS : expected failure - errno = 22 : Invalid argument
shmctl02 5 TPASS : expected failure - errno = 22 : Invalid argument
shmctl02 6 TCONF : shmctl02.c:138: shmctl() did not fail for non-root user.This may be okay for your distribution.
shmctl02 7 TCONF : shmctl02.c:138: shmctl() did not fail for non-root user.This may be okay for your distribution.
*** C1379T03: shmctl02 PASS (5)
shmctl03 1 TPASS : expected failure - errno = 13 : Permission denied
shmctl03 2 TPASS : expected failure - errno = 1 : Operation not permitted
shmctl03 3 TPASS : expected failure - errno = 1 : Operation not permitted
*** C1379T04: shmctl03 PASS (3)
shmctl04 1 TPASS : SHM_INFO call succeeded
*** C1379T05: shmctl04 PASS (1)
remap_file_pages01 1 TPASS : Non-Linear shm file OK
remap_file_pages01 2 TPASS : Non-Linear /tmp/ file OK
*** C1379T06: remap_file_pages01 PASS (2)
tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
remap_file_pages02.c:86: PASS: remap_file_pages(2) start is not valid MAP_SHARED mapping: EINVAL
remap_file_pages02.c:86: PASS: remap_file_pages(2) start is invalid: EINVAL
remap_file_pages02.c:86: PASS: remap_file_pages(2) size is invalid: EINVAL
remap_file_pages02.c:86: PASS: remap_file_pages(2) prot is invalid: EINVAL
Summary:
passed 4
failed 0
skipped 0
warnings 0
*** C1379T07: remap_file_pages02 PASS (4)
[shirasawa@apollo15 1379+1521+1531]$ exit
exit
Script done on Tue 01 Sep 2020 07:14:15 AM JST

View File

@ -0,0 +1,66 @@
Script started on Tue Sep 1 06:30:33 2020
bash-4.2$ uname -m
x86_64
bash-4.2$ make test
sh ./C1379.sh
mcstop+release.sh ... done
mcreboot.sh -c 1-7,9-15,17-23,25-31 -m 10G@0,10G@1 -r 1-7:0+9-15:8+17-23:16+25-31:24 ... done
tst_test.c:1096: INFO: Timeout per run is 0h 00m 20s
../../../../../include/tst_fuzzy_sync.h:477: INFO: Minimum sampling period ended
../../../../../include/tst_fuzzy_sync.h:301: INFO: loop = 1024, delay_bias = 0
../../../../../include/tst_fuzzy_sync.h:290: INFO: start_a - start_b: { avg = -341ns, avg_dev = 260ns, dev_ratio = 0.76 }
../../../../../include/tst_fuzzy_sync.h:290: INFO: end_a - start_a : { avg = 61009ns, avg_dev = 434ns, dev_ratio = 0.01 }
../../../../../include/tst_fuzzy_sync.h:290: INFO: end_b - start_b : { avg = 12223ns, avg_dev = 596ns, dev_ratio = 0.05 }
../../../../../include/tst_fuzzy_sync.h:290: INFO: end_a - end_b : { avg = 48445ns, avg_dev = 377ns, dev_ratio = 0.01 }
../../../../../include/tst_fuzzy_sync.h:290: INFO: spins : { avg = 18529 , avg_dev = 1533 , dev_ratio = 0.08 }
../../../../../include/tst_fuzzy_sync.h:606: INFO: Exceeded execution time, requesting exit
shmctl05.c:97: PASS: didn't crash
Summary:
passed 1
failed 0
skipped 0
warnings 0
*** C1379T01: shmctl05 PASS (1)
shmctl01 1 TPASS : pid, size, # of attaches and mode are correct - pass #1
shmctl01 2 TPASS : pid, size, # of attaches and mode are correct - pass #2
shmctl01 3 TPASS : new mode and change time are correct
shmctl01 4 TPASS : get correct shared memory limits
shmctl01 5 TPASS : get correct shared memory id
shmctl01 6 TPASS : SHM_LOCK is set
shmctl01 7 TPASS : SHM_LOCK is cleared
shmctl01 8 TPASS : shared memory appears to be removed
*** C1379T02: shmctl01 PASS (8)
shmctl02 1 TPASS : expected failure - errno = 13 : Permission denied
shmctl02 2 TPASS : expected failure - errno = 14 : Bad address
shmctl02 3 TPASS : expected failure - errno = 14 : Bad address
shmctl02 4 TPASS : expected failure - errno = 22 : Invalid argument
shmctl02 5 TPASS : expected failure - errno = 22 : Invalid argument
shmctl02 6 TCONF : shmctl02.c:138: shmctl() did not fail for non-root user.This may be okay for your distribution.
shmctl02 7 TCONF : shmctl02.c:138: shmctl() did not fail for non-root user.This may be okay for your distribution.
*** C1379T03: shmctl02 PASS (5)
shmctl03 1 TPASS : expected failure - errno = 13 : Permission denied
shmctl03 2 TPASS : expected failure - errno = 1 : Operation not permitted
shmctl03 3 TPASS : expected failure - errno = 1 : Operation not permitted
*** C1379T04: shmctl03 PASS (3)
shmctl04 1 TPASS : SHM_INFO call succeeded
*** C1379T05: shmctl04 PASS (1)
remap_file_pages01 1 TPASS : Non-Linear shm file OK
remap_file_pages01 2 TPASS : Non-Linear /tmp/ file OK
*** C1379T06: remap_file_pages01 PASS (2)
tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
remap_file_pages02.c:86: PASS: remap_file_pages(2) start is not valid MAP_SHARED mapping: EINVAL
remap_file_pages02.c:86: PASS: remap_file_pages(2) start is invalid: EINVAL
remap_file_pages02.c:86: PASS: remap_file_pages(2) size is invalid: EINVAL
remap_file_pages02.c:86: PASS: remap_file_pages(2) prot is invalid: EINVAL
Summary:
passed 4
failed 0
skipped 0
warnings 0
*** C1379T07: remap_file_pages02 PASS (4)
bash-4.2$ exit
exit
Script done on Tue Sep 1 06:31:10 2020

View File

@ -0,0 +1,10 @@
CC = gcc
TARGET =
all:: $(TARGET)
test:: all
sh ./C1379.sh
clean::
rm -f $(TARGET) *.o

View File

@ -0,0 +1,25 @@
【Issue#1379, Issue#1521, Issue#1531 動作確認】
□ テスト内容
1. Issue 指摘事項の再現確認
C1379T01 LTP shmctl05を実行し、全てのテストケースがPASSすること。
2. LTP を用いて既存処理に影響しないことを確認
shmctl, remap_file_pages処理を変更したため、それらを使用するLTPを選定した。
全てPASSすること。
C1379T02 shmctl01: shmctl の基本機能の確認
C1379T03 shmctl01: shmctl の基本機能の確認
C1379T04 shmctl01: shmctl の基本機能の確認
C1379T05 shmctl01: shmctl の基本機能の確認
C1379T06 remap_file_pages01: remap_file_pages の基本機能の確認
C1379T07 remap_file_pages02: remap_file_pages の基本機能の確認
□ 実行手順
$ make test
McKernelのインストール先や LTP の配置場所は、$HOME/.mck_test_config を
参照する。.mck_test_config は、McKernel をビルドした際に生成される
mck_test_config.sample ファイルを $HOME にコピーし、適宜編集すること。
□ 実行結果
C1379_x86_64.txt(x86_64実行結果)、C1379_arm64.txt(arm64実行結果)参照。
全ての項目が PASS していることを確認。

89
test/issues/1428/C1428.sh Executable file
View File

@ -0,0 +1,89 @@
#/bin/sh
USELTP=1
USEOSTEST=0
MCREBOOT=0
. ../../common.sh
issue="1428"
tid=01
arch="`uname -p`"
if [ "${arch}" == "x86_64" ]; then
UTI_TEST_DIR="../../uti"
elif [ "${arch}" == "aarch64" ]; then
UTI_TEST_DIR="../../uti/arm64"
else
echo "Error: ${arch} is unexpected arch"
exit 1
fi
# make uti test
pushd ${UTI_TEST_DIR}
make
popd
mcreboot
for tno in `seq 12 20`
do
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
sudo ${MCEXEC} --enable-uti ${UTI_TEST_DIR}/CT${tno} 2>&1 | tee ./${tname}.txt
rc=$?
ngs=`grep "NG" ./${tname}.txt | wc -l`
if [ ${ngs} -eq 0 ]; then
echo "*** ${tname} PASSED ******************************"
else
echo "*** ${tname} FAILED ******************************"
fi
let tid++
echo ""
done
echo "*** Stop mckernel to exec CT31-34 on Linux"
mcstop
for tno in `seq 31 34`
do
sudo ${UTI_TEST_DIR}/CT${tno} -l &> ./lnx_CT${tno}.txt
done
echo "*** Boot mckernel"
mcreboot
echo ""
for tno in `seq 31 34`
do
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
sudo ${MCEXEC} --enable-uti ${UTI_TEST_DIR}/CT${tno} 2>&1 | tee ./${tname}.txt
rc=$?
ngs=`grep "NG" ./${tname}.txt | wc -l`
echo "** Result on Linux **"
grep "waiter" ./lnx_CT${tno}.txt
if [ ${ngs} -eq 0 ]; then
echo "*** ${tname} PASSED ******************************"
else
echo "*** ${tname} FAILED ******************************"
fi
let tid++
echo ""
done
for tp in futex_wait01 futex_wait02 futex_wait03 futex_wait04 futex_wait_bitset01 futex_wait_bitset02 futex_wake01 futex_wake02 futex_wake03
do
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
sudo $MCEXEC $LTPBIN/$tp 2>&1 | tee $tp.txt
ok=`grep PASS $tp.txt | wc -l`
ng=`grep FAIL $tp.txt | wc -l`
if [ $ng = 0 ]; then
echo "*** ${tname} PASSED ($ok)"
else
echo "*** ${tname} FAILED (ok=$ok ng=$ng)"
fi
let tid++
echo ""
done

11
test/issues/1428/Makefile Normal file
View File

@ -0,0 +1,11 @@
CFLAGS=-g
LDFLAGS=
TARGET=
all: $(TARGET)
test: all
./C1428.sh
clean:
rm -f $(TARGET) *.o *.txt

32
test/issues/1428/README Normal file
View File

@ -0,0 +1,32 @@
【Issue#1428 動作確認】
□ テスト内容
1. McKernelのuti用テストプログラムのうち、futex機能を用いるテストを実行し、
utiスレッドを用いたfutex機能が正常に動作することを確認
実行するのは、test/utiの中のCT1220、および、CT3134
テスト内容の詳細は、test/uti/README を参照
2. 以下のLTPを用いて既存のfutex機能に影響が無いことを確認
- futex_wait01
- futex_wait02
- futex_wait03
- futex_wait04
- futex_wake_bitset01
- futex_wake_bitset02
- futex_wake01
- futex_wake02
- futex_wake03
□ 実行手順
test/uti/Makefile または test/uti/arm64/Makefile 中の
UTI_DIR の内容を環境に合わせて変更する。
$ make test
McKernelのインストール先や、OSTEST, LTPの配置場所は、
$HOME/.mck_test_config を参照している
.mck_test_config は、McKernelをビルドした際に生成されるmck_test_config.sample ファイルを
$HOMEにコピーし、適宜編集する
□ 実行結果
x86_64_result.log aarch64_result.log 参照。
すべての項目をPASSしていることを確認。

View File

@ -0,0 +1,248 @@
./C1428.sh
mcstop+release.sh ... done
mcreboot.sh -c 37-43,49-55 -m 2G@2,2G@3 -r 37-43:36+49-55:48 -O ... done
~/src/mckernel/test/uti/arm64 ~/src/mckernel/test/issues/1428
~/src/mckernel/test/issues/1428
*** C1428T01 start *******************************
CT12001 futex START
CT12002 pthread_create OK
CT12100 running on Linux CPU OK
CT12003 FUTEX_WAKE OK
CT12101 FUTEX_WAIT OK
CT12004 pthread_join OK
CT12005 END
*** C1428T01 PASSED ******************************
*** C1428T02 start *******************************
CT13001 futex START
CT13002 pthread_create OK
CT13100 running on Linux CPU OK
CT13101 FUTEX_WAKE OK
CT13003 FUTEX_WAIT OK
CT13004 pthread_join OK
CT13005 END
*** C1428T02 PASSED ******************************
*** C1428T03 start *******************************
CT14001 futex START
CT14002 util_indicate_clone OK
CT14003 pthread_create OK
CT14004 lock first OK
CT14100 running on Linux OK
CT14101 lock second OK
CT14005 pthread_join OK
CT14006 END
nsec=94214570, nspw=9.421457
*** C1428T03 PASSED ******************************
*** C1428T04 start *******************************
CT15001 futex START
CT15002 util_indicate_clone OK
CT15003 pthread_create OK
CT15100 running on Linux OK
CT15101 lock first OK
CT15004 lock second OK
CT15005 pthread_join OK
CT15006 END
nsec=94214620, nspw=9.421462
*** C1428T04 PASSED ******************************
*** C1428T05 start *******************************
CT16001 futex START
CT16002 util_indicate_clone OK
CT16003 pthread_create OK
CT16101 running on Linux OK
CT16102 return from pthread_cond_wait() OK
CT16004 pthread_join OK
CT16005 END
*** C1428T05 PASSED ******************************
*** C1428T06 start *******************************
CT17001 futex START
CT17002 util_indicate_clone OK
CT17003 pthread_create OK
CT17004 lock on 0x4200a8 OK
CT17100 running on Linux OK
CT17005 wake on 0x4200e0 OK
CT17006 pthread_join OK
CT17007 END
*** C1428T06 PASSED ******************************
*** C1428T07 start *******************************
CT18001 futex START
CT18002 pthread_create OK
CT18101 running on Linux CPU OK
start=1613528413.931088714
op=109
end=1613528414.759485821
CT18102 FUTEX_WAIT OK
CT18103 timeout OK
CT18003 FUTEX_WAKE missing the waiter OK
CT18004 pthread_join OK
CT18005 END
*** C1428T07 PASSED ******************************
*** C1428T08 start *******************************
CT19001 futex START
CT19002 pthread_create OK
CT19100 running on Linux CPU OK
start=7347844.370216897
op=9
end=7347845.190062937
CT19101 FUTEX_WAIT OK
CT19102 timeout OK
CT19003 FUTEX_WAKE missing the waiter OK
CT19004 pthread_join OK
CT19005 END
*** C1428T08 PASSED ******************************
*** C1428T09 start *******************************
CT20001 futex START
CT20002 pthread_create OK
CT20100 running on Linux CPU OK
start=1613528425.067456921
end=1613528425.879490654
CT20101 FUTEX_WAIT OK
CT20102 timeout OK
CT20003 FUTEX_WAKE missing the waiter OK
CT20004 pthread_join OK
CT20005 END
*** C1428T09 PASSED ******************************
*** Stop mckernel to exec CT31-34 on Linux
mcstop+release.sh ... done
*** Boot mckernel
mcreboot.sh -c 37-43,49-55 -m 2G@2,2G@3 -r 37-43:36+49-55:48 -O ... done
*** C1428T10 start *******************************
[INFO] nloop=1000,blocktime=10000000
[INFO] Master thread (tid: 54422) is running on 00,00
nsec=94970130, nspw=9.497013
[ OK ] Master thread is running on McKernel
[ OK ] util_indicate_clone
[INFO] Utility thread (tid: 54440) is running on 29,29
[ OK ] Utility thread is running on Linux
[INFO] waker: 9756747550 nsec, waiter: 9762950100 nsec, (waiter - waker) / nloop: 6202 nsec
** Result on Linux **
[INFO] waker: 9984565840 nsec, waiter: 9988427640 nsec, (waiter - waker) / nloop: 3861 nsec
*** C1428T10 PASSED ******************************
*** C1428T11 start *******************************
[INFO] nloop=1000,blocktime=10000000
[INFO] Master thread (tid: 54449) is running on 00,00
nsec=95183170, nspw=9.518317
[ OK ] Master thread is running on McKernel
[ OK ] util_indicate_clone
[INFO] Utility thread (tid: 54467) is running on 27,27
[ OK ] Utility thread is running on Linux
[INFO] waker: 9724199060 nsec, waiter: 9730702360 nsec, (waiter - waker) / nloop: 6503 nsec
** Result on Linux **
[INFO] waker: 9987888970 nsec, waiter: 9991459180 nsec, (waiter - waker) / nloop: 3570 nsec
*** C1428T11 PASSED ******************************
*** C1428T12 start *******************************
[INFO] nloop=1000,blocktime=10000000
[INFO] Master thread (tid: 54476) is running on 00,00
nsec=96968310, nspw=9.696831
[ OK ] Master thread is running on McKernel
[ OK ] util_indicate_clone
[INFO] Utility thread (tid: 54494) is running on 27,27
[ OK ] Utility thread is running on Linux
[INFO] waiter: 9747346620 nsec, waker: 9736919490 nsec, (waiter - waker) / nloop: 10427 nsec
** Result on Linux **
[INFO] waiter: 9922548360 nsec, waker: 9918225010 nsec, (waiter - waker) / nloop: 4323 nsec
*** C1428T12 PASSED ******************************
*** C1428T13 start *******************************
[INFO] nloop=1000,blocktime=10000000
[INFO] Master thread (tid: 54503) is running on 01,01
nsec=94160460, nspw=9.416046
[ OK ] Master thread is running on McKernel
[ OK ] util_indicate_clone
[INFO] Utility thread (tid: 54521) is running on 01,01
[ OK ] Utility thread is running on Linux
[INFO] waiter: 10112660440 nsec, waker: 10105975190 nsec, (waiter - waker) / nloop: 6685 nsec
** Result on Linux **
[INFO] waiter: 10082423010 nsec, waker: 10078381240 nsec, (waiter - waker) / nloop: 4041 nsec
*** C1428T13 PASSED ******************************
*** C1428T14 start *******************************
futex_wait01 1 TPASS : futex_wait(): errno=ETIMEDOUT(110): Connection timed out
futex_wait01 2 TPASS : futex_wait(): errno=EAGAIN/EWOULDBLOCK(11): Resource temporarily unavailable
futex_wait01 3 TPASS : futex_wait(): errno=ETIMEDOUT(110): Connection timed out
futex_wait01 4 TPASS : futex_wait(): errno=EAGAIN/EWOULDBLOCK(11): Resource temporarily unavailable
*** C1428T14 PASSED (4)
*** C1428T15 start *******************************
futex_wait02 1 TPASS : futex_wait() woken up
*** C1428T15 PASSED (1)
*** C1428T16 start *******************************
futex_wait03 1 TPASS : futex_wait() woken up
*** C1428T16 PASSED (1)
*** C1428T17 start *******************************
futex_wait04 1 TPASS : futex_wait() returned -1: errno=EAGAIN/EWOULDBLOCK(11): Resource temporarily unavailable
*** C1428T17 PASSED (1)
*** C1428T18 start *******************************
tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
futex_wait_bitset.h:17: INFO: testing futex_wait_bitset() timeout with CLOCK_MONOTONIC
futex_wait_bitset.h:59: PASS: futex_wait_bitset() waited 102024us, expected 100010us
Summary:
passed 1
failed 0
skipped 0
warnings 0
*** C1428T18 PASSED (1)
*** C1428T19 start *******************************
tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
futex_wait_bitset.h:17: INFO: testing futex_wait_bitset() timeout with CLOCK_REALTIME
futex_wait_bitset.h:59: PASS: futex_wait_bitset() waited 101995us, expected 100010us
Summary:
passed 1
failed 0
skipped 0
warnings 0
*** C1428T19 PASSED (1)
*** C1428T20 start *******************************
futex_wake01 1 TPASS : futex_wake() returned 0
futex_wake01 2 TPASS : futex_wake() returned 0
futex_wake01 3 TPASS : futex_wake() returned 0
futex_wake01 4 TPASS : futex_wake() returned 0
futex_wake01 5 TPASS : futex_wake() returned 0
futex_wake01 6 TPASS : futex_wake() returned 0
*** C1428T20 PASSED (6)
*** C1428T21 start *******************************
futex_wake02 1 TPASS : futex_wake() woken up 1 threads
futex_wake02 2 TPASS : futex_wake() woken up 2 threads
futex_wake02 3 TPASS : futex_wake() woken up 3 threads
futex_wake02 4 TPASS : futex_wake() woken up 4 threads
futex_wake02 5 TPASS : futex_wake() woken up 5 threads
futex_wake02 6 TPASS : futex_wake() woken up 6 threads
futex_wake02 7 TPASS : futex_wake() woken up 7 threads
futex_wake02 8 TPASS : futex_wake() woken up 8 threads
futex_wake02 9 TPASS : futex_wake() woken up 9 threads
futex_wake02 10 TPASS : futex_wake() woken up 10 threads
futex_wake02 11 TPASS : futex_wake() woken up 0 threads
futex_wake02 0 TINFO : Child process returned TPASS
*** C1428T21 PASSED (12)
*** C1428T22 start *******************************
futex_wake03 1 TPASS : futex_wake() woken up 1 childs
futex_wake03 2 TPASS : futex_wake() woken up 2 childs
futex_wake03 3 TPASS : futex_wake() woken up 3 childs
futex_wake03 4 TPASS : futex_wake() woken up 4 childs
futex_wake03 5 TPASS : futex_wake() woken up 5 childs
futex_wake03 6 TPASS : futex_wake() woken up 6 childs
futex_wake03 7 TPASS : futex_wake() woken up 7 childs
futex_wake03 8 TPASS : futex_wake() woken up 8 childs
futex_wake03 9 TPASS : futex_wake() woken up 9 childs
futex_wake03 10 TPASS : futex_wake() woken up 10 childs
futex_wake03 11 TPASS : futex_wake() woken up 0 children
*** C1428T22 PASSED (11)

View File

@ -0,0 +1,131 @@
[m-takagi@wallaby14 1428]$ make test
./C1428.sh
mcstop+release.sh ... done
mcreboot.sh -k 0 -f LOG_LOCAL6 -c 1-7,9-15,17-23,25-31 -m 10G@0,10G@1 -r 1-7:0+9-15:8+17-23:16+25-31:24 -O ... done
~/project/os/mckernel/test/uti ~/project/os/mckernel/test/issues/1428
make[1]: Entering directory `/home/m-takagi/project/os/mckernel/test/uti'
dd bs=4096 count=1000 if=/dev/zero of=./file
1000+0 records in
1000+0 records out
4096000 bytes (4.1 MB) copied, 0.0398667 s, 103 MB/s
make[1]: Leaving directory `/home/m-takagi/project/os/mckernel/test/uti'
~/project/os/mckernel/test/issues/1428
*** Stop mckernel to exec CT31-34 on Linux
mcstop+release.sh ... done
*** Boot mckernel
mcreboot.sh -k 0 -f LOG_LOCAL6 -c 1-7,9-15,17-23,25-31 -m 10G@0,10G@1 -r 1-7:0+9-15:8+17-23:16+25-31:24 -O ... done
*** C1428T01 start *******************************
[INFO] nloop=1000,blocktime=10000000
[INFO] Master thread (tid: 1518) is running on 00,00
[ OK ] Master thread is running on McKernel
[ OK ] util_indicate_clone
[INFO] Utility thread (tid: 1551) is running on 00,00
[ OK ] Utility thread is running on Linux
[INFO] waker: 26037705112 cycles, waiter: 26042430924 cycles, (waiter - waker) / nloop: 4725 cycles
** Result on Linux **
[INFO] waker: 19797701232 cycles, waiter: 19799301694 cycles, (waiter - waker) / nloop: 1600 cycles
*** C1428T01 PASSED ******************************
*** C1428T02 start *******************************
[INFO] nloop=1000,blocktime=10000000
[INFO] Master thread (tid: 1568) is running on 00,00
[ OK ] Master thread is running on McKernel
[ OK ] util_indicate_clone
[INFO] Utility thread (tid: 1600) is running on 00,00
[ OK ] Utility thread is running on Linux
[INFO] waker: 26064839352 cycles, waiter: 26070575240 cycles, (waiter - waker) / nloop: 5735 cycles
** Result on Linux **
[INFO] waker: 24762320086 cycles, waiter: 24764268665 cycles, (waiter - waker) / nloop: 1948 cycles
*** C1428T02 PASSED ******************************
*** C1428T03 start *******************************
[INFO] nloop=1000,blocktime=10000000
[INFO] Master thread (tid: 1609) is running on 00,00
[ OK ] Master thread is running on McKernel
[ OK ] util_indicate_clone
[INFO] Utility thread (tid: 1641) is running on 00,00
[ OK ] Utility thread is running on Linux
[INFO] waiter: 26042752992 cycles, waker: 26037367808 cycles, (waiter - waker) / nloop: 5385 cycles
** Result on Linux **
[INFO] waiter: 25124067612 cycles, waker: 25122513727 cycles, (waiter - waker) / nloop: 1553 cycles
*** C1428T03 PASSED ******************************
*** C1428T04 start *******************************
[INFO] nloop=1000,blocktime=10000000
[INFO] Master thread (tid: 1651) is running on 01,01
[ OK ] Master thread is running on McKernel
[ OK ] util_indicate_clone
[INFO] Utility thread (tid: 1684) is running on 00,00
[ OK ] Utility thread is running on Linux
[INFO] waiter: 26004096360 cycles, waker: 25998796808 cycles, (waiter - waker) / nloop: 5299 cycles
** Result on Linux **
[INFO] waiter: 26289569877 cycles, waker: 26287829592 cycles, (waiter - waker) / nloop: 1740 cycles
*** C1428T04 PASSED ******************************
*** C1428T05 start *******************************
futex_wait01 1 TPASS : futex_wait(): errno=ETIMEDOUT(110): Connection timed out
futex_wait01 2 TPASS : futex_wait(): errno=EAGAIN/EWOULDBLOCK(11): Resource temporarily unavailable
futex_wait01 3 TPASS : futex_wait(): errno=ETIMEDOUT(110): Connection timed out
futex_wait01 4 TPASS : futex_wait(): errno=EAGAIN/EWOULDBLOCK(11): Resource temporarily unavailable
*** C1428T05 PASSED (4)
*** C1428T06 start *******************************
futex_wait02 1 TPASS : futex_wait() woken up
*** C1428T06 PASSED (1)
*** C1428T07 start *******************************
futex_wait03 1 TPASS : futex_wait() woken up
*** C1428T07 PASSED (1)
*** C1428T08 start *******************************
futex_wait04 1 TPASS : futex_wait() returned -1: errno=EAGAIN/EWOULDBLOCK(11): Resource temporarily unavailable
*** C1428T08 PASSED (1)
*** C1428T09 start *******************************
futex_wait_bitset01 0 TINFO : testing futex_wait_bitset() timeout with CLOCK_MONOTONIC
futex_wait_bitset01 1 TPASS : futex_wait_bitset() waited 146706us, expected 100010us
*** C1428T09 PASSED (1)
*** C1428T10 start *******************************
futex_wait_bitset02 0 TINFO : testing futex_wait_bitset() timeout with CLOCK_REALTIME
futex_wait_bitset02 1 TPASS : futex_wait_bitset() waited 146709us, expected 100010us
*** C1428T10 PASSED (1)
*** C1428T11 start *******************************
futex_wake01 1 TPASS : futex_wake() returned 0
futex_wake01 2 TPASS : futex_wake() returned 0
futex_wake01 3 TPASS : futex_wake() returned 0
futex_wake01 4 TPASS : futex_wake() returned 0
futex_wake01 5 TPASS : futex_wake() returned 0
futex_wake01 6 TPASS : futex_wake() returned 0
*** C1428T11 PASSED (6)
*** C1428T12 start *******************************
futex_wake02 1 TPASS : futex_wake() woken up 1 threads
futex_wake02 2 TPASS : futex_wake() woken up 2 threads
futex_wake02 3 TPASS : futex_wake() woken up 3 threads
futex_wake02 4 TPASS : futex_wake() woken up 4 threads
futex_wake02 5 TPASS : futex_wake() woken up 5 threads
futex_wake02 6 TPASS : futex_wake() woken up 6 threads
futex_wake02 7 TPASS : futex_wake() woken up 7 threads
futex_wake02 8 TPASS : futex_wake() woken up 8 threads
futex_wake02 9 TPASS : futex_wake() woken up 9 threads
futex_wake02 10 TPASS : futex_wake() woken up 10 threads
futex_wake02 11 TPASS : futex_wake() woken up 0 threads
futex_wake02 0 TINFO : Child process returned TPASS
*** C1428T12 PASSED (12)
*** C1428T13 start *******************************
futex_wake03 1 TPASS : futex_wake() woken up 1 childs
futex_wake03 2 TPASS : futex_wake() woken up 2 childs
futex_wake03 3 TPASS : futex_wake() woken up 3 childs
futex_wake03 4 TPASS : futex_wake() woken up 4 childs
futex_wake03 5 TPASS : futex_wake() woken up 5 childs
futex_wake03 6 TPASS : futex_wake() woken up 6 childs
futex_wake03 7 TPASS : futex_wake() woken up 7 childs
futex_wake03 8 TPASS : futex_wake() woken up 8 childs
futex_wake03 9 TPASS : futex_wake() woken up 9 childs
futex_wake03 10 TPASS : futex_wake() woken up 10 childs
futex_wake03 11 TPASS : futex_wake() woken up 0 children
*** C1428T13 PASSED (11)

25
test/issues/1505/C1505.sh Normal file
View File

@ -0,0 +1,25 @@
#!/bin/sh
USELTP=1
USEOSTEST=0
. ../../common.sh
################################################################################
uname -m
for i in msgrcv05:01 msgsnd05:02 semctl01:03 semop05:04 kill01:05 \
kill02:06 kill06:07 kill07:08 kill08:09 kill09:10; do
tp=`echo $i|sed 's/:.*//'`
id=`echo $i|sed 's/.*://'`
sudo PATH=$PATH:$LTPBIN $MCEXEC $LTPBIN/$tp 2>&1 | tee $tp.txt
ok=`grep TPASS $tp.txt | wc -l`
ng=`grep TFAIL $tp.txt | wc -l`
if [ $ok = 0 -a $ng = 0 ]; then
ok=`awk '/^passed/{print $2}' $tp.txt`
ng=`awk '/^failed/{print $2}' $tp.txt`
fi
if [ $ng = 0 ]; then
echo "*** C1505T$id: $tp PASS ($ok)"
else
echo "*** C1505T$id: $tp FAIL (ok=$ok ng=%ng)"
fi
done

View File

@ -0,0 +1,55 @@
Script started on Tue 22 Dec 2020 08:24:38 AM JST
[shirasawa@apollo16 1505]$ make test
sh ./C1505.sh
mcstop+release.sh ... done
mcreboot.sh -c 2-31 -m 2G@0,2G@1 -O ... done
aarch64
msgrcv05 1 TPASS : got EINTR as expected
*** C1505T01: msgrcv05 PASS (1)
tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
msgsnd05.c:63: PASS: msgsnd() failed as expected: EAGAIN
msgsnd05.c:63: PASS: msgsnd() failed as expected: EINTR
Summary:
passed 2
failed 0
skipped 0
warnings 0
*** C1505T02: msgsnd05 PASS (2)
semctl01 1 TPASS : buf.sem_nsems and buf.sem_perm.mode are correct
semctl01 2 TPASS : buf.sem_perm.mode is correct
semctl01 3 TPASS : semaphores have expected values
semctl01 4 TPASS : number of sleeping processes is correct
semctl01 5 TPASS : last pid value is correct
semctl01 6 TPASS : semaphore value is correct
semctl01 7 TPASS : number of sleeping processes is correct
semctl01 8 TPASS : semaphore values are correct
semctl01 9 TPASS : semaphore value is correct
semctl01 10 TPASS : the highest index is correct
semctl01 11 TPASS : number of semaphore sets is correct
semctl01 12 TPASS : id of the semaphore set is correct
semctl01 13 TPASS : semaphore appears to be removed
*** C1505T03: semctl01 PASS (13)
semop05 1 TPASS : expected failure - errno = 43 : Identifier removed
semop05 1 TPASS : expected failure - errno = 43 : Identifier removed
semop05 1 TPASS : expected failure - errno = 4 : Interrupted system call
semop05 1 TPASS : expected failure - errno = 4 : Interrupted system call
*** C1505T04: semop05 PASS (4)
kill01 1 TPASS : received expected signal 9
*** C1505T05: kill01 PASS (1)
kill02 1 TPASS : The signal was sent to all processes in the process group.
kill02 2 TPASS : The signal was not sent to selective processes that were not in the process group.
*** C1505T06: kill02 PASS (2)
kill06 1 TPASS : received expected signal 9
*** C1505T07: kill06 PASS (1)
kill07 0 TINFO : received expected signal 9
kill07 1 TPASS : Did not catch signal as expected
*** C1505T08: kill07 PASS (1)
kill08 1 TPASS : received expected signal 9
*** C1505T09: kill08 PASS (1)
kill09 1 TPASS : kill(83510, SIGKILL) returned 0
*** C1505T10: kill09 PASS (1)
]0;shirasawa@apollo16:~/issue1505/mckernel/test/issues/1505[shirasawa@apollo16 1505]$ exit
exit
Script done on Tue 22 Dec 2020 08:25:23 AM JST

View File

@ -0,0 +1,55 @@
Script started on Tue Dec 22 07:58:45 2020
bash-4.2$ make test
sh ./C1505.sh
mcstop+release.sh ... done
mcreboot.sh -c 1-7,9-15,17-23,25-31 -m 10G@0,10G@1 -r 1-7:0+9-15:8+17-23:16+25-31:24 ... done
x86_64
msgrcv05 1 TPASS : got EINTR as expected
*** C1505T01: msgrcv05 PASS (1)
tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
msgsnd05.c:63: PASS: msgsnd() failed as expected: EAGAIN
msgsnd05.c:63: PASS: msgsnd() failed as expected: EINTR
Summary:
passed 2
failed 0
skipped 0
warnings 0
*** C1505T02: msgsnd05 PASS (2)
semctl01 1 TPASS : buf.sem_nsems and buf.sem_perm.mode are correct
semctl01 2 TPASS : buf.sem_perm.mode is correct
semctl01 3 TPASS : semaphores have expected values
semctl01 4 TPASS : number of sleeping processes is correct
semctl01 5 TPASS : last pid value is correct
semctl01 6 TPASS : semaphore value is correct
semctl01 7 TPASS : number of sleeping processes is correct
semctl01 8 TPASS : semaphore values are correct
semctl01 9 TPASS : semaphore value is correct
semctl01 10 TPASS : the highest index is correct
semctl01 11 TPASS : number of semaphore sets is correct
semctl01 12 TPASS : id of the semaphore set is correct
semctl01 13 TPASS : semaphore appears to be removed
*** C1505T03: semctl01 PASS (13)
semop05 1 TPASS : expected failure - errno = 43 : Identifier removed
semop05 1 TPASS : expected failure - errno = 43 : Identifier removed
semop05 1 TPASS : expected failure - errno = 4 : Interrupted system call
semop05 1 TPASS : expected failure - errno = 4 : Interrupted system call
*** C1505T04: semop05 PASS (4)
kill01 1 TPASS : received expected signal 9
*** C1505T05: kill01 PASS (1)
kill02 1 TPASS : The signal was sent to all processes in the process group.
kill02 2 TPASS : The signal was not sent to selective processes that were not in the process group.
*** C1505T06: kill02 PASS (2)
kill06 1 TPASS : received expected signal 9
*** C1505T07: kill06 PASS (1)
kill07 0 TINFO : received expected signal 9
kill07 1 TPASS : Did not catch signal as expected
*** C1505T08: kill07 PASS (1)
kill08 1 TPASS : received expected signal 9
*** C1505T09: kill08 PASS (1)
kill09 1 TPASS : kill(19542, SIGKILL) returned 0
*** C1505T10: kill09 PASS (1)
bash-4.2$ exit
exit
Script done on Tue Dec 22 07:59:12 2020

View File

@ -0,0 +1,5 @@
test::
sh ./C1505.sh
clean::
rm -f $(TARGET) *.o

28
test/issues/1505/README Normal file
View File

@ -0,0 +1,28 @@
【Issue#1505 動作確認】
□ テスト内容
1. Issue 指摘事項の再現確認
問題が発生していた以下のLTPが対策によってPASSすることを確認する。
C1505T01 msgrcv05: msgrcv発行中にシグナルによりシステムコールを中断する
C1505T02 msgsnd05: msgsnd発行中にシグナルによりシステムコールを中断する
C1505T03 semctl01: semctl発行中にシグナルによりシステムコールを中断する
C1505T04 semop05: semop発行中にシグナルによりシステムコールを中断する
2. LTP を用いて既存処理に影響しないことを確認
シグナル関連のテストプログラムの動作に影響しないことを確認する
C1505T05 kill01: kill の基本機能の確認
C1505T06 kill02: kill の基本機能の確認
C1505T07 kill06: kill の基本機能の確認
C1505T08 kill07: kill の基本機能の確認
C1505T09 kill08: kill の基本機能の確認
C1505T10 kill09: kill の基本機能の確認
□ 実行手順
$ make test
McKernelのインストール先や LTP の配置場所は、$HOME/.mck_test_config を
参照する。.mck_test_config は、McKernel をビルドした際に生成される
mck_test_config.sample ファイルを $HOME にコピーし、適宜編集すること。
□ 実行結果
C1505_x86_64.txt(x86_64実行結果)、C1505_arm64.txt(arm64実行結果)参照。
全ての項目が PASS していることを確認。

60
test/issues/1512/C1512.sh Executable file
View File

@ -0,0 +1,60 @@
#/bin/sh
USELTP=1
USEOSTEST=0
. ../../common.sh
issue="1512"
tid=01
arch=`uname -p`
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
tp="shmt09"
fail_flag=0
for rep in `seq 1 10`
do
sudo $MCEXEC $LTPBIN/$tp 2>&1 > $tp.txt
ok=`grep PASS $tp.txt | wc -l`
ng=`grep FAIL $tp.txt | wc -l`
echo "shmt09 rep $rep done. (ok=$ok ng=$ng)"
if [ $ng -ne 0 ]; then
if [ "${arch}" == "x86_64" ]; then
echo "OK: Expected fail on ${arch}"
else
echo "NG: Unexpected fail on ${arch}"
fail_flag=1
fi
else
echo "OK: shmt09 PASS"
fi
done
if [ ${fail_flag} -eq 0 ]; then
echo "*** ${tname} PASSED"
else
echo "*** ${tname} FAILED"
fi
echo ""
let tid++
while read tp
do
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
args=""
if [ "$tp" == "mmapstress06" ]; then
args="1"
fi
sudo $MCEXEC $LTPBIN/$tp $args 2>&1 | tee $tp.txt
ok=`grep PASS $tp.txt | wc -l`
ng=`grep FAIL $tp.txt | wc -l`
if [ $ng = 0 ]; then
echo "*** ${tname} PASSED ($ok)"
else
echo "*** ${tname} FAILED (ok=$ok ng=$ng)"
fi
let tid++
echo ""
done < ./ltp_list

11
test/issues/1512/Makefile Normal file
View File

@ -0,0 +1,11 @@
CFLAGS=-g
LDFLAGS=
TARGET=
all: $(TARGET)
test: all
./C1512.sh
clean:
rm -f $(TARGET) *.o *.txt

21
test/issues/1512/README Normal file
View File

@ -0,0 +1,21 @@
【Issue#1512 動作確認】
□ テスト内容
1. shmt09 を10回連続で実行し、それぞれのプログラムが終了することを確認する
なお、x86_64 アーキテクチャではMcKernelのbrk位置をshrinkさせない仕様により、
テスト動作が変わり、shmt09はFAILとなる
2. sbrk()を利用する以下のLTPを用いて既存のbrk機能に影響が無いことを確認
- sbrk01,02
- mmapstress02,05,06
□ 実行手順
$ make test
McKernelのインストール先や、OSTEST, LTPの配置場所は、
$HOME/.mck_test_config を参照している
.mck_test_config は、McKernelをビルドした際に生成されるmck_test_config.sample ファイルを
$HOMEにコピーし、適宜編集する
□ 実行結果
x86_64_result.log aarch64_result.log 参照。
すべての項目をPASSしていることを確認。

View File

@ -0,0 +1,50 @@
./C1512.sh
mcstop+release.sh ... done
mcreboot.sh -c 37-43,49-55 -m 2G@2,2G@3 -r 37-43:36+49-55:48 -O ... done
*** C1512T01 start *******************************
shmt09 rep 1 done. (ok=4 ng=0)
OK: shmt09 PASS
shmt09 rep 2 done. (ok=4 ng=0)
OK: shmt09 PASS
shmt09 rep 3 done. (ok=4 ng=0)
OK: shmt09 PASS
shmt09 rep 4 done. (ok=4 ng=0)
OK: shmt09 PASS
shmt09 rep 5 done. (ok=4 ng=0)
OK: shmt09 PASS
shmt09 rep 6 done. (ok=4 ng=0)
OK: shmt09 PASS
shmt09 rep 7 done. (ok=4 ng=0)
OK: shmt09 PASS
shmt09 rep 8 done. (ok=4 ng=0)
OK: shmt09 PASS
shmt09 rep 9 done. (ok=4 ng=0)
OK: shmt09 PASS
shmt09 rep 10 done. (ok=4 ng=0)
OK: shmt09 PASS
*** C1512T01 PASSED
*** C1512T02 start *******************************
sbrk01 1 TPASS : sbrk - Increase by 8192 bytes returned 0x20030000
sbrk01 2 TPASS : sbrk - Increase by -8192 bytes returned 0x20032000
*** C1512T02 PASSED (2)
*** C1512T03 start *******************************
sbrk02 0 TINFO : setup() bailing inc: 17196646400, ret: 0xffffffffffffffff, sbrk: 0x7fe20030000: errno=ENOMEM(12): Cannot allocate memory
sbrk02 1 TPASS : sbrk(17196646400) failed as expected: TEST_ERRNO=ENOMEM(12): Cannot allocate memory
*** C1512T03 PASSED (1)
*** C1512T04 start *******************************
mmapstress02 1 TPASS : Test passed
*** C1512T04 PASSED (1)
*** C1512T05 start *******************************
mmapstress05 1 TPASS : Test passed
*** C1512T05 PASSED (1)
*** C1512T06 start *******************************
mmapstress06 1 TPASS : Test passed
*** C1512T06 PASSED (1)

View File

@ -0,0 +1,5 @@
sbrk01
sbrk02
mmapstress02
mmapstress05
mmapstress06

View File

@ -0,0 +1,50 @@
./C1512.sh
mcstop+release.sh ... done
mcreboot.sh -c 1-7,9-15,17-23,25-31 -m 10G@0,10G@1 -r 1-7:0+9-15:8+17-23:16+25-31:24 -O ... done
*** C1512T01 start *******************************
shmt09 rep 1 done. (ok=3 ng=1)
OK: Expected fail on x86_64
shmt09 rep 2 done. (ok=3 ng=1)
OK: Expected fail on x86_64
shmt09 rep 3 done. (ok=3 ng=1)
OK: Expected fail on x86_64
shmt09 rep 4 done. (ok=3 ng=1)
OK: Expected fail on x86_64
shmt09 rep 5 done. (ok=3 ng=1)
OK: Expected fail on x86_64
shmt09 rep 6 done. (ok=3 ng=1)
OK: Expected fail on x86_64
shmt09 rep 7 done. (ok=3 ng=1)
OK: Expected fail on x86_64
shmt09 rep 8 done. (ok=3 ng=1)
OK: Expected fail on x86_64
shmt09 rep 9 done. (ok=3 ng=1)
OK: Expected fail on x86_64
shmt09 rep 10 done. (ok=3 ng=1)
OK: Expected fail on x86_64
*** C1512T01 PASSED
*** C1512T02 start *******************************
sbrk01 1 TPASS : sbrk - Increase by 8192 bytes returned 0x821000
sbrk01 2 TPASS : sbrk - Increase by -8192 bytes returned 0x823000
*** C1512T02 PASSED (2)
*** C1512T03 start *******************************
sbrk02 0 TINFO : setup() bailing inc: 28068282368, ret: 0xffffffffffffffff, sbrk: 0x1550dc821000: errno=ENOMEM(12): Cannot allocate memory
sbrk02 1 TPASS : sbrk(28068282368) failed as expected: TEST_ERRNO=ENOMEM(12): Cannot allocate memory
*** C1512T03 PASSED (1)
*** C1512T04 start *******************************
mmapstress02 1 TPASS : Test passed
*** C1512T04 PASSED (1)
*** C1512T05 start *******************************
mmapstress05 1 TPASS : Test passed
*** C1512T05 PASSED (1)
*** C1512T06 start *******************************
mmapstress06 1 TPASS : Test passed
*** C1512T06 PASSED (1)

30
test/issues/1523/C1523.sh Executable file
View File

@ -0,0 +1,30 @@
#/bin/sh
USELTP=1
USEOSTEST=0
MCREBOOT=0
. ../../common.sh
BOOTPARAM="${BOOTPARAM} -e anon_on_demand"
mcreboot
issue="1523"
tid=01
for tp in move_pages01 move_pages02 move_pages04 move_pages06 move_pages09 move_pages10
do
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
sudo $MCEXEC $LTPBIN/$tp 2>&1 | tee $tp.txt
ok=`grep PASS $tp.txt | wc -l`
ng=`grep FAIL $tp.txt | wc -l`
if [ $ng = 0 ]; then
echo "*** ${tname} PASSED ($ok)"
else
echo "*** ${tname} FAILED (ok=$ok ng=$ng)"
fi
let tid++
echo ""
done

11
test/issues/1523/Makefile Normal file
View File

@ -0,0 +1,11 @@
CFLAGS=-g
LDFLAGS=
TARGET=
all: $(TARGET)
test: all
./C1523.sh
clean:
rm -f $(TARGET) *.o *.txt

21
test/issues/1523/README Normal file
View File

@ -0,0 +1,21 @@
【Issue#1523 動作確認】
□ テスト内容
1. 以下のLTPがPASSすることを確認する
- move_pages01
- move_pages02
- move_pages04
- move_pages06
- move_pages09
- move_pages10
□ 実行手順
$ make test
McKernelのインストール先や、OSTEST, LTPの配置場所は、
$HOME/.mck_test_config を参照している
.mck_test_config は、McKernelをビルドした際に生成されるmck_test_config.sample ファイルを
$HOMEにコピーし、適宜編集する
□ 実行結果
x86_64_result.log aarch64_result.log 参照。
すべての項目をPASSしていることを確認。

View File

@ -0,0 +1,25 @@
mcstop+release.sh ... done
mcreboot.sh -c 37-43,49-55 -m 2G@2,2G@3 -r 37-43:36+49-55:48 -O -e anon_on_demand ... done
*** C1523T01 start *******************************
move_pages01 1 TPASS : pages are present in expected nodes
*** C1523T01 PASSED (1)
*** C1523T02 start *******************************
move_pages02 1 TPASS : pages are present in expected nodes
*** C1523T02 PASSED (1)
*** C1523T03 start *******************************
move_pages04 1 TPASS : status[1] has expected value
*** C1523T03 PASSED (1)
*** C1523T04 start *******************************
move_pages06 1 TPASS : move_pages failed with ENODEV as expected
*** C1523T04 PASSED (1)
*** C1523T05 start *******************************
move_pages09 1 TPASS : move_pages succeeded
*** C1523T05 PASSED (1)
*** C1523T06 start *******************************
move_pages10 1 TPASS : move_pages failed with EINVAL as expected
*** C1523T06 PASSED (1)

View File

@ -0,0 +1,25 @@
mcstop+release.sh ... done
mcreboot.sh -c 1-7,9-15,17-23,25-31 -m 10G@0,10G@1 -r 1-7:0+9-15:8+17-23:16+25-31:24 -O -e anon_on_demand ... done
*** C1523T01 start *******************************
move_pages01 1 TPASS : pages are present in expected nodes
*** C1523T01 PASSED (1)
*** C1523T02 start *******************************
move_pages02 1 TPASS : pages are present in expected nodes
*** C1523T02 PASSED (1)
*** C1523T03 start *******************************
move_pages04 1 TPASS : status[1] has expected value
*** C1523T03 PASSED (1)
*** C1523T04 start *******************************
move_pages06 1 TPASS : move_pages failed with ENODEV as expected
*** C1523T04 PASSED (1)
*** C1523T05 start *******************************
move_pages09 1 TPASS : move_pages succeeded
*** C1523T05 PASSED (1)
*** C1523T06 start *******************************
move_pages10 1 TPASS : move_pages failed with EINVAL as expected
*** C1523T06 PASSED (1)

271
test/issues/1555/C1555.c Normal file
View File

@ -0,0 +1,271 @@
/* 1400_arm64.c COPYRIGHT FUJITSU LIMITED 2020 */
#define _GNU_SOURCE
#include <stdio.h>
#include <sched.h>
#include <unistd.h>
#include <errno.h>
#include <sys/wait.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/syscall.h>
#define POINT_ORDER_NUM 2
#define SLEEP_SEC 2
#ifdef __x86_64__
#define PAUSE_INST "pause"
#elif defined(__aarch64__)
#define PAUSE_INST "yield"
#else
#error "unexpected archtecture."
#endif
#define cpu_pause() \
({ \
__asm__ __volatile__(PAUSE_INST ::: "memory"); \
})
static int *sync1 = MAP_FAILED;
static int *parent_core = MAP_FAILED;
static int *point_order = MAP_FAILED;
static int *od = MAP_FAILED;
int main(int argc, char *argv[])
{
pid_t pid = -1;
pid_t ret_pid = -1;
int status = 0;
int i = 0, rc;
int *resp;
int result = -1;
int ret = -1;
int failed = 0;
/* get shared memory */
sync1 = (int *)mmap(NULL, sizeof(int), PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
parent_core = (int *)mmap(NULL, sizeof(int), PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
point_order = (int *)mmap(NULL, sizeof(int) * POINT_ORDER_NUM,
PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
od = (int *)mmap(NULL, sizeof(int), PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
/* mmap check */
if (sync1 == MAP_FAILED ||
parent_core == MAP_FAILED ||
point_order == MAP_FAILED ||
od == MAP_FAILED) {
printf("mmap() Failed.\n");
goto out;
}
for (i = 0; i < POINT_ORDER_NUM; i++) {
point_order[i] = 0;
}
*od = 0;
*sync1 = 0;
/* create child process */
pid = fork();
switch (pid) {
case -1:
/* error */
printf("fork() Failed.\n");
goto out;
case 0: {
/* child */
/* before migrate, get cpunum */
int old_mycore = sched_getcpu();
printf("[child:%d] running core %d\n", getpid(), old_mycore);
/* sync parent */
*sync1 = 1;
#ifdef MIGRATE_ON_OFFLOAD
int sec = SLEEP_SEC;
resp = (int *)mmap(NULL, sizeof(int), PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
/* debug syscall */
rc = syscall(888, sec, resp, 0, 0, 0, 0);
if (rc != sec || *resp != sec) {
printf("[child:%d] debug_syscall failed\n", getpid());
_exit(-1);
}
#endif
/* wait until migrated */
while (sched_getcpu() == old_mycore) {
cpu_pause();
}
point_order[0] = ++(*od);
_exit(0);
break;
}
default: {
/* parent */
cpu_set_t cpuset;
/* sync child */
while (*sync1 != 1) {
cpu_pause();
}
/* parent corenum get */
*parent_core = sched_getcpu();
/* child process to migrate parent core */
printf("[parent:%d] running core %d\n", getpid(), *parent_core);
printf("[parent] child process(pid=%d) "
"migrate/bind to core %d\n",
pid, *parent_core);
CPU_ZERO(&cpuset);
CPU_SET(*parent_core, &cpuset);
/* sched_setaffinity interval */
usleep(10000);
result = sched_setaffinity(pid, sizeof(cpuset), &cpuset);
if (result == -1) {
printf("errno = %d\n", errno);
printf("child migrate/bind "
"sched_setaffinity failed.\n");
}
/* parent core bind */
printf("[parent] parent process bind to core %d\n",
*parent_core);
result = sched_setaffinity(0, sizeof(cpuset), &cpuset);
if (result == -1) {
printf("errno = %d\n", errno);
printf("parent bind sched_setaffinity failed.\n");
}
#ifdef MIGRATE_ON_OFFLOAD
/* wait for child woken up */
usleep(SLEEP_SEC * 1000000 + 1000);
#endif
/* sync child, switch to child process */
printf("[parent] send sched_yield.\n");
result = 0;
result = sched_yield();
point_order[1] = ++(*od);
break;
}
}
if (result == -1) {
printf("sched_yield failed.\n");
}
/* child process status check. */
ret_pid = wait(&status);
if (ret_pid == pid) {
if (WIFEXITED(status)) {
if (WEXITSTATUS(status)) {
printf("TP failed, child migrate fail.\n");
}
else {
goto wait_ok;
}
}
else {
printf("TP failed, child is not exited.\n");
}
if (WIFSIGNALED(status)) {
printf("TP failed, child signaled by %d.\n",
WTERMSIG(status));
if (WCOREDUMP(status)) {
printf("coredumped.\n");
}
}
else {
printf("TP failed, child is not signaled.\n");
}
if (WIFSTOPPED(status)) {
printf("TP failed, child is stopped by signal %d.\n",
WSTOPSIG(status));
}
else {
printf("TP failed, child is not stopped.\n");
}
if (WIFCONTINUED(status)) {
printf("TP failed, child is continued.\n");
}
else {
printf("TP failed, child is not continued.\n");
}
for (i = 0; i < POINT_ORDER_NUM; i++) {
printf("point_order[%d] = %d\n", i, point_order[i]);
}
goto out;
}
else {
printf("TP failed, child process wait() fail.\n");
for (i = 0; i < POINT_ORDER_NUM; i++) {
printf("point_order[%d] = %d\n", i, point_order[i]);
}
goto out;
}
wait_ok:
for (i = 0; i < POINT_ORDER_NUM; i++) {
printf("point_order[%d] = %d\n", i, point_order[i]);
if (point_order[i] == 0) {
failed = 1;
}
}
if (failed != 0) {
printf("TP failed, parent or child process is not running.\n");
goto out;
}
if (result != -1) {
if (point_order[0] < point_order[1]) {
ret = 0;
}
else {
printf("TP failed, out of order.\n");
}
}
out:
/* unmap semaphore memory */
if (od != MAP_FAILED) {
munmap(od, sizeof(int));
}
if (point_order != MAP_FAILED) {
munmap(point_order, sizeof(int) * POINT_ORDER_NUM);
}
if (parent_core != MAP_FAILED) {
munmap(parent_core, sizeof(int));
}
if (sync1 != MAP_FAILED) {
munmap(sync1, sizeof(int));
}
return ret;
}

103
test/issues/1555/C1555.sh Normal file
View File

@ -0,0 +1,103 @@
#/bin/sh
USELTP=1
USEOSTEST=0
MCREBOOT=0
. ../../common.sh
BOOTPARAM="${BOOTPARAM} -e anon_on_demand"
mcreboot
PWD=`pwd`
STOPFILE="./1555_stop"
LOGFILE="${PWD}/1555_log"
LTPLIST="${PWD}/ltp_list"
TESTTIME=43200 # 6 hours
issue="1555"
echo "start-time: `date`"
stime=`date "+%s"`
failed=0
loops=0
while :
do
sudo ${MCEXEC} ./C1555T01 > ${LOGFILE} 2>&1
if [ $? -ne 0 ]; then
echo "C1555T01 failed."
failed=1
break
fi
${IHKOSCTL} 0 clear_kmsg
sudo ${MCEXEC} ./C1555T02 > ${LOGFILE} 2>&1
if [ $? -ne 0 ]; then
echo "C1555T02 failed."
failed=1
break
fi
dbg_cnt=`${IHKOSCTL} 0 kmsg | grep "ISSUE_1555" | wc -l`
if [ ${dbg_cnt} -eq 0 ]; then
echo "C1555T02 failed. Did not migrate in offload."
failed=1
break
fi
pushd ${LTPBIN} > /dev/null
while read line
do
${MCEXEC} ./${line} > ${LOGFILE} 2>&1
if [ $? -ne 0 ]; then
echo "${line} failed."
failed=1
break
fi
ng=`grep FAIL ${LOGFILE} | wc -l`
if [ $ng -ne 0 ]; then
echo "${line} failed."
cat ${LOGFILE}
failed=1
break
fi
done < ${LTPLIST}
popd > /dev/null
let loops++
if [ -e ${STOPFILE} ]; then
rm -f ${STOPFILE}
break
fi
etime=`date "+%s"`
run_time=$((${etime} - ${stime}))
if [ ${TESTTIME} -le ${run_time} ]; then
break;
fi
if [ ${failed} -eq 1 ]; then
break
fi
done
echo "end-time: `date`"
etime=`date "+%s"`
run_time=$((${etime} - ${stime}))
if [ ${TESTTIME} -le ${run_time} ]; then
if [ ${failed} -eq 0 ]; then
echo "Issue#${issue} test OK."
echo "Test cases run ${loops} times."
rm -f ${LOGFILE}
else
echo "Issue#${issue} test NG."
echo "Test cases run ${loops} times."
fi
else
echo "Issue#${issue} test NG."
echo "Test cases run ${loops} times."
fi

18
test/issues/1555/Makefile Normal file
View File

@ -0,0 +1,18 @@
CC=gcc
CFLAGS=-g
LDFLAGS=
TARGET=C1555T01 C1555T02
all: $(TARGET)
C1555T01: C1555.c
$(CC) -o $@ $^
C1555T02: C1555.c
$(CC) -DMIGRATE_ON_OFFLOAD -o $@ $^
test: all
sh ./C1555.sh
clean:
rm -f $(TARGET) *.o *.txt

38
test/issues/1555/README Normal file
View File

@ -0,0 +1,38 @@
【Issue#1555 動作確認】
□ テスト内容
Issueにて報告された症状はmigrate指示のタイミングによって発生の有無が
変化するため、下記のテストを6時間連続実行して、症状が発生しないことを確認する。
1. 下記のテストプログラムを実行し、症状が発生しないことを確認する
C1555T01: (Issue#1400 のテストプログラム 1400_arm64.c を流用)
親プロセスが子プロセスと自身を同一CPUにバインドしてsched_yield()した場合、
子プロセス、親プロセスの順序で実行されることを確認する。
C1555T02:
C1555T01 のテストケースにおいて、子プロセスがmigrate指示を受ける際に
システムコールのオフロードの中であり、その最中にRemote page faultが発生した場合にも
子プロセス、親プロセスの順序で実行されることを確認する。
2. 以下のLTPを用いて既存機能に影響が無いことを確認
- sched_yield01
- signal01,02,03,04,05
- rt_sigaction01,02,03
- rt_sigprocmask01,02
- rt_sigsuspend01
- rt_tgsigqueueinfo01
- futex_wait01,02,03,04
- futex_wake01
- futex_wait_bitset01
- execveat02
□ 実行手順
$ make test
McKernelのインストール先や、OSTEST, LTPの配置場所は、
$HOME/.mck_test_config を参照している
.mck_test_config は、McKernelをビルドした際に生成されるmck_test_config.sample ファイルを
$HOMEにコピーし、適宜編集する
□ 実行結果
x86_64_result.log aarch64_result.log 参照。
すべての項目をPASSしていることを確認。

View File

@ -0,0 +1,7 @@
sh ./C1555.sh
mcstop+release.sh ... done
mcreboot.sh -c 37-43,49-55 -m 2G@2,2G@3 -r 37-43:36+49-55:48 -O -e anon_on_demand ... done
start-time: 2021年 1月 27日 水曜日 14:41:56 JST
end-time: 2021年 1月 28日 木曜日 02:42:01 JST
Issue#1555 test OK.
Test cases run 4426 times.

21
test/issues/1555/ltp_list Normal file
View File

@ -0,0 +1,21 @@
sched_yield01
signal01
signal02
signal03
signal04
signal05
rt_sigaction01
rt_sigaction02
rt_sigaction03
rt_sigprocmask01
rt_sigprocmask02
rt_sigsuspend01
rt_tgsigqueueinfo01
futex_wait01
futex_wait02
futex_wait03
futex_wait04
futex_wake01
futex_wait_bitset01
execveat02
sigsuspend01

View File

@ -0,0 +1,74 @@
diff --git a/arch/arm64/kernel/include/syscall_list.h b/arch/arm64/kernel/include/syscall_list.h
index 28e99eb..1d9f052 100644
--- a/arch/arm64/kernel/include/syscall_list.h
+++ b/arch/arm64/kernel/include/syscall_list.h
@@ -137,6 +137,7 @@ SYSCALL_HANDLED(802, linux_mlock)
SYSCALL_HANDLED(803, suspend_threads)
SYSCALL_HANDLED(804, resume_threads)
SYSCALL_HANDLED(811, linux_spawn)
+SYSCALL_DELEGATED(888, dbg_sleep)
SYSCALL_DELEGATED(1024, open)
SYSCALL_DELEGATED(1035, readlink)
diff --git a/arch/x86_64/kernel/include/syscall_list.h b/arch/x86_64/kernel/include/syscall_list.h
index 17a1d65..8010d3e 100644
--- a/arch/x86_64/kernel/include/syscall_list.h
+++ b/arch/x86_64/kernel/include/syscall_list.h
@@ -181,6 +181,7 @@ SYSCALL_HANDLED(802, linux_mlock)
SYSCALL_HANDLED(803, suspend_threads)
SYSCALL_HANDLED(804, resume_threads)
SYSCALL_HANDLED(811, linux_spawn)
+SYSCALL_DELEGATED(888, dbg_sleep)
/* Do not edit the lines including this comment and
* EOF just after it because those are used as a
diff --git a/executer/user/mcexec.c b/executer/user/mcexec.c
index c48e245..118de75 100644
--- a/executer/user/mcexec.c
+++ b/executer/user/mcexec.c
@@ -5039,6 +5039,15 @@ return_linux_spawn:
break;
#endif
+ case 888: { // dbg_sleep
+ int sec = (int)w.sr.args[0];
+ int *resp = (int *)w.sr.args[1];
+ sleep(sec);
+ *resp = sec;
+ do_syscall_return(fd, cpu, sec, 0, 0, 0, 0);
+ break;
+ }
+
default:
ret = do_generic_syscall(&w);
do_syscall_return(fd, cpu, ret, 0, 0, 0, 0);
diff --git a/kernel/syscall.c b/kernel/syscall.c
index 8a919e1..0b0fbc3 100644
--- a/kernel/syscall.c
+++ b/kernel/syscall.c
@@ -181,6 +181,7 @@ long do_syscall(struct syscall_request *req, int cpu)
struct thread *thread = cpu_local_var(current);
struct ihk_os_cpu_monitor *monitor = cpu_local_var(monitor);
int mstatus = 0;
+ int orig_cpu;
#ifdef PROFILE_ENABLE
/* We cannot use thread->profile_start_ts here because the
@@ -231,6 +232,7 @@ long do_syscall(struct syscall_request *req, int cpu)
#ifdef ENABLE_TOFU
res.pde_data = NULL;
#endif
+ orig_cpu = ihk_mc_get_processor_id();
send_syscall(req, cpu, &res);
if (req->rtid == -1) {
@@ -386,6 +388,9 @@ long do_syscall(struct syscall_request *req, int cpu)
preempt_enable();
}
+ if (orig_cpu != ihk_mc_get_processor_id()) {
+ kprintf("ISSUE_1555 migrated during syscall_offload\n");
+ }
dkprintf("%s: syscall num: %d got host reply: %d \n",
__FUNCTION__, req->number, res.ret);

View File

@ -0,0 +1,8 @@
sh ./C1555.sh
mcstop+release.sh ... done
mcreboot.sh -c 1-7,9-15,17-23,25-31 -m 10G@0,10G@1 -r 1-7:0+9-15:8+17-23:16+25-31:24 -O -e anon_on_demand ... done
start-time: Wed Jan 27 14:48:14 JST 2021
execveat02 failed.
end-time: Thu Jan 28 02:48:18 JST 2021
Issue#1555 test NG.
Test cases run 4855 times.

125
test/issues/959/C959.sh Executable file
View File

@ -0,0 +1,125 @@
#/bin/sh
USELTP=1
USEOSTEST=1
LTP_LIST="mbind01 get_mempolicy01"
OSTEST_MBIND_LIST="1 3 5 9 12 14 15 16 20 24 26 28 30"
BOOTPARAM="-c 1-7 -m 10G@0,10G@1 -O -e anon_on_demand"
. ../../common.sh
issue="959"
tid=01
arch=`uname -p`
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
sudo ${MCEXEC} ./check_mempol_il 1 30 6 3 3 3
if [ $? -eq 0 ]; then
echo "*** ${tname} PASSED ******************************"
else
echo "*** ${tname} FAILED ******************************"
fi
let tid++
echo ""
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
sudo ${MCEXEC} ./check_mempol_il 2 30 6 3 3 3
if [ $? -eq 0 ]; then
echo "*** ${tname} PASSED ******************************"
else
echo "*** ${tname} FAILED ******************************"
fi
let tid++
echo ""
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
sudo ${MCEXEC} ./check_mempol_il 1 30 6 2 0 6
if [ $? -eq 0 ]; then
echo "*** ${tname} PASSED ******************************"
else
echo "*** ${tname} FAILED ******************************"
fi
let tid++
echo ""
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
sudo ${MCEXEC} ./check_mempol_il 2 30 6 2 0 6
if [ $? -eq 0 ]; then
echo "*** ${tname} PASSED ******************************"
else
echo "*** ${tname} FAILED ******************************"
fi
let tid++
echo ""
BOOTPARAM="-c 1-7 -m 10G@0,2G@1 -O -e anon_on_demand"
mcstop
mcreboot
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
${IHKOSCTL} 0 clear_kmsg
sudo ${MCEXEC} ./check_mempol_il 1 30 6 2 4 2
ret=$?
dbg_prints=`${IHKOSCTL} 0 kmsg | grep "TEST_959" | wc -l`
if [ ${ret} -eq 0 -a ${dbg_prints} -gt 0 ]; then
echo "*** ${tname} PASSED ******************************"
else
echo "*** ${tname} FAILED ******************************"
fi
let tid++
echo ""
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
sudo ${MCEXEC} ./check_mempol_il 2 30 6 2 4 2
ret=$?
dbg_prints=`${IHKOSCTL} 0 kmsg | grep "TEST_959" | wc -l`
if [ ${ret} -eq 0 -a ${dbg_prints} -gt 0 ]; then
echo "*** ${tname} PASSED ******************************"
else
echo "*** ${tname} FAILED ******************************"
fi
let tid++
echo ""
for tp in ${LTP_LIST}
do
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
sudo $MCEXEC $LTPBIN/$tp 2>&1 | tee $tp.txt
ok=`grep PASS $tp.txt | wc -l`
ng=`grep FAIL $tp.txt | wc -l`
if [ $ng = 0 ]; then
echo "*** ${tname} PASSED ($ok)"
else
echo "*** ${tname} FAILED (ok=$ok ng=$ng)"
fi
let tid++
echo ""
done
for tno in ${OSTEST_MBIND_LIST}
do
tname=`printf "C${issue}T%02d" ${tid}`
echo "*** ${tname} start *******************************"
${MCEXEC} ${TESTMCK} -s mbind -n ${tno} -- -n 2 2>&1 | tee test_mck-mbind${tno}.txt
if [ $? = 0 ]; then
echo "*** ${tname} PASSED"
else
echo "*** ${tname} FAILED"
fi
let tid++
echo ""
done

14
test/issues/959/Makefile Normal file
View File

@ -0,0 +1,14 @@
include $(HOME)/.mck_test_config.mk
CFLAGS=-g -O0 -Wall -I$(MCK_DIR)/include
LDFLAGS=-L$(MCK_DIR)/lib64 -lihk -lnuma -Wl,-rpath=$(MCK_DIR)/lib64
TARGET=check_mempol_il
all: $(TARGET)
test: all
./C959.sh
clean:
rm -f $(TARGET) *.o *.txt

87
test/issues/959/README Normal file
View File

@ -0,0 +1,87 @@
【Issue#959 動作確認】
□ テスト内容
本テストは2つのNUMAード(node0, node1)を使用してMPOL_INTERLEAVEの動作を確認するテストである。
2つ以上のNUMAードを持つ環境で実行すること。
1. INTERLEAVEするードセットに十分なメモリ容量がある場合の動作確認
C959T01: set_mempolicyによるmempolicy設定時の動作 (2ード)
node0, node1 からそれぞれ10GBのメモリをMcKernelに割り当てた状態で
下記の処理を確認する
(1) set_mempolicy() でプロセスのmempolicyを、node0, node1 でのINTERLEAVEに設定する
(2) 6GBのメモリを確保し、書き込みを行う
(3) McKernelの2つのNUMAードから均等にメモリが使用されていることを確認する
C959T02: mbindによるmempolicy設定時の動作 (2ード)
node0, node1 からそれぞれ10GBのメモリをMcKernelに割り当てた状態で
下記の処理を確認する
(1) set_mempolicy() でプロセスのmempolicyを、node0 でのBINDに設定する
(2) 6GBのメモリを確保する
(3) mbind() で(2)で確保した領域のmempolicyを、node0, node1 でのINTERLEAVEに設定する
(4) McKernelの2つのNUMAードから均等にメモリが使用されていることを確認する
C959T03: set_mempolicyによるmempolicy設定時の動作 (1ード)
node0, node1 からそれぞれ10GBのメモリをMcKernelに割り当てた状態で
下記の処理を確認する
(1) set_mempolicy() でプロセスのmempolicyを、 node1 でのINTERLEAVEに設定する
(2) 6GBのメモリを確保し、書き込みを行う
(3) McKernelのnode1から 6GBが使用されていることを確認する
C959T04: mbindによるmempolicy設定時の動作 (2ード)
node0, node1 からそれぞれ10GBのメモリをMcKernelに割り当てた状態で
下記の処理を確認する
(1) set_mempolicy() でプロセスのmempolicyを、node0 でのBINDに設定する
(2) 6GBのメモリを確保する
(3) mbind() で(2)で確保した領域のmempolicyを、node1 でのINTERLEAVEに設定する
(4) McKernelのnode1から 6GBが使用されていることを確認する
2. INTERLEAVEするードセットにメモリ容量が不足している場合の動作確認
C959T05: set_mempolicyによるmempolicy設定時の動作
node0 に10GB、 node1 に2GBのメモリをそれぞれMcKernelに割り当てた状態で
下記の処理を確認する
(1) set_mempolicy() でプロセスのmempolicyを、node1 でのINTERLEAVEに設定する
(2) 6GBのメモリを確保し、書き込みを行う
(3) McKernelのnode0から4GB, node1から2GBがそれぞれ使用されていることを確認する
C959T06: mbindによるmempolicy設定時の動作
node0 に10GB、 node1 に2GBのメモリをそれぞれMcKernelに割り当てた状態で
下記の処理を確認する
(1) set_mempolicy() でプロセスのmempolicyを、node0 でのBINDに設定する
(2) 6GBのメモリを確保する
(3) mbind() で(2)で確保した領域のmempolicyを、node1 でのINTERLEAVEに設定する
(4) McKernelのnode0から4GB, node1から2GBがそれぞれ使用されていることを確認する
3. 以下のLTPを用いて既存のmbind機能に影響がないことを確認する
- mbind01
- get_mempolicy01
4. 以下のOSTESTを用いて既存のmbind機能に影響がないことを確認する
- ostest-mbind.000
- ostest-mbind.001
- ostest-mbind.002
- ostest-mbind.003
- ostest-mbind.004
- ostest-mbind.005
- ostest-mbind.006
- ostest-mbind.007
- ostest-mbind.008
- ostest-mbind.009
- ostest-mbind.010
- ostest-mbind.011
- ostest-mbind.012
□ 実行手順
・下記の手順でテストを実行する
$ cd <mckernel>
$ patch -p0 < test/issues/959/test_print.patch
(build mckernel)
$ cd test/issues/959
$ make test
McKernelのインストール先や、OSTEST, LTPの配置場所は、
$HOME/.mck_test_config を参照している
.mck_test_config は、McKernelをビルドした際に生成されるmck_test_config.sample ファイルを
$HOMEにコピーし、適宜編集する
□ 実行結果
x86_64result.log, aarch64_result.log 参照。
すべての項目をPASSしていることを確認。

View File

@ -0,0 +1,314 @@
mcstop+release.sh ... done
mcreboot.sh -c 1-7 -m 10G@0,10G@1 -O -e anon_on_demand ... done
*** C959T01 start *******************************
INTERLEAVE BIT_MASK: 0x3
set_mempolicy: INTERLEAVE mask 0x3
** Difference of numa_stat **
[OK] NUMA[0] 0xc0000000
[OK] NUMA[1] 0xc0000000
*** C959T01 PASSED ******************************
*** C959T02 start *******************************
INTERLEAVE BIT_MASK: 0x3
set_mempolicy: BIND mask 0x1
mbind : INTERLEAVE mask 0x3
** Difference of numa_stat **
[OK] NUMA[0] 0xc0000000
[OK] NUMA[1] 0xc0000000
*** C959T02 PASSED ******************************
*** C959T03 start *******************************
INTERLEAVE BIT_MASK: 0x2
set_mempolicy: INTERLEAVE mask 0x2
** Difference of numa_stat **
[OK] NUMA[0] 0x0
[OK] NUMA[1] 0x180000000
*** C959T03 PASSED ******************************
*** C959T04 start *******************************
INTERLEAVE BIT_MASK: 0x2
set_mempolicy: BIND mask 0x1
mbind : INTERLEAVE mask 0x2
** Difference of numa_stat **
[OK] NUMA[0] 0x0
[OK] NUMA[1] 0x180000000
*** C959T04 PASSED ******************************
mcstop+release.sh ... done
mcreboot.sh -c 1-7 -m 10G@0,2G@1 -O -e anon_on_demand ... done
*** C959T05 start *******************************
INTERLEAVE BIT_MASK: 0x2
set_mempolicy: INTERLEAVE mask 0x2
** Difference of numa_stat **
[OK] NUMA[0] 0x100000000
[OK] NUMA[1] 0x80000000
*** C959T05 PASSED ******************************
*** C959T06 start *******************************
INTERLEAVE BIT_MASK: 0x2
set_mempolicy: BIND mask 0x1
mbind : INTERLEAVE mask 0x2
** Difference of numa_stat **
[OK] NUMA[0] 0x100000000
[OK] NUMA[1] 0x80000000
*** C959T06 PASSED ******************************
*** C959T07 start *******************************
tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
mbind01.c:181: INFO: case MPOL_DEFAULT
mbind01.c:230: PASS: Test passed
mbind01.c:181: INFO: case MPOL_DEFAULT (target exists)
mbind01.c:230: PASS: Test passed
mbind01.c:181: INFO: case MPOL_BIND (no target)
mbind01.c:230: PASS: Test passed
mbind01.c:181: INFO: case MPOL_BIND
mbind01.c:230: PASS: Test passed
mbind01.c:181: INFO: case MPOL_INTERLEAVE (no target)
mbind01.c:230: PASS: Test passed
mbind01.c:181: INFO: case MPOL_INTERLEAVE
mbind01.c:230: PASS: Test passed
mbind01.c:181: INFO: case MPOL_PREFERRED (no target)
mbind01.c:230: PASS: Test passed
mbind01.c:181: INFO: case MPOL_PREFERRED
mbind01.c:230: PASS: Test passed
mbind01.c:181: INFO: case UNKNOWN_POLICY
mbind01.c:230: PASS: Test passed
mbind01.c:181: INFO: case MPOL_DEFAULT (invalid flags)
mbind01.c:230: PASS: Test passed
mbind01.c:181: INFO: case MPOL_PREFERRED (invalid nodemask)
mbind01.c:230: PASS: Test passed
Summary:
passed 11
failed 0
skipped 0
warnings 0
*** C959T07 PASSED (11)
*** C959T08 start *******************************
EXPECT: return value(ret)=0 errno=0 (Success)
RESULT: return value(ret)=0 errno=0 (Success)
EXPECT: return value(ret)=0 errno=0 (Success)
RESULT: return value(ret)=0 errno=0 (Success)
EXPECT: return value(ret)=0 errno=0 (Success)
RESULT: return value(ret)=0 errno=0 (Success)
EXPECT: return value(ret)=0 errno=0 (Success)
RESULT: return value(ret)=0 errno=0 (Success)
EXPECT: return value(ret)=0 errno=0 (Success)
RESULT: return value(ret)=0 errno=0 (Success)
EXPECT: return value(ret)=0 errno=0 (Success)
RESULT: return value(ret)=0 errno=0 (Success)
EXPECT: return value(ret)=0 errno=0 (Success)
RESULT: return value(ret)=0 errno=0 (Success)
EXPECT: return value(ret)=0 errno=0 (Success)
RESULT: return value(ret)=0 errno=0 (Success)
EXPECT: return value(ret)=0 errno=0 (Success)
RESULT: return value(ret)=0 errno=0 (Success)
EXPECT: return value(ret)=0 errno=0 (Success)
RESULT: return value(ret)=0 errno=0 (Success)
EXPECT: return value(ret)=-1 errno=14 (Bad address)
RESULT: return value(ret)=-1 errno=14 (Bad address)
EXPECT: return value(ret)=-1 errno=22 (Invalid argument)
RESULT: return value(ret)=-1 errno=22 (Invalid argument)
get_mempolicy01 0 TINFO : (case00) START
get_mempolicy01 1 TPASS : (case00) END
get_mempolicy01 0 TINFO : (case01) START
get_mempolicy01 2 TPASS : (case01) END
get_mempolicy01 0 TINFO : (case02) START
get_mempolicy01 3 TPASS : (case02) END
get_mempolicy01 0 TINFO : (case03) START
get_mempolicy01 4 TPASS : (case03) END
get_mempolicy01 0 TINFO : (case04) START
get_mempolicy01 5 TPASS : (case04) END
get_mempolicy01 0 TINFO : (case05) START
get_mempolicy01 6 TPASS : (case05) END
get_mempolicy01 0 TINFO : (case06) START
get_mempolicy01 7 TPASS : (case06) END
get_mempolicy01 0 TINFO : (case07) START
get_mempolicy01 8 TPASS : (case07) END
get_mempolicy01 0 TINFO : (case08) START
get_mempolicy01 9 TPASS : (case08) END
get_mempolicy01 0 TINFO : (case09) START
get_mempolicy01 10 TPASS : (case09) END
get_mempolicy01 0 TINFO : (case10) START
get_mempolicy01 11 TPASS : (case10) END
get_mempolicy01 0 TINFO : (case11) START
get_mempolicy01 12 TPASS : (case11) END
*** C959T08 PASSED (12)
*** C959T09 start *******************************
TEST_SUITE: mbind
TEST_NUMBER: 1
ARGS: -n 2
RESULT: ok
*** C959T09 PASSED
*** C959T10 start *******************************
TEST_SUITE: mbind
TEST_NUMBER: 3
ARGS: -n 2
RESULT: ok
*** C959T10 PASSED
*** C959T11 start *******************************
TEST_SUITE: mbind
TEST_NUMBER: 5
ARGS: -n 2
RESULT: ok
*** C959T11 PASSED
*** C959T12 start *******************************
TEST_SUITE: mbind
TEST_NUMBER: 9
ARGS: -n 2
RESULT: ok
*** C959T12 PASSED
*** C959T13 start *******************************
TEST_SUITE: mbind
TEST_NUMBER: 12
ARGS: -n 2
RESULT: ok
*** C959T13 PASSED
*** C959T14 start *******************************
TEST_SUITE: mbind
TEST_NUMBER: 14
ARGS: -n 2
RESULT: ok
*** C959T14 PASSED
*** C959T15 start *******************************
TEST_SUITE: mbind
TEST_NUMBER: 15
ARGS: -n 2
region 0
get : mode = 2, node_mask = 1
m_expect : mode = 2, node_mask = 1
region 1
get : mode = 0, node_mask = 0
m_expect : mode = 0, node_mask = 0
region 2
get : mode = 3, node_mask = 3
m_expect : mode = 3, node_mask = 3
region 3
get : mode = 3, node_mask = 3
m_expect : mode = 3, node_mask = 3
region 4
get : mode = 3, node_mask = 3
m_expect : mode = 3, node_mask = 3
region 5
get : mode = 0, node_mask = 0
m_expect : mode = 0, node_mask = 0
region 6
get : mode = 0, node_mask = 0
m_expect : mode = 0, node_mask = 0
RESULT: ok
*** C959T15 PASSED
*** C959T16 start *******************************
TEST_SUITE: mbind
TEST_NUMBER: 16
ARGS: -n 2
region 0
get : mode = 0, node_mask = 0
m_expect : mode = 0, node_mask = 0
region 1
get : mode = 2, node_mask = 1
m_expect : mode = 2, node_mask = 1
region 2
get : mode = 2, node_mask = 1
m_expect : mode = 2, node_mask = 1
region 3
get : mode = 2, node_mask = 1
m_expect : mode = 2, node_mask = 1
region 4
get : mode = 0, node_mask = 0
m_expect : mode = 0, node_mask = 0
region 5
get : mode = 0, node_mask = 0
m_expect : mode = 0, node_mask = 0
region 6
get : mode = 0, node_mask = 0
m_expect : mode = 0, node_mask = 0
RESULT: ok
*** C959T16 PASSED
*** C959T17 start *******************************
TEST_SUITE: mbind
TEST_NUMBER: 20
ARGS: -n 2
region 0
get : mode = 0, node_mask = 0
m_expect : mode = 0, node_mask = 0
region 1
get : mode = 3, node_mask = 3
m_expect : mode = 3, node_mask = 3
region 2
get : mode = 3, node_mask = 3
m_expect : mode = 3, node_mask = 3
region 3
get : mode = 2, node_mask = 1
m_expect : mode = 2, node_mask = 1
region 4
get : mode = 2, node_mask = 1
m_expect : mode = 2, node_mask = 1
region 5
get : mode = 0, node_mask = 0
m_expect : mode = 0, node_mask = 0
region 6
get : mode = 0, node_mask = 0
m_expect : mode = 0, node_mask = 0
RESULT: ok
*** C959T17 PASSED
*** C959T18 start *******************************
TEST_SUITE: mbind
TEST_NUMBER: 24
ARGS: -n 2
RESULT: ok
*** C959T18 PASSED
*** C959T19 start *******************************
TEST_SUITE: mbind
TEST_NUMBER: 26
ARGS: -n 2
nodemask = 0
RESULT: ok
*** C959T19 PASSED
*** C959T20 start *******************************
TEST_SUITE: mbind
TEST_NUMBER: 28
ARGS: -n 2
RESULT: ok
*** C959T20 PASSED
*** C959T21 start *******************************
TEST_SUITE: mbind
TEST_NUMBER: 30
ARGS: -n 2
RESULT: ok
*** C959T21 PASSED

Some files were not shown because too many files have changed in this diff Show More