Compare commits

...

114 Commits
1.4.0 ... 1.5.0

Author SHA1 Message Date
641dfed37e configure.ac: Update version number 2018-04-06 09:14:27 +09:00
4572e6be3f fix mcctrl SMAP - everyone needs copy_to_user 2018-04-03 10:38:44 +09:00
12e44050c9 mcexec: drop READ_IMPLIES_EXEC from personality to avoid device file mapping failure 2018-04-02 20:12:54 +09:00
d5190990f5 mcreboot.sh,mcstop+release.sh: rm -rf /tmp/mcreboot when it's done 2018-03-27 23:25:44 +09:00
82822b1f16 mcreboot.sh: Fix error cases
(1) Restart irqbalance when error occurs after it's stopped
(2) Restore /proc/irq/*/smp_affinity when error occurs after
    they're modified
2018-03-27 22:20:25 +09:00
7f02889f76 mcreboot.sh,mcstop+release.sh: Save /proc/irq/*/smp_affinity to /tmp/mcreboot 2018-03-27 22:01:55 +09:00
9dc86869d8 test: Modify mng_mod/{863,870}/README 2018-03-27 19:36:07 +09:00
02bb127007 test: Modify mng_mod/*/README 2018-03-27 14:53:29 +09:00
c26c4aba4f test: Modify mng_mod/{863,870} 2018-03-13 10:24:52 +09:00
e8d8ad60c2 Modify README files of test/mng_mod/{863,870,882} 2018-03-13 05:04:06 +09:00
a7f645f7df terminate(): fix update_lock and threads_lock order to avoid deadlock 2018-03-25 08:29:53 +09:00
73731d2a0d ihk_mc_map/unmap_virtual(): do proper TLB invalidation 2018-03-24 07:58:08 +09:00
0f049c5ed7 Modify README of #863 and #870 2018-03-12 17:13:16 +09:00
8d5f95de04 schedule: Add comment on #1029
refs #1029
2018-03-12 17:11:20 +09:00
88fca2c0df issue/{863, 870}/README: update test items 2018-03-23 16:08:17 +09:00
81d18e35dd rename files 2018-03-23 15:35:24 +09:00
309da8fc53 issue/863: add 8 testcases 2018-03-23 14:48:18 +09:00
535e3f3af6 issue/863/CT300x: add timestamp and check 2018-03-23 13:28:19 +09:00
4c80dca479 issue/863/README: add how to execute stress_test 2018-03-23 12:26:13 +09:00
7bef1f5117 Remove debug-print from do_syscall() 2018-03-12 02:07:12 +09:00
bb8c8355c2 small fix: testcases for #1032, #1033, #1034 2018-03-19 16:28:18 +09:00
fab0641813 prepare_process_ranges_args_envs(): fix generating saved_cmdline to avoid PF in strlen() 2018-03-19 13:56:04 +09:00
ce3af4734a fix: dual hold_thread() in do_kill() 2018-03-19 11:12:50 +09:00
e2dea4e9f8 mcexec_start_image(): handle IKC send timeout 2018-03-17 21:33:17 +09:00
0d9c1df75a update: testcases and result for #1032, #1033, #1034 2018-03-16 11:14:29 +09:00
6a979cf4b8 add: testcases for #1032, #1033, #1034 2018-03-15 14:31:29 +09:00
c107d1fdf9 fix: Bug for measuring rss in fork()
refs: #1032
2018-03-15 14:29:16 +09:00
bc89a51e00 fix: getrusage's u|stime race-condition caused by release_thread() and getrusage() 2018-03-15 14:26:39 +09:00
9da9e755fa Issue#923: add test cases 2018-03-15 10:13:16 +09:00
fe42481d6f Add allow_oversubscribe kernel argument
It's not allowed in the default setting.
Execute mcreboot.sh with -O option to allow it.

refs #1072
2018-03-10 13:08:38 +09:00
b1ea6eb82a procfs: Show Linux /proc/self/cgroup
Support the case where McKernel process retrieves its job-id when running under
the Fujitsu TCS suite.
2018-03-10 11:58:45 +09:00
8c2e20c3aa uti: Fix uti thread on the McKernel side blocks others in do_syscall()
It could block other threads on the same CPU in do_syscall() since it busy-waits after woken up
because it's not allowed to sleep again.
2018-03-09 18:02:45 +09:00
65667709a8 Fix thread status race-condition caused by hold_thread() in do_kill() and terminate()
Conflicts:
	arch/x86_64/kernel/syscall.c
	kernel/syscall.c
2018-03-09 17:53:17 +09:00
51bc5fd61f uti: Fix wrong argument passed to ihk_ikc_release_packet() in mcexec_terminate_thread()
Conflicts:
	executer/kernel/mcctrl/control.c
2018-03-09 17:44:30 +09:00
3b277b2354 uti: Fix dead-lock of calling terminate() from terminate()
Conflicts:
	arch/x86_64/kernel/syscall.c
	kernel/syscall.c
2018-03-09 17:38:55 +09:00
3e4c9bdd90 Fix lock of struct wait_queue_head_list_node 2018-03-09 17:31:10 +09:00
06b1b4f8ab Fix deadlock on thread->times_update in getrusage()
Set thread->in_kernel properly on exiting interrupt handler when entering
it from kernel mode.

Conflicts:
	arch/x86_64/kernel/cpu.c
	kernel/mem.c
2018-03-09 17:26:31 +09:00
7b4de6e6c2 mcstat: Clean-up Makefile.in 2018-03-09 14:36:01 +09:00
1c266f4849 mcstat: Fix build error 2018-03-09 14:31:07 +09:00
b7a7281195 fix: Bug for getrusage often return incorrect ru_stime
refs #1034
2018-03-07 13:11:37 +09:00
b77732fb4f fix: Bug for getrusage(RUSAGE_CHILDREN) return parent info (POSTK_DEBUG_TEMP_FIX_72)
refs #1033
2018-03-07 13:10:45 +09:00
a224bf648a fix: Bug for getrusage return incorrect ru_maxrss
refs #1032
2018-03-07 13:09:24 +09:00
642520f80c rus_vm_fault: If page fault occurs in a thread that has not processed system call offloading, incorrectly return to normal.
refs #923
2018-03-07 10:22:47 +09:00
5cb75b00c7 mcexec_destroy_per_process_data: System calls delegation can not be terminated in error when the last process that closed /dev/mcos0 is a child process.
refs #882
2018-03-07 09:11:37 +09:00
7dd0d1137f revert for fix git message
This reverts commit 840acd6021.
2018-03-07 09:09:28 +09:00
cb2fe29f06 fix build error 2018-03-05 10:57:10 +09:00
3432f46d8b fix & add: testcases for refs #885, refs #1031 2018-03-01 15:41:58 +09:00
afcf1a24aa add: testcases for refs #885, refs #1031 2018-03-01 10:24:21 +09:00
140f813d77 fix: differences in behavior of sigaction between Linux and Mckernel 2018-03-01 09:44:44 +09:00
7ad6f9595c fix: bug for ptrace_attach self pid 2018-03-01 09:37:12 +09:00
1796c20b88 A bug for not installing mcstat is fixed. 2018-02-25 11:46:16 +09:00
0da5b76916 Merge branch 'development' of postpeta.pccluster.org:mckernel into development 2018-02-25 11:03:13 +09:00
4ac1efae6c - mcstat is a tool to report McKernel statistics from Linux side.
This is a response to a CEA's request.
	- The tools directory is created under the mckernel directory.
	- Some include files are now installed in the install directory,
	  but we should rethink of it.
2018-02-25 10:57:28 +09:00
523a066245 sigaction: support for SA_RESETHAND on x86_64
refs #1031
2018-02-22 11:55:32 +09:00
98df469d29 Issue#882: add test cases 2018-02-22 11:42:43 +09:00
f46287a711 ptrace: support for attaching child_process to parent
refs #885
2018-02-22 09:47:59 +09:00
c260b5c6f3 xpmem: support for fork()
refs #925
2018-02-22 09:37:48 +09:00
c9157f273f do_fork: If mcexec succeeds for fork and McKernel fails fork, the child process of mcexec will remain. 2018-02-14 16:37:38 +09:00
840acd6021 mcexec_destroy_per_process_data: System calls delegation can not be terminated in error when the last process that closed /dev/mcos0 is a child process.
refs #822
2018-02-14 16:34:08 +09:00
c949a894c6 Remove unnecessary files commited by mistake. 2018-02-06 10:43:21 +09:00
228f8f8533 Wait for LWK to run at shutdown.
refs #898
refs #928
2018-02-06 10:40:12 +09:00
8ee9eca74e issue 863: add test cases and test evidences 2018-02-05 16:07:00 +09:00
748429fc92 do_generic_syscall: Even if the system call is normal, if errno is not zero, it returns an error. (TEMP_FIX_75) 2018-02-03 21:37:12 +09:00
a9dfcd9a89 translate_rva_to_rpa(): use 2MB blocks in 1GB pages on x86 2018-01-31 11:16:44 +09:00
559fc9746c signal: check_signal must be called after check_need_resched. 2018-01-28 13:38:51 +09:00
54169bc3ea procfs: indicate heap in /proc/maps 2018-01-26 16:22:43 +09:00
142e923222 procfs: indicate VDSO, vsyscall and stack in /proc/maps 2018-01-26 16:02:32 +09:00
86efc86945 save_syscall_return_value(): separate from check_signal() and call from syscall() (for ARM64) 2018-01-26 14:43:18 +09:00
ebaafa95d8 settid(): clear syscal offload request before populating 2018-01-26 13:54:34 +09:00
b8ee144e67 do_fork(): return -ENOMEM when no more TIDs available 2018-01-26 13:53:05 +09:00
722ae0e7d5 ARM64 arch_clone_thread(): eliminate extra save_fp_regs() 2018-01-26 13:51:38 +09:00
f56e087208 init_process_stack(): fix stack alignment (align to 64 bytes) 2018-01-26 13:43:23 +09:00
f55f01cc11 signal: If the thread receiving the signal is not current, the signal is not processed. 2018-01-25 22:27:34 +09:00
1fa398cfab do_kill: fix to initialization leakage 2018-01-24 23:11:18 +09:00
8123cc413e Use version string in configure.ac when git repo is not found 2018-01-24 00:52:18 +09:00
d4459cf9f3 Add check to confirm IHK and McKernel with the same version are used 2018-01-24 00:20:57 +09:00
4bb65494e9 signal: When the process receives a termination signal, it first terminates mcexec.
refs #863
refs #870
2018-01-23 14:40:38 +09:00
2f2b3cdc6f signal: interrupt_syscall is called by the core executing the thread that recieved the signal.
refs #999
2018-01-23 14:31:04 +09:00
1e9f9d9809 update Test for Issue#1029 2018-01-14 14:58:19 +09:00
1b25379c02 small fix: reset switch_ctx flag in schedule() for redo 2018-01-14 14:50:31 +09:00
38bbb4e390 add Test programs for Issue#1029 2018-01-10 11:22:05 +09:00
0fa88f513f fix broken files 2017-12-27 15:28:13 +09:00
cd54c5983a fix openat 2017-12-27 14:59:13 +09:00
6084faeecd make McKernel's execve behave same as Linux when argv or envp is set to NULL (fix for TEMP_FIX_21) 2017-12-26 17:43:17 +09:00
d209c00a30 part of Issue#994
mcexec: open syscall moves to arch_dep
do_fork: don't use __NR_fork. use __NR_clone
vfork: moves to arch_dep
2017-12-26 10:30:33 +09:00
9a5d5feb9c time(): Split into architecture dependent functions
This fixes the bug reported as POSTK_ARCH_DEP_13 and POSTK_DEBUG_ARCH_DEP_13.
2017-12-23 11:36:52 +09:00
0cda763f95 fix /proc/*/pagemap
refs #387
2017-12-25 16:08:51 +09:00
cc7be46b7d make sure to context-switch to idle thread when therad's status is PS_EXITED
refs #1029
2017-12-25 13:32:42 +09:00
589504dc33 mcreboot: -h to indicate halting CPU in idle threads (e.g., in futex_wait()) 2017-12-18 11:22:15 +09:00
bf2f38051b mcreboot-smp: offline/online MCDRAM in one go 2017-12-06 14:41:25 +09:00
2d2d0af6fb add test for Issue#873, 1011 2017-11-29 12:23:20 +09:00
7f47dc78a1 add Issue#727 test cases 2017-11-29 11:32:40 +09:00
c3c9187ed5 add test for portability (kahansei_kojo in dev_V) 2017-11-28 17:55:23 +09:00
aebacb243e User Space:swapout (this is a rebase commit to merge into development) 2017-11-28 09:16:00 +09:00
5a8d1f09e8 add test/dump/README 2017-11-27 19:39:16 +09:00
0e10b6d1ee test/strace: Fix permission 2017-11-22 06:31:32 +09:00
d649d6fc2d Include mbind support (this is a rebase commit to merge into development) 2017-11-27 11:16:53 +09:00
bad487cc07 add regression test result for strace 2017-11-25 18:30:51 +09:00
3b6056fb1a add strace test cases and test result 2017-11-25 17:37:10 +09:00
5cc738d6bd add test programs for strace 2017-11-25 14:35:17 +09:00
c9fa445f54 Merge branch 'development' of pccluster.org:mckernel into development 2017-11-22 10:53:33 +09:00
d273a2f58b add strace bundled test cases 2017-11-22 10:52:30 +09:00
4e7069d499 add: proc|sys fs format_checker (tool) 2017-11-22 09:39:48 +09:00
66f44e77af mcstop+release.sh: Allow ihkmond to flush kmsg buffer 2017-11-20 18:28:48 +09:00
35f908b75c mcexec: protect against incorrect partitioned execution argument (-n) using timeouts 2017-11-20 17:06:01 +09:00
2f0089dfb9 mcstop+release: use ihkconfig release mem all 2017-11-20 17:06:01 +09:00
2af6d5115a fix: depending arch futex_atomic_op_inuser() (a part of ARCH_DEP_8) 2017-11-20 16:42:47 +09:00
ac25c5e1e7 fix: depending arch in Makefile (POSTK_DEBUG_ARCH_DEP_1) 2017-11-20 14:45:18 +09:00
90c0355d90 add setting process of pgshift to remap_process_memory_range
refs #955
2017-11-20 14:17:03 +09:00
43230eb623 fix: checking the return code of fork() in Linux.
refs #906
2017-11-15 15:46:47 +09:00
f18dc8428d fix: error code of perf_event_open, when unsupported event is specified.
refs #1030
2017-11-15 12:49:56 +09:00
ab53c8e0a4 execve: fix memory leak
refs #727
2017-11-09 16:44:31 +09:00
6c33e236d7 mcreboot: Fix umask for /proc and /sys files 2017-10-27 04:57:44 +09:00
85d36f1469 mcexec: check kernel version <= 3.10 for RHEL mcoverlayfs 2017-10-31 13:39:31 +09:00
470 changed files with 23311 additions and 786 deletions

View File

@ -4,7 +4,7 @@ INCDIR = @INCDIR@
ETCDIR = @ETCDIR@
MANDIR = @MANDIR@
all: executer-mcctrl executer-mcoverlayfs executer-user mckernel
all: executer-mcctrl executer-mcoverlayfs executer-user mckernel mck-tools
executer-mcctrl:
+@(cd executer/kernel/mcctrl; $(MAKE) modules)
@ -26,6 +26,9 @@ mckernel:
;; \
esac
mck-tools:
+@(cd tools/mcstat; $(MAKE))
install:
@(cd executer/kernel/mcctrl; $(MAKE) install)
@(cd executer/kernel/mcoverlayfs; $(MAKE) install)
@ -60,6 +63,7 @@ install:
exit 1 \
;; \
esac
@(cd tools/mcstat/; $(MAKE) install)
clean:
@(cd executer/kernel/mcctrl; $(MAKE) clean)
@ -74,3 +78,4 @@ clean:
exit 1 \
;; \
esac
@(cd tools/mcstat; $(MAKE) clean)

View File

@ -1172,8 +1172,6 @@ void arch_clone_thread(struct thread *othread, unsigned long pc,
asm("mrs %0, tpidr_el0" : "=r" (tls));
othread->tlsblock_base = nthread->tlsblock_base = tls;
/* copy fp_regs values from parent. */
save_fp_regs(othread);
if ((othread->fp_regs != NULL) && (check_and_allocate_fp_regs(nthread) == 0)) {
memcpy(nthread->fp_regs, othread->fp_regs, sizeof(fp_regs_struct));
}

View File

@ -144,5 +144,3 @@ SYSCALL_HANDLED(1045, signalfd)
SYSCALL_DELEGATED(1049, stat)
SYSCALL_DELEGATED(1060, getpgrp)
SYSCALL_DELEGATED(1062, time)
SYSCALL_HANDLED(1071, vfork)
SYSCALL_DELEGATED(1079, fork)

View File

@ -1807,7 +1807,6 @@ static int clear_range_l1(void *args0, pte_t *ptep, uint64_t base,
ihk_mc_free_pages_user(phys_to_virt(phys), npages);
dkprintf("%s: freeing regular page at 0x%lx\n", __FUNCTION__, base);
}
args->vm->currss -= PTL1_SIZE;
}
return 0;
@ -1887,7 +1886,6 @@ static int clear_range_middle(void *args0, pte_t *ptep, uint64_t base,
ihk_mc_free_pages_user(phys_to_virt(phys), npages);
dkprintf("%s(level=%d): freeing large page at 0x%lx\n", __FUNCTION__, level, base);
}
args->vm->currss -= tbl.pgsize;
}
return 0;

View File

@ -13,6 +13,7 @@
#include <hwcap.h>
#include <prctl.h>
#include <limits.h>
#include <syscall.h>
extern void ptrace_report_signal(struct thread *thread, int sig);
extern void clear_single_step(struct thread *thread);
@ -1322,6 +1323,17 @@ interrupt_from_user(void *regs0)
return((regs->pstate & PSR_MODE_MASK) == PSR_MODE_EL0t);
}
void save_syscall_return_value(int num, unsigned long rc)
{
/*
* Save syscall return value.
*/
if (cpu_local_var(current) && cpu_local_var(current)->uctx &&
num != __NR_rt_sigsuspend) {
ihk_mc_syscall_arg0(cpu_local_var(current)->uctx) = rc;
}
}
void
check_signal(unsigned long rc, void *regs0, int num)
{
@ -1346,16 +1358,6 @@ __check_signal(unsigned long rc, void *regs0, int num, int irq_disabled)
return;
thread = cpu_local_var(current);
/**
* If check_signal is called from syscall(),
* then save syscall return value.
*/
if((regs == NULL)&&(num != __NR_rt_sigsuspend)){ /* It's call from syscall! */
// Get user context through current thread
// and update syscall return.
ihk_mc_syscall_arg0(thread->uctx) = rc;
}
if(thread == NULL || thread->proc->pid == 0){
struct thread *t;
irqstate = ihk_mc_spinlock_lock(&(cpu_local_var(runq_lock)));
@ -2497,4 +2499,15 @@ out:
return mpsr->phase_ret;
}
time_t time(void) {
struct timespec ats;
if (gettime_local_support) {
calculate_time_from_tsc(&ats);
return ats.tv_sec;
}
return (time_t)0;
}
/*** End of File ***/

View File

@ -849,6 +849,7 @@ void setup_x86_ap(void (*next_func)(void))
void arch_show_interrupt_context(const void *reg);
void set_signal(int sig, void *regs, struct siginfo *info);
void check_signal(unsigned long, void *, int);
void check_sig_pending();
extern void tlb_flush_handler(int vector);
void __show_stack(uintptr_t *sp) {
@ -870,6 +871,19 @@ void show_context_stack(uintptr_t *rbp) {
return;
}
void interrupt_exit(struct x86_user_context *regs)
{
if (interrupt_from_user(regs)) {
cpu_enable_interrupt();
check_sig_pending();
check_need_resched();
check_signal(0, regs, 0);
}
else {
check_sig_pending();
}
}
void handle_interrupt(int vector, struct x86_user_context *regs)
{
struct ihk_mc_interrupt_handler *h;
@ -992,12 +1006,8 @@ void handle_interrupt(int vector, struct x86_user_context *regs)
}
}
if(interrupt_from_user(regs)){
cpu_enable_interrupt();
check_signal(0, regs, 0);
check_need_resched();
}
set_cputime(0);
interrupt_exit(regs);
set_cputime(interrupt_from_user(regs)? 0: 1);
--v->in_interrupt;
}
@ -1012,13 +1022,9 @@ void gpe_handler(struct x86_user_context *regs)
panic("gpe_handler");
}
set_signal(SIGSEGV, regs, NULL);
if(interrupt_from_user(regs)){
cpu_enable_interrupt();
check_signal(0, regs, 0);
check_need_resched();
}
set_cputime(0);
// panic("GPF");
interrupt_exit(regs);
set_cputime(interrupt_from_user(regs)? 0: 1);
panic("GPF");
}
void debug_handler(struct x86_user_context *regs)
@ -1045,12 +1051,8 @@ void debug_handler(struct x86_user_context *regs)
memset(&info, '\0', sizeof info);
info.si_code = si_code;
set_signal(SIGTRAP, regs, &info);
if(interrupt_from_user(regs)){
cpu_enable_interrupt();
check_signal(0, regs, 0);
check_need_resched();
}
set_cputime(0);
interrupt_exit(regs);
set_cputime(interrupt_from_user(regs)? 0: 1);
}
void int3_handler(struct x86_user_context *regs)
@ -1067,12 +1069,8 @@ void int3_handler(struct x86_user_context *regs)
memset(&info, '\0', sizeof info);
info.si_code = TRAP_BRKPT;
set_signal(SIGTRAP, regs, &info);
if(interrupt_from_user(regs)){
cpu_enable_interrupt();
check_signal(0, regs, 0);
check_need_resched();
}
set_cputime(0);
interrupt_exit(regs);
set_cputime(interrupt_from_user(regs)? 0: 1);
}
void
@ -1582,6 +1580,8 @@ void arch_show_interrupt_context(const void *reg)
__kprintf("%16lx %16lx %16lx %16lx\n",
regs->cs, regs->ss, regs->rflags, regs->error);
kprintf_unlock(irqflags);
return;
arch_show_extended_context();
arch_print_pre_interrupt_stack(regs);

View File

@ -64,7 +64,6 @@ static inline int futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval,
return oldval;
}
#ifdef POSTK_DEBUG_ARCH_DEP_8 /* arch depend hide */
static inline int futex_atomic_op_inuser(int encoded_op, int __user *uaddr)
{
int op = (encoded_op >> 28) & 7;
@ -128,7 +127,6 @@ static inline int futex_atomic_op_inuser(int encoded_op, int __user *uaddr)
}
return ret;
}
#endif /* !POSTK_DEBUG_ARCH_DEP_8 */
static inline int get_futex_value_locked(uint32_t *dest, uint32_t *from)
{

View File

@ -18,6 +18,11 @@
#define _NSIG_BPW 64
#define _NSIG_WORDS (_NSIG / _NSIG_BPW)
static inline int valid_signal(unsigned long sig)
{
return sig <= _NSIG ? 1 : 0;
}
typedef unsigned long int __sigset_t;
#define __sigmask(sig) (((__sigset_t) 1) << ((sig) - 1))

View File

@ -56,7 +56,7 @@ SYSCALL_HANDLED(36, getitimer)
SYSCALL_HANDLED(38, setitimer)
SYSCALL_HANDLED(39, getpid)
SYSCALL_HANDLED(56, clone)
SYSCALL_DELEGATED(57, fork)
SYSCALL_HANDLED(57, fork)
SYSCALL_HANDLED(58, vfork)
SYSCALL_HANDLED(59, execve)
SYSCALL_HANDLED(60, exit)

View File

@ -145,6 +145,8 @@ nmi:
movq %rsp,%gs:PANIC_REGS+0x08
movl nmi_mode(%rip),%eax
cmp $3,%rax
je 4f
cmp $1,%rax
je 1f
cmp $2,%rax
@ -199,9 +201,9 @@ nmi:
movl %eax,%gs:PANIC_REGS+0xA0
movq $1,%gs:PANICED
call ihk_mc_query_mem_areas
1:
4:
hlt
jmp 1b
jmp 4b
.globl x86_syscall
x86_syscall:

View File

@ -493,7 +493,7 @@ uint64_t ihk_mc_pt_virt_to_pagemap(struct page_table *pt, unsigned long virt)
error = ihk_mc_pt_virt_to_phys(pt, (void *)virt, &phys);
if (error) {
return 0;
return PM_PSHIFT(PAGE_SHIFT);
}
pagemap = PM_PFRAME(phys >> PAGE_SHIFT);
@ -1542,7 +1542,6 @@ static int clear_range_l1(void *args0, pte_t *ptep, uint64_t base,
dkprintf("%lx-,%s: calling memory_stat_rss_sub(),phys=%lx,size=%ld,pgsize=%ld\n", pte_get_phys(&old), __FUNCTION__, pte_get_phys(&old), PTL1_SIZE, PTL1_SIZE);
rusage_memory_stat_sub(args->memobj, PTL1_SIZE, PTL1_SIZE);
}
args->vm->currss -= PTL1_SIZE;
} else {
dkprintf("%s: !calling memory_stat_rss_sub(),virt=%lx,phys=%lx\n", __FUNCTION__, base, pte_get_phys(&old));
}
@ -1611,7 +1610,6 @@ static int clear_range_l2(void *args0, pte_t *ptep, uint64_t base,
dkprintf("%lx-,%s: calling memory_stat_rss_sub(),phys=%lx,size=%ld,pgsize=%ld\n", pte_get_phys(&old), __FUNCTION__, pte_get_phys(&old), PTL2_SIZE, PTL2_SIZE);
rusage_memory_stat_sub(args->memobj, PTL2_SIZE, PTL2_SIZE);
}
args->vm->currss -= PTL2_SIZE;
}
}
@ -1693,7 +1691,6 @@ static int clear_range_l3(void *args0, pte_t *ptep, uint64_t base,
dkprintf("%lx-,%s: calling memory_stat_rss_sub(),phys=%lx,size=%ld,pgsize=%ld\n", pte_get_phys(&old), __FUNCTION__, pte_get_phys(&old), PTL3_SIZE, PTL3_SIZE);
rusage_memory_stat_sub(args->memobj, PTL3_SIZE, PTL3_SIZE);
}
args->vm->currss -= PTL3_SIZE;
}
}

View File

@ -30,8 +30,9 @@
#include <ihk/ikc.h>
#include <page.h>
#include <limits.h>
#include <syscall.h>
void terminate(int, int);
void terminate_mcexec(int, int);
extern long do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact);
long syscall(int num, ihk_mc_user_context_t *ctx);
void set_signal(int sig, void *regs0, siginfo_t *info);
@ -142,8 +143,6 @@ SYSCALL_DECLARE(rt_sigaction)
struct k_sigaction new_sa, old_sa;
int rc;
if(sig == SIGKILL || sig == SIGSTOP || sig <= 0 || sig > 64)
return -EINVAL;
if (sigsetsize != sizeof(sigset_t))
return -EINVAL;
@ -251,8 +250,8 @@ SYSCALL_DECLARE(rt_sigreturn)
regs->gpr.rflags &= ~RFLAGS_TF;
info.si_code = TRAP_TRACE;
set_signal(SIGTRAP, regs, &info);
check_signal(0, regs, 0);
check_need_resched();
check_signal(0, regs, 0);
}
if(ksigsp.fpregs && xsavesize){
@ -279,6 +278,7 @@ SYSCALL_DECLARE(rt_sigreturn)
extern struct cpu_local_var *clv;
extern unsigned long do_kill(struct thread *thread, int pid, int tid, int sig, struct siginfo *info, int ptracecont);
extern void interrupt_syscall(struct thread *, int sig);
extern void terminate(int, int);
extern int num_processors;
#define RFLAGS_MASK (RFLAGS_CF | RFLAGS_PF | RFLAGS_AF | RFLAGS_ZF | \
@ -807,6 +807,11 @@ do_signal(unsigned long rc, void *regs0, struct thread *thread, struct sig_pendi
regs->gpr.rip = (unsigned long)k->sa.sa_handler;
regs->gpr.rsp = (unsigned long)usp;
// check signal handler is ONESHOT
if (k->sa.sa_flags & SA_RESETHAND) {
k->sa.sa_handler = SIG_DFL;
}
if(!(k->sa.sa_flags & SA_NODEFER))
thread->sigmask.__val[0] |= pending->sigmask.__val[0];
kfree(pending);
@ -818,8 +823,8 @@ do_signal(unsigned long rc, void *regs0, struct thread *thread, struct sig_pendi
regs->gpr.rflags &= ~RFLAGS_TF;
info.si_code = TRAP_TRACE;
set_signal(SIGTRAP, regs, &info);
check_signal(0, regs, 0);
check_need_resched();
check_signal(0, regs, 0);
}
}
else {
@ -1006,6 +1011,12 @@ interrupt_from_user(void *regs0)
return !(regs->gpr.rsp & 0x8000000000000000);
}
void save_syscall_return_value(int num, unsigned long rc)
{
/* Empty on x86 */
return;
}
void
check_signal(unsigned long rc, void *regs0, int num)
{
@ -1053,6 +1064,110 @@ out:
return;
}
static int
check_sig_pending_thread(struct thread *thread)
{
int found = 0;
struct list_head *head;
mcs_rwlock_lock_t *lock;
struct mcs_rwlock_node_irqsave mcs_rw_node;
struct sig_pending *next;
struct sig_pending *pending;
__sigset_t w;
__sigset_t x;
int sig = 0;
struct k_sigaction *k;
struct cpu_local_var *v;
v = get_this_cpu_local_var();
w = thread->sigmask.__val[0];
lock = &thread->sigcommon->lock;
head = &thread->sigcommon->sigpending;
for (;;) {
mcs_rwlock_reader_lock(lock, &mcs_rw_node);
list_for_each_entry_safe(pending, next, head, list){
for (x = pending->sigmask.__val[0], sig = 0; x;
sig++, x >>= 1);
k = thread->sigcommon->action + sig - 1;
if ((sig != SIGCHLD && sig != SIGURG) ||
(k->sa.sa_handler != (void *)1 &&
k->sa.sa_handler != NULL)) {
if (!(pending->sigmask.__val[0] & w)) {
if (pending->interrupted == 0) {
pending->interrupted = 1;
found = 1;
if (sig != SIGCHLD &&
sig != SIGURG &&
!k->sa.sa_handler) {
found = 2;
break;
}
}
}
}
}
mcs_rwlock_reader_unlock(lock, &mcs_rw_node);
if (found == 2) {
break;
}
if (lock == &thread->sigpendinglock) {
break;
}
lock = &thread->sigpendinglock;
head = &thread->sigpending;
}
if (found == 2) {
ihk_mc_spinlock_unlock(&v->runq_lock, v->runq_irqstate);
terminate_mcexec(0, sig);
return 1;
}
else if (found == 1) {
ihk_mc_spinlock_unlock(&v->runq_lock, v->runq_irqstate);
interrupt_syscall(thread, 0);
return 1;
}
return 0;
}
void
check_sig_pending()
{
struct thread *thread;
struct cpu_local_var *v;
if (clv == NULL)
return;
v = get_this_cpu_local_var();
repeat:
v->runq_irqstate = ihk_mc_spinlock_lock(&v->runq_lock);
list_for_each_entry(thread, &(v->runq), sched_list) {
if (thread == NULL || thread == &cpu_local_var(idle)) {
continue;
}
if (thread->in_syscall_offload == 0) {
continue;
}
if (thread->proc->exit_status & 0x0000000100000000L) {
continue;
}
if (check_sig_pending_thread(thread))
goto repeat;
}
ihk_mc_spinlock_unlock(&v->runq_lock, v->runq_irqstate);
}
unsigned long
do_kill(struct thread *thread, int pid, int tid, int sig, siginfo_t *info,
int ptracecont)
@ -1214,15 +1329,19 @@ done:
mcs_rwlock_reader_lock_noirq(&tproc->update_lock, &updatelock);
savelock = &tthread->sigpendinglock;
head = &tthread->sigpending;
if(sig == SIGKILL ||
(tproc->status != PS_EXITED &&
tproc->status != PS_ZOMBIE &&
tthread->status != PS_EXITED)){
hold_thread(tthread);
mcs_rwlock_reader_lock_noirq(&tproc->threads_lock, &lock);
if (tthread->status != PS_EXITED &&
(sig == SIGKILL ||
(tproc->status != PS_EXITED && tproc->status != PS_ZOMBIE))) {
if ((rc = hold_thread(tthread))) {
kprintf("%s: ERROR hold_thread returned %d,tid=%d\n", __FUNCTION__, rc, tthread->tid);
tthread = NULL;
}
}
else{
tthread = NULL;
}
mcs_rwlock_reader_unlock_noirq(&tproc->threads_lock, &lock);
mcs_rwlock_reader_unlock_noirq(&tproc->update_lock, &updatelock);
mcs_rwlock_reader_unlock_noirq(&thash->lock[hash], &lock);
}
@ -1249,7 +1368,9 @@ done:
}
if (tthread->thread_offloaded) {
interrupt_syscall(tthread, sig);
if (!tthread->proc->nohost) {
interrupt_syscall(tthread, sig);
}
release_thread(tthread);
return 0;
}
@ -1284,6 +1405,7 @@ done:
rc = -ENOMEM;
}
else{
memset(pending, 0, sizeof(struct sig_pending));
pending->sigmask.__val[0] = mask;
memcpy(&pending->info, info, sizeof(siginfo_t));
pending->ptracecont = ptracecont;
@ -1307,9 +1429,6 @@ done:
ihk_mc_interrupt_cpu(get_x86_cpu_local_variable(tthread->cpu_id)->apic_id, 0xd0);
}
if(!tthread->proc->nohost)
interrupt_syscall(tthread, 0);
if (status != PS_RUNNING) {
if(sig == SIGKILL){
/* Wake up the target only when stopped by ptrace-reporting */
@ -1486,6 +1605,16 @@ SYSCALL_DECLARE(clone)
ihk_mc_syscall_sp(ctx));
}
SYSCALL_DECLARE(fork)
{
return do_fork(SIGCHLD, 0, 0, 0, 0, ihk_mc_syscall_pc(ctx), ihk_mc_syscall_sp(ctx));
}
SYSCALL_DECLARE(vfork)
{
return do_fork(CLONE_VFORK|SIGCHLD, 0, 0, 0, 0, ihk_mc_syscall_pc(ctx), ihk_mc_syscall_sp(ctx));
}
SYSCALL_DECLARE(shmget)
{
const key_t key = ihk_mc_syscall_arg0(ctx);
@ -2540,4 +2669,14 @@ out:
return mpsr->phase_ret;
}
time_t time(void) {
struct syscall_request sreq IHK_DMA_ALIGN;
struct thread *thread = cpu_local_var(current);
time_t ret;
sreq.number = __NR_time;
sreq.args[0] = (uintptr_t)NULL;
ret = (time_t)do_syscall(&sreq, ihk_mc_get_processor_id(), thread->proc->pid);
return ret;
}
/*** End of File ***/

View File

@ -35,7 +35,6 @@ error_exit() {
;&
tmp_mcos_created)
if [ "$enable_mcoverlay" == "yes" ]; then
umask $umask_old
rm -rf /tmp/mcos
fi
;&
@ -48,10 +47,6 @@ error_exit() {
}
fi
# Change umask for /proc and /sys files
umask_dec=$(( 8#${umask_old} & 8#0002 ))
umask 0`printf "%o" ${umask_dec}`
if [ ! -e /tmp/mcos ]; then
mkdir -p /tmp/mcos;
fi
@ -149,7 +144,3 @@ for cpuid in `find /sys/bus/cpu/devices/* -maxdepth 0 -name "cpu[0123456789]*" -
rm -rf /tmp/mcos/mcos0_sys/bus/cpu/devices/$cpuid
fi
done
# Restore umask
umask ${umask_old}

View File

@ -19,6 +19,7 @@ ETCDIR=@ETCDIR@
KMODDIR="${prefix}/kmod"
KERNDIR="${prefix}/@TARGET@/kernel"
ENABLE_MCOVERLAYFS="@ENABLE_MCOVERLAYFS@"
MCK_BUILDID=@BUILDID@
mem="512M@0"
cpus=""
@ -44,8 +45,10 @@ fi
turbo=""
ihk_irq=""
umask_old=`umask`
idle_halt=""
allow_oversubscribe=""
while getopts :tk:c:m:o:f:r:q:i:d:e: OPT
while getopts :tk:c:m:o:f:r:q:i:d:e:hO OPT
do
case ${OPT} in
f) facility=${OPTARG}
@ -70,6 +73,10 @@ do
;;
i) mon_interval=${OPTARG}
;;
h) idle_halt="idle_halt"
;;
O) allow_oversubscribe="allow_oversubscribe"
;;
*) echo "invalid option -${OPT}" >&2
exit 1
esac
@ -90,6 +97,18 @@ error_exit() {
local status=$1
case $status in
irqbalance_mck_started)
if [ "${irqbalance_used}" == "yes" ]; then
if [ "`systemctl status irqbalance_mck.service 2> /dev/null |grep -E 'Active: active'`" != "" ]; then
if ! systemctl stop irqbalance_mck.service 2>/dev/null; then
echo "warning: failed to stop irqbalance_mck" >&2
fi
if ! systemctl disable irqbalance_mck.service >/dev/null 2>/dev/null; then
echo "warning: failed to disable irqbalance_mck" >&2
fi
fi
fi
;&
mcos_sys_mounted)
if [ "$enable_mcoverlay" == "yes" ]; then
umount /tmp/mcos/mcos0_sys
@ -117,7 +136,6 @@ error_exit() {
;&
tmp_mcos_created)
if [ "$enable_mcoverlay" == "yes" ]; then
umask $umask_old
rm -rf /tmp/mcos
fi
;&
@ -157,20 +175,20 @@ error_exit() {
ihk_loaded)
rmmod ihk 2>/dev/null || echo "warning: failed to remove ihk" >&2
;&
smp_affinity_modified)
umask $umask_old
if [ "${irqbalance_used}" == "yes" ]; then
if ! perl -e '$tmpdir="/tmp/mcreboot"; @files = grep { -f } glob "$tmpdir/proc/irq/*/smp_affinity"; foreach $file (@files) { $dest = substr($file, length($tmpdir)); if (0) {print "cp $file $dest\n";} system("cp $file $dest 2>/dev/null"); }'; then
echo "warning: failed to restore /proc/irq/*/smp_affinity" >&2
fi
if [ -e /tmp/mcreboot ]; then rm -rf /tmp/mcreboot; fi
fi
;&
irqbalance_stopped)
if [ "`systemctl status irqbalance_mck.service 2> /dev/null |grep -E 'Active: active'`" != "" ]; then
if ! systemctl stop irqbalance_mck.service 2>/dev/null; then
echo "warning: failed to stop irqbalance_mck" >&2
fi
if ! systemctl disable irqbalance_mck.service >/dev/null 2>/dev/null; then
echo "warning: failed to disable irqbalance_mck" >&2
fi
if ! etcdir=@ETCDIR@ perl -e '$etcdir=$ENV{'etcdir'}; @files = grep { -f } glob "$etcdir/proc/irq/*/smp_affinity"; foreach $file (@files) { $dest = substr($file, length($etcdir)); if(0) {print "cp $file $dest\n";} system("cp $file $dest 2>/dev/null"); }'; then
echo "warning: failed to restore /proc/irq/*/smp_affinity" >&2
fi
if ! systemctl start irqbalance.service; then
echo "warning: failed to start irqbalance" >&2;
fi
if [ "${irqbalance_used}" == "yes" ]; then
if ! systemctl start irqbalance.service; then
echo "warning: failed to start irqbalance" >&2;
fi
fi
;&
initial)
@ -240,9 +258,9 @@ if [ "${irqbalance_used}" == "yes" ]; then
exit 1
fi;
if ! etcdir=@ETCDIR@ perl -e 'use File::Copy qw(copy); $etcdir=$ENV{'etcdir'}; @files = grep { -f } glob "/proc/irq/*/smp_affinity"; foreach $file (@files) { $rel = substr($file, 1); $dir=substr($rel, 0, length($rel)-length("/smp_affinity")); if(0) { print "cp $file $etcdir/$rel\n";} if(system("mkdir -p $etcdir/$dir")){ exit 1;} if(!copy($file,"$etcdir/$rel")){ exit 1;} }'; then
if ! perl -e 'use File::Copy qw(copy); $tmpdir="/tmp/mcreboot"; @files = grep { -f } glob "/proc/irq/*/smp_affinity"; foreach $file (@files) { $rel = substr($file, 1); $dir = substr($rel, 0, length($rel) - length("/smp_affinity")); if (system("mkdir -p $tmpdir/$dir")) { exit 1; } if (0) { print "cp $file $tmpdir/$rel\n"; } if (!copy($file,"$tmpdir/$rel")) { exit 1; } }'; then
echo "error: saving /proc/irq/*/smp_affinity" >&2
error_exit "mcos_sys_mounted"
error_exit "irqbalance_stopped"
fi;
# Prevent /proc/irq/*/smp_affinity from getting zero after offlining
@ -256,16 +274,20 @@ if [ "${irqbalance_used}" == "yes" ]; then
if ! ncpus=$ncpus smp_affinity_mask=$smp_affinity_mask perl -e '@dirs = grep { -d } glob "/proc/irq/*"; foreach $dir (@dirs) { $hit = 0; $affinity_str = `cat $dir/smp_affinity`; chomp $affinity_str; @int32strs = split /,/, $affinity_str; @int32strs_mask=split /,/, $ENV{'smp_affinity_mask'}; for($i=0;$i <= $#int32strs_mask; $i++) { $int32strs_inv[$i] = sprintf("%08x",hex($int32strs_mask[$i])^0xffffffff); if($i == 0) { $len = int((($ENV{'ncpus'}%32)+3)/4); if($len != 0) { $int32strs_inv[$i] = substr($int32strs_inv[$i], -$len, $len); } } } $inv = join(",", @int32strs_inv); $nint32s = int(($ENV{'ncpus'}+31)/32); for($j = $nint32s - 1; $j >= 0; $j--) { if(hex($int32strs[$nint32s - 1 - $j]) & hex($int32strs_mask[$nint32s - 1 - $j])) { $hit = 1; }} if($hit == 1) { $cmd = "echo $inv > $dir/smp_affinity 2>/dev/null"; system $cmd;}}'; then
echo "error: modifying /proc/irq/*/smp_affinity" >&2
error_exit "mcos_sys_mounted"
error_exit "irqbalance_stopped"
fi
fi
# Set umask so that proc/sys files/directories created by
# mcctrl.ko and mcreboot.sh have appropriate permission bits
umask_dec=$(( 8#${umask_old} & 8#0002 ))
umask 0`printf "%o" ${umask_dec}`
# Load IHK if not loaded
if ! grep -E 'ihk\s' /proc/modules &>/dev/null; then
if ! taskset -c 0 insmod ${KMODDIR}/ihk.ko 2>/dev/null; then
echo "error: loading ihk" >&2
error_exit "irqbalance_stopped"
error_exit "smp_affinity_modified"
fi
fi
@ -299,7 +321,7 @@ if ! grep ihk_smp_@ARCH@ /proc/modules &>/dev/null; then
error_exit "ihk_loaded"
fi
# Offline-reonline RAM (special case for OFP SNC-4 mode)
# Offline-reonline RAM (special case for OFP SNC-4 flat mode)
if [ "`hostname | grep "c[0-9][0-9][0-9][0-9].ofp"`" != "" ] && [ "`cat /sys/devices/system/node/online`" == "0-7" ]; then
for i in 0 1 2 3; do
find /sys/devices/system/node/node$i/memory*/ -name "online" | while read f; do
@ -313,6 +335,22 @@ if ! grep ihk_smp_@ARCH@ /proc/modules &>/dev/null; then
find /sys/devices/system/node/node$i/memory*/ -name "online" | while read f; do
echo 0 > $f 2>&1 > /dev/null;
done
done
for i in 4 5 6 7; do
find /sys/devices/system/node/node$i/memory*/ -name "online" | while read f; do
echo 1 > $f 2>&1 > /dev/null;
done
done
fi
# Offline-reonline RAM (special case for OFP Quadrant flat mode)
if [ "`hostname | grep "c[0-9][0-9][0-9][0-9].ofp"`" != "" ] && [ "`cat /sys/devices/system/node/online`" == "0-1" ]; then
for i in 1; do
find /sys/devices/system/node/node$i/memory*/ -name "online" | while read f; do
echo 0 > $f 2>&1 > /dev/null;
done
done
for i in 1; do
find /sys/devices/system/node/node$i/memory*/ -name "online" | while read f; do
echo 1 > $f 2>&1 > /dev/null;
done
@ -337,6 +375,13 @@ if ! grep mcctrl /proc/modules &>/dev/null; then
fi
fi
# Check that different versions of binaries/scripts are not mixed
IHK_BUILDID=`${SBINDIR}/ihkconfig 0 get buildid`
if [ "${IHK_BUILDID}" != "${MCK_BUILDID}" ]; then
echo "IHK build-id (${IHK_BUILDID}) didn't match McKernel build-id (${MCK_BUILDID})." >&2
exit 1
fi
# Destroy all LWK instances
if ls /dev/mcos* 1>/dev/null 2>&1; then
for i in /dev/mcos*; do
@ -387,7 +432,7 @@ if ! ${SBINDIR}/ihkosctl 0 load ${KERNDIR}/mckernel.img; then
fi
# Set kernel arguments
if ! ${SBINDIR}/ihkosctl 0 kargs "hidos $turbo dump_level=${DUMP_LEVEL} $extra_kopts"; then
if ! ${SBINDIR}/ihkosctl 0 kargs "hidos $turbo $idle_halt dump_level=${DUMP_LEVEL} $extra_kopts $allow_oversubscribe"; then
echo "error: setting kernel arguments" >&2
error_exit "os_created"
fi
@ -426,4 +471,7 @@ if [ "${irqbalance_used}" == "yes" ]; then
# echo cpus=$cpus ncpus=$ncpus banirq=$banirq
fi
# Restore umask
umask ${umask_old}
exit 0

View File

@ -48,6 +48,9 @@ if ls /dev/mcos* 1>/dev/null 2>&1; then
done
fi
# Allow ihkmond to flush kmsg buffer
sleep 2.0
# Query IHK-SMP resources and release them
if ! ${SBINDIR}/ihkconfig 0 query cpu > /dev/null; then
echo "error: querying cpus" >&2
@ -62,17 +65,23 @@ if [ "${cpus}" != "" ]; then
fi
fi
if ! ${SBINDIR}/ihkconfig 0 query mem > /dev/null; then
echo "error: querying memory" >&2
exit 1
fi
#if ! ${SBINDIR}/ihkconfig 0 query mem > /dev/null; then
# echo "error: querying memory" >&2
# exit 1
#fi
#
#mem=`${SBINDIR}/ihkconfig 0 query mem`
#if [ "${mem}" != "" ]; then
# if ! ${SBINDIR}/ihkconfig 0 release mem $mem > /dev/null; then
# echo "error: releasing memory" >&2
# exit 1
# fi
#fi
mem=`${SBINDIR}/ihkconfig 0 query mem`
if [ "${mem}" != "" ]; then
if ! ${SBINDIR}/ihkconfig 0 release mem $mem > /dev/null; then
echo "error: releasing memory" >&2
exit 1
fi
# Release all memory
if ! ${SBINDIR}/ihkconfig 0 release mem "all" > /dev/null; then
echo "error: releasing memory" >&2
exit 1
fi
# Remove delegator if loaded
@ -110,9 +119,10 @@ fi
# Start irqbalance with the original settings
if [ "${irqbalance_used}" != "" ]; then
if ! etcdir=@ETCDIR@ perl -e '$etcdir=$ENV{'etcdir'}; @files = grep { -f } glob "$etcdir/proc/irq/*/smp_affinity"; foreach $file (@files) { $dest = substr($file, length($etcdir)); if(0) {print "cp $file $dest\n";} system("cp $file $dest 2>/dev/null"); }'; then
if ! perl -e '$tmpdir="/tmp/mcreboot"; @files = grep { -f } glob "$tmpdir/proc/irq/*/smp_affinity"; foreach $file (@files) { $dest = substr($file, length($tmpdir)); if (0) {print "cp $file $dest\n";} system("cp $file $dest 2>/dev/null"); }'; then
echo "warning: failed to restore /proc/irq/*/smp_affinity" >&2
fi
if [ -e /tmp/mcreboot ]; then rm -rf /tmp/mcreboot; fi
if ! systemctl start irqbalance.service; then
echo "warning: failed to start irqbalance" >&2;
fi

View File

@ -3,21 +3,24 @@
/* Path of install directory for binary */
#undef BINDIR
/* IHK build-id to confirm IHK and McKernel built at the same time are used */
#undef BUILDID
/* whether mcoverlayfs is enabled */
#undef ENABLE_MCOVERLAYFS
/* whether memdump feature is enabled */
#undef ENABLE_MEMDUMP
/* whether perf is enabled */
#undef ENABLE_PERF
/* whether qlmpi is enabled */
#undef ENABLE_QLMPI
/* whether rusage is enabled */
#undef ENABLE_RUSAGE
/* whether perf is enabled */
#undef ENABLE_PERF
/* Define to 1 if you have the <inttypes.h> header file. */
#undef HAVE_INTTYPES_H

48
configure vendored
View File

@ -1,6 +1,6 @@
#! /bin/sh
# Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.69 for mckernel 0.9.0.
# Generated by GNU Autoconf 2.69 for mckernel 1.5.0.
#
#
# Copyright (C) 1992-1996, 1998-2012 Free Software Foundation, Inc.
@ -577,8 +577,8 @@ MAKEFLAGS=
# Identity of this package.
PACKAGE_NAME='mckernel'
PACKAGE_TARNAME='mckernel'
PACKAGE_VERSION='0.9.0'
PACKAGE_STRING='mckernel 0.9.0'
PACKAGE_VERSION='1.5.0'
PACKAGE_STRING='mckernel 1.5.0'
PACKAGE_BUGREPORT=''
PACKAGE_URL=''
@ -645,6 +645,7 @@ TARGET
UNAME_R
KDIR
ARCH
BUILDID
XCC
FGREP
EGREP
@ -1261,7 +1262,7 @@ if test "$ac_init_help" = "long"; then
# Omit some internal or obsolete options to make the list less imposing.
# This message is too long to be a string in the A/UX 3.1 sh.
cat <<_ACEOF
\`configure' configures mckernel 0.9.0 to adapt to many kinds of systems.
\`configure' configures mckernel 1.5.0 to adapt to many kinds of systems.
Usage: $0 [OPTION]... [VAR=VALUE]...
@ -1322,7 +1323,7 @@ fi
if test -n "$ac_init_help"; then
case $ac_init_help in
short | recursive ) echo "Configuration of mckernel 0.9.0:";;
short | recursive ) echo "Configuration of mckernel 1.5.0:";;
esac
cat <<\_ACEOF
@ -1430,7 +1431,7 @@ fi
test -n "$ac_init_help" && exit $ac_status
if $ac_init_version; then
cat <<\_ACEOF
mckernel configure 0.9.0
mckernel configure 1.5.0
generated by GNU Autoconf 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
@ -1728,7 +1729,7 @@ cat >config.log <<_ACEOF
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by mckernel $as_me 0.9.0, which was
It was created by mckernel $as_me 1.5.0, which was
generated by GNU Autoconf 2.69. Invocation command line was
$ $0 $@
@ -2081,12 +2082,12 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
IHK_VERSION=0.9.0
MCKERNEL_VERSION=0.9.0
DCFA_VERSION=0.9.0
IHK_RELEASE_DATE=2013-11-18
MCKERNEL_RELEASE_DATE=2013-11-18
DCFA_RELEASE_DATE=2013-11-18
IHK_VERSION=1.5.0
MCKERNEL_VERSION=1.5.0
DCFA_VERSION=DCFA_VERSION_m4
IHK_RELEASE_DATE=2018-04-05
MCKERNEL_RELEASE_DATE=2018-04-05
DCFA_RELEASE_DATE=DCFA_RELEASE_DATE_m4
@ -5012,6 +5013,20 @@ cat >>confdefs.h <<_ACEOF
_ACEOF
ABS_SRCDIR=$( cd $( dirname $0 ); pwd )
IHK_ABS_SRCDIR=${ABS_SRCDIR}/../ihk
BUILDID=$( cd $IHK_ABS_SRCDIR; if [ ! -d .git ]; then echo $IHK_VERSION; else bash -c 'git rev-list -1 HEAD | cut -c1-8'; fi )
{ $as_echo "$as_me:${as_lineno-$LINENO}: BUILDID=$BUILDID" >&5
$as_echo "$as_me: BUILDID=$BUILDID" >&6;}
if test "x$BUILDID" != "x" ; then
cat >>confdefs.h <<_ACEOF
#define BUILDID "$BUILDID"
_ACEOF
fi
@ -5045,7 +5060,7 @@ ac_config_headers="$ac_config_headers config.h"
# POSTK_DEBUG_ARCH_DEP_37
# AC_CONFIG_FILES arch dependfiles separate
ac_config_files="$ac_config_files Makefile executer/user/Makefile executer/user/mcexec.1:executer/user/mcexec.1in executer/user/vmcore2mckdump executer/user/arch/$ARCH/Makefile executer/user/arch/x86_64/Makefile executer/kernel/mcctrl/Makefile executer/kernel/mcctrl/arch/$ARCH/Makefile executer/kernel/mcoverlayfs/Makefile executer/kernel/mcoverlayfs/linux-3.10.0-327.36.1.el7/Makefile executer/kernel/mcoverlayfs/linux-4.0.9/Makefile executer/kernel/mcoverlayfs/linux-4.6.7/Makefile executer/include/qlmpilib.h kernel/Makefile kernel/Makefile.build kernel/include/swapfmt.h arch/x86_64/tools/mcreboot-attached-mic.sh arch/x86_64/tools/mcshutdown-attached-mic.sh arch/x86_64/tools/mcreboot-builtin-x86.sh arch/x86_64/tools/mcreboot-smp-x86.sh arch/x86_64/tools/mcstop+release-smp-x86.sh arch/x86_64/tools/mcoverlay-destroy-smp-x86.sh arch/x86_64/tools/mcoverlay-create-smp-x86.sh arch/x86_64/tools/eclair-dump-backtrace.exp arch/x86_64/tools/mcshutdown-builtin-x86.sh arch/x86_64/tools/mcreboot.1:arch/x86_64/tools/mcreboot.1in arch/x86_64/tools/irqbalance_mck.service arch/x86_64/tools/irqbalance_mck.in"
ac_config_files="$ac_config_files Makefile executer/user/Makefile executer/user/mcexec.1:executer/user/mcexec.1in executer/user/vmcore2mckdump executer/user/arch/$ARCH/Makefile executer/user/arch/x86_64/Makefile executer/kernel/mcctrl/Makefile executer/kernel/mcctrl/arch/$ARCH/Makefile executer/kernel/mcoverlayfs/Makefile executer/kernel/mcoverlayfs/linux-3.10.0-327.36.1.el7/Makefile executer/kernel/mcoverlayfs/linux-4.0.9/Makefile executer/kernel/mcoverlayfs/linux-4.6.7/Makefile executer/include/qlmpilib.h kernel/Makefile kernel/Makefile.build kernel/include/swapfmt.h arch/x86_64/tools/mcreboot-attached-mic.sh arch/x86_64/tools/mcshutdown-attached-mic.sh arch/x86_64/tools/mcreboot-builtin-x86.sh arch/x86_64/tools/mcreboot-smp-x86.sh arch/x86_64/tools/mcstop+release-smp-x86.sh arch/x86_64/tools/mcoverlay-destroy-smp-x86.sh arch/x86_64/tools/mcoverlay-create-smp-x86.sh arch/x86_64/tools/eclair-dump-backtrace.exp arch/x86_64/tools/mcshutdown-builtin-x86.sh arch/x86_64/tools/mcreboot.1:arch/x86_64/tools/mcreboot.1in arch/x86_64/tools/irqbalance_mck.service arch/x86_64/tools/irqbalance_mck.in tools/mcstat/Makefile"
if test "$TARGET" = "smp-x86"; then
@ -5570,7 +5585,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
# report actual input values of CONFIG_FILES etc. instead of their
# values after options handling.
ac_log="
This file was extended by mckernel $as_me 0.9.0, which was
This file was extended by mckernel $as_me 1.5.0, which was
generated by GNU Autoconf 2.69. Invocation command line was
CONFIG_FILES = $CONFIG_FILES
@ -5632,7 +5647,7 @@ _ACEOF
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
ac_cs_version="\\
mckernel config.status 0.9.0
mckernel config.status 1.5.0
configured by $0, generated by GNU Autoconf 2.69,
with options \\"\$ac_cs_config\\"
@ -5782,6 +5797,7 @@ do
"arch/x86_64/tools/mcreboot.1") CONFIG_FILES="$CONFIG_FILES arch/x86_64/tools/mcreboot.1:arch/x86_64/tools/mcreboot.1in" ;;
"arch/x86_64/tools/irqbalance_mck.service") CONFIG_FILES="$CONFIG_FILES arch/x86_64/tools/irqbalance_mck.service" ;;
"arch/x86_64/tools/irqbalance_mck.in") CONFIG_FILES="$CONFIG_FILES arch/x86_64/tools/irqbalance_mck.in" ;;
"tools/mcstat/Makefile") CONFIG_FILES="$CONFIG_FILES tools/mcstat/Makefile" ;;
"arch/x86_64/kernel/Makefile.arch") CONFIG_FILES="$CONFIG_FILES arch/x86_64/kernel/Makefile.arch" ;;
"kernel/config/config.smp-arm64") CONFIG_FILES="$CONFIG_FILES kernel/config/config.smp-arm64" ;;
"arch/arm64/kernel/vdso/Makefile") CONFIG_FILES="$CONFIG_FILES arch/arm64/kernel/vdso/Makefile" ;;

View File

@ -1,11 +1,9 @@
# configure.ac COPYRIGHT FUJITSU LIMITED 2015-2016
AC_PREREQ(2.63)
m4_define([IHK_VERSION_m4],[0.9.0])dnl
m4_define([MCKERNEL_VERSION_m4],[0.9.0])dnl
m4_define([DCFA_VERSION_m4],[0.9.0])dnl
m4_define([IHK_RELEASE_DATE_m4],[2013-11-18])dnl
m4_define([MCKERNEL_RELEASE_DATE_m4],[2013-11-18])dnl
m4_define([DCFA_RELEASE_DATE_m4],[2013-11-18])dnl
m4_define([IHK_VERSION_m4],[1.5.0])dnl
m4_define([MCKERNEL_VERSION_m4],[1.5.0])dnl
m4_define([IHK_RELEASE_DATE_m4],[2018-04-05])dnl
m4_define([MCKERNEL_RELEASE_DATE_m4],[2018-04-05])dnl
AC_INIT([mckernel], MCKERNEL_VERSION_m4)
@ -502,6 +500,15 @@ fi
AC_DEFINE_UNQUOTED(BINDIR,"$BINDIR",[Path of install directory for binary])
AC_DEFINE_UNQUOTED(SBINDIR,"$SBINDIR",[Path of install directory for system binary])
ABS_SRCDIR=$( cd $( dirname $0 ); pwd )
IHK_ABS_SRCDIR=${ABS_SRCDIR}/../ihk
BUILDID=$( cd $IHK_ABS_SRCDIR; if @<:@ ! -d .git @:>@; then echo $IHK_VERSION; else bash -c 'git rev-list -1 HEAD | cut -c1-8'; fi )
AC_MSG_NOTICE([BUILDID=$BUILDID])
if test "x$BUILDID" != "x" ; then
AC_DEFINE_UNQUOTED(BUILDID,"$BUILDID",[IHK build-id to confirm IHK and McKernel built at the same time are used])
fi
AC_SUBST(BUILDID)
AC_SUBST(CC)
AC_SUBST(XCC)
AC_SUBST(ARCH)
@ -563,6 +570,7 @@ AC_CONFIG_FILES([
arch/x86_64/tools/mcreboot.1:arch/x86_64/tools/mcreboot.1in
arch/x86_64/tools/irqbalance_mck.service
arch/x86_64/tools/irqbalance_mck.in
tools/mcstat/Makefile
])
if test "$TARGET" = "smp-x86"; then

View File

@ -9,7 +9,6 @@ IHK_BASE=$(src)/../../../../ihk
obj-m += mcctrl.o
# POSTK_DEBUG_ARCH_DEP_1, arch depend "-mcmodel"
# POSTK_DEBUG_ARCH_DEP_83, arch depend translate_rva_to_rpa() move
ccflags-y := -I$(IHK_BASE)/linux/include \
-I$(IHK_BASE)/linux/include/ihk/arch/$(ARCH) \
@ -24,9 +23,8 @@ ccflags-y := -I$(IHK_BASE)/linux/include \
-I$(src)/../../../kernel/include \
-DMCEXEC_PATH=\"$(BINDIR)/mcexec\"
ifneq ($(ARCH), arm64)
ccflags-y += -mno-red-zone -mcmodel=kernel
endif
# depending arch
include @abs_builddir@/arch/$(ARCH)/Makefile
mcctrl-y := driver.o control.o ikc.o syscall.o procfs.o binfmt_mcexec.o
mcctrl-y += sysfs.o sysfs_files.o arch/$(ARCH)/archdeps.o

View File

@ -1 +1 @@
# dummy file
ccflags-y += -mno-red-zone -mcmodel=kernel

View File

@ -327,6 +327,14 @@ int translate_rva_to_rpa(ihk_os_t os, unsigned long rpt, unsigned long rva,
pgsize = 1UL << offsh;
rpa = pt[ix] & ((1UL << 52) - 1) & ~(pgsize - 1);
rpa |= rva & (pgsize - 1);
/* For GB pages, just report regular 2MB page */
if (offsh == 30) {
pgsize = 1UL << 21;
dprintk("%s: GB page translated 0x%lx -> 0x%lx, pgsize: %lu\n",
__FUNCTION__, rva, rpa, pgsize);
}
ihk_device_unmap_virtual(ihk_os_to_dev(os), pt, PAGE_SIZE);
ihk_device_unmap_memory(ihk_os_to_dev(os), phys, PAGE_SIZE);
error = 0;

View File

@ -367,7 +367,7 @@ static long mcexec_debug_log(ihk_os_t os, unsigned long arg)
}
int mcexec_close_exec(ihk_os_t os);
int mcexec_destroy_per_process_data(ihk_os_t os);
int mcexec_destroy_per_process_data(ihk_os_t os, int pid);
static void release_handler(ihk_os_t os, void *param)
{
@ -387,7 +387,7 @@ static void release_handler(ihk_os_t os, void *param)
mcexec_close_exec(os);
mcexec_destroy_per_process_data(os);
mcexec_destroy_per_process_data(os, info->pid);
memset(&isp, '\0', sizeof isp);
isp.msg = SCD_MSG_CLEANUP_PROCESS;
@ -435,6 +435,7 @@ static long mcexec_start_image(ihk_os_t os,
struct mcctrl_channel *c;
struct mcctrl_usrdata *usrdata = ihk_host_os_get_usrdata(os);
struct mcos_handler_info *info;
int ret = 0;
desc = kmalloc(sizeof(*desc), GFP_KERNEL);
if (!desc) {
@ -445,17 +446,18 @@ static long mcexec_start_image(ihk_os_t os,
if (copy_from_user(desc, udesc,
sizeof(struct program_load_desc))) {
kfree(desc);
return -EFAULT;
ret = -EFAULT;
goto out;
}
info = new_mcos_handler_info(os, file);
#ifdef POSTK_DEBUG_TEMP_FIX_64 /* host process is SIGKILLed fix. */
if (info == NULL) {
kfree(desc);
return -ENOMEM;
ret = -ENOMEM;
goto out;
}
#endif /* POSTK_DEBUG_TEMP_FIX_64 */
info->pid = desc->pid;
info->cpu = desc->cpu;
ihk_os_register_release_handler(file, release_handler, info);
@ -471,10 +473,14 @@ static long mcexec_start_image(ihk_os_t os,
isp.ref = desc->cpu;
isp.arg = desc->rprocess;
mcctrl_ikc_send(os, desc->cpu, &isp);
ret = mcctrl_ikc_send(os, desc->cpu, &isp);
if (ret < 0) {
printk("%s: error: sending IKC msg\n", __FUNCTION__);
}
out:
kfree(desc);
return 0;
return ret;
}
static DECLARE_WAIT_QUEUE_HEAD(signalq);
@ -632,6 +638,7 @@ static long mcexec_get_cpuset(ihk_os_t os, unsigned long arg)
pli->task = current;
pli->ready = 0;
pli->timeout = 0;
init_waitqueue_head(&pli->pli_wq);
pli_next = NULL;
@ -693,11 +700,50 @@ static long mcexec_get_cpuset(ihk_os_t os, unsigned long arg)
dprintk("%s: pid: %d, waiting in list\n",
__FUNCTION__, task_tgid_vnr(current));
mutex_unlock(&pe->lock);
ret = wait_event_interruptible(pli->pli_wq, pli->ready);
/* Timeout period: 10 secs + (#procs * 0.1sec) */
ret = wait_event_interruptible_timeout(pli->pli_wq,
pli->ready,
msecs_to_jiffies(10000 + req.nr_processes * 100));
mutex_lock(&pe->lock);
if (ret != 0) {
/* First timeout task? Wake up everyone else,
* but tell them we timed out */
if (ret == 0) {
printk("%s: error: pid: %d, timed out, waking everyone\n",
__FUNCTION__, task_tgid_vnr(current));
while (!list_empty(&pe->pli_list)) {
pli_next = list_first_entry(&pe->pli_list,
struct process_list_item, list);
list_del(&pli_next->list);
pli_next->ready = 1;
pli_next->timeout = 1;
wake_up_interruptible(&pli_next->pli_wq);
}
/* Reset process counter to start state */
pe->nr_processes = -1;
ret = -ETIMEDOUT;
goto put_and_unlock_out;
}
/* Interrupted or woken up by someone else due to time out? */
if (ret < 0 || pli->timeout) {
if (ret > 0) {
printk("%s: error: pid: %d, job startup timed out\n",
__FUNCTION__, task_tgid_vnr(current));
ret = -ETIMEDOUT;
}
goto put_and_unlock_out;
}
/* Incorrect wakeup state? */
if (!pli->ready) {
printk("%s: error: pid: %d, not ready but woken?\n",
__FUNCTION__, task_tgid_vnr(current));
ret = -EINVAL;
goto put_and_unlock_out;
}
dprintk("%s: pid: %d, woken up\n",
__FUNCTION__, task_tgid_vnr(current));
}
@ -1201,8 +1247,8 @@ retry_alloc:
wqhln->packet = packet;
wqhln->req = 1;
ihk_ikc_spinlock_unlock(&ppd->wq_list_lock, flags);
wake_up(&wqhln->wq_syscall);
ihk_ikc_spinlock_unlock(&ppd->wq_list_lock, flags);
mcctrl_put_per_proc_data(ppd);
@ -1344,7 +1390,6 @@ retry_alloc:
goto put_ppd_out;
}
#ifdef POSTK_DEBUG_ARCH_DEP_46 /* user area direct access fix. */
if (copy_to_user(&req->cpu, &packet->ref, sizeof(req->cpu))) {
if (mcctrl_delete_per_thread_data(ppd, current) < 0) {
kprintf("%s: error deleting per-thread data\n", __FUNCTION__);
@ -1352,9 +1397,6 @@ retry_alloc:
ret = -EINVAL;
goto put_ppd_out;
}
#else /* POSTK_DEBUG_ARCH_DEP_46 */
req->cpu = packet->ref;
#endif /* POSTK_DEBUG_ARCH_DEP_46 */
ret = 0;
goto put_ppd_out;
@ -1680,12 +1722,12 @@ int mcexec_create_per_process_data(ihk_os_t os)
return 0;
}
int mcexec_destroy_per_process_data(ihk_os_t os)
int mcexec_destroy_per_process_data(ihk_os_t os, int pid)
{
struct mcctrl_usrdata *usrdata = ihk_host_os_get_usrdata(os);
struct mcctrl_per_proc_data *ppd = NULL;
ppd = mcctrl_get_per_proc_data(usrdata, task_tgid_vnr(current));
ppd = mcctrl_get_per_proc_data(usrdata, pid);
if (ppd) {
/* One for the reference and one for deallocation.
@ -2431,7 +2473,9 @@ mcexec_terminate_thread(ihk_os_t os, unsigned long *param, struct file *file)
mcctrl_delete_per_thread_data(ppd, tsk);
__return_syscall(usrdata->os, packet, param[2], tid);
ihk_ikc_release_packet((struct ihk_ikc_free_packet *)packet,
(usrdata->channels + packet->ref)->c);
(usrdata->ikc2linux[smp_processor_id()] ?
usrdata->ikc2linux[smp_processor_id()] :
usrdata->ikc2linux[0]));
err:
if(ppd)
mcctrl_put_per_proc_data(ppd);

View File

@ -27,6 +27,7 @@
#include <linux/miscdevice.h>
#include <linux/slab.h>
#include <linux/device.h>
#include <linux/delay.h>
#include "mcctrl.h"
#include <ihk/ihk_host_user.h>
@ -169,6 +170,14 @@ error_cleanup_channels:
int mcctrl_os_shutdown_notifier(int os_index)
{
if (os[os_index]) {
/* Wait for os running */
if (ihk_os_wait_for_status(os[os_index], IHK_OS_STATUS_RUNNING, 0, 200) != 0) {
printk("IHK: OS does not become RUNNING in shutdown. Force shutdown.\n");
/* send nmi to force shutdown */
ihk_os_send_nmi(os[os_index], 3);
mdelay(200);
}
sysfsm_cleanup(os[os_index]);
free_topology_info(os[os_index]);
ihk_os_unregister_user_call_handlers(os[os_index], mcctrl_uc + os_index);

View File

@ -304,6 +304,7 @@ struct node_topology {
struct process_list_item {
int ready;
int timeout;
struct task_struct *task;
struct list_head list;
wait_queue_head_t pli_wq;

View File

@ -1019,7 +1019,8 @@ static const struct procfs_entry tid_entry_stuff[] = {
static const struct procfs_entry pid_entry_stuff[] = {
PROC_REG("auxv", S_IRUSR, NULL),
PROC_REG("cgroup", S_IXUSR, NULL),
/* Support the case where McKernel process retrieves its job-id under the Fujitsu TCS suite. */
// PROC_REG("cgroup", S_IXUSR, NULL),
// PROC_REG("clear_refs", S_IWUSR, NULL),
PROC_REG("cmdline", S_IRUGO, NULL),
// PROC_REG("comm", S_IRUGO|S_IWUSR, NULL),

View File

@ -222,6 +222,14 @@ int translate_rva_to_rpa(ihk_os_t os, unsigned long rpt, unsigned long rva,
pgsize = 1UL << offsh;
rpa = pt[ix] & ((1UL << 52) - 1) & ~(pgsize - 1);
rpa |= rva & (pgsize - 1);
/* For GB pages, just report regular 2MB page */
if (offsh == 30) {
pgsize = 1UL << 21;
dprintk("%s: GB page translated 0x%lx -> 0x%lx, pgsize: %lu\n",
__FUNCTION__, rva, rpa, pgsize);
}
ihk_device_unmap_virtual(ihk_os_to_dev(os), pt, PAGE_SIZE);
ihk_device_unmap_memory(ihk_os_to_dev(os), phys, PAGE_SIZE);
error = 0;
@ -799,7 +807,7 @@ static int rus_vm_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
packet = (struct ikc_scd_packet *)mcctrl_get_per_thread_data(ppd, current);
if (!packet) {
error = -ENOENT;
ret = VM_FAULT_SIGBUS;
printk("%s: no packet registered for TID %d\n",
__FUNCTION__, task_pid_vnr(current));
goto put_and_out;

View File

@ -11,7 +11,7 @@ MCKERNEL_INCDIR=@MCKERNEL_INCDIR@
MCKERNEL_LIBDIR=@MCKERNEL_LIBDIR@
KDIR ?= @KDIR@
ARCH=@ARCH@
CFLAGS=-Wall -O -I. -I$(VPATH)/arch/${ARCH}
CFLAGS=-Wall -O -I. -I$(VPATH)/arch/${ARCH} -I${IHKDIR}
LDFLAGS=@LDFLAGS@
RPATH=$(shell echo $(LDFLAGS)|awk '{for(i=1;i<=NF;i++){if($$i~/^-L/){w=$$i;sub(/^-L/,"-Wl,-rpath,",w);print w}}}')
VPATH=@abs_srcdir@
@ -40,7 +40,7 @@ mcexec: mcexec.c libmcexec.a
# POSTK_DEBUG_ARCH_DEP_34, eclair arch depend separate.
ifeq ($(ARCH), arm64)
eclair: eclair.c arch/$(ARCH)/arch-eclair.c
$(CC) -I.. -I. -I./arch/$(ARCH)/include -I$(VPATH)/.. -I$(VPATH) -I$(VPATH)/arch/$(ARCH)/include -I${IHKDIR} $(CFLAGS) -o $@ $^ $(LIBS)
$(CC) -I.. -I. -I./arch/$(ARCH)/include -I$(VPATH)/.. -I$(VPATH) -I$(VPATH)/arch/$(ARCH)/include $(CFLAGS) -o $@ $^ $(LIBS)
else
eclair: eclair.c
$(CC) $(CFLAGS) -I${IHKDIR} -o $@ $^ $(LIBS)

View File

@ -9,12 +9,15 @@ LIBS=@LIBS@
all: $(TARGET)
../../libmcexec.a: archdep.o
$(AR) cr ../../libmcexec.a archdep.o
../../libmcexec.a: archdep.o arch_syscall.o
$(AR) cr ../../libmcexec.a archdep.o arch_syscall.o
archdep.o: archdep.S
$(CC) -c -I${KDIR} $(CFLAGS) $(EXTRA_CFLAGS) -fPIE -pie -pthread $<
arch_syscall.o: arch_syscall.c
$(CC) -c -I${KDIR} $(CFLAGS) $(EXTRA_CFLAGS) -fPIE -pie -pthread $<
clean:
$(RM) $(TARGET) *.o

View File

@ -0,0 +1,7 @@
struct syscall_wait_desc;
int
archdep_syscall(struct syscall_wait_desc *w, long *ret)
{
return -1;
}

View File

@ -9,12 +9,15 @@ LIBS=@LIBS@
all: $(TARGET)
../../libmcexec.a: archdep.o
$(AR) cr ../../libmcexec.a archdep.o
../../libmcexec.a: archdep.o arch_syscall.o
$(AR) cr ../../libmcexec.a archdep.o arch_syscall.o
archdep.o: archdep.S
$(CC) -c -I${KDIR} $(CFLAGS) $(EXTRA_CFLAGS) -fPIE -pie -pthread $<
arch_syscall.o: arch_syscall.c
$(CC) -c -I${KDIR} $(CFLAGS) $(EXTRA_CFLAGS) -fPIE -pie -pthread $<
clean:
$(RM) $(TARGET) *.o

View File

@ -0,0 +1,63 @@
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <sys/resource.h>
#include <sys/syscall.h>
#include <elf.h>
#include <dirent.h>
#include "../../../include/uprotocol.h"
#include "../../archdep.h"
//#define DEBUG
#ifndef DEBUG
#define __dprint(msg, ...)
#define __dprintf(arg, ...)
#define __eprint(msg, ...)
#define __eprintf(format, ...)
#else
#define __dprint(msg, ...) {printf("%s: " msg, __FUNCTION__);fflush(stdout);}
#define __dprintf(format, ...) {printf("%s: " format, __FUNCTION__, \
__VA_ARGS__);fflush(stdout);}
#define __eprint(msg, ...) {fprintf(stderr, "%s: " msg, __FUNCTION__);\
fflush(stderr);}
#define __eprintf(format, ...) {fprintf(stderr, "%s: " format, __FUNCTION__, \
__VA_ARGS__);fflush(stderr);}
#endif
extern char *chgpath(char *, char *);
extern long do_strncpy_from_user(int, void *, void *, unsigned long);
extern int fd;
#define SET_ERR(ret) if (ret == -1) ret = -errno
int
archdep_syscall(struct syscall_wait_desc *w, long *ret)
{
char *fn;
char pathbuf[PATH_MAX];
char tmpbuf[PATH_MAX];
switch (w->sr.number) {
case __NR_open:
*ret = do_strncpy_from_user(fd, pathbuf,
(void *)w->sr.args[0], PATH_MAX);
if (*ret >= PATH_MAX) {
*ret = -ENAMETOOLONG;
}
if (ret < 0) {
return 0;
}
__dprintf("open: %s\n", pathbuf);
fn = chgpath(pathbuf, tmpbuf);
*ret = open(fn, w->sr.args[1], w->sr.args[2]);
SET_ERR(*ret);
return 0;
}
return -1;
}

View File

@ -1,3 +1,4 @@
extern int switch_ctx(int fd, unsigned long cmd, void **param, void *lctx, void *rctx);
extern unsigned long compare_and_swap(unsigned long *addr, unsigned long old, unsigned long new);
extern unsigned int compare_and_swap_int(unsigned int *addr, unsigned int old, unsigned int new);
extern int archdep_syscall(struct syscall_wait_desc *w, long *ret);

View File

@ -76,6 +76,7 @@
#include <asm/prctl.h>
#endif /* !POSTK_DEBUG_ARCH_DEP_77 */
#include "../include/uprotocol.h"
#include <ihk/ihk_host_user.h>
#include <getopt.h>
#include "archdep.h"
#include "arch_args.h"
@ -148,17 +149,25 @@ char **__glob_argv = 0;
#ifdef ENABLE_MCOVERLAYFS
#undef ENABLE_MCOVERLAYFS
#ifndef RHEL_RELEASE_CODE
// RedHat?
#ifdef RHEL_RELEASE_CODE
#if LINUX_VERSION_CODE <= KERNEL_VERSION(3,10,0)
#define ENABLE_MCOVERLAYFS 1
#else
#error "ERROR: your Linux kernel version on RHEL is not supported"
#endif // LINUX_VERSION_CODE <= KERNEL_VERSION(3,10,0)
// Other distro?
#else
#if LINUX_VERSION_CODE >= KERNEL_VERSION(4,0,0) && LINUX_VERSION_CODE < KERNEL_VERSION(4,1,0)
#define ENABLE_MCOVERLAYFS 1
#endif // LINUX_VERSION_CODE == 4.0
#if LINUX_VERSION_CODE >= KERNEL_VERSION(4,6,0) && LINUX_VERSION_CODE < KERNEL_VERSION(4,7,0)
#define ENABLE_MCOVERLAYFS 1
#endif // LINUX_VERSION_CODE == 4.6
#else
#if RHEL_RELEASE_CODE <= RHEL_RELEASE_VERSION(7,3)
#define ENABLE_MCOVERLAYFS 1
#endif // RHEL_RELEASE_CODE <= 7.3
#endif // RHEL_RELEASE_CODE
#endif // ENABLE_MCOVERLAYFS
@ -198,7 +207,7 @@ struct thread_data_s;
int main_loop(struct thread_data_s *);
static int mcosid;
static int fd;
int fd;
static char *exec_path = NULL;
static char *altroot;
static const char rlimit_stack_envname[] = "MCKERNEL_RLIMIT_STACK";
@ -224,12 +233,13 @@ static int nr_processes = 0;
static int nr_threads = -1;
struct fork_sync {
pid_t pid;
int status;
volatile int success;
sem_t sem;
};
struct fork_sync_container {
pid_t pid;
struct fork_sync_container *next;
struct fork_sync *fs;
};
@ -1746,6 +1756,8 @@ static int
opendev()
{
int f;
char buildid[] = BUILDID;
char query_result[sizeof(BUILDID)];
sprintf(dev, "/dev/mcos%d", mcosid);
@ -1757,6 +1769,18 @@ opendev()
}
fd = f;
if (ioctl(fd, IHK_OS_GET_BUILDID, query_result)) {
fprintf(stderr, "Error: IHK_OS_GET_BUILDID failed");
close(fd);
return -1;
}
if (strncmp(buildid, query_result, sizeof(buildid))) {
fprintf(stderr, "Error: build-id of mcexec (%s) didn't match that of IHK (%s)\n", buildid, query_result);
close(fd);
return -1;
}
return fd;
}
@ -1832,8 +1856,14 @@ int main(int argc, char **argv)
altroot = "/usr/linux-k1om-4.7/linux-k1om";
}
/* Disable address space layout randomization */
/* Disable READ_IMPLIES_EXEC */
persona = personality(0xffffffff);
if (persona & READ_IMPLIES_EXEC) {
persona &= ~READ_IMPLIES_EXEC;
persona = personality(persona);
}
/* Disable address space layout randomization */
__dprintf("persona=%08x\n", persona);
if ((persona & (PER_LINUX | ADDR_NO_RANDOMIZE)) == 0) {
CHKANDJUMP(getenv("MCEXEC_ADDR_NO_RANDOMIZE"), 1, "personality() and then execv() failed\n");
@ -2510,20 +2540,11 @@ do_generic_syscall(
__dprintf("do_generic_syscall(%ld)\n", w->sr.number);
#ifdef POSTK_DEBUG_TEMP_FIX_75 /* syscall return value check add. */
ret = syscall(w->sr.number, w->sr.args[0], w->sr.args[1], w->sr.args[2],
w->sr.args[3], w->sr.args[4], w->sr.args[5]);
if (ret == -1) {
ret = -errno;
}
#else /* POSTK_DEBUG_TEMP_FIX_75 */
errno = 0;
ret = syscall(w->sr.number, w->sr.args[0], w->sr.args[1], w->sr.args[2],
w->sr.args[3], w->sr.args[4], w->sr.args[5]);
if (errno != 0) {
ret = -errno;
}
#endif /* POSTK_DEBUG_TEMP_FIX_75 */
/* Overlayfs /sys/X directory lseek() problem work around */
if (w->sr.number == __NR_lseek && ret == -EINVAL) {
@ -2985,7 +3006,7 @@ out:
return rc;
}
static long do_strncpy_from_user(int fd, void *dest, void *src, unsigned long n)
long do_strncpy_from_user(int fd, void *dest, void *src, unsigned long n)
{
struct strncpy_from_user_desc desc;
int ret;
@ -3199,45 +3220,60 @@ int main_loop(struct thread_data_s *my_thread)
my_thread->remote_cpu = w.cpu;
switch (w.sr.number) {
#ifdef POSTK_DEBUG_ARCH_DEP_13 /* arch depend hide */
#ifdef __aarch64__
case __NR_openat:
/* initialize buffer */
memset(tmpbuf, '\0', sizeof(tmpbuf));
memset(pathbuf, '\0', sizeof(pathbuf));
/* check argument 1 dirfd */
if ((int)w.sr.args[0] != AT_FDCWD) {
ret = do_strncpy_from_user(fd, pathbuf,
(void *)w.sr.args[1],
PATH_MAX);
__dprintf("openat(dirfd == AT_FDCWD)\n");
if (ret >= PATH_MAX) {
ret = -ENAMETOOLONG;
}
if (ret < 0) {
do_syscall_return(fd, cpu, ret, 0, 0, 0, 0);
break;
}
if ((int)w.sr.args[0] != AT_FDCWD &&
pathbuf[0] != '/') {
/* dirfd != AT_FDCWD */
__dprintf("openat(dirfd != AT_FDCWD)\n");
snprintf(tmpbuf, sizeof(tmpbuf), "/proc/self/fd/%d", (int)w.sr.args[0]);
ret = readlink(tmpbuf, pathbuf, sizeof(pathbuf) - 1);
snprintf(tmpbuf, sizeof(tmpbuf),
"/proc/self/fd/%d", (int)w.sr.args[0]);
ret = readlink(tmpbuf, pathbuf,
sizeof(pathbuf) - 1);
if (ret == -1 &&
(errno == ENOENT ||
errno == EINVAL)) {
do_syscall_return(fd, cpu, -EBADF, 0, 0,
0, 0);
break;
}
if (ret < 0) {
do_syscall_return(fd, cpu, -errno, 0, 0, 0, 0);
do_syscall_return(fd, cpu, -errno, 0, 0,
0, 0);
break;
}
__dprintf(" %s -> %s\n", tmpbuf, pathbuf);
ret = do_strncpy_from_user(fd, tmpbuf, (void *)w.sr.args[1], PATH_MAX);
ret = do_strncpy_from_user(fd, tmpbuf,
(void *)w.sr.args[1],
PATH_MAX);
if (ret >= PATH_MAX) {
ret = -ENAMETOOLONG;
}
if (ret < 0) {
do_syscall_return(fd, cpu, ret, 0, 0, 0, 0);
do_syscall_return(fd, cpu, ret, 0, 0, 0,
0);
break;
}
strncat(pathbuf, "/", 1);
strncat(pathbuf, tmpbuf, strlen(tmpbuf) + 1);
} else {
/* dirfd == AT_FDCWD */
__dprintf("openat(dirfd == AT_FDCWD)\n");
ret = do_strncpy_from_user(fd, pathbuf, (void *)w.sr.args[1], PATH_MAX);
if (ret >= PATH_MAX) {
ret = -ENAMETOOLONG;
}
if (ret < 0) {
do_syscall_return(fd, cpu, ret, 0, 0, 0, 0);
break;
}
}
else {
}
__dprintf("openat: %s\n", pathbuf);
@ -3247,44 +3283,6 @@ int main_loop(struct thread_data_s *my_thread)
SET_ERR(ret);
do_syscall_return(fd, cpu, ret, 0, 0, 0, 0);
break;
#else /* __aarch64__ */
case __NR_open:
ret = do_strncpy_from_user(fd, pathbuf, (void *)w.sr.args[0], PATH_MAX);
if (ret >= PATH_MAX) {
ret = -ENAMETOOLONG;
}
if (ret < 0) {
do_syscall_return(fd, cpu, ret, 0, 0, 0, 0);
break;
}
__dprintf("open: %s\n", pathbuf);
fn = chgpath(pathbuf, tmpbuf);
ret = open(fn, w.sr.args[1], w.sr.args[2]);
SET_ERR(ret);
do_syscall_return(fd, cpu, ret, 0, 0, 0, 0);
break;
#endif /* __aarch64__ */
#else /* POSTK_DEBUG_ARCH_DEP_13 */
case __NR_open:
ret = do_strncpy_from_user(fd, pathbuf, (void *)w.sr.args[0], PATH_MAX);
if (ret >= PATH_MAX) {
ret = -ENAMETOOLONG;
}
if (ret < 0) {
do_syscall_return(fd, cpu, ret, 0, 0, 0, 0);
break;
}
__dprintf("open: %s\n", pathbuf);
fn = chgpath(pathbuf, tmpbuf);
ret = open(fn, w.sr.args[1], w.sr.args[2]);
SET_ERR(ret);
do_syscall_return(fd, cpu, ret, 0, 0, 0, 0);
break;
#endif /* POSTK_DEBUG_ARCH_DEP_13 */
case __NR_futex:
ret = clock_gettime(w.sr.args[1], &tv);
@ -3405,74 +3403,58 @@ gettid_out:
break;
}
#ifdef POSTK_DEBUG_ARCH_DEP_13 /* arch depend hide */
case 1079: {
#else /* POSTK_DEBUG_ARCH_DEP_13 */
case __NR_fork: {
#endif /* POSTK_DEBUG_ARCH_DEP_13 */
case __NR_clone: {
struct fork_sync *fs;
struct fork_sync_container *fsc;
struct fork_sync_container *fsc = NULL;
struct fork_sync_container *fp;
struct fork_sync_container *fb;
int flag = w.sr.args[0];
int rc = -1;
pid_t pid;
if (flag == 1) {
pid = w.sr.args[1];
rc = 0;
pthread_mutex_lock(&fork_sync_mutex);
for (fp = fork_sync_top, fb = NULL; fp; fb = fp, fp = fp->next)
if (fp->pid == pid)
break;
if (fp) {
fs = fp->fs;
if (fb)
fb->next = fp->next;
else
fork_sync_top = fp->next;
fs->success = 1;
munmap(fs, sizeof(struct fork_sync));
free(fp);
}
pthread_mutex_unlock(&fork_sync_mutex);
do_syscall_return(fd, cpu, rc, 0, 0, 0, 0);
break;
}
fs = mmap(NULL, sizeof(struct fork_sync),
PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
if (fs == (void *)-1) {
goto fork_err;
}
memset(fs, '\0', sizeof(struct fork_sync));
sem_init(&fs->sem, 1, 0);
fsc = malloc(sizeof(struct fork_sync_container));
if (!fsc) {
goto fork_err;
}
memset(fsc, '\0', sizeof(struct fork_sync_container));
pthread_mutex_lock(&fork_sync_mutex);
fsc->next = fork_sync_top;
fork_sync_top = fsc;
pthread_mutex_unlock(&fork_sync_mutex);
fsc->fs = fs = mmap(NULL, sizeof(struct fork_sync),
PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
if(fs == (void *)-1){
goto fork_err;
}
fsc->fs = fs;
memset(fs, '\0', sizeof(struct fork_sync));
sem_init(&fs->sem, 1, 0);
if(flag){
int pipefds[2];
if(pipe(pipefds) == -1){
rc = -errno;
sem_destroy(&fs->sem);
goto fork_err;
}
pid = fork();
if(pid == 0){
close(pipefds[0]);
pid = fork();
if(pid != 0){
if (write(pipefds[1], &pid, sizeof pid) != sizeof(pid)) {
fprintf(stderr, "error: writing pipefds\n");
}
exit(0);
}
}
else if(pid != -1){
int npid;
int st;
close(pipefds[1]);
if (read(pipefds[0], &npid, sizeof npid) != sizeof(npid)) {
fprintf(stderr, "error: reading pipefds\n");
}
close(pipefds[0]);
waitpid(pid, &st, 0);
pid = npid;
}
else{
rc = -errno;
sem_destroy(&fs->sem);
goto fork_err;
}
}
else
pid = fork();
fsc->pid = pid = fork();
switch (pid) {
/* Error */
@ -3538,12 +3520,13 @@ gettid_out:
fork_child_sync_pipe:
sem_post(&fs->sem);
sem_destroy(&fs->sem);
if (fs->status)
exit(1);
for (fp = fork_sync_top; fp;) {
fb = fp->next;
if (fp->fs)
if (fp->fs && fp->fs != fs)
munmap(fp->fs, sizeof(struct fork_sync));
free(fp);
fp = fb;
@ -3555,6 +3538,16 @@ fork_child_sync_pipe:
ioctl(fd, MCEXEC_UP_NEW_PROCESS, &npdesc);
/* TODO: does the forked thread run in a pthread context? */
while (getppid() != 1 &&
fs->success == 0) {
sched_yield();
}
if (fs->success == 0) {
exit(1);
}
munmap(fs, sizeof(struct fork_sync));
join_all_threads();
return ret;
@ -3562,7 +3555,6 @@ fork_child_sync_pipe:
/* Parent */
default:
fs->pid = pid;
while ((rc = sem_trywait(&fs->sem)) == -1 && (errno == EAGAIN || errno == EINTR)) {
int st;
int wrc;
@ -3585,20 +3577,25 @@ fork_child_sync_pipe:
break;
}
sem_destroy(&fs->sem);
munmap(fs, sizeof(struct fork_sync));
fork_err:
pthread_mutex_lock(&fork_sync_mutex);
for (fp = fork_sync_top, fb = NULL; fp; fb = fp, fp = fp->next)
if (fp == fsc)
break;
if (fp) {
if (fb)
fb->next = fsc->next;
else
fork_sync_top = fsc->next;
if (fs) {
sem_destroy(&fs->sem);
if (rc < 0) {
munmap(fs, sizeof(struct fork_sync));
pthread_mutex_lock(&fork_sync_mutex);
for (fp = fork_sync_top, fb = NULL; fp; fb = fp, fp = fp->next)
if (fp == fsc)
break;
if (fp) {
if (fb)
fb->next = fsc->next;
else
fork_sync_top = fsc->next;
free(fp);
}
pthread_mutex_unlock(&fork_sync_mutex);
}
}
pthread_mutex_unlock(&fork_sync_mutex);
do_syscall_return(fd, cpu, rc, 0, 0, 0, 0);
break;
}
@ -4277,7 +4274,9 @@ return_linux_spawn:
}
default:
ret = do_generic_syscall(&w);
if (archdep_syscall(&w, &ret)) {
ret = do_generic_syscall(&w);
}
do_syscall_return(fd, cpu, ret, 0, 0, 0, 0);
break;

View File

@ -225,7 +225,7 @@ static void devobj_release(struct memobj *memobj)
return;
}
static int devobj_get_page(struct memobj *memobj, off_t off, int p2align, uintptr_t *physp, unsigned long *flag)
static int devobj_get_page(struct memobj *memobj, off_t off, int p2align, uintptr_t *physp, unsigned long *flag, uintptr_t virt_addr)
{
const off_t pgoff = off / PAGE_SIZE;
struct devobj *obj = to_devobj(memobj);

View File

@ -191,7 +191,7 @@ static struct fileobj *obj_list_lookup(uintptr_t handle)
/***********************************************************************
* fileobj
*/
int fileobj_create(int fd, struct memobj **objp, int *maxprotp)
int fileobj_create(int fd, struct memobj **objp, int *maxprotp, uintptr_t virt_addr)
{
ihk_mc_user_context_t ctx;
struct pager_create_result result __attribute__((aligned(64)));
@ -265,7 +265,7 @@ int fileobj_create(int fd, struct memobj **objp, int *maxprotp)
/* Get the actual pages NUMA interleaved */
for (j = 0; j < nr_pages; ++j) {
mo->pages[j] = ihk_mc_alloc_aligned_pages_node_user(1,
PAGE_P2ALIGN, IHK_MC_AP_NOWAIT, node);
PAGE_P2ALIGN, IHK_MC_AP_NOWAIT, node, virt_addr);
if (!mo->pages[j]) {
kprintf("%s: ERROR: allocating pages[%d]\n",
__FUNCTION__, j);
@ -558,7 +558,7 @@ out:
}
static int fileobj_get_page(struct memobj *memobj, off_t off,
int p2align, uintptr_t *physp, unsigned long *pflag)
int p2align, uintptr_t *physp, unsigned long *pflag, uintptr_t virt_addr)
{
struct thread *proc = cpu_local_var(current);
struct fileobj *obj = to_fileobj(memobj);
@ -571,7 +571,7 @@ static int fileobj_get_page(struct memobj *memobj, off_t off,
struct mcs_rwlock_node mcs_node;
int hash = (off >> PAGE_SHIFT) & FILEOBJ_PAGE_HASH_MASK;
dkprintf("fileobj_get_page(%p,%lx,%x,%p)\n", obj, off, p2align, physp);
dkprintf("fileobj_get_page(%p,%lx,%x,%x,%p)\n", obj, off, p2align, virt_addr, physp);
if (p2align != PAGE_P2ALIGN) {
return -ENOMEM;
}
@ -584,13 +584,13 @@ static int fileobj_get_page(struct memobj *memobj, off_t off,
int page_ind = off >> PAGE_SHIFT;
if (!memobj->pages[page_ind]) {
virt = ihk_mc_alloc_pages_user(1, IHK_MC_AP_NOWAIT | IHK_MC_AP_USER);
virt = ihk_mc_alloc_pages_user(1, IHK_MC_AP_NOWAIT | IHK_MC_AP_USER, virt_addr);
if (!virt) {
error = -ENOMEM;
kprintf("fileobj_get_page(%p,%lx,%x,%p):"
kprintf("fileobj_get_page(%p,%lx,%x,%x,%x,%p):"
"alloc failed. %d\n",
obj, off, p2align, physp,
obj, off, p2align, virt_addr, physp,
error);
goto out_nolock;
}
@ -627,22 +627,22 @@ static int fileobj_get_page(struct memobj *memobj, off_t off,
args = kmalloc(sizeof(*args), IHK_MC_AP_NOWAIT);
if (!args) {
error = -ENOMEM;
kprintf("fileobj_get_page(%p,%lx,%x,%p):"
kprintf("fileobj_get_page(%p,%lx,%x,%x,%p):"
"kmalloc failed. %d\n",
obj, off, p2align, physp, error);
obj, off, p2align, virt_addr, physp, error);
goto out;
}
if (!page) {
npages = 1 << p2align;
virt = ihk_mc_alloc_pages_user(npages, IHK_MC_AP_NOWAIT |
(to_memobj(obj)->flags & MF_ZEROFILL) ? IHK_MC_AP_USER : 0);
virt = ihk_mc_alloc_pages_user(npages, (IHK_MC_AP_NOWAIT |
(to_memobj(obj)->flags & MF_ZEROFILL) ? IHK_MC_AP_USER : 0), virt_addr);
if (!virt) {
error = -ENOMEM;
kprintf("fileobj_get_page(%p,%lx,%x,%p):"
kprintf("fileobj_get_page(%p,%lx,%x,%x,%p):"
"alloc failed. %d\n",
obj, off, p2align, physp,
obj, off, p2align, virt_addr, physp,
error);
goto out;
}
@ -707,8 +707,8 @@ out_nolock:
if (args) {
kfree(args);
}
dkprintf("fileobj_get_page(%p,%lx,%x,%p): %d %lx\n",
obj, off, p2align, physp, error, phys);
dkprintf("fileobj_get_page(%p,%lx,%x,%x,%p): %d %lx\n",
obj, off, p2align, virt_addr, physp, error, phys);
return error;
}

View File

@ -673,9 +673,11 @@ static uint64_t futex_wait_queue_me(struct futex_hash_bucket *hb, struct futex_q
xchg4(&(cpu_local_var(current)->status), PS_INTERRUPTIBLE);
/* Indicate spin sleep */
irqstate = ihk_mc_spinlock_lock(&thread->spin_sleep_lock);
thread->spin_sleep = 1;
ihk_mc_spinlock_unlock(&thread->spin_sleep_lock, irqstate);
if (!idle_halt) {
irqstate = ihk_mc_spinlock_lock(&thread->spin_sleep_lock);
thread->spin_sleep = 1;
ihk_mc_spinlock_unlock(&thread->spin_sleep_lock, irqstate);
}
queue_me(q, hb);

View File

@ -79,7 +79,7 @@ int prepare_process_ranges_args_envs(struct thread *thread,
unsigned long s, e, up;
char **argv;
char **a;
int i, n, argc, envc, args_envs_npages, l;
int i, n, argc, envc, args_envs_npages;
char **env;
int range_npages;
void *up_v;
@ -148,7 +148,7 @@ int prepare_process_ranges_args_envs(struct thread *thread,
}
if ((up_v = ihk_mc_alloc_pages_user(range_npages,
IHK_MC_AP_NOWAIT | ap_flags)) == NULL) {
IHK_MC_AP_NOWAIT | ap_flags, s)) == NULL) {
kprintf("ERROR: alloc pages for ELF section %i\n", i);
goto err;
}
@ -259,7 +259,7 @@ int prepare_process_ranges_args_envs(struct thread *thread,
e = addr + PAGE_SIZE * ARGENV_PAGE_COUNT;
if((args_envs = ihk_mc_alloc_pages_user(ARGENV_PAGE_COUNT,
IHK_MC_AP_NOWAIT)) == NULL){
IHK_MC_AP_NOWAIT, -1)) == NULL){
kprintf("ERROR: allocating pages for args/envs\n");
goto err;
}
@ -349,21 +349,25 @@ int prepare_process_ranges_args_envs(struct thread *thread,
// Update variables
argc = *((long *)(args_envs));
dkprintf("argc: %d\n", argc);
argv = (char **)(args_envs + (sizeof(long)));
if(proc->saved_cmdline){
if (proc->saved_cmdline) {
kfree(proc->saved_cmdline);
proc->saved_cmdline = NULL;
proc->saved_cmdline_len = 0;
}
for(a = argv, l = 0; *a; a++)
l += strlen(args_envs + (unsigned long)*a) + 1;
proc->saved_cmdline = kmalloc(p->args_len, IHK_MC_AP_NOWAIT);
if(!proc->saved_cmdline)
if (!proc->saved_cmdline) {
goto err;
proc->saved_cmdline_len = l;
for(a = argv, l = 0; *a; a++){
strcpy(proc->saved_cmdline + l, args_envs + (unsigned long)*a);
l += strlen(args_envs + (unsigned long)*a) + 1;
}
proc->saved_cmdline_len = p->args_len - ((argc + 1) * sizeof(char **));
memcpy(proc->saved_cmdline,
(char *)args_envs + ((argc + 1) * sizeof(char **)),
proc->saved_cmdline_len);
for (a = argv; *a; a++) {
*a = (char *)addr + (unsigned long)*a; // Process' address space!
}

View File

@ -97,6 +97,8 @@ struct cpu_local_var {
ihk_spinlock_t smp_func_req_lock;
struct list_head smp_func_req_list;
struct process_vm *on_fork_vm;
} __attribute__((aligned(64)));

View File

@ -117,78 +117,9 @@
#include <arch/system.h>
#endif
#ifdef POSTK_DEBUG_ARCH_DEP_8 /* arch depend hide */
#else
static inline int futex_atomic_op_inuser(int encoded_op, int __user *uaddr)
{
int op = (encoded_op >> 28) & 7;
int cmp = (encoded_op >> 24) & 15;
int oparg = (encoded_op << 8) >> 20;
int cmparg = (encoded_op << 20) >> 20;
int oldval = 0, ret, tem;
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
oparg = 1 << oparg;
#ifdef __UACCESS__
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
return -EFAULT;
#endif
switch (op) {
case FUTEX_OP_SET:
__futex_atomic_op1("xchgl %0, %2", ret, oldval, uaddr, oparg);
break;
case FUTEX_OP_ADD:
__futex_atomic_op1("lock; xaddl %0, %2", ret, oldval,
uaddr, oparg);
break;
case FUTEX_OP_OR:
__futex_atomic_op2("orl %4, %3", ret, oldval, uaddr, oparg);
break;
case FUTEX_OP_ANDN:
__futex_atomic_op2("andl %4, %3", ret, oldval, uaddr, ~oparg);
break;
case FUTEX_OP_XOR:
__futex_atomic_op2("xorl %4, %3", ret, oldval, uaddr, oparg);
break;
default:
ret = -ENOSYS;
}
if (!ret) {
switch (cmp) {
case FUTEX_OP_CMP_EQ:
ret = (oldval == cmparg);
break;
case FUTEX_OP_CMP_NE:
ret = (oldval != cmparg);
break;
case FUTEX_OP_CMP_LT:
ret = (oldval < cmparg);
break;
case FUTEX_OP_CMP_GE:
ret = (oldval >= cmparg);
break;
case FUTEX_OP_CMP_LE:
ret = (oldval <= cmparg);
break;
case FUTEX_OP_CMP_GT:
ret = (oldval > cmparg);
break;
default:
ret = -ENOSYS;
}
}
return ret;
}
#endif /* arch depend hide */
#endif // __KERNEL__
#endif // _ASM_X86_FUTEX_H
#define FUTEX_HASHBITS 8 /* 256 entries in each futex hash tbl */
#define FUT_OFF_INODE 1 /* We set bit 0 if key has a reference on inode */

View File

@ -18,6 +18,7 @@ extern void mem_init(void);
extern void ihk_ikc_master_init(void);
extern void ap_init(void);
extern void arch_ready(void);
extern void done_init(void);
extern void mc_ikc_test_init(void);
extern void cpu_local_var_init(void);
extern void kmalloc_init(void);

View File

@ -65,7 +65,7 @@ struct memobj {
typedef void memobj_release_func_t(struct memobj *obj);
typedef void memobj_ref_func_t(struct memobj *obj);
typedef int memobj_get_page_func_t(struct memobj *obj, off_t off, int p2align, uintptr_t *physp, unsigned long *flag);
typedef int memobj_get_page_func_t(struct memobj *obj, off_t off, int p2align, uintptr_t *physp, unsigned long *flag, uintptr_t virt_addr);
typedef uintptr_t memobj_copy_page_func_t(struct memobj *obj, uintptr_t orgphys, int p2align);
typedef int memobj_flush_page_func_t(struct memobj *obj, uintptr_t phys, size_t pgsize);
typedef int memobj_invalidate_page_func_t(struct memobj *obj, uintptr_t phys, size_t pgsize);
@ -96,10 +96,10 @@ static inline void memobj_ref(struct memobj *obj)
}
static inline int memobj_get_page(struct memobj *obj, off_t off,
int p2align, uintptr_t *physp, unsigned long *pflag)
int p2align, uintptr_t *physp, unsigned long *pflag, uintptr_t virt_addr)
{
if (obj->ops->get_page) {
return (*obj->ops->get_page)(obj, off, p2align, physp, pflag);
return (*obj->ops->get_page)(obj, off, p2align, physp, pflag, virt_addr);
}
return -ENXIO;
}
@ -159,7 +159,7 @@ static inline int memobj_is_removable(struct memobj *obj)
return !!(obj->flags & MF_IS_REMOVABLE);
}
int fileobj_create(int fd, struct memobj **objp, int *maxprotp);
int fileobj_create(int fd, struct memobj **objp, int *maxprotp, uintptr_t virt_addr);
struct shmid_ds;
int shmobj_create(struct shmid_ds *ds, struct memobj **objp);
int zeroobj_create(struct memobj **objp);

View File

@ -277,6 +277,8 @@ struct resource_set {
extern struct list_head resource_set_list;
extern mcs_rwlock_lock_t resource_set_lock;
extern int idle_halt;
extern int allow_oversubscribe;
struct process_hash {
struct list_head list[HASH_SIZE];
@ -390,7 +392,7 @@ struct vm_range {
};
struct vm_range_numa_policy {
struct list_head list;
struct rb_node policy_rb_node;
unsigned long start, end;
DECLARE_BITMAP(numa_mask, PROCESS_NUMA_MASK_BITS);
int numa_mem_policy;
@ -419,6 +421,7 @@ struct mckfd {
long (*mmap_cb)(struct mckfd *, ihk_mc_user_context_t *);
int (*close_cb)(struct mckfd *, ihk_mc_user_context_t *);
int (*fcntl_cb)(struct mckfd *, ihk_mc_user_context_t *);
int (*dup_cb)(struct mckfd *, ihk_mc_user_context_t *);
};
#define SFD_CLOEXEC 02000000
@ -436,6 +439,7 @@ struct sig_pending {
sigset_t sigmask;
siginfo_t info;
int ptracecont;
int interrupted;
};
typedef void pgio_func_t(void *arg);
@ -486,7 +490,7 @@ struct process {
// V +---- |
// PS_STOPPED -----+
// (PS_TRACED)
int exit_status; // only for zombie
unsigned long exit_status; // only for zombie
/* Store exit_status for a group of threads when stopped by SIGSTOP.
exit_status can't be used because values of exit_status of threads
@ -572,9 +576,6 @@ struct process {
int nr_processes; /* For partitioned execution */
};
void hold_thread(struct thread *ftn);
void release_thread(struct thread *ftn);
/*
* Scheduling policies
*/
@ -725,7 +726,7 @@ struct process_vm {
DECLARE_BITMAP(numa_mask, PROCESS_NUMA_MASK_BITS);
int numa_mem_policy;
/* Protected by memory_range_lock */
struct list_head vm_range_numa_policy_list;
struct rb_root vm_range_numa_policy_tree;
struct vm_range *range_cache[VM_RANGE_CACHE_SIZE];
int range_cache_ind;
struct swapinfo *swapinfo;
@ -750,7 +751,7 @@ struct thread *create_thread(unsigned long user_pc,
struct thread *clone_thread(struct thread *org, unsigned long pc,
unsigned long sp, int clone_flags);
void destroy_thread(struct thread *thread);
void hold_thread(struct thread *thread);
int hold_thread(struct thread *thread);
void release_thread(struct thread *thread);
void flush_process_memory(struct process_vm *vm);
void hold_process_vm(struct process_vm *vm);

View File

@ -35,6 +35,7 @@ rusage_rss_add(unsigned long size)
unsigned long newval;
unsigned long oldval;
unsigned long retval;
struct process_vm *vm;
newval = __sync_add_and_fetch(&rusage->rss_current, size);
oldval = rusage->memory_max_usage;
@ -46,12 +47,28 @@ rusage_rss_add(unsigned long size)
}
oldval = retval;
}
/* process rss */
vm = cpu_local_var(on_fork_vm);
if (!vm) {
vm = cpu_local_var(current)->vm;
}
vm->currss += size;
if (vm->currss > vm->proc->maxrss) {
vm->proc->maxrss = vm->currss;
}
}
static inline void
rusage_rss_sub(unsigned long size)
{
struct process_vm *vm = cpu_local_var(current)->vm;
__sync_sub_and_fetch(&rusage->rss_current, size);
/* process rss */
vm->currss -= size;
}
static inline void memory_stat_rss_add(unsigned long size, int pgsize)

View File

@ -471,6 +471,8 @@ int arch_map_vdso(struct process_vm *vm); /* arch dependent */
int arch_setup_vdso(void);
int arch_cpu_read_write_register(struct ihk_os_cpu_register *desc,
enum mcctrl_os_cpu_operation op);
struct vm_range_numa_policy *vm_range_policy_search(struct process_vm *vm, uintptr_t addr);
time_t time(void);
#ifndef POSTK_DEBUG_ARCH_DEP_52
#define VDSO_MAXPAGES 2

View File

@ -252,6 +252,7 @@ extern struct xpmem_partition *xpmem_my_part;
static int xpmem_ioctl(struct mckfd *mckfd, ihk_mc_user_context_t *ctx);
static int xpmem_close(struct mckfd *mckfd, ihk_mc_user_context_t *ctx);
static int xpmem_dup(struct mckfd *mckfd, ihk_mc_user_context_t *ctx);
static int xpmem_init(void);
static void xpmem_exit(void);
@ -275,6 +276,7 @@ static int xpmem_release(xpmem_apid_t);
static void xpmem_release_ap(struct xpmem_thread_group *,
struct xpmem_access_permit *);
static void xpmem_release_aps_of_tg(struct xpmem_thread_group *ap_tg);
static void xpmem_flush(struct mckfd *);
static int xpmem_attach(struct mckfd *, xpmem_apid_t, off_t, size_t,
unsigned long, int, int, unsigned long *);

View File

@ -144,6 +144,18 @@ static void parse_kargs(void)
}
}
ihk_mc_set_dump_level(dump_level);
/* idle_halt option */
ptr = find_command_line("idle_halt");
if (ptr) {
idle_halt = 1;
}
/* allow_oversubscribe option */
ptr = find_command_line("allow_oversubscribe");
if (ptr) {
allow_oversubscribe = 1;
}
}
extern void ihk_mc_get_boot_time(unsigned long *tv_sec, unsigned long *tv_nsec);
@ -381,6 +393,7 @@ int main(void)
futex_init();
done_init();
kputs("IHK/McKernel booted.\n");
#ifdef DCFA_KMOD

View File

@ -76,7 +76,7 @@ static void *___kmalloc(int size, ihk_mc_ap_flag flag);
static void ___kfree(void *ptr);
static void *___ihk_mc_alloc_aligned_pages_node(int npages,
int p2align, ihk_mc_ap_flag flag, int node, int is_user);
int p2align, ihk_mc_ap_flag flag, int node, int is_user, uintptr_t virt_addr);
static void *___ihk_mc_alloc_pages(int npages, ihk_mc_ap_flag flag, int is_user);
static void ___ihk_mc_free_pages(void *p, int npages, int is_user);
@ -193,7 +193,7 @@ struct pagealloc_track_entry *__pagealloc_track_find_entry(
/* Top level routines called from macros */
void *_ihk_mc_alloc_aligned_pages_node(int npages, int p2align,
ihk_mc_ap_flag flag, int node, int is_user,
ihk_mc_ap_flag flag, int node, int is_user, uintptr_t virt_addr,
char *file, int line)
{
unsigned long irqflags;
@ -201,7 +201,7 @@ void *_ihk_mc_alloc_aligned_pages_node(int npages, int p2align,
struct pagealloc_track_addr_entry *addr_entry;
int hash, addr_hash;
void *r = ___ihk_mc_alloc_aligned_pages_node(npages,
p2align, flag, node, is_user);
p2align, flag, node, is_user, virt_addr);
if (!memdebug || !pagealloc_track_initialized)
return r;
@ -497,10 +497,10 @@ void pagealloc_memcheck(void)
/* Actual allocation routines */
static void *___ihk_mc_alloc_aligned_pages_node(int npages, int p2align,
ihk_mc_ap_flag flag, int node, int is_user)
ihk_mc_ap_flag flag, int node, int is_user, uintptr_t virt_addr)
{
if (pa_ops)
return pa_ops->alloc_page(npages, p2align, flag, node, is_user);
return pa_ops->alloc_page(npages, p2align, flag, node, is_user, virt_addr);
else
return early_alloc_pages(npages);
}
@ -508,7 +508,7 @@ static void *___ihk_mc_alloc_aligned_pages_node(int npages, int p2align,
static void *___ihk_mc_alloc_pages(int npages, ihk_mc_ap_flag flag,
int is_user)
{
return ___ihk_mc_alloc_aligned_pages_node(npages, PAGE_P2ALIGN, flag, -1, is_user);
return ___ihk_mc_alloc_aligned_pages_node(npages, PAGE_P2ALIGN, flag, -1, is_user, -1);
}
static void ___ihk_mc_free_pages(void *p, int npages, int is_user)
@ -544,7 +544,7 @@ static void reserve_pages(struct ihk_page_allocator_desc *pa_allocator,
extern int cpu_local_var_initialized;
static void *mckernel_allocate_aligned_pages_node(int npages, int p2align,
ihk_mc_ap_flag flag, int pref_node, int is_user)
ihk_mc_ap_flag flag, int pref_node, int is_user, uintptr_t virt_addr)
{
unsigned long pa = 0;
int i, node;
@ -553,6 +553,12 @@ static void *mckernel_allocate_aligned_pages_node(int npages, int p2align,
#endif
int numa_id;
struct vm_range_numa_policy *range_policy_iter = NULL;
int numa_mem_policy = -1;
struct process_vm *vm;
struct vm_range *range = NULL;
int chk_shm = 0;
if(npages <= 0)
return NULL;
@ -565,7 +571,23 @@ static void *mckernel_allocate_aligned_pages_node(int npages, int p2align,
/* No explicitly requested NUMA or user policy? */
if ((pref_node == -1) && (!(flag & IHK_MC_AP_USER) ||
cpu_local_var(current)->vm->numa_mem_policy == MPOL_DEFAULT)) {
goto distance_based;
if (virt_addr != -1) {
vm = cpu_local_var(current)->vm;
range_policy_iter = vm_range_policy_search(vm, virt_addr);
if (range_policy_iter) {
range = lookup_process_memory_range(vm, (uintptr_t)virt_addr, ((uintptr_t)virt_addr) + 1);
if (range) {
if( (range->memobj) && (range->memobj->flags == MF_SHM)) {
chk_shm = 1;
}
}
}
}
if ((!((range_policy_iter) && (range_policy_iter->numa_mem_policy != MPOL_DEFAULT))) && (chk_shm == 0))
goto distance_based;
}
node = ihk_mc_get_numa_id();
@ -611,7 +633,28 @@ static void *mckernel_allocate_aligned_pages_node(int npages, int p2align,
}
}
switch (cpu_local_var(current)->vm->numa_mem_policy) {
if ((virt_addr != -1) && (chk_shm == 0)) {
vm = cpu_local_var(current)->vm;
if (!(range_policy_iter)) {
range_policy_iter = vm_range_policy_search(vm, virt_addr);
}
if (range_policy_iter) {
range = lookup_process_memory_range(vm, (uintptr_t)virt_addr, ((uintptr_t)virt_addr) + 1);
if ((range && (range->memobj->flags == MF_SHM))) {
chk_shm = 1;
} else {
numa_mem_policy = range_policy_iter->numa_mem_policy;
}
}
}
if (numa_mem_policy == -1)
numa_mem_policy = cpu_local_var(current)->vm->numa_mem_policy;
switch (numa_mem_policy) {
case MPOL_BIND:
case MPOL_PREFERRED:
@ -1169,10 +1212,6 @@ static void page_fault_handler(void *fault_addr, uint64_t reason, void *regs)
info._sifields._sigfault.si_addr = fault_addr;
set_signal(SIGSEGV, regs, &info);
}
if(interrupt_from_user(regs)){
cpu_enable_interrupt();
check_signal(0, regs, 0);
}
goto out;
}
@ -1181,8 +1220,12 @@ static void page_fault_handler(void *fault_addr, uint64_t reason, void *regs)
out:
dkprintf("%s: addr: %p, reason: %lx, regs: %p -> error: %d\n",
__FUNCTION__, fault_addr, reason, regs, error);
check_need_resched();
set_cputime(0);
if(interrupt_from_user(regs)){
cpu_enable_interrupt();
check_need_resched();
check_signal(0, regs, 0);
}
set_cputime(interrupt_from_user(regs)? 0: 1);
#ifdef PROFILE_ENABLE
if (thread->profile)
profile_event_add(PROFILE_page_fault, (rdtsc() - t_s));
@ -1621,6 +1664,8 @@ void *ihk_mc_map_virtual(unsigned long phys, int npages,
ihk_pagealloc_free(vmap_allocator, virt_to_phys(p), npages);
return NULL;
}
flush_tlb_single((unsigned long)(p + (i << PAGE_SHIFT)));
}
barrier();
return (char *)p + offset;
@ -1633,17 +1678,14 @@ void ihk_mc_unmap_virtual(void *va, int npages, int free_physical)
va = (void *)((unsigned long)va & PAGE_MASK);
for (i = 0; i < npages; i++) {
ihk_mc_pt_clear_page(NULL, (char *)va + (i << PAGE_SHIFT));
flush_tlb_single((unsigned long)(va + (i << PAGE_SHIFT)));
}
#ifdef POSTK_DEBUG_TEMP_FIX_42 /* add unmap virtual tlb flush. */
flush_tlb();
#endif /* POSTK_DEBUG_TEMP_FIX_42 */
#ifdef POSTK_DEBUG_TEMP_FIX_51 /* ihk_mc_unmap_virtual() free_physical disabled */
ihk_pagealloc_free(vmap_allocator, (unsigned long)va, npages);
#else /* POSTK_DEBUG_TEMP_FIX_51 */
if (free_physical) {
ihk_pagealloc_free(vmap_allocator, (unsigned long)va, npages);
flush_tlb_single((unsigned long)va);
}
#endif /* POSTK_DEBUG_TEMP_FIX_51 */
}

View File

@ -96,7 +96,7 @@ struct swapinfo {
struct arealist swap_area;
struct arealist mlock_area;
struct mlockcntnr mlock_container;
#define UDATA_BUFSIZE (8*1024)
#define UDATA_BUFSIZE PAGE_SIZE
char *swapfname;
char *udata_buf; /* To read-store data from Linux to user space */
@ -295,7 +295,7 @@ static int
pager_open(struct swapinfo *si, char *fname, int flag, int mode)
{
int fd;
strcpy(si->udata_buf, fname);
copy_to_user(si->udata_buf, fname, strlen(fname) + 1);
fd = linux_open(si->udata_buf, flag, mode);
return fd;
}
@ -303,22 +303,69 @@ pager_open(struct swapinfo *si, char *fname, int flag, int mode)
static int
pager_unlink(struct swapinfo *si, char *fname)
{
strcpy(si->udata_buf, fname);
copy_to_user(si->udata_buf, fname, strlen(fname) + 1);
return linux_unlink(si->udata_buf);
}
static int
pager_copy_from_user(void * dst, void * from, size_t size, struct process_vm *vm)
{
int ret;
void *virt;
unsigned long psize;
unsigned long rphys;
int faulted = 0;
if (size > PAGE_SIZE) {
ret = -EFAULT;
return ret;
}
retry_lookup:
/* remember page */
ret = ihk_mc_pt_virt_to_phys_size(vm->address_space->page_table,
dst, &rphys, &psize);
if (ret) {
uint64_t reason = PF_POPULATE | PF_WRITE | PF_USER;
void *addr= (void *)(((unsigned long)dst)& PAGE_MASK);
if (faulted) {
ret = -EFAULT;
return ret;
}
ret = page_fault_process_vm(vm, addr, reason);
if (ret) {
ret = -EFAULT;
return ret;
}
faulted = 1;
goto retry_lookup;
}
virt = phys_to_virt(rphys);
ret = copy_from_user(virt, from, size);
return ret;
}
static ssize_t
pager_read(struct swapinfo *si, int fd, void *start, size_t size)
pager_read(struct swapinfo *si, int fd, void *start, size_t size,struct process_vm *vm)
{
ssize_t off, sz, rs;
kprintf("pager_read: %lx (%lx)\n", start, size);
for (off = 0; off < size; off += sz) {
sz = size - off;
sz = (sz > UDATA_BUFSIZE) ? UDATA_BUFSIZE : sz;
rs = linux_read(fd, si->udata_buf, sz);
if (rs != sz) return rs;
copy_to_user(start + off, si->udata_buf, sz);
rs = pager_copy_from_user(start + off, si->udata_buf, sz, vm);
if (rs != 0) return rs;
}
return off;
}
@ -354,23 +401,26 @@ mlocklist_req(unsigned long start, unsigned long end, struct addrpair *addr, int
static int
mlocklist_morereq(struct swapinfo *si, unsigned long *start)
{
struct areaent *ent = si->mlock_area.tail;
struct areaent went,*ent = si->mlock_area.tail;
copy_from_user(&went, ent, sizeof(struct areaent));
dkprintf("mlocklist_morereq: start = %ld and = %ld\n",
ent->pair[ent->count].start, ent->pair[ent->count].end);
if (ent->pair[ent->count].start != (unsigned long) -1) {
went.pair[went.count].start, went.pair[went.count].end);
if (went.pair[went.count].start != (unsigned long) -1) {
return 0;
}
*start = ent->pair[ent->count].end;
*start = went.pair[went.count].end;
return 1;
}
static int
arealist_alloc(struct swapinfo *si, struct arealist *areap)
{
struct areaent went;
areap->head = areap->tail = myalloc(si, sizeof(struct areaent));
if (areap->head == NULL) return -ENOMEM;
memset(areap->head, 0, sizeof(struct areaent));
memset(&went, 0, sizeof(struct areaent));
copy_to_user(areap->head, &went, sizeof(struct areaent));
return 0;
}
@ -402,7 +452,7 @@ arealist_free(struct arealist *area)
static int
arealist_get(struct swapinfo *si, struct addrpair **pair, struct arealist *area)
{
struct areaent *tmp;
struct areaent *tmp,wtmp;
struct areaent *tail = area->tail;
if (tail->count < MLOCKADDRS_SIZE - 1) { /* at least two entries are needed */
if (pair) *pair = &tail->pair[tail->count];
@ -412,8 +462,10 @@ arealist_get(struct swapinfo *si, struct addrpair **pair, struct arealist *area)
if (tmp == NULL) {
return -1;
}
memset(tmp, 0, sizeof(struct areaent));
area->tail->next = tmp;
memset(&wtmp, 0, sizeof(struct areaent));
copy_to_user(tmp, &wtmp, sizeof(struct areaent));
copy_to_user(&(area->tail->next), &tmp, sizeof(struct areaent *));
area->tail = tmp;
if (pair) *pair = area->tail->pair;
return MLOCKADDRS_SIZE;
@ -422,7 +474,10 @@ arealist_get(struct swapinfo *si, struct addrpair **pair, struct arealist *area)
static void
arealist_update(int cnt, struct arealist *area)
{
area->tail->count += cnt;
int i;
copy_from_user(&i, &(area->tail->count), sizeof(int));
i += cnt;
copy_to_user(&(area->tail->count), &i, sizeof(int));
area->count += cnt;
}
@ -431,11 +486,13 @@ arealist_add(struct swapinfo *si, unsigned long start, unsigned long end,
unsigned long flag, struct arealist *area)
{
int cc;
struct addrpair *addr;
struct addrpair *addr,waddr;
cc = arealist_get(si, &addr, area);
if (cc < 0) return -1;
addr->start = start; addr->end = end; addr->flag = flag;
waddr.start = start; waddr.end = end; waddr.flag = flag;
copy_to_user(addr, &waddr, sizeof(struct addrpair));
arealist_update(1, area);
return 0;
}
@ -444,27 +501,31 @@ static int
arealist_preparewrite(struct arealist *areap, struct swap_areainfo *info,
ssize_t off, struct process_vm *vm, int flag)
{
struct areaent *ent;
struct areaent *ent,went;
int count = 0;
ssize_t totsz = 0;
unsigned long pos;
struct page_table *pt = vm->address_space->page_table;
for (ent = areap->head; ent != NULL; ent = ent->next) {
int i;
for (i = 0; i < ent->count; i++, count++) {
ssize_t sz = ent->pair[i].end - ent->pair[i].start;
info[count].start = ent->pair[i].start;
info[count].end = ent->pair[i].end;
info[count].flag = ent->pair[i].flag;
copy_from_user(&went, ent, sizeof(struct areaent));
for (i = 0; i < went.count; i++, count++) {
ssize_t sz = went.pair[i].end - went.pair[i].start;
copy_to_user(&(info[count].start), &(went.pair[i].start), sizeof(unsigned long));
copy_to_user(&(info[count].end), &(went.pair[i].end), sizeof(unsigned long));
copy_to_user(&(info[count].flag), &(went.pair[i].flag), sizeof(unsigned long));
if (flag) { /* position in file */
info[count].pos = off + totsz;
pos = off + totsz;
} else { /* physical memory */
if (ihk_mc_pt_virt_to_phys(pt,
(void*) ent->pair[i].start,
&info[count].pos)) {
&pos)) {
kprintf("Cannot get phys\n");
}
}
copy_to_user(&(info[count].pos), &pos, sizeof(unsigned long));
totsz += sz;
}
}
@ -489,6 +550,7 @@ arealist_print(char *msg, struct arealist *areap, int count)
for (ent = areap->head; ent != NULL; ent = ent->next) {
int i;
for (i = 0; i < ent->count; i++) {
kprintf("\t%p -- %p\n",
(void*) ent->pair[i].start, (void*) ent->pair[i].end);
}
@ -608,13 +670,13 @@ do_pagein(int flag)
extern int ihk_mc_pt_print_pte(struct page_table *pt, void *virt);
sz = si->swap_info[i].end - si->swap_info[i].start;
dkprintf("pagein: %016lx:%016lx sz(%lx)\n", si->swap_info[i].start, si->swap_info[i].end, sz);
rs = pager_read(si, fd, (void*) si->swap_info[i].start, sz);
rs = pager_read(si, fd, (void*) si->swap_info[i].start, sz, vm);
if (rs != sz) goto err;
// ihk_mc_pt_print_pte(vm->address_space->page_table, (void*) si->swap_info[i].start);
}
linux_close(fd);
print_region("after pagin", vm);
kprintf("do_pagein: done, currss(%lx)\n", vm->currss);
dkprintf("do_pagein: done, currss(%lx)\n", vm->currss);
vm->swapinfo = NULL;
kfree(si->swapfname);
kfree(si);
@ -679,7 +741,8 @@ do_pageout(char *fname, void *buf, size_t size, int flag)
goto err;
}
fd = linux_open(fname, O_RDWR|O_CREAT|O_TRUNC, 0600);
copy_to_user(si->udata_buf, si->swapfname, strlen(si->swapfname) + 1);
fd = linux_open(si->udata_buf, O_RDWR|O_CREAT|O_TRUNC, 0600);
if (fd < 0) {
ekprintf("do_pageout: Cannot open/create file: %s\n", fname);
cc = fd;
@ -752,10 +815,10 @@ do_pageout(char *fname, void *buf, size_t size, int flag)
/* preparing page store */
si->swphdr = myalloc(si, sizeof(struct swap_header));
strncpy(si->swphdr->magic, MCKERNEL_SWAP, SWAP_HLEN);
strncpy(si->swphdr->version, MCKERNEL_SWAP_VERSION, SWAP_HLEN);
si->swphdr->count_sarea = si->swap_area.count;
si->swphdr->count_marea = si->mlock_area.count;
copy_to_user(&(si->swphdr->magic), MCKERNEL_SWAP, SWAP_HLEN);
copy_to_user(&(si->swphdr->version), MCKERNEL_SWAP_VERSION, SWAP_HLEN);
copy_to_user(&(si->swphdr->count_sarea), &(si->swap_area.count), sizeof(unsigned int));
copy_to_user(&(si->swphdr->count_marea), &(si->mlock_area.count), sizeof(unsigned int));
if ((cc = pager_write(fd, si->swphdr, sizeof(struct swap_header)))
!= sizeof(struct swap_header)) {
if (cc >= 0)
@ -779,8 +842,10 @@ do_pageout(char *fname, void *buf, size_t size, int flag)
if ((cc = arealist_write(fd, si->mlock_info, si->mlock_area.count)) < 0) goto err;
/* now pages are stored */
for (i = 0; i < si->swap_area.count; i++) {
sz = si->swap_info[i].end - si->swap_info[i].start;
if ((cc = pager_write(fd, (void*) si->swap_info[i].start, sz)) != sz) {
struct swap_areainfo sw_info;
copy_from_user(&sw_info, &(si->swap_info[i]), sizeof(struct swap_areainfo));
sz = sw_info.end - sw_info.start;
if ((cc = pager_write(fd, (void*) sw_info.start, sz)) != sz) {
if (cc >= 0)
cc = -EIO;
goto err;
@ -792,10 +857,12 @@ do_pageout(char *fname, void *buf, size_t size, int flag)
}
kprintf("removing physical memory\n");
for (i = 0; i < si->swap_area.count; i++) {
struct swap_areainfo sw_info;
copy_from_user(&sw_info, &(si->swap_info[i]), sizeof(struct swap_areainfo));
cc = ihk_mc_pt_free_range(vm->address_space->page_table,
vm,
(void*) si->swap_info[i].start,
(void*) si->swap_info[i].end, NULL);
(void*) sw_info.start,
(void*) sw_info.end, NULL);
if (cc < 0) {
kprintf("ihk_mc_pt_clear_range returns: %d\n", cc);
}
@ -811,8 +878,10 @@ do_pageout(char *fname, void *buf, size_t size, int flag)
* except TEXT, STACK, readonly pages, are not invalid.
*/
for (i = 0; i < si->swap_area.count; i++) {
sz = si->swap_info[i].end - si->swap_info[i].start;
cc = linux_munmap((void*) si->swap_info[i].start, sz, 0);
struct swap_areainfo sw_info;
copy_from_user(&sw_info, &(si->swap_info[i]), sizeof(struct swap_areainfo));
sz = sw_info.end - sw_info.start;
cc = linux_munmap((void*) sw_info.start, sz, 0);
if (cc < 0) {
kprintf("do_pageout: Cannot munmap: %lx len(%lx)\n",
si->swap_info[i].start, sz);

View File

@ -100,6 +100,9 @@ extern void perf_reset(struct mc_perf_event *event);
struct list_head resource_set_list;
mcs_rwlock_lock_t resource_set_lock;
int idle_halt = 0;
int allow_oversubscribe = 0;
void
init_process(struct process *proc, struct process *parent)
{
@ -256,7 +259,7 @@ init_process_vm(struct process *owner, struct address_space *asp, struct process
ihk_atomic_set(&vm->refcount, 1);
vm->vm_range_tree = RB_ROOT;
INIT_LIST_HEAD(&vm->vm_range_numa_policy_list);
vm->vm_range_numa_policy_tree = RB_ROOT;
vm->address_space = asp;
vm->proc = owner;
vm->exiting = 0;
@ -395,6 +398,7 @@ clone_thread(struct thread *org, unsigned long pc, unsigned long sp,
int termsig = clone_flags & 0xff;
struct process *proc = NULL;
struct address_space *asp = NULL;
struct cpu_local_var *v = get_this_cpu_local_var();
if ((thread = ihk_mc_alloc_pages(KERNEL_STACK_NR_PAGES,
IHK_MC_AP_NOWAIT)) == NULL) {
@ -405,6 +409,9 @@ clone_thread(struct thread *org, unsigned long pc, unsigned long sp,
ihk_atomic_set(&thread->refcount, 2);
memcpy(&thread->cpu_set, &org->cpu_set, sizeof(thread->cpu_set));
/* New thread is in kernel until jumping to enter_user_mode */
thread->in_kernel = org->in_kernel;
/* NOTE: sp is the user mode stack! */
ihk_mc_init_user_process(&thread->ctx, &thread->uctx, ((char *)thread) +
KERNEL_STACK_NR_PAGES * PAGE_SIZE, pc, sp);
@ -478,12 +485,15 @@ clone_thread(struct thread *org, unsigned long pc, unsigned long sp,
dkprintf("fork(): copy_user_ranges()\n");
/* Copy user-space mappings.
* TODO: do this with COW later? */
v->on_fork_vm = proc->vm;
if (copy_user_ranges(proc->vm, org->vm) != 0) {
release_address_space(asp);
v->on_fork_vm = NULL;
kfree(proc->vm);
kfree(proc);
goto err_free_proc;
}
v->on_fork_vm = NULL;
/* Copy mckfd list
FIXME: Replace list manipulation with list_add() etc. */
@ -507,13 +517,15 @@ clone_thread(struct thread *org, unsigned long pc, unsigned long sp,
mckfd->next = proc->mckfd;
proc->mckfd = mckfd;
}
if (mckfd->dup_cb) {
mckfd->dup_cb(mckfd, NULL);
}
}
ihk_mc_spinlock_unlock(&proc->mckfd_lock, irqstate);
thread->vm->vdso_addr = org->vm->vdso_addr;
thread->vm->vvar_addr = org->vm->vvar_addr;
thread->proc->maxrss = org->proc->maxrss;
thread->vm->currss = org->vm->currss;
thread->sigstack.ss_sp = org->sigstack.ss_sp;
thread->sigstack.ss_flags = org->sigstack.ss_flags;
@ -646,7 +658,7 @@ static int copy_user_pte(void *arg0, page_table_t src_pt, pte_t *src_ptep, void
npages = pgsize / PAGE_SIZE;
virt = ihk_mc_alloc_aligned_pages_user(npages, pgalign,
IHK_MC_AP_NOWAIT);
IHK_MC_AP_NOWAIT, (uintptr_t)pgaddr);
if (!virt) {
kprintf("ERROR: copy_user_pte() allocating new page\n");
error = -ENOMEM;
@ -1466,14 +1478,6 @@ static int remap_one_page(void *arg0, page_table_t pt, pte_t *ptep,
dkprintf("remap_one_page(%p,%p,%p %#lx,%p,%d)\n",
arg0, pt, ptep, *ptep, pgaddr, pgshift);
/* XXX: NYI: large pages */
if (pgsize != PAGE_SIZE) {
error = -E2BIG;
ekprintf("remap_one_page(%p,%p,%p %#lx,%p,%d):%d\n",
arg0, pt, ptep, *ptep, pgaddr, pgshift, error);
goto out;
}
off = args->off + ((uintptr_t)pgaddr - args->start);
pte_make_fileoff(off, 0, pgsize, &apte);
@ -1509,6 +1513,7 @@ int remap_process_memory_range(struct process_vm *vm, struct vm_range *range,
{
struct rfp_args args;
int error;
unsigned int retval;
dkprintf("remap_process_memory_range(%p,%p,%#lx,%#lx,%#lx)\n",
vm, range, start, end, off);
@ -1519,6 +1524,13 @@ int remap_process_memory_range(struct process_vm *vm, struct vm_range *range,
args.off = off;
args.memobj = range->memobj;
retval = __sync_val_compare_and_swap(&range->pgshift, 0, PAGE_SHIFT);
if (retval != 0 && retval != PAGE_SHIFT) {
error = -E2BIG;
ekprintf("%s: pgshift is too big (%d) failed:%d\n", __func__, retval, error);
goto out;
}
error = visit_pte_range(vm->address_space->page_table, (void *)start,
(void *)end, range->pgshift, VPTEF_DEFAULT,
&remap_one_page, &args);
@ -1765,7 +1777,7 @@ static int page_fault_process_memory_range(struct process_vm *vm, struct vm_rang
off = pte_get_off(ptep, pgsize);
}
error = memobj_get_page(range->memobj, off, p2align,
&phys, &memobj_flag);
&phys, &memobj_flag, fault_addr);
if (error) {
struct memobj *obj;
@ -1787,7 +1799,7 @@ retry:
npages = pgsize / PAGE_SIZE;
virt = ihk_mc_alloc_aligned_pages_user(npages, p2align,
IHK_MC_AP_NOWAIT |
(range->flag & VR_AP_USER) ? IHK_MC_AP_USER : 0);
((range->flag & VR_AP_USER) ? IHK_MC_AP_USER : 0), fault_addr);
if (!virt && !range->pgshift && (pgsize != PAGE_SIZE)) {
error = arch_get_smaller_page_size(NULL, pgsize, &pgsize, &p2align);
if (error) {
@ -1847,7 +1859,7 @@ retry:
npages = pgsize / PAGE_SIZE;
virt = ihk_mc_alloc_aligned_pages_user(npages, p2align,
IHK_MC_AP_NOWAIT);
IHK_MC_AP_NOWAIT, fault_addr);
if (!virt) {
error = -ENOMEM;
kprintf("page_fault_process_memory_range(%p,%lx-%lx %lx,%lx,%lx):cannot allocate copy page. %d\n", vm, range->start, range->end, range->flag, fault_addr, reason, error);
@ -1936,9 +1948,6 @@ retry:
// memory_stat_rss_add() is called in downstream with !memobj check
}
flush_tlb_single(fault_addr);
vm->currss += pgsize;
if(vm->currss > vm->proc->maxrss)
vm->proc->maxrss = vm->currss;
error = 0;
page = NULL;
@ -2123,6 +2132,8 @@ int init_process_stack(struct thread *thread, struct program_load_desc *pn,
struct process *proc = thread->proc;
unsigned long ap_flag;
struct vm_range *range;
int stack_populated_size = 0;
int stack_align_padding = 0;
/* Create stack range */
end = STACK_TOP(&thread->vm->region) & LARGE_PAGE_MASK;
@ -2156,7 +2167,7 @@ int init_process_stack(struct thread *thread, struct program_load_desc *pn,
ap_flag ? "(IHK_MC_AP_USER)" : "");
stack = ihk_mc_alloc_aligned_pages_user(minsz >> PAGE_SHIFT,
LARGE_PAGE_P2ALIGN, IHK_MC_AP_NOWAIT | ap_flag);
LARGE_PAGE_P2ALIGN, IHK_MC_AP_NOWAIT | ap_flag, start);
if (!stack) {
kprintf("%s: error: couldn't allocate initial stack\n",
@ -2194,22 +2205,29 @@ int init_process_stack(struct thread *thread, struct program_load_desc *pn,
return error;
}
// memory_stat_rss_add() is called in ihk_mc_pt_set_range();
/* Pre-compute populated size so that we can align stack
* and verify the size at the end */
stack_align_padding = 0;
stack_populated_size = 16 /* Random */ +
AUXV_LEN * sizeof(unsigned long) /* AUXV */ +
(argc + 2) * sizeof(unsigned long) /* args + term NULL + argc */ +
(envc + 1) * sizeof(unsigned long); /* envs + term NULL */
/* set up initial stack frame */
p = (unsigned long *)(stack + minsz);
s_ind = -1;
#ifdef POSTK_DEBUG_ARCH_DEP_15 /* userstack 16byte align */
if(!((envc + argc) % 2)){
p[s_ind--] = 0;
/* Align stack to 64 bytes */
while ((unsigned long)(stack + minsz -
stack_populated_size - stack_align_padding) & (0x40L - 1)) {
s_ind--;
stack_align_padding += sizeof(unsigned long);
}
#endif /* POSTK_DEBUG_ARCH_DEP_15 */
/* "random" 16 bytes on the very top */
p[s_ind--] = 0x010101011;
p[s_ind--] = 0x010101011;
at_rand = end + sizeof(unsigned long) * s_ind;
at_rand = end + (s_ind + 1) * sizeof(unsigned long);
/* auxiliary vector */
/* If you add/delete entires, please increase/decrease
@ -2261,6 +2279,20 @@ int init_process_stack(struct thread *thread, struct program_load_desc *pn,
/* argc */
p[s_ind] = argc;
if (((void *)&p[s_ind] != (void *)stack + minsz -
stack_populated_size - stack_align_padding)) {
kprintf("%s: WARNING: stack_populated_size mismatch (is AUXV_LEN up-to-date?): "
"&p[s_ind]: %lu, computed: %lu\n",
__FUNCTION__,
(unsigned long)&p[s_ind],
(unsigned long)stack + minsz -
stack_populated_size - stack_align_padding);
}
if ((unsigned long)&p[s_ind] & (0x40L - 1)) {
kprintf("%s: WARNING: stack alignment mismatch\n", __FUNCTION__);
}
ihk_mc_modify_user_context(thread->uctx, IHK_UCR_STACK_POINTER,
end + sizeof(unsigned long) * s_ind);
thread->vm->region.stack_end = end;
@ -2300,7 +2332,7 @@ unsigned long extend_process_region(struct process_vm *vm,
p = ihk_mc_alloc_aligned_pages_user(
(new_end_allocated - end_allocated) >> PAGE_SHIFT,
align_p2align, IHK_MC_AP_NOWAIT |
(!(vm->proc->mpol_flags & MPOL_NO_HEAP) ? IHK_MC_AP_USER : 0));
(!(vm->proc->mpol_flags & MPOL_NO_HEAP) ? IHK_MC_AP_USER : 0), end_allocated);
if (!p) {
return end_allocated;
@ -2542,14 +2574,15 @@ out:
return error;
}
void hold_thread(struct thread *thread)
int hold_thread(struct thread *thread)
{
if (thread->status == PS_EXITED) {
panic("hold_thread: already exited process");
kprintf("hold_thread: ERROR: already exited process,tid=%d\n", thread->tid);
return -ESRCH;
}
ihk_atomic_inc(&thread->refcount);
return;
return 0;
}
void
@ -2596,21 +2629,30 @@ void destroy_thread(struct thread *thread)
{
struct sig_pending *pending;
struct sig_pending *signext;
struct mcs_rwlock_node_irqsave lock;
struct mcs_rwlock_node_irqsave lock, updatelock;
struct process *proc = thread->proc;
struct resource_set *resource_set = cpu_local_var(resource_set);
int hash;
struct timespec ats;
hash = thread_hash(thread->tid);
mcs_rwlock_writer_lock(&resource_set->thread_hash->lock[hash], &lock);
list_del(&thread->hash_list);
mcs_rwlock_writer_unlock(&resource_set->thread_hash->lock[hash], &lock);
mcs_rwlock_writer_lock(&proc->update_lock, &updatelock);
tsc_to_ts(thread->system_tsc, &ats);
ts_add(&thread->proc->stime, &ats);
tsc_to_ts(thread->user_tsc, &ats);
ts_add(&thread->proc->utime, &ats);
mcs_rwlock_writer_lock(&proc->threads_lock, &lock);
list_del(&thread->siblings_list);
__release_tid(proc, thread);
mcs_rwlock_writer_unlock(&proc->threads_lock, &lock);
mcs_rwlock_writer_unlock(&proc->update_lock, &updatelock);
cpu_clear(thread->cpu_id, &thread->vm->address_space->cpu_set,
&thread->vm->address_space->cpu_set_lock);
list_for_each_entry_safe(pending, signext, &thread->sigpending, list){
@ -2639,20 +2681,11 @@ void destroy_thread(struct thread *thread)
void release_thread(struct thread *thread)
{
struct process_vm *vm;
struct mcs_rwlock_node_irqsave lock;
struct timespec ats;
if (!ihk_atomic_dec_and_test(&thread->refcount)) {
return;
}
mcs_rwlock_writer_lock(&thread->proc->update_lock, &lock);
tsc_to_ts(thread->system_tsc, &ats);
ts_add(&thread->proc->stime, &ats);
tsc_to_ts(thread->user_tsc, &ats);
ts_add(&thread->proc->utime, &ats);
mcs_rwlock_writer_unlock(&thread->proc->update_lock, &lock);
vm = thread->vm;
#ifdef PROFILE_ENABLE
@ -2891,7 +2924,7 @@ void sched_init(void)
ihk_mc_init_context(&idle_thread->ctx, NULL, idle);
ihk_mc_spinlock_init(&idle_thread->vm->memory_range_lock);
idle_thread->vm->vm_range_tree = RB_ROOT;
INIT_LIST_HEAD(&idle_thread->vm->vm_range_numa_policy_list);
idle_thread->vm->vm_range_numa_policy_tree = RB_ROOT;
idle_thread->proc->pid = 0;
idle_thread->tid = ihk_mc_get_processor_id();
@ -3044,6 +3077,12 @@ void spin_sleep_or_schedule(void)
int woken = 0;
long irqstate;
/* Spinning disabled explicitly */
if (idle_halt) {
dkprintf("%s: idle_halt -> schedule()\n", __FUNCTION__);
goto out_schedule;
}
/* Try to spin sleep */
irqstate = ihk_mc_spinlock_lock(&thread->spin_sleep_lock);
if (thread->spin_sleep == 0) {
@ -3092,6 +3131,7 @@ void spin_sleep_or_schedule(void)
cpu_pause();
}
out_schedule:
schedule();
}
@ -3109,6 +3149,9 @@ void schedule(void)
}
redo:
/* Reset for redo */
switch_ctx = 0;
cpu_local_var(runq_irqstate) =
ihk_mc_spinlock_lock(&(get_this_cpu_local_var()->runq_lock));
v = get_this_cpu_local_var();
@ -3130,7 +3173,10 @@ redo:
}
}
if (v->flags & CPU_FLAG_NEED_MIGRATE) {
/* Switch to idle() when prev is PS_EXITED since it always reaches release_thread()
because it always resumes from just after ihk_mc_switch_context() call. See #1029 */
if (v->flags & CPU_FLAG_NEED_MIGRATE ||
prev->status == PS_EXITED) {
next = &cpu_local_var(idle);
} else {
/* Pick a new running process or one that has a pending signal */

View File

@ -358,7 +358,7 @@ void process_procfs_request(struct ikc_scd_packet *rpacket)
* 08048000-08056000 r-xp 00000000 03:0c 64593 /usr/sbin/gpm
*/
written_now = snprintf(_buf, left,
"%lx-%lx %s%s%s%s %lx %lx:%lx %d %s\n",
"%012lx-%012lx %s%s%s%s %lx %lx:%lx %d\t\t\t%s\n",
range->start, range->end,
range->flag & VR_PROT_READ ? "r" : "-",
range->flag & VR_PROT_WRITE ? "w" : "-",
@ -369,6 +369,14 @@ void process_procfs_request(struct ikc_scd_packet *rpacket)
0UL,
0UL,
0,
range->start ==
(unsigned long)vm->vdso_addr ? "[vdso]" :
range->start ==
(unsigned long)vm->vvar_addr ? "[vsyscall]" :
range->flag & VR_STACK ? "[stack]" :
range->start >= vm->region.brk_start &&
range->end <= vm->region.brk_end_allocated ?
"[heap]" :
""
);
@ -422,12 +430,6 @@ void process_procfs_request(struct ikc_scd_packet *rpacket)
if (strcmp(p, "pagemap") == 0) {
uint64_t *_buf = (uint64_t *)buf;
uint64_t start, end;
if (offset < PAGE_SIZE) {
kprintf("WARNING: /proc/pagemap queried for NULL page\n");
ans = 0;
goto end;
}
/* Check alignment */
if ((offset % sizeof(uint64_t) != 0) ||

View File

@ -374,7 +374,7 @@ static void shmobj_ref(struct memobj *memobj)
}
static int shmobj_get_page(struct memobj *memobj, off_t off, int p2align,
uintptr_t *physp, unsigned long *pflag)
uintptr_t *physp, unsigned long *pflag, uintptr_t virt_addr)
{
struct shmobj *obj = to_shmobj(memobj);
int error;
@ -415,7 +415,7 @@ static int shmobj_get_page(struct memobj *memobj, off_t off, int p2align,
if (!page) {
npages = 1 << p2align;
virt = ihk_mc_alloc_aligned_pages_user(npages, p2align,
IHK_MC_AP_NOWAIT);
IHK_MC_AP_NOWAIT, virt_addr);
if (!virt) {
error = -ENOMEM;
ekprintf("shmobj_get_page(%p,%#lx,%d,%p):"

View File

@ -113,6 +113,7 @@ static ihk_spinlock_t tod_data_lock = SPIN_LOCK_UNLOCKED;
static void calculate_time_from_tsc(struct timespec *ts);
void check_signal(unsigned long, void *, int);
void save_syscall_return_value(int num, unsigned long rc);
void do_signal(long rc, void *regs, struct thread *thread, struct sig_pending *pending, int num);
extern unsigned long do_kill(struct thread *thread, int pid, int tid, int sig, struct siginfo *info, int ptracecont);
extern long alloc_debugreg(struct thread *thread);
@ -319,10 +320,10 @@ long do_syscall(struct syscall_request *req, int cpu, int pid)
flags = cpu_disable_interrupt_save();
/* Try to sleep until notified */
if (__sync_bool_compare_and_swap(&res.req_thread_status,
IHK_SCD_REQ_THREAD_SPINNING,
IHK_SCD_REQ_THREAD_DESCHEDULED)) {
if (res.req_thread_status == IHK_SCD_REQ_THREAD_DESCHEDULED ||
__sync_bool_compare_and_swap(&res.req_thread_status,
IHK_SCD_REQ_THREAD_SPINNING,
IHK_SCD_REQ_THREAD_DESCHEDULED)) {
dkprintf("%s: tid %d waiting for syscall reply...\n",
__FUNCTION__, thread->tid);
waitq_init(&thread->scd_wq);
@ -459,16 +460,10 @@ long do_syscall(struct syscall_request *req, int cpu, int pid)
/* -ERESTARTSYS indicates that the proxy process is gone
* and the application should be terminated */
#ifdef POSTK_DEBUG_TEMP_FIX_70 /* interrupt_syscall returned -ERESTARTSYS fix */
if (rc == -ERESTARTSYS && req->number != __NR_exit_group
&& req->number != __NR_kill) {
#else /* POSTK_DEBUG_TEMP_FIX_70 */
if (rc == -ERESTARTSYS && req->number != __NR_exit_group) {
#endif /* POSTK_DEBUG_TEMP_FIX_70 */
if (rc == -ERESTARTSYS) {
kprintf("%s: proxy PID %d is dead, terminate()\n",
__FUNCTION__, thread->proc->pid);
thread->proc->nohost = 1;
terminate(0, SIGKILL);
}
#ifdef PROFILE_ENABLE
@ -963,6 +958,28 @@ SYSCALL_DECLARE(waitid)
return 0;
}
void terminate_mcexec(int rc, int sig)
{
unsigned long old_exit_status;
unsigned long exit_status;
struct thread *mythread = cpu_local_var(current);
struct process *proc = mythread->proc;
struct syscall_request request IHK_DMA_ALIGN;
if ((old_exit_status = proc->exit_status) & 0x0000000100000000L)
return;
exit_status = 0x0000000100000000L | ((rc & 0x00ff) << 8) | (sig & 0xff);
if (!__sync_bool_compare_and_swap(&proc->exit_status,
old_exit_status, exit_status))
return;
if (!proc->nohost) {
request.number = __NR_exit_group;
request.args[0] = proc->exit_status;
proc->nohost = 1;
do_syscall(&request, ihk_mc_get_processor_id(), proc->pid);
}
}
void terminate(int rc, int sig)
{
struct resource_set *resource_set = cpu_local_var(resource_set);
@ -980,7 +997,6 @@ void terminate(int rc, int sig)
int i;
int n;
int *ids = NULL;
struct syscall_request request IHK_DMA_ALIGN;
int exit_status;
// sync perf info
@ -988,13 +1004,15 @@ void terminate(int rc, int sig)
sync_child_event(proc->monitoring_event);
// clean up threads
mcs_rwlock_reader_lock(&proc->threads_lock, &lock); // conflict clone
mcs_rwlock_writer_lock_noirq(&proc->update_lock, &updatelock);
mcs_rwlock_writer_lock(&proc->threads_lock, &lock); // conflict clone
if (proc->status == PS_EXITED) {
mcs_rwlock_writer_unlock_noirq(&proc->update_lock, &updatelock);
mcs_rwlock_reader_unlock(&proc->threads_lock, &lock);
dkprintf("%s: PID: %d, TID: %d PS_EXITED already\n",
__FUNCTION__, proc->pid, mythread->tid);
preempt_disable();
mythread->status = PS_EXITED;
mcs_rwlock_writer_unlock(&proc->threads_lock, &lock);
mcs_rwlock_writer_unlock_noirq(&proc->update_lock, &updatelock);
release_thread(mythread);
preempt_enable();
schedule();
@ -1002,10 +1020,15 @@ void terminate(int rc, int sig)
return;
}
exit_status = mythread->exit_status = ((rc & 0x00ff) << 8) | (sig & 0xff);
dkprintf("%s: PID: %d, TID: %d setting PS_EXITED\n",
__FUNCTION__, proc->pid, mythread->tid);
exit_status = ((rc & 0x00ff) << 8) | (sig & 0xff);
mythread->exit_status = exit_status;
proc->status = PS_EXITED;
mcs_rwlock_writer_unlock(&proc->threads_lock, &lock);
mcs_rwlock_writer_unlock_noirq(&proc->update_lock, &updatelock);
mcs_rwlock_reader_unlock(&proc->threads_lock, &lock);
terminate_mcexec(rc, sig);
mcs_rwlock_writer_lock(&proc->threads_lock, &lock);
list_del(&mythread->siblings_list);
@ -1151,23 +1174,13 @@ void terminate(int rc, int sig)
#endif
// clean up memory
if (!proc->nohost) {
request.number = __NR_exit_group;
request.args[0] = exit_status;
#ifdef POSTK_DEBUG_TEMP_FIX_48 /* nohost flag missed fix */
proc->nohost = 1;
do_syscall(&request, ihk_mc_get_processor_id(), proc->pid);
#else /* POSTK_DEBUG_TEMP_FIX_48 */
do_syscall(&request, ihk_mc_get_processor_id(), proc->pid);
proc->nohost = 1;
#endif /* POSTK_DEBUG_TEMP_FIX_48 */
}
proc->exit_status = exit_status;
finalize_process(proc);
preempt_disable();
mcs_rwlock_writer_lock(&proc->threads_lock, &lock);
mythread->status = PS_EXITED;
mcs_rwlock_writer_unlock(&proc->threads_lock, &lock);
release_thread(mythread);
release_process_vm(vm);
preempt_enable();
@ -1479,7 +1492,7 @@ do_mmap(const intptr_t addr0, const size_t len0, const int prot,
maxprot = PROT_READ | PROT_WRITE | PROT_EXEC;
if (!(flags & MAP_ANONYMOUS)) {
off = off0;
error = fileobj_create(fd, &memobj, &maxprot);
error = fileobj_create(fd, &memobj, &maxprot, addr0);
#ifdef ATTACHED_MIC
/*
* XXX: refuse device mapping in attached-mic now:
@ -1542,7 +1555,7 @@ do_mmap(const intptr_t addr0, const size_t len0, const int prot,
}
p = ihk_mc_alloc_aligned_pages_user(npages, p2align,
IHK_MC_AP_NOWAIT | ap_flag);
IHK_MC_AP_NOWAIT | ap_flag, addr0);
if (p == NULL) {
dkprintf("%s: warning: failed to allocate %d contiguous pages "
" (bytes: %lu, pgshift: %d), enabling demand paging\n",
@ -1966,9 +1979,7 @@ static void settid(struct thread *thread, int nr_tids, int *tids)
int ret;
struct syscall_request request IHK_DMA_ALIGN;
#ifdef POSTK_DEBUG_ARCH_DEP_58 /* settid() arguments area 0 clear add. */
memset(&request, 0, sizeof(request));
#endif /* POSTK_DEBUG_ARCH_DEP_58 */
request.number = __NR_gettid;
/*
@ -2152,11 +2163,9 @@ SYSCALL_DECLARE(execve)
{
int error;
long ret;
char *empty_envp[1] = {NULL};
const char *filename = (const char *)ihk_mc_syscall_arg0(ctx);
char **argv = (char **)ihk_mc_syscall_arg1(ctx);
char **envp = (char **)ihk_mc_syscall_arg2(ctx) ?
(char **)ihk_mc_syscall_arg2(ctx) : empty_envp;
char **envp = (char **)ihk_mc_syscall_arg2(ctx);
char *argv_flat = NULL;
int argv_flat_len = 0;
@ -2202,13 +2211,8 @@ SYSCALL_DECLARE(execve)
if (ret != 0) {
dkprintf("execve(): ERROR: host failed to load elf header, errno: %d\n",
ret);
#ifdef POSTK_DEBUG_TEMP_FIX_10 /* sys_execve() memleak fix. */
ret = -ret;
goto desc_free;
#else /* POSTK_DEBUG_TEMP_FIX_10 */
ihk_mc_free_pages(desc, 4);
return -ret;
#endif /* POSTK_DEBUG_TEMP_FIX_10 */
goto end;
}
dkprintf("execve(): ELF desc received, num sections: %d\n",
@ -2221,7 +2225,7 @@ SYSCALL_DECLARE(execve)
/* Flatten argv and envp into kernel-space buffers */
argv_flat_len = flatten_strings_from_user(-1, (desc->shell_path[0] ?
desc->shell_path : NULL), argv, &argv_flat);
if (argv_flat_len == 0) {
if (argv_flat_len < 0) {
char *kfilename;
int len = strlen_user(filename);
@ -2231,17 +2235,12 @@ SYSCALL_DECLARE(execve)
kprintf("ERROR: no argv for executable: %s?\n", kfilename? kfilename: "");
if(kfilename)
kfree(kfilename);
#ifdef POSTK_DEBUG_TEMP_FIX_10 /* sys_execve() memleak fix. */
ret = -EINVAL;
goto desc_free;
#else /* POSTK_DEBUG_TEMP_FIX_10 */
ihk_mc_free_pages(desc, 4);
return -EINVAL;
#endif /* POSTK_DEBUG_TEMP_FIX_10 */
ret = argv_flat_len;
goto end;
}
envp_flat_len = flatten_strings_from_user(-1, NULL, envp, &envp_flat);
if (envp_flat_len == 0) {
if (envp_flat_len < 0) {
char *kfilename;
int len = strlen_user(filename);
@ -2251,12 +2250,8 @@ SYSCALL_DECLARE(execve)
kprintf("ERROR: no envp for executable: %s?\n", kfilename? kfilename: "");
if(kfilename)
kfree(kfilename);
#ifdef POSTK_DEBUG_TEMP_FIX_10 /* sys_execve() memleak fix. */
ret = -EINVAL;
goto argv_free;
#else /* POSTK_DEBUG_TEMP_FIX_10 */
return -EINVAL;
#endif /* POSTK_DEBUG_TEMP_FIX_10 */
ret = envp_flat_len;
goto end;
}
if (cpu_local_var(current)->proc->ptrace) {
@ -2293,11 +2288,8 @@ SYSCALL_DECLARE(execve)
request.args[2] = sizeof(struct program_load_desc) +
sizeof(struct program_image_section) * desc->num_sections;
ret = do_syscall(&request, ihk_mc_get_processor_id(), 0);
if (ret != 0) {
kprintf("execve(): PANIC: host failed to load elf image\n");
panic("");
if ((ret = do_syscall(&request, ihk_mc_get_processor_id(), 0)) != 0) {
goto end;
}
for(i = 0; i < _NSIG; i++){
@ -2320,18 +2312,14 @@ SYSCALL_DECLARE(execve)
dkprintf("execve(): switching to new process\n");
proc->execed = 1;
#ifdef POSTK_DEBUG_TEMP_FIX_10 /* sys_execve() memleak fix. */
ret = 0;
end:
if (envp_flat) {
kfree(envp_flat);
}
argv_free:
if (argv_flat) {
kfree(argv_flat);
}
desc_free:
ihk_mc_free_pages(desc, 4);
if (!ret) {
@ -2343,21 +2331,6 @@ desc_free:
cpu_local_var(current));
}
return ret;
#else /* POSTK_DEBUG_TEMP_FIX_10 */
ihk_mc_free_pages(desc, 4);
kfree(argv_flat);
kfree(envp_flat);
/* Lock run queue because enter_user_mode expects to release it */
cpu_local_var(runq_irqstate) =
ihk_mc_spinlock_lock(&(get_this_cpu_local_var()->runq_lock));
ihk_mc_switch_context(NULL, &cpu_local_var(current)->ctx,
cpu_local_var(current));
/* Never reach here */
return 0;
#endif /* POSTK_DEBUG_TEMP_FIX_10 */
}
unsigned long do_fork(int clone_flags, unsigned long newsp,
@ -2374,6 +2347,7 @@ unsigned long do_fork(int clone_flags, unsigned long newsp,
struct syscall_request request1 IHK_DMA_ALIGN;
int ptrace_event = 0;
int termsig = clone_flags & 0x000000ff;
const struct ihk_mc_cpu_info *cpu_info = ihk_mc_get_cpu_info();
dkprintf("do_fork,flags=%08x,newsp=%lx,ptidptr=%lx,ctidptr=%lx,tls=%lx,curpc=%lx,cursp=%lx",
clone_flags, newsp, parent_tidptr, child_tidptr, tlsblock_base, curpc, cursp);
@ -2413,6 +2387,11 @@ unsigned long do_fork(int clone_flags, unsigned long newsp,
return -EINVAL;
}
if (!allow_oversubscribe && rusage->num_threads >= cpu_info->ncpus) {
kprintf("%s: ERROR: CPU oversubscription is not allowed. Specify -O option in mcreboot.sh to allow it.\n", __FUNCTION__);
return -EINVAL;
}
cpuid = obtain_clone_cpuid(&old->cpu_set);
if (cpuid == -1) {
kprintf("do_fork,core not available\n");
@ -2482,28 +2461,25 @@ retry_tid:
}
}
mcs_rwlock_writer_unlock(&newproc->threads_lock, &lock);
/* TODO: spawn more mcexec threads */
if (!new->tid) {
release_cpuid(cpuid);
kprintf("%s: no more TIDs available\n");
panic("");
return -ENOMEM;
}
mcs_rwlock_writer_unlock(&newproc->threads_lock, &lock);
}
/* fork() a new process on the host */
else {
request1.number = __NR_fork;
request1.number = __NR_clone;
request1.args[0] = 0;
if(clone_flags & CLONE_PARENT){
if(oldproc->ppid_parent->pid != 1)
request1.args[0] = clone_flags;
}
newproc->pid = do_syscall(&request1, ihk_mc_get_processor_id(), 0);
#ifdef POSTK_DEBUG_TEMP_FIX_12 /* __NR_fork retval check fix. */
if (newproc->pid < 0) {
#else /* POSTK_DEBUG_TEMP_FIX_12 */
if (newproc->pid == -1) {
#endif /* POSTK_DEBUG_TEMP_FIX_12 */
kprintf("ERROR: forking host process\n");
/* TODO: clean-up new */
@ -2651,6 +2627,13 @@ retry_tid:
old->tid,
new->tid);
if (!(clone_flags & CLONE_VM)) {
request1.number = __NR_clone;
request1.args[0] = 1;
request1.args[1] = new->tid;
do_syscall(&request1, ihk_mc_get_processor_id(), 0);
}
runq_add_thread(new, cpuid);
if (ptrace_event) {
@ -2660,11 +2643,6 @@ retry_tid:
return new->tid;
}
SYSCALL_DECLARE(vfork)
{
return do_fork(CLONE_VFORK|SIGCHLD, 0, 0, 0, 0, ihk_mc_syscall_pc(ctx), ihk_mc_syscall_sp(ctx));
}
SYSCALL_DECLARE(set_tid_address)
{
cpu_local_var(current)->clear_child_tid =
@ -3093,6 +3071,13 @@ do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
struct mcs_rwlock_node_irqsave mcs_rw_node;
ihk_mc_user_context_t ctx0;
if (!valid_signal(sig) || sig < 1) {
return -EINVAL;
}
if (act && (sig == SIGKILL || sig == SIGSTOP)) {
return -EINVAL;
}
mcs_rwlock_writer_lock(&thread->sigcommon->lock, &mcs_rw_node);
k = thread->sigcommon->action + sig - 1;
if(oact)
@ -3817,6 +3802,57 @@ perf_mmap(struct mckfd *sfd, ihk_mc_user_context_t *ctx)
return rc;
}
struct vm_range_numa_policy *vm_range_policy_search(struct process_vm *vm, uintptr_t addr)
{
struct rb_root *root = &vm->vm_range_numa_policy_tree;
struct rb_node *node = root->rb_node;
struct vm_range_numa_policy *numa_policy = NULL;
while (node) {
numa_policy = rb_entry(node, struct vm_range_numa_policy, policy_rb_node);
if (addr < numa_policy->start) {
node = node->rb_left;
} else if (addr >= numa_policy->end) {
node = node->rb_right;
} else {
return numa_policy;
}
}
return NULL;
}
static int vm_policy_insert(struct process_vm *vm, struct vm_range_numa_policy *newrange)
{
struct rb_root *root = &vm->vm_range_numa_policy_tree;
struct rb_node **new = &(root->rb_node), *parent = NULL;
struct vm_range_numa_policy *range;
while (*new) {
range = rb_entry(*new, struct vm_range_numa_policy, policy_rb_node);
parent = *new;
if (newrange->end <= range->start) {
new = &((*new)->rb_left);
} else if (newrange->start >= range->end) {
new = &((*new)->rb_right);
} else {
ekprintf("vm_range_insert(%p,%lx-%lx (nodemask)%lx (policy)%d): overlap %lx-%lx (nodemask)%lx (policy)%d\n",
vm, newrange->start, newrange->end, newrange->numa_mask, newrange->numa_mem_policy,
range->start, range->end, range->numa_mask, range->numa_mem_policy);
return -EFAULT;
}
}
dkprintf("vm_range_insert: %p,%p: %lx-%lx (nodemask)%lx (policy)%d\n",
vm, newrange, newrange->start, newrange->end, newrange->numa_mask, newrange->numa_mem_policy);
rb_link_node(&newrange->policy_rb_node, parent, new);
rb_insert_color(&newrange->policy_rb_node, root);
return 0;
}
struct mc_perf_event*
mc_perf_event_alloc(struct perf_event_attr *attr)
{
@ -3926,7 +3962,7 @@ SYSCALL_DECLARE(perf_event_open)
}
if (not_supported_flag) {
return -1;
return -ENOENT;
}
event = mc_perf_event_alloc((struct perf_event_attr*)attr);
@ -4684,30 +4720,6 @@ struct shminfo the_shminfo = {
};
struct shm_info the_shm_info = { 0, };
time_t time(void) {
#ifndef POSTK_DEBUG_ARCH_DEP_13 /* arch depend tmp hide */
struct syscall_request sreq IHK_DMA_ALIGN;
struct thread *thread = cpu_local_var(current);
#endif /* POSTK_DEBUG_ARCH_DEP_13 */
#ifdef POSTK_DEBUG_ARCH_DEP_49 /* time() local calculate added. */
struct timespec ats;
if (gettime_local_support) {
calculate_time_from_tsc(&ats);
return ats.tv_sec;
}
#endif /* POSTK_DEBUG_ARCH_DEP_49 */
#ifdef POSTK_DEBUG_ARCH_DEP_13 /* arch depend tmp hide */
return (time_t)0;
#else /* POSTK_DEBUG_ARCH_DEP_13 */
sreq.number = __NR_time;
sreq.args[0] = (uintptr_t)NULL;
return (time_t)do_syscall(&sreq, ihk_mc_get_processor_id(), thread->proc->pid);
#endif /* POSTK_DEBUG_ARCH_DEP_13 */
}
static int make_shmid(struct shmobj *obj)
{
return ((int)obj->index << 16) | obj->ds.shm_perm.seq;
@ -5475,16 +5487,16 @@ do_exit(int code)
FUTEX_WAKE, 1, 0, NULL, 0, 0, 1);
}
mcs_rwlock_reader_lock(&proc->threads_lock, &lock);
mcs_rwlock_writer_lock(&proc->threads_lock, &lock);
if(proc->status == PS_EXITED){
mcs_rwlock_reader_unlock(&proc->threads_lock, &lock);
mcs_rwlock_writer_unlock(&proc->threads_lock, &lock);
terminate(exit_status, 0);
return;
}
preempt_disable();
thread->status = PS_EXITED;
sync_child_event(thread->proc->monitoring_event);
mcs_rwlock_reader_unlock(&proc->threads_lock, &lock);
mcs_rwlock_writer_unlock(&proc->threads_lock, &lock);
release_thread(thread);
preempt_enable();
@ -5681,15 +5693,8 @@ SYSCALL_DECLARE(getrusage)
kusage.ru_maxrss = proc->maxrss / 1024;
}
else if(who == RUSAGE_CHILDREN){
#ifdef POSTK_DEBUG_TEMP_FIX_72 /* fix RUSAGE_CHILDREN time */
ts_to_tv(&kusage.ru_utime, &proc->utime_children);
ts_to_tv(&kusage.ru_stime, &proc->stime_children);
#else /* POSTK_DEBUG_TEMP_FIX_72 */
tsc_to_ts(thread->user_tsc, &ats);
ts_to_tv(&kusage.ru_utime, &ats);
tsc_to_ts(thread->system_tsc, &ats);
ts_to_tv(&kusage.ru_stime, &ats);
#endif /* POSTK_DEBUG_TEMP_FIX_72 */
kusage.ru_maxrss = proc->maxrss_children / 1024;
}
@ -6095,6 +6100,13 @@ static int ptrace_attach(int pid)
error = -ESRCH;
goto out;
}
if (proc->pid == pid) {
thread_unlock(thread, &lock);
error = -EPERM;
goto out;
}
child = thread->proc;
dkprintf("ptrace_attach(): pid requested:%d, thread->tid:%d, thread->proc->pid=%d, thread->proc->parent=%p\n", pid, thread->tid, thread->proc->pid, thread->proc->parent);
@ -6112,7 +6124,6 @@ static int ptrace_attach(int pid)
}
parent = child->parent;
#ifdef POSTK_DEBUG_TEMP_FIX_53 /* attach for child-process fix. */
dkprintf("ptrace_attach() parent->pid=%d\n", parent->pid);
mcs_rwlock_writer_lock_noirq(&parent->children_lock, &childlock);
@ -6124,23 +6135,6 @@ static int ptrace_attach(int pid)
list_add_tail(&child->siblings_list, &proc->children_list);
child->parent = proc;
mcs_rwlock_writer_unlock_noirq(&proc->children_lock, &childlock);
#else /* POSTK_DEBUG_TEMP_FIX_53 */
/* XXX: tmp */
if (parent != proc) {
dkprintf("ptrace_attach() parent->pid=%d\n", parent->pid);
mcs_rwlock_writer_lock_noirq(&parent->children_lock, &childlock);
list_del(&child->siblings_list);
list_add_tail(&child->ptraced_siblings_list, &parent->ptraced_children_list);
mcs_rwlock_writer_unlock_noirq(&parent->children_lock, &childlock);
mcs_rwlock_writer_lock_noirq(&proc->children_lock, &childlock);
list_add_tail(&child->siblings_list, &proc->children_list);
child->parent = proc;
mcs_rwlock_writer_unlock_noirq(&proc->children_lock, &childlock);
}
#endif /* POSTK_DEBUG_TEMP_FIX_53 */
child->ptrace = PT_TRACED | PT_TRACE_EXEC;
@ -8109,8 +8103,7 @@ SYSCALL_DECLARE(mbind)
int error = 0;
int bit;
struct vm_range *range;
struct vm_range_numa_policy *range_policy, *range_policy_iter;
struct vm_range_numa_policy *range_policy_next = NULL;
struct vm_range_numa_policy *range_policy, *range_policy_iter = NULL;
DECLARE_BITMAP(numa_mask, PROCESS_NUMA_MASK_BITS);
dkprintf("%s: addr: 0x%lx, len: %lu, mode: 0x%x, "
@ -8321,17 +8314,12 @@ SYSCALL_DECLARE(mbind)
case MPOL_INTERLEAVE:
case MPOL_PREFERRED:
/* Adjust any overlapping range settings and add new one */
range_policy_next = NULL;
list_for_each_entry(range_policy_iter,
&vm->vm_range_numa_policy_list, list) {
range_policy_iter = vm_range_policy_search(vm, addr);
if (range_policy_iter) {
int adjusted = 0;
unsigned long orig_end = range_policy_iter->end;
if (range_policy_iter->end < addr)
continue;
/* Special case of entirely overlapping */
if (range_policy_iter->start == addr &&
if (range_policy_iter->start == addr &&
range_policy_iter->end == addr + len) {
range_policy = range_policy_iter;
goto mbind_update_only;
@ -8348,7 +8336,7 @@ SYSCALL_DECLARE(mbind)
if (orig_end > addr + len) {
if (adjusted) {
/* Add a new entry after */
range_policy = kmalloc(sizeof(*range_policy),
range_policy = kmalloc(sizeof(struct vm_range_numa_policy),
IHK_MC_AP_NOWAIT);
if (!range_policy) {
dkprintf("%s: error allocating range_policy\n",
@ -8357,31 +8345,24 @@ SYSCALL_DECLARE(mbind)
goto unlock_out;
}
memcpy(range_policy, range_policy_iter,
sizeof(*range_policy));
RB_CLEAR_NODE(&range_policy->policy_rb_node);
range_policy->start = addr + len;
range_policy->end = orig_end;
list_add(&range_policy->list,
&range_policy_iter->list);
range_policy_next = range_policy;
break;
error = vm_policy_insert(vm, range_policy);
if (error) {
kprintf("%s: ERROR: could not insert range: %d\n",__FUNCTION__, error);
return error;
}
}
else {
range_policy_iter->start = addr + len;
range_policy_next = range_policy_iter;
break;
}
}
/* Next one in ascending address order? */
if (range_policy_iter->start >= addr + len) {
range_policy_next = range_policy_iter;
break;
}
}
/* Add a new entry */
range_policy = kmalloc(sizeof(*range_policy),
range_policy = kmalloc(sizeof(struct vm_range_numa_policy),
IHK_MC_AP_NOWAIT);
if (!range_policy) {
dkprintf("%s: error allocating range_policy\n",
@ -8390,17 +8371,14 @@ SYSCALL_DECLARE(mbind)
goto unlock_out;
}
memset(range_policy, 0, sizeof(*range_policy));
RB_CLEAR_NODE(&range_policy->policy_rb_node);
range_policy->start = addr;
range_policy->end = addr + len;
if (range_policy_next) {
list_add_tail(&range_policy->list,
&range_policy_next->list);
}
else {
list_add_tail(&range_policy->list,
&vm->vm_range_numa_policy_list);
error = vm_policy_insert(vm, range_policy);
if (error) {
kprintf("%s: ERROR: could not insert range: %d\n",__FUNCTION__, error);
return error;
}
mbind_update_only:
@ -8441,8 +8419,6 @@ SYSCALL_DECLARE(set_mempolicy)
struct process_vm *vm = cpu_local_var(current)->vm;
int error = 0;
int bit, valid_mask;
struct vm_range_numa_policy *range_policy_iter;
struct vm_range_numa_policy *range_policy_next = NULL;
DECLARE_BITMAP(numa_mask, PROCESS_NUMA_MASK_BITS);
memset(numa_mask, 0, sizeof(numa_mask));
@ -8463,6 +8439,8 @@ SYSCALL_DECLARE(set_mempolicy)
}
}
mode &= ~MPOL_MODE_FLAGS;
switch (mode) {
case MPOL_DEFAULT:
if (nodemask && nodemask_bits) {
@ -8486,6 +8464,13 @@ SYSCALL_DECLARE(set_mempolicy)
set_bit(bit, vm->numa_mask);
}
#if 0
/* In man, "MPOL_DEFAULT mode deletes a process memory policy
other than the default and interprets that the memory policy"
falls back to the system default policy ", but not to delete
the NUMA memory policy.
There was no processing of Linux's same name command. */
/* Delete all range settings */
ihk_mc_spinlock_lock_noirq(&vm->memory_range_lock);
list_for_each_entry_safe(range_policy_iter, range_policy_next,
@ -8494,6 +8479,7 @@ SYSCALL_DECLARE(set_mempolicy)
kfree(range_policy_iter);
}
ihk_mc_spinlock_unlock_noirq(&vm->memory_range_lock);
#endif
vm->numa_mem_policy = mode;
error = 0;
@ -8641,7 +8627,6 @@ SYSCALL_DECLARE(get_mempolicy)
/* Address range specific? */
if (flags & MPOL_F_ADDR) {
struct vm_range_numa_policy *range_policy_iter;
struct vm_range *range;
ihk_mc_spinlock_lock_noirq(&vm->memory_range_lock);
@ -8653,16 +8638,8 @@ SYSCALL_DECLARE(get_mempolicy)
goto out;
}
list_for_each_entry(range_policy_iter,
&vm->vm_range_numa_policy_list, list) {
if (range_policy_iter->start > addr ||
range_policy_iter->end <= addr) {
continue;
}
range_policy = vm_range_policy_search(vm, addr);
range_policy = range_policy_iter;
break;
}
ihk_mc_spinlock_unlock_noirq(&vm->memory_range_lock);
}
@ -9288,6 +9265,7 @@ set_cputime(int mode)
return;
}
cpu_disable_interrupt();
tsc = rdtsc();
if(thread->base_tsc != 0){
unsigned long dtsc = tsc - thread->base_tsc;
@ -9368,6 +9346,7 @@ set_cputime(int mode)
}
}
}
cpu_enable_interrupt();
}
long syscall(int num, ihk_mc_user_context_t *ctx)
@ -9386,6 +9365,7 @@ long syscall(int num, ihk_mc_user_context_t *ctx)
#endif // DISABLE_SCHED_YIELD
set_cputime(1);
//kprintf("syscall=%d\n", num);
#ifdef PROFILE_ENABLE
if (thread->profile && thread->profile_start_ts) {
unsigned long ts = rdtsc();
@ -9396,6 +9376,7 @@ long syscall(int num, ihk_mc_user_context_t *ctx)
if(cpu_local_var(current)->proc->status == PS_EXITED &&
(num != __NR_exit && num != __NR_exit_group)){
save_syscall_return_value(num, -EINVAL);
check_signal(-EINVAL, NULL, 0);
set_cputime(0);
return -EINVAL;
@ -9456,22 +9437,7 @@ long syscall(int num, ihk_mc_user_context_t *ctx)
l = ihk_mc_syscall_ret(ctx);
}
#if defined(POSTK_DEBUG_TEMP_FIX_60) && defined(POSTK_DEBUG_TEMP_FIX_56)
check_signal(l, NULL, num);
#elif defined(POSTK_DEBUG_TEMP_FIX_60) /* sched_yield called check_signal fix. */
if (num != __NR_futex) {
check_signal(l, NULL, num);
}
#elif defined(POSTK_DEBUG_TEMP_FIX_56) /* in futex_wait() signal handring fix. */
if (num != __NR_sched_yield) {
check_signal(l, NULL, num);
}
#else /* POSTK_DEBUG_TEMP_FIX_60 && POSTK_DEBUG_TEMP_FIX_56 */
if (!list_empty(&thread->sigpending) ||
!list_empty(&thread->sigcommon->sigpending)) {
check_signal(l, NULL, num);
}
#endif /* POSTK_DEBUG_TEMP_FIX_60 && POSTK_DEBUG_TEMP_FIX_56 */
save_syscall_return_value(num, l);
#ifdef PROFILE_ENABLE
{
@ -9511,10 +9477,20 @@ long syscall(int num, ihk_mc_user_context_t *ctx)
}
#endif /* POSTK_DEBUG_TEMP_FIX_60 && POSTK_DEBUG_TEMP_FIX_56 */
if (!list_empty(&thread->sigpending) ||
!list_empty(&thread->sigcommon->sigpending)) {
check_signal(l, NULL, num);
}
#ifdef DISABLE_SCHED_YIELD
if (num != __NR_sched_yield)
#endif // DISABLE_SCHED_YIELD
set_cputime(0);
if (thread->proc->nohost) { // mcexec termination was detected
terminate(0, SIGKILL);
}
//kprintf("syscall=%d returns %lx(%ld)\n", num, l, l);
return l;
}

View File

@ -34,7 +34,6 @@
#include <ihk/mm.h>
#include <xpmem_private.h>
struct xpmem_partition *xpmem_my_part = NULL; /* pointer to this partition */
@ -107,6 +106,7 @@ int xpmem_open(
mckfd->sig_no = -1;
mckfd->ioctl_cb = xpmem_ioctl;
mckfd->close_cb = xpmem_close;
mckfd->dup_cb = xpmem_dup;
mckfd->data = (long)proc;
irqstate = ihk_mc_spinlock_lock(&proc->mckfd_lock);
@ -276,16 +276,11 @@ static int xpmem_ioctl(
return -EINVAL;
}
static int xpmem_close(
struct mckfd *mckfd,
ihk_mc_user_context_t *ctx)
{
int n_opened;
struct process *proc = (struct process *)mckfd->data;
struct xpmem_thread_group *tg;
int index;
struct mcs_rwlock_node_irqsave lock;
XPMEM_DEBUG("call: fd=%d, pid=%d, rgid=%d",
mckfd->fd, proc->pid, proc->rgid);
@ -293,37 +288,11 @@ static int xpmem_close(
n_opened = ihk_atomic_dec_return(&xpmem_my_part->n_opened);
XPMEM_DEBUG("n_opened=%d", n_opened);
index = xpmem_tg_hashtable_index(proc->pid);
mcs_rwlock_writer_lock(&xpmem_my_part->tg_hashtable[index].lock, &lock);
tg = xpmem_tg_ref_by_tgid_all_nolock(proc->pid);
if (IS_ERR(tg)) {
mcs_rwlock_writer_unlock(
&xpmem_my_part->tg_hashtable[index].lock, &lock);
return 0;
if (mckfd->data) {
/* release my xpmem-objects */
xpmem_flush(mckfd);
}
list_del_init(&tg->tg_hashlist);
mcs_rwlock_writer_unlock(&xpmem_my_part->tg_hashtable[index].lock,
&lock);
XPMEM_DEBUG("tg->vm=0x%p", tg->vm);
ihk_mc_spinlock_lock_noirq(&tg->lock);
tg->flags |= XPMEM_FLAG_DESTROYING;
ihk_mc_spinlock_unlock_noirq(&tg->lock);
xpmem_release_aps_of_tg(tg);
xpmem_remove_segs_of_tg(tg);
ihk_mc_spinlock_lock_noirq(&tg->lock);
tg->flags |= XPMEM_FLAG_DESTROYED;
ihk_mc_spinlock_unlock_noirq(&tg->lock);
xpmem_destroy_tg(tg);
if (!n_opened) {
xpmem_exit();
}
@ -333,6 +302,15 @@ static int xpmem_close(
return 0;
}
static int xpmem_dup(
struct mckfd *mckfd,
ihk_mc_user_context_t *ctx)
{
mckfd->data = 0;
ihk_atomic_inc_return(&xpmem_my_part->n_opened);
return 0;
}
static int xpmem_init(void)
{
@ -987,6 +965,44 @@ static void xpmem_release_aps_of_tg(
XPMEM_DEBUG("return: ");
}
static void xpmem_flush(struct mckfd *mckfd)
{
struct process *proc = (struct process *)mckfd->data;
struct xpmem_thread_group *tg;
int index;
struct mcs_rwlock_node_irqsave lock;
index = xpmem_tg_hashtable_index(proc->pid);
mcs_rwlock_writer_lock(&xpmem_my_part->tg_hashtable[index].lock, &lock);
tg = xpmem_tg_ref_by_tgid_all_nolock(proc->pid);
if (IS_ERR(tg)) {
mcs_rwlock_writer_unlock(
&xpmem_my_part->tg_hashtable[index].lock, &lock);
return;
}
list_del_init(&tg->tg_hashlist);
mcs_rwlock_writer_unlock(&xpmem_my_part->tg_hashtable[index].lock,
&lock);
XPMEM_DEBUG("tg->vm=0x%p", tg->vm);
ihk_mc_spinlock_lock_noirq(&tg->lock);
tg->flags |= XPMEM_FLAG_DESTROYING;
ihk_mc_spinlock_unlock_noirq(&tg->lock);
xpmem_release_aps_of_tg(tg);
xpmem_remove_segs_of_tg(tg);
ihk_mc_spinlock_lock_noirq(&tg->lock);
tg->flags |= XPMEM_FLAG_DESTROYED;
ihk_mc_spinlock_unlock_noirq(&tg->lock);
xpmem_destroy_tg(tg);
}
static int xpmem_attach(
struct mckfd *mckfd,
@ -1431,11 +1447,12 @@ static void xpmem_detach_att(
att->at_vaddr, att->at_vaddr + 1);
if (!range || range->start > att->at_vaddr) {
DBUG_ON(1);
ihk_mc_spinlock_lock_noirq(&ap->lock);
list_del_init(&att->att_list);
ihk_mc_spinlock_unlock_noirq(&ap->lock);
mcs_rwlock_writer_unlock(&att->at_lock, &at_lock);
ihk_mc_spinlock_unlock_noirq(&vm->memory_range_lock);
ekprintf("%s: ERROR: lookup_process_memory_range() failed\n",
__FUNCTION__);
xpmem_att_destroyable(att);
XPMEM_DEBUG("return: range=%p");
return;
}

View File

@ -169,7 +169,7 @@ out:
}
static int zeroobj_get_page(struct memobj *memobj, off_t off, int p2align,
uintptr_t *physp, unsigned long *pflag)
uintptr_t *physp, unsigned long *pflag, uintptr_t virt_addr)
{
int error;
struct zeroobj *obj = to_zeroobj(memobj);

View File

@ -90,7 +90,7 @@ void ihk_mc_reserve_arch_pages(struct ihk_page_allocator_desc *pa_allocator,
unsigned long, unsigned long, int));
struct ihk_mc_pa_ops {
void *(*alloc_page)(int, int, ihk_mc_ap_flag, int node, int is_user);
void *(*alloc_page)(int, int, ihk_mc_ap_flag, int node, int is_user, uintptr_t virt_addr);
void (*free_page)(void *, int, int is_user);
void *(*alloc)(int, ihk_mc_ap_flag);
@ -115,33 +115,33 @@ int ihk_mc_free_micpa(unsigned long mic_pa);
void ihk_mc_clean_micpa(void);
void *_ihk_mc_alloc_aligned_pages_node(int npages, int p2align,
ihk_mc_ap_flag flag, int node, int is_user, char *file, int line);
ihk_mc_ap_flag flag, int node, int is_user, uintptr_t virt_addr, char *file, int line);
#define ihk_mc_alloc_aligned_pages_node(npages, p2align, flag, node) ({\
void *r = _ihk_mc_alloc_aligned_pages_node(npages, p2align, flag, node, IHK_MC_PG_KERNEL, __FILE__, __LINE__);\
void *r = _ihk_mc_alloc_aligned_pages_node(npages, p2align, flag, node, IHK_MC_PG_KERNEL, -1, __FILE__, __LINE__);\
r;\
})
#define ihk_mc_alloc_aligned_pages_node_user(npages, p2align, flag, node) ({\
void *r = _ihk_mc_alloc_aligned_pages_node(npages, p2align, flag, node, IHK_MC_PG_USER, __FILE__, __LINE__);\
#define ihk_mc_alloc_aligned_pages_node_user(npages, p2align, flag, node, virt_addr) ({\
void *r = _ihk_mc_alloc_aligned_pages_node(npages, p2align, flag, node, IHK_MC_PG_USER, virt_addr, __FILE__, __LINE__);\
r;\
})
#define ihk_mc_alloc_aligned_pages(npages, p2align, flag) ({\
void *r = _ihk_mc_alloc_aligned_pages_node(npages, p2align, flag, -1, IHK_MC_PG_KERNEL, __FILE__, __LINE__);\
void *r = _ihk_mc_alloc_aligned_pages_node(npages, p2align, flag, -1, IHK_MC_PG_KERNEL, -1, __FILE__, __LINE__);\
r;\
})
#define ihk_mc_alloc_aligned_pages_user(npages, p2align, flag) ({\
void *r = _ihk_mc_alloc_aligned_pages_node(npages, p2align, flag, -1, IHK_MC_PG_USER, __FILE__, __LINE__);\
#define ihk_mc_alloc_aligned_pages_user(npages, p2align, flag, virt_addr) ({\
void *r = _ihk_mc_alloc_aligned_pages_node(npages, p2align, flag, -1, IHK_MC_PG_USER, virt_addr, __FILE__, __LINE__);\
r;\
})
#define ihk_mc_alloc_pages(npages, flag) ({\
void *r = _ihk_mc_alloc_aligned_pages_node(npages, PAGE_P2ALIGN, flag, -1, IHK_MC_PG_KERNEL, __FILE__, __LINE__);\
void *r = _ihk_mc_alloc_aligned_pages_node(npages, PAGE_P2ALIGN, flag, -1, IHK_MC_PG_KERNEL, -1, __FILE__, __LINE__);\
r;\
})
#define ihk_mc_alloc_pages_user(npages, flag) ({\
void *r = _ihk_mc_alloc_aligned_pages_node(npages, PAGE_P2ALIGN, flag, -1, IHK_MC_PG_USER, __FILE__, __LINE__);\
#define ihk_mc_alloc_pages_user(npages, flag, virt_addr) ({\
void *r = _ihk_mc_alloc_aligned_pages_node(npages, PAGE_P2ALIGN, flag, -1, IHK_MC_PG_USER, virt_addr, __FILE__, __LINE__);\
r;\
})

View File

@ -280,6 +280,18 @@ int flatten_strings_from_user(int nr_strings, char *first, char **strings, char
long r;
int n, ret;
/* When strings is NULL, make array one NULL */
if (!strings) {
full_len = sizeof(long) + sizeof(char *);
_flat = kmalloc(full_len, IHK_MC_AP_NOWAIT);
if (!_flat) {
return -ENOMEM;
}
memset(_flat, 0, full_len);
*flat = (char *)_flat;
return full_len;
}
/* How many strings do we have? */
if (nr_strings == -1) {
nr_strings = 0;

23
test/dump/README Normal file
View File

@ -0,0 +1,23 @@
===================
Advance preparation
===================
1)Implement patch of destroy_mem.patch
cd mckernel
patch -p0 < destroy_mem.patch
make
make install
2)Compile command execution processing
cd mckernel/test/dump/mcexec_test_proc/
make
==========
How to run
==========
#Run McKernel led dump test
./go_mck_dump_test.sh
#Run Linux led dump test
./go_linux_dump_test.sh

View File

@ -0,0 +1,13 @@
diff --git a/executer/user/mcexec.c b/executer/user/mcexec.c
index 2d0d50e..9f109aa 100644
--- a/executer/user/mcexec.c
+++ b/executer/user/mcexec.c
@@ -470,7 +470,7 @@ struct program_load_desc *load_interp(struct program_load_desc *desc0, FILE *fp)
j++;
}
}
- desc->num_sections = j;
+ desc->num_sections = -1; // Test for num_secionts check Issue#1011
desc->entry = hdr.e_entry;
desc->interp_align = align;

View File

@ -0,0 +1,17 @@
■ Issue#1011 動作確認
1. ファイルの説明
1011.patch mcexec からmcctrl.ko に渡すprogram_load_des構造体のnum_sections に
常に-1 を設定するパッチ
2. 確認方法
1. 上記パッチを適用していない状態でMcKernelを起動する
2. mcexec hostname を実行し、ホスト名が表示されることを確認する
3. 上記パッチをMcKernelに適用後、ビルドと起動を行う
4. mcexec hostname を実行し、ホスト名が表示されず、"prepare: Invalid argument"が
コンソールに出力されることを確認する
5. /var/log/messages に"kernel: mcexec_prepare_image: ERROR: # of sections: -1"が
出力されていることを確認する
3. 確認結果
上記確認方法で期待される動作が確認できたため、問題ない。

View File

@ -0,0 +1,15 @@
diff --git a/executer/user/mcexec.c b/executer/user/mcexec.c
index 2d0d50e..70856e7 100644
--- a/executer/user/mcexec.c
+++ b/executer/user/mcexec.c
@@ -3732,7 +3732,9 @@ return_execve1:
/* Copy program image phase */
case 2:
-
+ fprintf(stderr, "execve killed\n");
+ fflush(stderr);
+ kill(getpid(), SIGKILL);
ret = -1;
/* Alloc descriptor */
desc = malloc(w.sr.args[2]);

View File

@ -0,0 +1,17 @@
■ Issue#727 動作確認
1. ファイルの説明
727.patch 再現確認用パッチ mcexec において、execve phase 2 を SIGKILL
終了させる
exec.c 確認用テストプログラム exec ls する
patch-off.log パッチ非適用時の動作確認結果 ls の結果が表示されれば OK
patch-on.log パッチ適用時の動作確認結果 強制終了し、McKernel が PANIC して
いなければ OK
2. 確認用テストプログラムのコンパイル方法
gcc -o exec exec.c を行い、実行ファイル exec を得る
3. 確認結果
patch-on.log において、mcexec が強制終了し、McKernelのログに PANIC の表示が
無いため、確認結果は OK。
また、パッチ非適用時 (patch-off.log)、正常に ls の結果が表示されているので OK。
以上より、確認結果は問題無い。

View File

@ -0,0 +1,10 @@
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <errno.h>
int
main(int argc, char **argv)
{
execlp("ls", "ls", NULL);
}

View File

@ -0,0 +1,33 @@
スクリプトは Wed Nov 29 11:22:32 2017
に開始しました[?1034hbash-4.2$ ../../../../mic/mcexec ./exec
727.patch exec exec.c patch-off.log
bash-4.2$ ../../../../mic/ihkosctl 0kmsg kmsg
IHK/McKernel started.
[ -1]: no_execute_available: 1
[ -1]: X86_IA32_NUM_PERF_COUNTERS: 4, X86_IA32_NUM_FIXED_PERF_COUNTERS: 3
[ -1]: Invariant TSC supported.
[ -1]: setup_x86 done.
[ -1]: ns_per_tsc: 384
[ -1]: KCommand Line: hidos dump_level=24
[ -1]: Physical memory: 0x1002c7000 - 0x140000000, 1070829568 bytes, 261433 pages available @ NUMA: 0
[ -1]: Physical memory: 0x880000000 - 0x8c0000000, 1073741824 bytes, 262144 pages available @ NUMA: 1
[ -1]: NUMA: 0, Linux NUMA: 0, type: 1, available bytes: 1070829568, pages: 261433
[ -1]: NUMA: 1, Linux NUMA: 1, type: 1, available bytes: 1073741824, pages: 262144
[ -1]: NUMA 0 distances: 0 (10), 1 (21),
[ -1]: NUMA 1 distances: 1 (10), 0 (21),
[ -1]: map_fixed: phys: 0x90000 => 0xffffffff70015000 (2 pages)
[ -1]: Trampoline area: 0x90000
[ -1]: map_fixed: phys: 0x0 => 0xffffffff70017000 (1 pages)
[ -1]: # of cpus : 7
[ -1]: locals = ffff8001002eb000
[ 0]: BSP: 0 (HW ID: 2 @ NUMA 0)
[ 0]: BSP: booted 6 AP CPUs
[ 0]: Master channel init acked.
[ 0]: vdso is enabled
IHK/McKernel booted.
bash-4.2$ シェルから脱出するには "exit" を使用してください。
bash-4.2$ exit
スクリプトは Wed Nov 29 11:22:57 2017
に終了しました

View File

@ -0,0 +1,35 @@
スクリプトは Wed Nov 29 11:25:01 2017
に開始しました[?1034hbash-4.2$ ../../../../mic/mcexec ./exec
execve killed
強制終了
bash-4.2$ ../../../../mic/ihkosctl 0 kmsg
IHK/McKernel started.
[ -1]: no_execute_available: 1
[ -1]: X86_IA32_NUM_PERF_COUNTERS: 4, X86_IA32_NUM_FIXED_PERF_COUNTERS: 3
[ -1]: Invariant TSC supported.
[ -1]: setup_x86 done.
[ -1]: ns_per_tsc: 384
[ -1]: KCommand Line: hidos dump_level=24
[ -1]: Physical memory: 0x1002c7000 - 0x140000000, 1070829568 bytes, 261433 pages available @ NUMA: 0
[ -1]: Physical memory: 0x880000000 - 0x8c0000000, 1073741824 bytes, 262144 pages available @ NUMA: 1
[ -1]: NUMA: 0, Linux NUMA: 0, type: 1, available bytes: 1070829568, pages: 261433
[ -1]: NUMA: 1, Linux NUMA: 1, type: 1, available bytes: 1073741824, pages: 262144
[ -1]: NUMA 0 distances: 0 (10), 1 (21),
[ -1]: NUMA 1 distances: 1 (10), 0 (21),
[ -1]: map_fixed: phys: 0x90000 => 0xffffffff70015000 (2 pages)
[ -1]: Trampoline area: 0x90000
[ -1]: map_fixed: phys: 0x0 => 0xffffffff70017000 (1 pages)
[ -1]: # of cpus : 7
[ -1]: locals = ffff8001002eb000
[ 0]: BSP: 0 (HW ID: 2 @ NUMA 0)
[ 0]: BSP: booted 6 AP CPUs
[ 0]: Master channel init acked.
[ 0]: vdso is enabled
IHK/McKernel booted.
[ 0]: do_syscall: proxy PID 14276 is dead, terminate()
bash-4.2$ シェルから脱出するには "exit" を使用してください。
bash-4.2$ exit
スクリプトは Wed Nov 29 11:25:33 2017
に終了しました

View File

@ -0,0 +1,12 @@
■ Issue#873 動作確認
1. ファイルの説明
mck_boot_test.sh McKernelの停止→起動 を100回繰り返すスクリプト
すべてで成功した場合には[OK]メッセージを、途中で失敗した場合には[NG] メッセージを出力する
2. 確認方法
1. mck_boot_test.sh のMCK_DIR にMcKernelのインストールディレクトリを指定する
2. sh ./mck_boot_test.sh を実行する
3. [OK]メッセージがコンソールに出力されることを確認する
3. 確認結果
上記の確認方法で期待される動作が確認できたため、問題ない。

View File

@ -0,0 +1,18 @@
#!/bin/sh
MCK_DIR=/home/satoken/ppos
REP_NUM=100
for i in `seq 1 ${REP_NUM}`
do
sudo ${MCK_DIR}/sbin/mcstop+release.sh
sleep 1
sudo ${MCK_DIR}/sbin/mcreboot.sh
if [ $? -ne 0 ]; then
echo "[NG] failed to boot Mckernel :${i}"
exit 1
fi
done
echo "[OK] succeed to boot McKernel ${i} times"

19
test/mbind/README Normal file
View File

@ -0,0 +1,19 @@
===================
Advance preparation
===================
1)Implement patch of test_trace_mem.patch
cd mckernel
patch -p0 < test_trace_mem.patch
make
make install
2)Compile command execution processing
cd mckernel/test/mbind/mcexec_test_proc/
make
==========
How to run
==========
./go_mbind_test.sh

58
test/mbind/chk_mbind_result.sh Executable file
View File

@ -0,0 +1,58 @@
#!/bin/sh
DEFAULT_POLICY_KIND="<default policy>"
#SHARED_POLICY_KIND="<default policy:Mapping of MAP_SHARED>"
NUMA_NODE_POLICY_KIND="<NUMA node policy>"
FILE_NAME=$1
CHK_LOG_FILE="./result/${FILE_NAME}.log"
source "./testcases/${FILE_NAME}.txt"
CHK_POLICY_KIND=${POLICY_KIND}
SET_MEM_POLICY=`grep "OK:set_mempolicy" $CHK_LOG_FILE | grep -o '(MPOL.*)'`
SET_POLICY_NUM=`grep -c1 "OK:mbind" $CHK_LOG_FILE`
for exec_num in `seq 0 $((SET_POLICY_NUM - 1))`
do
if [ $exec_num -lt 10 ]; then
NUMA_NODE_ADDR=`grep "OK:mbind" $CHK_LOG_FILE | grep -e "0$exec_num]" | grep -o '(0x.*000)'`
NUMA_NODE_POLICY=`grep "OK:mbind" $CHK_LOG_FILE | grep -e "0$exec_num]" | grep -o '(MPOL.*)'`
else
NUMA_NODE_ADDR=`grep "OK:mbind" $CHK_LOG_FILE | grep -e "$exec_num]" | grep -o '(0x.*000)'`
NUMA_NODE_POLICY=`grep "OK:mbind" $CHK_LOG_FILE | grep -e "$exec_num]" | grep -o '(MPOL.*)'`
fi
if [ "$CHK_POLICY_KIND" = "$DEFAULT_POLICY_KIND" ]; then
SET_MEM_POLICY_NUM=`grep -v $NUMA_NODE_ADDR $CHK_LOG_FILE | grep -e "$CHK_POLICY_KIND" | grep -ce "$SET_MEM_POLICY"`
if [ $SET_MEM_POLICY_NUM -gt 0 ]; then
echo "OK:" $exec_num $CHK_POLICY_KIND" - not address" $NUMA_NODE_ADDR "test policy" $SET_MEM_POLICY "allocate num:" $SET_MEM_POLICY_NUM
exit 0
else
echo "NG:" $exec_num $CHK_POLICY_KIND" - not address" $NUMA_NODE_ADDR "test policy" $SET_MEM_POLICY "allocate num:" $SET_MEM_POLICY_NUM
exit 1
fi
fi
ALLOCATE_POLICY=`grep "mckernel_allocate_aligned_pages_node" $CHK_LOG_FILE | grep -e $NUMA_NODE_ADDR | grep -e "$CHK_POLICY_KIND" | grep -o '(MPOL.*)'`
if [ "$CHK_POLICY_KIND" = "$NUMA_NODE_POLICY_KIND" ]; then
if [ $NUMA_NODE_POLICY != $ALLOCATE_POLICY ]; then
echo "NG:" $exec_num $CHK_POLICY_KIND" - address" $NUMA_NODE_ADDR "test policy" $NUMA_NODE_POLICY "allocate policy" $ALLOCATE_POLICY
exit 1
else
echo "OK:" $exec_num $CHK_POLICY_KIND" - address" $NUMA_NODE_ADDR "test policy" $NUMA_NODE_POLICY "allocate policy" $ALLOCATE_POLICY
fi
else
if [ $SET_MEM_POLICY != $ALLOCATE_POLICY ]; then
echo "NG:" $exec_num $CHK_POLICY_KIND" - address" $NUMA_NODE_ADDR "test policy" $SET_MEM_POLICY "allocate policy" $ALLOCATE_POLICY
exit 1
else
echo "OK:" $exec_num $CHK_POLICY_KIND" - address" $NUMA_NODE_ADDR "test policy" $SET_MEM_POLICY "allocate policy" $ALLOCATE_POLICY
fi
fi
done
exit 0

3
test/mbind/config Executable file
View File

@ -0,0 +1,3 @@
MCMOD_DIR=$HOME/ppos
export MCMOD_DIR

26
test/mbind/go_mbind_test.sh Executable file
View File

@ -0,0 +1,26 @@
#!/bin/sh
START_NG_TEST_NO=0085
for test_case in `ls -1 ./testcases/*.txt`
do
case_name=`basename ${test_case} .txt`
logfile="./result/${case_name}.log"
./mbind_test.sh ${test_case} &> ${logfile}
if [ $? -eq 0 ]; then
./chk_mbind_result.sh ${case_name}
if [ $? -eq 0 ]; then
echo "[OK] ${case_name} is done."
else
echo "[NG] failed to test ${case_name}. Please check ${logfile}"
fi
else
test_number=`basename ${test_case} _mbind.txt`
if [ $test_number -ge $START_NG_TEST_NO ]; then
echo "[OK] ${case_name} is done(NG test case)."
else
echo "[NG] failed to test ${case_name}. Please check ${logfile}"
fi
fi
done

47
test/mbind/mbind_test.sh Executable file
View File

@ -0,0 +1,47 @@
#!/bin/sh
if [ $# -lt 1 ]; then
echo "Error: too few arguments."
echo "usage: `basename $0` <param_file>"
fi
# read config
source ./config
# read testcase param
source $1
# mcexec processのkill
./utils/kill_mcexec.sh &> /dev/null
# stop mckernel
sudo ${MCMOD_DIR}/sbin/mcstop+release.sh
sleep 1
# boot mckernel
echo "${MCMOD_DIR}/sbin/mcreboot.sh ${MCRBT_OPT%,}"
sudo ${MCMOD_DIR}/sbin/mcreboot.sh ${MCRBT_OPT%,}
sleep 1
if [ ! -e "/dev/mcos0" ]; then
echo "Error: failed to mcreboot"
exit 1
fi
# exec mckernel test program
echo "${MCMOD_DIR}/bin/mcexec ${USR_PROC}"
${MCMOD_DIR}/bin/mcexec ${USR_PROC}
#if [ $? -eq 0 ]; then
if [ $? == 0 ]; then
sleep 1
echo "${MCMOD_DIR}/sbin/ihkosctl ${OS_IDX} kmsg"
sudo ${MCMOD_DIR}/sbin/ihkosctl ${OS_IDX} kmsg
exit 0
else
echo "Error: faild to mcexec"
exit 1
fi

View File

@ -0,0 +1,7 @@
OBJS = exec_setmempolicy_mbind exec_setmempolicy_mbind_shared
LDFLAGS = -lnuma
all: $(OBJS)
clean:
rm $(OBJS)

View File

@ -0,0 +1,208 @@
#include <stdio.h>
#include <assert.h>
#include <sys/mman.h>
#include <numa.h>
#include <numaif.h>
#define PAGE_SIZE (4096)
typedef struct func_setmem_para {
int set_mode;
int dummy;
unsigned long set_nodemask;
unsigned long set_maxnode;
} set_mem_para;
typedef struct func_mbind_para {
int set_mode;
int loop_cnt;
unsigned long set_nodemask;
unsigned long set_maxnode;
unsigned flags;
} mbind_para;
typedef struct func_para {
set_mem_para para1;
mbind_para para2;
} main_para;
char *mempolicy [] = {
"MPOL_DEFAULT",
"MPOL_PREFERRED",
"MPOL_BIND",
"MPOL_INTERLEAVE"
};
int func_set_mempolicy(set_mem_para* inpara)
{
int rst = -1;
int set_mode = inpara->set_mode;
unsigned long set_nodemask = inpara->set_nodemask;
unsigned long set_maxnode = inpara->set_maxnode;
int mode = set_mode & 0x00000003;
rst = set_mempolicy(set_mode, &set_nodemask, set_maxnode);
printf("-----\n");
if (rst < 0) {
printf("NG:set_mempolicy - mode:(%s) nodemask:0x%x maxnode:%d rst:%d\n"
,mempolicy[mode] ,set_nodemask ,set_maxnode, rst);
//assert(0 && "set_mempolicy() failed");
} else {
printf("OK:set_mempolicy - mode:(%s) nodemask:0x%x maxnode:%d\n"
,mempolicy[mode] ,set_nodemask ,set_maxnode);
}
printf("-----\n");
return rst;
}
int func_mbind(mbind_para* inpara)
{
int rst = -1;
unsigned char *addr = NULL;
int get_mode = 0;
int i = 0;
unsigned long mem_len = PAGE_SIZE;
int set_mode = inpara->set_mode;
unsigned long set_nodemask = inpara->set_nodemask;
unsigned long set_maxnode = inpara->set_maxnode;
unsigned flags = inpara->flags;
int loop_cnt = inpara->loop_cnt;
int mode = set_mode & 0x00000003;
for (i = 0; i < loop_cnt; i++) {
addr = mmap(0, mem_len, (PROT_READ | PROT_WRITE),
(MAP_ANONYMOUS | MAP_PRIVATE), 0, 0);
if (addr == (void *) -1) {
printf("[%02d] NG:mmap - len:%d prot:0x%x flags:0x%x\n"
,i ,mem_len ,(PROT_READ | PROT_WRITE) ,(MAP_ANONYMOUS | MAP_PRIVATE));
//assert(0 && "mmap() failed");
return -1;
} else {
// printf("[%02d] OK:mmap - addr:(0x%016lx) len:%d prot:0x%x flags:0x%x\n"
// ,i ,addr ,mem_len ,(PROT_READ | PROT_WRITE) ,(MAP_ANONYMOUS | MAP_PRIVATE));
}
if ((inpara->set_mode & 0x000000ff) == 0xff) {
switch ((i & 0x3)) {
case MPOL_PREFERRED:
set_mode = ((set_mode & 0xffffff00) | MPOL_PREFERRED);
set_nodemask = inpara->set_nodemask;
flags = 0;
mode = MPOL_PREFERRED;
break;
case MPOL_BIND:
set_mode = ((set_mode & 0xffffff00) | MPOL_BIND);
set_nodemask = inpara->set_nodemask;
flags = 0;
mode = MPOL_BIND;
break;
case MPOL_INTERLEAVE:
set_mode = ((set_mode & 0xffffff00) | MPOL_INTERLEAVE);
set_nodemask = inpara->set_nodemask;
flags = 0;
mode = MPOL_INTERLEAVE;
break;
case MPOL_DEFAULT:
default:
set_mode = ((set_mode & 0xffffff00) | MPOL_DEFAULT);
set_nodemask = 0;
flags = MPOL_MF_STRICT;
mode = MPOL_DEFAULT;
break;
}
}
rst = mbind(addr, mem_len, set_mode, &set_nodemask, set_maxnode, flags);
if (rst < 0) {
printf("[%02d] NG:mbind - addr:(0x%016lx) len:%d mode:(%s) nodemask:0x%x maxnode:%d flags:%d rst:%d\n"
,i ,addr ,mem_len ,mempolicy[mode] ,set_nodemask ,set_maxnode ,flags ,rst);
//assert(0 && "mbind() failed");
return -1;
} else {
printf("[%02d] OK:mbind - addr:(0x%016lx) len:%d mode:(%s) nodemask:0x%x maxnode:%d flags:%d\n"
,i ,addr ,mem_len ,mempolicy[mode] ,set_nodemask ,set_maxnode ,flags);
}
rst = get_mempolicy(&get_mode, NULL, 0, addr, MPOL_F_ADDR);
if(rst < 0) {
printf("[%02d] NG:get_mempolicy - addr:(0x%016lx) rst:%d\n"
,i ,addr , rst);
//assert(0 && "get_mempolicy failed");
return -1;
} else {
printf("[%02d] OK:get_mempolicy - addr:(0x%016lx) mode:(%s)\n"
,i ,addr ,mempolicy[get_mode]);
}
rst = munmap(addr, mem_len);
if (rst < 0) {
printf("[%02d] NG:munmap - addr:(0x%016lx) len:%d\n"
,i ,addr ,mem_len);
} else {
// printf("[%02d] OK:munmap - addr:(0x%016lx) len:%d\n"
// ,i ,addr ,mem_len);
}
addr = mmap(addr, mem_len, (PROT_READ | PROT_WRITE),
(MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE), 0, 0);
if (addr == (void *) -1) {
printf("[%02d] NG:mmap - len:%d prot:0x%x flags:0x%x\n"
,i ,mem_len ,(PROT_READ | PROT_WRITE) ,(MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE));
//assert(0 && "mmap() failed");
return -1;
} else {
// printf("[%02d] OK:mmap - addr:(0x%016lx) len:%d prot:0x%x flags:0x%x\n"
// ,i ,addr ,mem_len ,(PROT_READ | PROT_WRITE) ,(MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE));
}
printf("-----\n");
}
return 0;
}
int main(int argc, char *argv[])
{
main_para inpara;
int rst = -1;
if (argc == 9 ) {
inpara.para1.set_mode = strtol(argv[1], NULL, 16);
inpara.para1.set_nodemask = strtoul(argv[2], NULL, 16);
inpara.para1.set_maxnode = strtol(argv[3], NULL, 10);
rst = func_set_mempolicy(&inpara.para1);
if (rst == 0) {
inpara.para2.set_mode = strtol(argv[4], NULL, 16);
inpara.para2.set_nodemask = strtoul(argv[5], NULL, 16);
inpara.para2.set_maxnode = strtoul(argv[6], NULL, 10);
inpara.para2.flags = strtoul(argv[7], NULL, 16);
inpara.para2.loop_cnt = strtol(argv[8], NULL, 10);
rst = func_mbind(&inpara.para2);
}
} else {
printf("NG: Invalid number of parameters(%d)\n",(argc-1));
printf(" parameter 1 : set_mempolicy(mode)\n");
printf(" parameter 2 : set_mempolicy(nodemask)\n");
printf(" parameter 3 : set_mempolicy(maxnode)\n");
printf(" parameter 4 : mbind(mode) 0xff - all mode\n");
printf(" parameter 5 : mbind(nodemask)\n");
printf(" parameter 6 : mbind(maxnode)\n");
printf(" parameter 7 : mbind(flags)\n");
printf(" parameter 8 : Number of mbind executed\n");
printf(" example) ./exec_setmempolicy_mbind 0x1 0x1 2 0x2 0x1 2 0x0 1\n");
}
return rst;
}

View File

@ -0,0 +1,208 @@
#include <stdio.h>
#include <assert.h>
#include <sys/mman.h>
#include <numa.h>
#include <numaif.h>
#define PAGE_SIZE (4096)
typedef struct func_setmem_para {
int set_mode;
int dummy;
unsigned long set_nodemask;
unsigned long set_maxnode;
} set_mem_para;
typedef struct func_mbind_para {
int set_mode;
int loop_cnt;
unsigned long set_nodemask;
unsigned long set_maxnode;
unsigned flags;
} mbind_para;
typedef struct func_para {
set_mem_para para1;
mbind_para para2;
} main_para;
char *mempolicy [] = {
"MPOL_DEFAULT",
"MPOL_PREFERRED",
"MPOL_BIND",
"MPOL_INTERLEAVE"
};
int func_set_mempolicy(set_mem_para* inpara)
{
int rst = -1;
int set_mode = inpara->set_mode;
unsigned long set_nodemask = inpara->set_nodemask;
unsigned long set_maxnode = inpara->set_maxnode;
int mode = set_mode & 0x00000003;
rst = set_mempolicy(set_mode, &set_nodemask, set_maxnode);
printf("-----\n");
if (rst < 0) {
printf("NG:set_mempolicy - mode:(%s) nodemask:0x%x maxnode:%d rst:%d\n"
,mempolicy[mode] ,set_nodemask ,set_maxnode, rst);
//assert(0 && "set_mempolicy() failed");
} else {
printf("OK:set_mempolicy - mode:(%s) nodemask:0x%x maxnode:%d\n"
,mempolicy[mode] ,set_nodemask ,set_maxnode);
}
printf("-----\n");
return rst;
}
int func_mbind(mbind_para* inpara)
{
int rst = -1;
unsigned char *addr = NULL;
int get_mode = 0;
int i = 0;
unsigned long mem_len = PAGE_SIZE;
int set_mode = inpara->set_mode;
unsigned long set_nodemask = inpara->set_nodemask;
unsigned long set_maxnode = inpara->set_maxnode;
unsigned flags = inpara->flags;
int loop_cnt = inpara->loop_cnt;
int mode = set_mode & 0x00000003;
for (i = 0; i < loop_cnt; i++) {
addr = mmap(0, mem_len, (PROT_READ | PROT_WRITE),
(MAP_ANONYMOUS | MAP_SHARED | MAP_POPULATE), 0, 0);
if (addr == (void *) -1) {
printf("[%02d] NG:mmap - len:%d prot:0x%x flags:0x%x\n"
,i ,mem_len ,(PROT_READ | PROT_WRITE) ,(MAP_ANONYMOUS | MAP_SHARED | MAP_POPULATE));
//assert(0 && "mmap() failed");
return -1;
} else {
// printf("[%02d] OK:mmap - addr:(0x%016lx) len:%d prot:0x%x flags:0x%x\n"
// ,i ,addr ,mem_len ,(PROT_READ | PROT_WRITE) ,(MAP_ANONYMOUS | MAP_SHARED | MAP_POPULATE));
}
if ((inpara->set_mode & 0x000000ff) == 0xff) {
switch ((i & 0x3)) {
case MPOL_PREFERRED:
set_mode = ((set_mode & 0xffffff00) | MPOL_PREFERRED);
set_nodemask = inpara->set_nodemask;
flags = 0;
mode = MPOL_PREFERRED;
break;
case MPOL_BIND:
set_mode = ((set_mode & 0xffffff00) | MPOL_BIND);
set_nodemask = inpara->set_nodemask;
flags = 0;
mode = MPOL_BIND;
break;
case MPOL_INTERLEAVE:
set_mode = ((set_mode & 0xffffff00) | MPOL_INTERLEAVE);
set_nodemask = inpara->set_nodemask;
flags = 0;
mode = MPOL_INTERLEAVE;
break;
case MPOL_DEFAULT:
default:
set_mode = ((set_mode & 0xffffff00) | MPOL_DEFAULT);
set_nodemask = 0;
flags = MPOL_MF_STRICT;
mode = MPOL_DEFAULT;
break;
}
}
rst = mbind(addr, mem_len, set_mode, &set_nodemask, set_maxnode, flags);
if (rst < 0) {
printf("[%02d] NG:mbind - addr:(0x%016lx) len:%d mode:(%s) nodemask:0x%x maxnode:%d flags:%d rst:%d\n"
,i ,addr ,mem_len ,mempolicy[mode] ,set_nodemask ,set_maxnode ,flags ,rst);
//assert(0 && "mbind() failed");
return -1;
} else {
printf("[%02d] OK:mbind - addr:(0x%016lx) len:%d mode:(%s) nodemask:0x%x maxnode:%d flags:%d\n"
,i ,addr ,mem_len ,mempolicy[mode] ,set_nodemask ,set_maxnode ,flags);
}
rst = get_mempolicy(&get_mode, NULL, 0, addr, MPOL_F_ADDR);
if(rst < 0) {
printf("[%02d] NG:get_mempolicy - addr:(0x%016lx) rst:%d\n"
,i ,addr , rst);
//assert(0 && "get_mempolicy failed");
return -1;
} else {
printf("[%02d] OK:get_mempolicy - addr:(0x%016lx) mode:(%s)\n"
,i ,addr ,mempolicy[get_mode]);
}
rst = munmap(addr, mem_len);
if (rst < 0) {
printf("[%02d] NG:munmap - addr:(0x%016lx) len:%d\n"
,i ,addr ,mem_len);
} else {
// printf("[%02d] OK:munmap - addr:(0x%016lx) len:%d\n"
// ,i ,addr ,mem_len);
}
addr = mmap(addr, mem_len, (PROT_READ | PROT_WRITE),
(MAP_FIXED | MAP_ANONYMOUS | MAP_SHARED | MAP_POPULATE), 0, 0);
if (addr == (void *) -1) {
printf("[%02d] NG:mmap - len:%d prot:0x%x flags:0x%x\n"
,i ,mem_len ,(PROT_READ | PROT_WRITE) ,(MAP_FIXED | MAP_ANONYMOUS | MAP_SHARED | MAP_POPULATE));
//assert(0 && "mmap() failed");
return -1;
} else {
// printf("[%02d] OK:mmap - addr:(0x%016lx) len:%d prot:0x%x flags:0x%x\n"
// ,i ,addr ,mem_len ,(PROT_READ | PROT_WRITE) ,(MAP_FIXED | MAP_ANONYMOUS | MAP_SHARED | MAP_POPULATE));
}
printf("-----\n");
}
return 0;
}
int main(int argc, char *argv[])
{
main_para inpara;
int rst = -1;
if (argc == 9 ) {
inpara.para1.set_mode = strtol(argv[1], NULL, 16);
inpara.para1.set_nodemask = strtoul(argv[2], NULL, 16);
inpara.para1.set_maxnode = strtol(argv[3], NULL, 10);
rst = func_set_mempolicy(&inpara.para1);
if (rst == 0) {
inpara.para2.set_mode = strtol(argv[4], NULL, 16);
inpara.para2.set_nodemask = strtoul(argv[5], NULL, 16);
inpara.para2.set_maxnode = strtoul(argv[6], NULL, 10);
inpara.para2.flags = strtoul(argv[7], NULL, 16);
inpara.para2.loop_cnt = strtol(argv[8], NULL, 10);
rst = func_mbind(&inpara.para2);
}
} else {
printf("NG: Invalid number of parameters(%d)\n",(argc-1));
printf(" parameter 1 : set_mempolicy(mode)\n");
printf(" parameter 2 : set_mempolicy(nodemask)\n");
printf(" parameter 3 : set_mempolicy(maxnode)\n");
printf(" parameter 4 : mbind(mode) 0xff - all mode\n");
printf(" parameter 5 : mbind(nodemask)\n");
printf(" parameter 6 : mbind(maxnode)\n");
printf(" parameter 7 : mbind(flags)\n");
printf(" parameter 8 : Number of mbind executed\n");
printf(" example) ./exec_setmempolicy_mbind 0x1 0x1 2 0x2 0x1 2 0x0 1\n");
}
return rst;
}

0
test/mbind/result/.gitignore vendored Normal file
View File

View File

@ -0,0 +1,75 @@
diff --git kernel/mem.c kernel/mem.c
index 62cb206..5bfb6d6 100644
--- kernel/mem.c
+++ kernel/mem.c
@@ -542,6 +542,15 @@ static void reserve_pages(struct ihk_page_allocator_desc *pa_allocator,
ihk_pagealloc_reserve(pa_allocator, start, end);
}
+#if 1 /* Trace for DEBUG */
+char *mempolicy [] = {
+ "MPOL_DEFAULT",
+ "MPOL_PREFERRED",
+ "MPOL_BIND",
+ "MPOL_INTERLEAVE"
+};
+#endif
+
extern int cpu_local_var_initialized;
static void *mckernel_allocate_aligned_pages_node(int npages, int p2align,
ihk_mc_ap_flag flag, int pref_node, int is_user, uintptr_t virt_addr)
@@ -585,6 +594,23 @@ static void *mckernel_allocate_aligned_pages_node(int npages, int p2align,
}
}
+#if 1 /* Trace for DEBUG */
+ if (!((range_policy_iter) && (range_policy_iter->numa_mem_policy != MPOL_DEFAULT))) {
+ if ((range_policy_iter) && (range_policy_iter->numa_mem_policy == MPOL_DEFAULT)) {
+ if (chk_shm == 0) {
+ kprintf("%s[%d]: addr(0x%016lx) policy(%s) <NUMA node policy>\n"
+ ,__FUNCTION__ ,__LINE__ ,virt_addr ,mempolicy[(range_policy_iter->numa_mem_policy & 0x3)]);
+ }
+ } else {
+ if ((cpu_local_var(current)->vm->numa_mem_policy == MPOL_DEFAULT) && (virt_addr != -1)) {
+ if (virt_addr) {
+ kprintf("%s[%d]: addr(0x%016lx) policy(%s) <default policy>\n"
+ ,__FUNCTION__ ,__LINE__ ,virt_addr ,mempolicy[(cpu_local_var(current)->vm->numa_mem_policy & 0x3)]);
+ }
+ }
+ }
+ }
+#endif
if ((!((range_policy_iter) && (range_policy_iter->numa_mem_policy != MPOL_DEFAULT))) && (chk_shm == 0))
goto distance_based;
@@ -647,10 +673,30 @@ static void *mckernel_allocate_aligned_pages_node(int npages, int p2align,
chk_shm = 1;
} else {
numa_mem_policy = range_policy_iter->numa_mem_policy;
+
+#if 1 /* Trace for DEBUG */
+ kprintf("%s[%d]: addr(0x%016lx) policy(%s) <NUMA node policy>\n"
+ ,__FUNCTION__ ,__LINE__ ,virt_addr ,mempolicy[(numa_mem_policy & 0x3)]);
+#endif
+
}
}
}
+#if 1 /* Trace for DEBUG */
+ if (numa_mem_policy == -1) {
+ if (chk_shm == 1) {
+ kprintf("%s[%d]: addr(0x%016lx) policy(%s) <default policy:Mapping of MAP_SHARED>\n"
+ ,__FUNCTION__ ,__LINE__ ,virt_addr ,mempolicy[(cpu_local_var(current)->vm->numa_mem_policy & 0x3)]);
+ } else {
+ if (virt_addr) {
+ kprintf("%s[%d]: addr(0x%016lx) policy(%s) <default policy>\n"
+ ,__FUNCTION__ ,__LINE__ ,virt_addr ,mempolicy[(cpu_local_var(current)->vm->numa_mem_policy & 0x3)]);
+ }
+ }
+ }
+#endif
+
if (numa_mem_policy == -1)
numa_mem_policy = cpu_local_var(current)->vm->numa_mem_policy;

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 1 0x8000 0x0 1 1 1"
OS_IDX=0
POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 1 0x8001 0x1 1 0 1"
OS_IDX=0
POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 1 0x8002 0x1 1 0 1"
OS_IDX=0
POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 1 0x8003 0x1 1 0 1"
OS_IDX=0
POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 1 0x80ff 0x1 1 0 4"
OS_IDX=0
POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 1 0x80ff 0x1 1 0 40"
OS_IDX=0
POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 1 0x8000 0x0 1 1 1"
OS_IDX=0
#POLICY_KIND="<NUMA node policy>"
POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8001 0x1 1 0x8000 0x0 1 1 1"
OS_IDX=0
#POLICY_KIND="<NUMA node policy>"
POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8002 0x1 1 0x8000 0x0 1 1 1"
OS_IDX=0
#POLICY_KIND="<NUMA node policy>"
POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8003 0x1 1 0x8000 0x0 1 1 1"
OS_IDX=0
#POLICY_KIND="<NUMA node policy>"
POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind_shared 0x8000 0x0 1 0x8003 0x1 1 0 1"
OS_IDX=0
#POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind_shared 0x8001 0x1 1 0x8002 0x1 1 0 1"
OS_IDX=0
#POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind_shared 0x8002 0x1 1 0x8001 0x1 1 0 1"
OS_IDX=0
#POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind_shared 0x8003 0x1 1 0x8000 0x0 1 0 1"
OS_IDX=0
#POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0 1" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 2 0x8000 0x0 2 1 1"
OS_IDX=0
POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0 1" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 2 0x8001 0x3 2 0 1"
OS_IDX=0
POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0 1" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 2 0x8002 0x3 2 0 1"
OS_IDX=0
POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0 1" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 2 0x8003 0x3 2 0 1"
OS_IDX=0
POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0 1" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 2 0x80ff 0x3 2 0 4"
OS_IDX=0
POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0 1" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 2 0x80ff 0x3 2 0 40"
OS_IDX=0
POLICY_KIND="<NUMA node policy>"
#POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0 1" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8000 0x0 2 0x8000 0x0 2 1 1"
OS_IDX=0
#POLICY_KIND="<NUMA node policy>"
POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0 1" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8001 0x3 2 0x8000 0x0 2 1 1"
OS_IDX=0
#POLICY_KIND="<NUMA node policy>"
POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0 1" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8002 0x3 2 0x8000 0x0 2 1 1"
OS_IDX=0
#POLICY_KIND="<NUMA node policy>"
POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

View File

@ -0,0 +1,6 @@
MCRBT_OPT="-m `./utils/gen_mem_chunks.sh "0 1" 32M 1`"
USR_PROC="mcexec_test_proc/exec_setmempolicy_mbind 0x8003 0x3 2 0x8000 0x0 2 1 1"
OS_IDX=0
#POLICY_KIND="<NUMA node policy>"
POLICY_KIND="<default policy>"
#POLICY_KIND="<default policy:Mapping of MAP_SHARED>"

Some files were not shown because too many files have changed in this diff Show More