768 lines
28 KiB
ReStructuredText
768 lines
28 KiB
ReStructuredText
=============================================
|
|
Version 1.8.0 (Mar 23, 2021)
|
|
=============================================
|
|
|
|
----------------------
|
|
IHK major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
IHK major bug fixes
|
|
------------------------
|
|
N/A
|
|
|
|
----------------------
|
|
McKernel major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
McKernel major bug fixes
|
|
------------------------
|
|
#. profile: fix infinite recursion for allocation miss event
|
|
#. Fugaku: MAP_LOCKED and pre-populate PMIx shared memory PFNs
|
|
|
|
=============================================
|
|
Version 1.7.10 (Mar 18, 2021)
|
|
=============================================
|
|
|
|
----------------------
|
|
IHK major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
IHK major bug fixes
|
|
------------------------
|
|
#. __ihk_device_detect_hungup: detect hungup via device-ioctl
|
|
|
|
----------------------
|
|
McKernel major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
McKernel major bug fixes
|
|
------------------------
|
|
N/A
|
|
|
|
=============================================
|
|
Version 1.7.9 (Mar 17, 2021)
|
|
=============================================
|
|
|
|
----------------------
|
|
IHK major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
IHK major bug fixes
|
|
------------------------
|
|
#. ihklib: ihk_reserve_mem_conf*: fix default values
|
|
#. smp_ihk_os_shutdown: fix memory leak
|
|
#. smp_ihk_os_shutdown: prevent double free
|
|
#. __ihk_os_shutdown: fix smp_ihk_os_shutdown()-related double free
|
|
#. smp_ihk_os_panic_notifier: exclude memory from Linux dump with default setting
|
|
#. smp_ihk_os_panic_notifier: exclude memory from Linux dump while booting, on timeout
|
|
|
|
----------------------
|
|
McKernel major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
McKernel major bug fixes
|
|
------------------------
|
|
#. mcctrl_wakeup_desc: refcount and fix timeouts
|
|
|
|
=============================================
|
|
Version 1.7.8 (Mar 12, 2021)
|
|
=============================================
|
|
|
|
----------------------
|
|
IHK major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
IHK major bug fixes
|
|
------------------------
|
|
#. ihklib: ihk_reserve_cpu: fix job cpu check when using krm
|
|
|
|
----------------------
|
|
McKernel major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
McKernel major bug fixes
|
|
------------------------
|
|
N/A
|
|
|
|
=============================================
|
|
Version 1.7.7 (Mar 11, 2021)
|
|
=============================================
|
|
|
|
----------------------
|
|
IHK major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
IHK major bug fixes
|
|
------------------------
|
|
N/A
|
|
|
|
----------------------
|
|
McKernel major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
McKernel major bug fixes
|
|
------------------------
|
|
#. mcexec: fput executable just after its contents is transferred
|
|
#. spec: cmake-config cmake parameters
|
|
|
|
=============================================
|
|
Version 1.7.6 (Mar 11, 2021)
|
|
=============================================
|
|
|
|
----------------------
|
|
IHK major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
IHK major bug fixes
|
|
------------------------
|
|
#. ihklib: ihk_reserve_mem_conf*: apply change only to the next reservation
|
|
|
|
----------------------
|
|
McKernel major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
McKernel major bug fixes
|
|
------------------------
|
|
N/A
|
|
|
|
=============================================
|
|
Version 1.7.5 (Mar 11, 2021)
|
|
=============================================
|
|
|
|
----------------------
|
|
IHK major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
IHK major bug fixes
|
|
------------------------
|
|
#. ihklib: fix cgroup cpuset.cpus/mems check when using krm
|
|
#. ihklib: ihk_reserve_mem_conf_str: set default values to those not specified
|
|
|
|
----------------------
|
|
McKernel major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
McKernel major bug fixes
|
|
------------------------
|
|
N/A
|
|
|
|
=============================================
|
|
Version 1.7.4 (Mar 7, 2021)
|
|
=============================================
|
|
|
|
----------------------
|
|
IHK major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
IHK major bug fixes
|
|
------------------------
|
|
N/A
|
|
|
|
----------------------
|
|
McKernel major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
McKernel major bug fixes
|
|
------------------------
|
|
N/A
|
|
|
|
=============================================
|
|
Version 1.7.3 (Mar 5, 2021)
|
|
=============================================
|
|
|
|
----------------------
|
|
IHK major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
IHK major bug fixes
|
|
------------------------
|
|
N/A
|
|
|
|
----------------------
|
|
McKernel major updates
|
|
----------------------
|
|
N/A
|
|
|
|
------------------------
|
|
McKernel major bug fixes
|
|
------------------------
|
|
N/A
|
|
|
|
=============================================
|
|
Version 1.7.2 (Mar 5, 2021)
|
|
=============================================
|
|
|
|
----------------------
|
|
IHK major updates
|
|
----------------------
|
|
#. ihklib: add *_str() functions for reserve, assign, IKC-map, kargs
|
|
#. smp: make smp_call_func() arch independent
|
|
|
|
------------------------
|
|
IHK major bug fixes
|
|
------------------------
|
|
#. ihklib: ihk_reserve_mem: fix capped best-effort
|
|
#. TO RESET: fake missing NUMA node pieces, 90% memory limit
|
|
#. ihklib: ihk_reserve_mem_conf: range-check for IHK_RESERVE_MEM_MAX_SIZE_RATIO_ALL
|
|
#. ihklib: ihk_os_kargs: check if "hidos" is included
|
|
#. SMP: omit slab/slub shrink, use 95% limit by default
|
|
#. check cpu / numa cgroup set by krm
|
|
#. SMP: __ihk_smp_reserve_mem: add __GFP_COMP to __GFP_ATOMIC allocation
|
|
#. ihk_register_device: record minor to IHK device object
|
|
|
|
----------------------
|
|
McKernel major updates
|
|
----------------------
|
|
#. mcexec: memory policy control by environmental variable
|
|
#. mempolicy: Support MPOL_INTERLEAVE
|
|
#. uti: futex call function in mcctrl
|
|
#. uti: integrate libuti and redirect to mck/libuti.so
|
|
#. uti: integrate syscall_intercept
|
|
#. shmobj: support large page
|
|
#. xpmem: support large page
|
|
#. MM: handle zero_at_free in page faults
|
|
|
|
------------------------
|
|
McKernel major bug fixes
|
|
------------------------
|
|
#. TO RESET: stack changes
|
|
#. Tofu: keep track of stags per memory range
|
|
#. Tofu: match page sizes to MBPT and fault PTEs if not present
|
|
#. Tofu: fix phys addr calculation for contiguous pages in MBPT/BCH update
|
|
#. rus_vm_fault: vmf_insert_pfn: treat VM_FAULT_NOPAGE as success
|
|
#. Tofu: mcctrl side MMU notifier and CQ/BCH cleanup
|
|
#. copy_user_ranges: copy straight_start of struct vm_range
|
|
#. mcctrl: abort on invalid addr in mcexec_transfer_image()
|
|
#. mcctrl: fix access to uninitialized usrdata->cpu_topology_list
|
|
#. mcexec: propagate error in __NR_gettid handler
|
|
#. mcexec_transfer_image(): map exact size of remote memory (instead of forcing PAGE_SIZE)
|
|
#. xpmem: fault stack area of remote process if VM range doesn't yet exist
|
|
#. Tofu: fault stack area if VM range doesn't exist in STAG registration
|
|
#. __mcctrl_os_read_write_cpu_register: fix timeout
|
|
#. mbind: Use range_policy's numamask as priority on MPOL_BIND
|
|
#. migrate: Don't migrate on in-kernel interrupt
|
|
#. Send a signal to mcexec after switching to that process.
|
|
#. uti: fix syscall response is mis-consumed by __do_in_kernel_irq_syscall
|
|
#. uti: fix handling UTI_CPU_SET env
|
|
#. do_execveat: kill instead of panic when init_process_stack fails
|
|
#. remote_page_fault is handled by the offloaded thread.
|
|
#. coredump: fix behavior when gencore fail
|
|
#. xpmem: truncates the size of xpmem_attach at the page boundary (workaround for fjmpi)
|
|
#. __mcctrl_os_read_write_cpu_register: spin timeout in mcctrl_ikc_send_wait()
|
|
|
|
=============================================
|
|
Version 1.7.1 (Dec 23, 2020)
|
|
=============================================
|
|
|
|
----------------------
|
|
IHK major updates
|
|
----------------------
|
|
#. d5d5c23 Tofu: support for barrier gate
|
|
#. Tofu: proper cleanup of premapped DMA regions
|
|
#. Tofu: initial version
|
|
#. SMP: try with GFP_ATOMIC as well in mem reserve
|
|
|
|
------------------------
|
|
IHK major bug fixes
|
|
------------------------
|
|
#. ihklib: ihk(_os)_query_{cpu,mem}: allow to pass empty array
|
|
#. SMP: non compound page free and GFP_ATOMIC
|
|
#. ihk_get_num_os_instances: don't open /dev/mcdN
|
|
#. ihklib: ihk_create_os_str: fix variable prefix
|
|
|
|
----------------------
|
|
McKernel major updates
|
|
----------------------
|
|
#. stragiht map: creates a straight map covering the whole physical memory, and gives virtual address ranges out of it to maps to which physical pages are allocated on map
|
|
#. free-time, lazy, potentially Linux-side page-zeroing
|
|
#. Tofu built-in driver: supports memory registration and barrier gate setup
|
|
#. kmalloc cache
|
|
|
|
------------------------
|
|
McKernel major bug fixes
|
|
------------------------
|
|
#. mmap: return -EINVAL for non-anonymous, MAP_HUGETLB map
|
|
#. kernel: increase stack size
|
|
#. Tofu: proper cleanup of device files when mcexec gets killed
|
|
|
|
=============================================
|
|
Version 1.7.0 (Nov 25, 2020)
|
|
=============================================
|
|
|
|
----------------------
|
|
IHK major updates
|
|
----------------------
|
|
#. ihklib: add ihk_create_os_str
|
|
#. ihklib: ihk_reserve_mem: add capped best effort to balanced
|
|
|
|
------------------------
|
|
IHK major bug fixes
|
|
------------------------
|
|
#. make /dev/mcdN sharable
|
|
#. acpi: compat: RHEL-8.2
|
|
#. gic_chip_data: compat: RHEL-8.3
|
|
|
|
----------------------
|
|
McKernel major updates
|
|
----------------------
|
|
#. arm64: Contiguous PTE support
|
|
#. arm64: Scalable Vector Extension (SVE) support
|
|
#. arm64: PMU overflow interrupt support
|
|
#. arm64 port: Direct access to Mckernel memory from Linux
|
|
#. arm64 port: utility thread offloading, which spawns thread onto Linux CPU
|
|
#. eclair: support for live debug
|
|
#. Crash utility extension
|
|
#. Replace mcoverlayfs with a soft userspace overlay
|
|
#. Build system is switched to cmake
|
|
#. Core dump includes thread information
|
|
#. mcinspect and mcps: DWARF based LWK inspection
|
|
|
|
------------------------
|
|
McKernel major bug fixes
|
|
------------------------
|
|
#. shmobj: Fix rusage counting for large page
|
|
#. mcctrl control: task start_time changed to u64 nsec
|
|
#. mcctrl: add handling for one more level of page tables
|
|
#. Add kernel argument to turn on/off time sharing
|
|
#. flatten_string / process env: realign env and clear trailing bits
|
|
#. madvise: Add MADV_HUGEPAGE support
|
|
#. mcctrl: remove in-kernel calls to syscalls
|
|
#. arch_cpu_read_write_register: error return fix.
|
|
#. set_cputime(): interrupt enable/disable fix.
|
|
#. set_mempolicy(): Add mode check.
|
|
#. mbind(): Fix memory_range_lock deadlock.
|
|
#. ihk_ikc_recv: Record channel to packet for release
|
|
#. Add set_cputime() kernel to kernel case and mode enum.
|
|
#. execve: Call preempt_enable() before error-exit
|
|
#. memory/x86_64: fix linux safe_kernel_map
|
|
#. do_kill(): fix pids table when nr of threads is larger than num_processors
|
|
#. shmget: Use transparent huge pages when page size isn't specified
|
|
#. prctl: Add support for PR_SET_THP_DISABLE and PR_GET_THP_DISABLE
|
|
#. monitor_init: fix undetected hang on highest numbered core
|
|
#. init_process_stack: change premapped stack size based on arch
|
|
#. x86 syscalls: add a bunch of XXat() delegated syscalls
|
|
#. do_pageout: fix direct kernel-user access
|
|
#. stack: add hwcap auxval
|
|
#. perf counters: add arch-specific perf counters
|
|
#. Added check of nohost to terminate_host().
|
|
#. kmalloc: Fix address order in free list
|
|
#. sysfs: use nr_cpu_ids for cpumasks (fixes libnuma parsing error on ARM)
|
|
#. monitor_init: Use ihk_mc_cpu_info()
|
|
#. Fix ThunderX2 write-combined PTE flag insanity
|
|
#. ARM: eliminate zero page mapping (i.e, init_low_area())
|
|
#. eliminate futex_cmpxchg_enabled check (not used and dereffed a NULL pointer)
|
|
#. page_table: Fix return value of lookup_pte when ptl4 is blank
|
|
#. sysfs: add missing symlinks for cpu/node
|
|
#. Make Linux handler run when mmap to procfs.
|
|
#. Separate mmap area from program loading (relocation) area
|
|
#. move rusage into kernel ELF image (avoid dynamic alloc before NUMA init)
|
|
#. arm: turn off cpu on panic
|
|
#. page fault handler: protect thread accesses
|
|
#. Register PPD and release_handler at the same time.
|
|
#. fix to missing exclusive processing between terminate() and finalize_process().
|
|
#. perfctr_stop: add flags to no 'disable_intens'
|
|
#. fileobj, shmobj: free pages in object destructor (as opposed to page_unmap())
|
|
#. clear_range_l1, clear_range_middle: Fix handling contiguous PTE
|
|
#. do_mmap: don't pre-populate the whole file when asked for smaller segment
|
|
#. invalidate_one_page: Support shmobj and contiguous PTE
|
|
#. ubsan: fix undefined shifts
|
|
#. x86: disable zero mapping and add a boot pt for ap trampoline
|
|
#. rusage: Don't count PF_PATCH change
|
|
#. Fixed time processing.
|
|
#. copy_user_pte: vmap area not owned by McKernel
|
|
#. gencore: Zero-clear ELF header and memory range table
|
|
#. rpm: ignore CMakeCache.txt in dist and relax BuildRequires on cross build
|
|
#. gencore: Allocate ELF header to heap instead of stack
|
|
#. nanosleep: add cpu_pause() in spinwait loop
|
|
#. init_process: add missing initializations to proc struct
|
|
#. rus_vm_fault: always use a packet on the stack
|
|
#. process stack: use PAGE_SIZE in aux vector
|
|
#. copy_user_pte: base memobj copy on range & VR_PRIVATE
|
|
#. arm64: ptrace: Fix overwriting 1st argument with return value
|
|
#. page fault: use cow for private device mappings
|
|
#. reproductible builds: remove most install paths in c code
|
|
#. page fault: clear writable bit for non-dirtying access to shared ranges
|
|
#. mcreboot/mcstop+release: support for regular user execution
|
|
#. irqbalance_mck: replace extra service with service drop-in
|
|
#. do_mmap: give addr argument a chance even if not MAP_FIXED
|
|
#. x86: fix xchg() and cmpxchg() macros
|
|
#. IHK: support for using Linux work IRQ as IKC interrupt (optional)
|
|
#. MCS: fix ARM64 issue by using smp_XXX() functions (i.e., barrier()s)
|
|
#. procfs: add number of threads to stat and status
|
|
#. memory_range_lock: Fix deadlock in procfs/sysfs handler
|
|
#. flush instruction cache at context switch time if necessary
|
|
#. arm64: Fix PMU related functions
|
|
#. page_fault_process_memory_range: Disable COW for VM region with zeroobj
|
|
#. extend_process_region: Fall back to demand paging when not contiguous
|
|
#. munmap: fix deadlock with remote pagefault on vm range lock
|
|
#. procfs: if memory_range_lock fails, process later
|
|
#. migrate-cpu: Prevent migration target from calling schedule() twice
|
|
#. sched_request_migrate(): fix race condition between migration req and IRQs
|
|
#. get_one_cpu_topology: Renumber core_id (physical core id)
|
|
#. bb7e140 procfs cpuinfo: use sequence number as processor
|
|
#. set_host_vma(): do NOT read protect Linux VMA
|
|
#. hugefileobj: rewrite page allocation/handling
|
|
#. VM: use RW spinlock for vm_range_lock
|
|
#. /dev/shm: use Linux PFNs and populate mappings
|
|
#. Make struct ihk_os_rusage compatible with mckernel_rusage (workaround for Fugaku)
|
|
#. Record pthread routine address in clone(), keep helper threads on caller CPU core (workaround for Fugaku)
|
|
#. struct process: fix type of group_exit_status
|
|
#. tgkill: Fix argument validatation
|
|
#. set_robust_list: Add error check
|
|
#. mcexec: Don't forward SIGTSTP SIGTTIN SIGTTOUT to mckernel
|
|
#. syscall: add prlimit64
|
|
#. stack: grow on page fault
|
|
#. mcexec: use FLIB_NUM_PROCESS_ON_NODE when -n not specified (Fugaku specific)
|
|
|
|
===========================================
|
|
Version 1.6.0 (Nov 11, 2018)
|
|
===========================================
|
|
|
|
-----------------------------------------------
|
|
McKernel major updates
|
|
-----------------------------------------------
|
|
#. McKernel and Linux share one unified kernel virtual address space.
|
|
That is, McKernel sections resides in Linux sections spared for
|
|
modules. In this way, Linux can access the McKernel kernel memory area.
|
|
#. hugetlbfs support
|
|
#. IHK is now included as a git submodule
|
|
#. Debug messages are turned on/off in per souce file basis at run-time.
|
|
#. It's prohibited for McKernel to access physical memory ranges which Linux didn't give to McKernel.
|
|
#. UTI (capability to spawn a thread on Linux CPU) improvement:
|
|
|
|
* System calls issued from the thread are hooked by modifying binary in memory.
|
|
|
|
---------------------------
|
|
McKernel major bug fixes
|
|
---------------------------
|
|
#<digits> below denotes the redmine issue number (https://postpeta.pccluster.org/redmine/).
|
|
|
|
1. #926: shmget: Hide object with IPC_RMID from shmget
|
|
2. #1028: init_process: Inherit parent cpu_set
|
|
3. #995: Fix shebang recorded in argv[0]
|
|
4. #1024: Fix VMAP virtual address leak
|
|
5. #1109: init_process_stack: Support "ulimit -s unlimited"
|
|
6. x86 mem init: do not map identity mapping
|
|
7. mcexec_wait_syscall: requeue potential request on interrupted wait
|
|
8. mcctrl_ikc_send_wait: fix interrupt with do_frees == NULL
|
|
9. pager_req_read: handle short read
|
|
10. kprintf: only call eventfd() if it is safe to interrupt
|
|
11. process_procfs_request: Add Pid to /proc/<PID>/status
|
|
12. terminate: fix oversubscribe hang when waiting for other threads on same CPU to die
|
|
13. mcexec: Do not close fd returned to mckernel side
|
|
14. #976: execve: Clear sigaltstack and fp_regs
|
|
15. #1002: perf_event: Specify counter by bit_mask on start/stop
|
|
16. #1027: schedule: Don't reschedule immediately when wake up on migrate
|
|
17. #mcctrl: lookup unexported symbols at runtime
|
|
18. __sched_wakeup_thread: Notify interrupt_exit() of re-schedule
|
|
19. futex_wait_queue_me: Spin-sleep when timeout and idle_halt is specified
|
|
20. #1167: ihk_os_getperfevent,setperfevent: Timeout IKC sent by mcctrl
|
|
21. devobj: fix object size (POSTK_DEBUG_TEMP_FIX_36)
|
|
22. mcctrl: remove rus page cache
|
|
23. #1021: procfs: Support multiple reads of e.g. ``/proc/*/maps``
|
|
24. #1006: wait: Delay wake-up parent within switch context
|
|
25. #1164: mem: Check if phys-mem is within the range of McKernel memory
|
|
26. #1039: page_fault_process_memory_range: Remove ihk_mc_map_virtual for CoW of device map
|
|
27. partitioned execution: pass process rank to LWK
|
|
28. process/vm: implement access_ok()
|
|
29. spinlock: rewrite spinlock to use Linux ticket head/tail format
|
|
30. #986: Fix deadlock involving mmap_sem and memory_range_lock
|
|
31. Prevent one CPU from getting chosen by concurrent forks
|
|
32. #1009: check_signal: system call restart is done only once
|
|
33. #1176: syscall: the signal received during system call processing is not processed.
|
|
34. #1036 syscall_time: Handle by McKernel
|
|
35. #1165 do_syscall: Delegate system calls to the mcexec with the same pid
|
|
36. #1194 execve: Fix calling ptrace_report_signal after preemption is disabled
|
|
37. #1005 coredump: Exclude special areas
|
|
38. #1018 procfs: Fix pread/pwrite to procfs fail when specified size is bigger than 4MB
|
|
39. #1180 sched_setaffinity: Check migration after decrementing in_interrupt
|
|
40. #771, #1179, #1143 ptrace supports threads
|
|
41. #1189 procfs/do_fork: wait until procfs entries are registered
|
|
42. #1114 procfs: add '/proc/pid/stat' to mckernel side and fix its comm
|
|
43. #1116 mcctrl procfs: check entry was returned before using it
|
|
44. #1167 ihk_os_getperfevent,setperfevent: Return -ETIME when IKC timeouts
|
|
45. mcexec/execve: fix shebangs handling
|
|
46. procfs: handle 'comm' on mckernel side
|
|
47. ihk_os_setperfevent: Return number of registered events
|
|
48. mcexec: fix terminating zero after readlink()
|
|
|
|
===========================================
|
|
Version 1.5.1 (July 9, 2018)
|
|
===========================================
|
|
|
|
-----------------------------------------------
|
|
McKernel major updates
|
|
-----------------------------------------------
|
|
|
|
Watchdog timer to detect hang of McKernel
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
mcexec prints out the following line to its stderr when a hang of McKernel is detected.
|
|
|
|
::
|
|
|
|
mcexec detected hang of McKernel
|
|
|
|
The watchdog timer is enabled by passing -i <timeout_in_sec> option to mcreboot.sh. <timeout_in_sec> specifies the interval of checking if McKernel is alive.
|
|
|
|
For example, specify ``-i 600`` to detect the hang with 10 minutes interval:
|
|
|
|
::
|
|
|
|
mcreboot.sh -i 600
|
|
|
|
The detailed step of the hang detection is as follows.
|
|
#. mcexec acquires eventfd for notification from IHK and perform epoll() on it.
|
|
#. A daemon called ihkmond monitors the state of McKernel periodically with the interval specified by the -i option. It judges that McKernel is hanging and notifies mcexec by the eventfd if its state hasn't changed since the last check.
|
|
|
|
---------------------------
|
|
McKernel major bug fixes
|
|
---------------------------
|
|
1. #1146: pager_req_map(): do not take mmap_sem if not needed
|
|
2. #1135: prepare_process_ranges_args_envs(): fix saving cmdline
|
|
3. #1144: fileobj/devobj: record path name
|
|
4. #1145: fileobj: use MCS locks for per-file page hash
|
|
5. #1076: mcctrl: refactor prepare_image into new generic ikc send&wait
|
|
6. #1072: execve: fix execve with oversubscribing
|
|
7. #1132: execve: use thread variable instead of cpu_local_var(current)
|
|
8. #1117: mprotect: do not set page table writable for cow pages
|
|
9. #1143: syscall wait4: add _WALL (POSTK_DEBUG_ARCH_DEP_44)
|
|
10. #1064: rusage: Fix initialization of rusage->num_processors
|
|
11. #1133: pager_req_unmap: Put per-process data at exit
|
|
12. #731: do_fork: Propagate error code returned by mcexec
|
|
13. #1149: execve: Reinitialize vm_regions's map area on execve
|
|
14. #1065: procfs: Show file names in /proc/<PID>/maps
|
|
15. #1112: mremap: Fix type of size arguments (from ssize_t to size_t)
|
|
16. #1121: sched_getaffinity: Check arguments in the same order as in Linux
|
|
17. #1137: mmap, mremap: Check arguments in the same order as in Linux
|
|
18. #1122: fix return value of sched_getaffinity
|
|
19. #732: fix: /proc/<PID>/maps outputs a unnecessary NULL character
|
|
|
|
===================================
|
|
Version 1.5.0 (Apr 5, 2018)
|
|
===================================
|
|
|
|
--------------------------------------
|
|
McKernel major updates
|
|
--------------------------------------
|
|
1. Aid for Linux version migration: Detect /proc, /sys format change
|
|
between two kernel verions
|
|
2. Swap out
|
|
* Only swap-out anonymous pages for now
|
|
3. Improve support of /proc/maps
|
|
4. mcstat: Linux tool to show resource usage
|
|
|
|
---------------------------
|
|
McKernel major bug fixes
|
|
---------------------------
|
|
#. #727: execve: Fix memory leak when receiving SIGKILL
|
|
#. #829: perf_event_open: Support PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE
|
|
#. #906: mcexec: Check return code of fork()
|
|
#. #1038: mcexec: Timeout when incorrect value is given to -n option
|
|
#. #943 #945 #946 #960 #961: mcexec: Support strace
|
|
#. #1029: struct thread is not released with stress-test involving signal and futex
|
|
#. #863 #870 : Respond immediately to terminating signal when offloading system call
|
|
#. #1119: translate_rva_to_rpa(): use 2MB blocks in 1GB pages on x86
|
|
#. #898: Shutdown OS only after no in-flight IKC exist
|
|
#. #882: release_handler: Destroy objects as the process which opened it
|
|
#. #882: mcexec: Make child process exit if the parent is killed during fork()
|
|
#. #925: XPMEM: Don't destroy per-process object of the parent
|
|
#. #885: ptrace: Support the case where a process attaches its child
|
|
#. #1031: sigaction: Support SA_RESETHAND
|
|
#. #923: rus_vm_fault: Return error when a thread not performing system call offloading causes remote page fault
|
|
#. #1032 #1033 #1034: getrusage: Fix ru_maxrss, RUSAGE_CHILDREN, ru_stime related bugs
|
|
#. #1120: getrusage: Fix deadlock on thread->times_update
|
|
#. #1123: Fix deadlock related to wait_queue_head_list_node
|
|
#. #1124: Fix deadlock of calling terminate() from terminate()
|
|
#. #1125: Fix deadlock related to thread status
|
|
|
|
* Related functions are: hold_thread(), do_kill() and terminate()
|
|
|
|
#. #1126: uti: Fix uti thread on the McKernel side blocks others in do_syscall()
|
|
#. #1066: procfs: Show Linux /proc/self/cgroup
|
|
#. #1127: prepare_process_ranges_args_envs(): fix generating saved_cmdline to avoid PF in strlen()
|
|
#. #1128: ihk_mc_map/unmap_virtual(): do proper TLB invalidation
|
|
#. #1043: terminate(): fix update_lock and threads_lock order to avoid deadlock
|
|
#. #1129: mcreboot.sh: Save ``/proc/irq/*/smp_affinity`` to ``/tmp/mcreboot``
|
|
#. #1130: mcexec: drop READ_IMPLIES_EXEC from personality
|
|
|
|
--------------------
|
|
McKernel workarounds
|
|
--------------------
|
|
#. Forbid CPU oversubscription
|
|
|
|
* It can be turned on by mcreboot.sh -O option
|
|
|
|
|
|
===================================
|
|
Version 1.4.0 (Oct 30, 2017)
|
|
===================================
|
|
|
|
-----------------------------------------------------------
|
|
Abstracted event type support in perf_event_open()
|
|
-----------------------------------------------------------
|
|
|
|
PERF_TYPE_HARDWARE and PERF_TYPE_CACHE types are supported.
|
|
|
|
----------------------------------
|
|
Direct user-space access
|
|
----------------------------------
|
|
Code lines using direct user-space access (e.g. passing user-space
|
|
pointer to memcpy()) becomes more portable across processor
|
|
architectures. The modification follows the following rules.
|
|
|
|
1. Move the code section as it is to the architecture dependent
|
|
directory if it is a part of the critical-path.
|
|
2. Otherwise, rewrite the code section by using the portable methods.
|
|
The methods include copy_from_user(), copy_to_user(),
|
|
pte_get_phys() and phys_to_virt().
|
|
|
|
--------------------------------
|
|
MPI and OpenMP micro-bench tests
|
|
--------------------------------
|
|
The performance figures of MPI and OpenMP primitives are compared with
|
|
those of Linux by using Intel MPI Benchmarks and EPCC OpenMP Micro
|
|
Benchmark.
|
|
|
|
|
|
===================================
|
|
Version 1.3.0 (Sep 30, 2017)
|
|
===================================
|
|
|
|
--------------------
|
|
Kernel dump
|
|
--------------------
|
|
#. A dump level of "only kernel memory" is added.
|
|
|
|
The following two levels are available now:
|
|
|
|
+--+-----------------------+
|
|
| 0|Dump all |
|
|
+--+-----------------------+
|
|
|24|Dump only kernel memory|
|
|
+--+-----------------------+
|
|
|
|
The dump level can be set by -d option in ihkosctl or the argument
|
|
for ihk_os_makedumpfile(), as shown in the following examples:
|
|
|
|
::
|
|
|
|
Command: ihkosctl 0 dump -d 24
|
|
Function call: ihk_os_makedumpfile(0, NULL, 24, 0);
|
|
|
|
#. Dump file is created when Linux panics.
|
|
|
|
The dump level can be set by dump_level kernel argument, as shown in the
|
|
following example:
|
|
|
|
::
|
|
|
|
ihkosctl 0 kargs "hidos dump_level=24"
|
|
|
|
The IHK dump function is registered to panic_notifier_list when creating /dev/mcdX and called when Linux panics.
|
|
|
|
-----------------------------
|
|
Quick Process Launch
|
|
-----------------------------
|
|
|
|
MPI process launch time and some of the initialization time can be
|
|
reduced in application consisting of multiple MPI programs which are
|
|
launched in turn in the job script.
|
|
|
|
The following two steps should be performed to use this feature:
|
|
#. Replace mpiexec with ql_mpiexec_start and add some lines for ql_mpiexec_finalize in the job script
|
|
#. Modify the app so that it can repeat calculations and wait for the instructions from ql_mpiexec_{start,finalize} at the end of the loop
|
|
|
|
The first step is explained using an example. Assume the original job script looks like this:
|
|
|
|
.. code-block:: none
|
|
|
|
/* Execute ensamble simulation and then data assimilation, and repeat this ten times */
|
|
for i in {1..10}; do
|
|
|
|
/* Each ensamble simulation execution uses 100 nodes, launch ten of them in parallel */
|
|
for j in {1..10}; do
|
|
mpiexec -n 100 -machinefile ./list1_$j p1.out a1 & pids[$i]=$!;
|
|
done
|
|
|
|
/* Wait until the ten ensamble simulation programs finish */
|
|
for j in {1..10}; do wait ${pids[$j]}; done
|
|
|
|
/* Launch one data assimilation program using 1000 nodes */
|
|
mpiexec -n 1000 -machinefile ./list2 p2.out a2
|
|
done
|
|
|
|
The job script should be modified like this:
|
|
|
|
.. code-block:: none
|
|
|
|
for i in {1..10}; do
|
|
for j in {1..10}; do
|
|
/* Replace mpiexec with ql_mpiexec_start */
|
|
ql_mpiexec_start -n 100 -machinefile ./list1_$j p1.out a1 & pids[$j]=$!;
|
|
done
|
|
|
|
for j in {1..10}; do wait ${pids[$j]}; done
|
|
|
|
ql_mpiexec_start -n 1000 -machinefile ./list2 p2.out a2
|
|
done
|
|
|
|
/* p1.out and p2.out don't exit but are waiting for the next calculation. So tell them to exit */
|
|
for j in {1..10}; do
|
|
ql_mpiexec_finalize -machinefile ./list1_$i p1.out a1;
|
|
done
|
|
|
|
ql_mpiexec_finalize -machinefile ./list2 p2.out a2;
|
|
|
|
The second step is explained using a pseudo-code.
|
|
|
|
.. code-block:: none
|
|
|
|
MPI_Init();
|
|
Prepare data exchange with preceding / following MPI programs
|
|
loop:
|
|
foreach Fortran module
|
|
Initialize data using command-line argments, parameter files, environment variables
|
|
Input data from preceding MPI programs / Read snap-shot
|
|
Perform main calculation
|
|
Output data to following MPI programs / Write snap-shot
|
|
/* ql_client() waits for command of ql_mpiexec_{start,finish} */
|
|
if (ql_client() == QL_CONTINUE) { goto loop; }
|
|
MPI_Finalize();
|
|
|
|
qlmpilib.h should be included in the code and libql{mpi,fort}.so should be linked to the executable file.
|