03fed4d1c8
automatically generate hfi structs from dwarf info
2018-06-13 00:31:38 +09:00
6279f69f5c
compiler.h: take in recent linux updates for newer gcc support
...
Had to remove from original compiler-gcc:
- things that deal with types, e.g. READ_ONCE macro and friends;
- #define barrier(). This one would be better there at some point.
hfi1: remove ACCESS_ONCE from hfi1 header
2018-06-13 00:31:38 +09:00
6959d5ead4
HFI: port to SFI driver version 10.5.1.0.2
2018-06-13 00:31:38 +09:00
a5aa68744f
hfi1: use kmalloc_cache for tid_rb_node allocations
2018-06-13 00:31:38 +09:00
89c5aaa9e9
hfi1_user_exp_rcv_setup(): rewrite main loop
2018-06-13 00:31:37 +09:00
15422d886f
hif1_file_ioctl(): use dkprintf()
2018-06-13 00:31:37 +09:00
f139bef0cb
mmap(): remove force large page extension (meant to be RESET)
2018-06-13 00:31:37 +09:00
de82cf8779
hfi1/user_exp_rcv/setup: keep track of position within page
...
ihk_mc_pt_lookup_pte + pte_get_phys will get us the physical address
for the start of the page we're looking at.
Re-offset it by position within buffer.
2018-06-13 00:31:37 +09:00
662895c020
hfi1/user_exp_rcv: explicitely call hfi1_map_device_addresses
...
There were cases where nobody else did this mapping for us
2018-06-13 00:31:37 +09:00
d23939da8c
process/vm: fix lookup_process_memory_range (again)
...
That optimistically going left was a more serious bug than just
last iteration, we could just pass by a match and continue down
the tree if the match was not a leaf.
Fix the actual algorithm issue
Conflicts:
kernel/process.c
2018-06-13 00:31:37 +09:00
67529f21ff
hfi1: replace true/false defines by stddef include
2018-06-13 00:31:37 +09:00
5c11ff0950
process/vm: fix lookup_process_memory_range with small start address
...
Cherry-picked from 6370520e
Conflicts:
kernel/process.c
2018-06-13 00:31:37 +09:00
ce4eb0d409
hfi1/user_exp_rcv/setup: add access_ok check
2018-06-13 00:31:36 +09:00
04434320fc
hfi1/user_exp_rcv/setup: do not skip over pages
...
If the vaddr we consider is not at the start of a page, we could skip
over (smaller, not contigous) areas.
For example consider this segment of virtual memory:
[ 2MB | 4k | 4k | ... ]
Starting at 1MB offset, we would get a pgsize of 2MB so would skip
straight over 1MB worth of 4k pages.
2018-06-13 00:31:36 +09:00
50fafa6d71
hfi1/user_exp_rcv/setup: use cache_alloc for tidlist
2018-06-13 00:31:36 +09:00
f5ced648ef
hfi1/user_exp_rcv: rework main loop
...
New loop now takes into account pages not physically contiguous.
Also some minor improvements, e.g. make the spin_lock used more locally,
reuse a group we had if we had one, etc.
2018-06-13 00:31:36 +09:00
0f8f88ca46
hfi1/user_exp_rcv/invalid: Remove function
...
user_exp_rcv_invalid is only used together with the mmu cache
(its purpose is the delayed freeing of tids that were invalidated in cache)
Since we do not use that cache, the function can go
2018-06-13 00:31:36 +09:00
e99f19e812
hfi1/user_exp_rcv/setup: set length in tidinfo
...
This was dropped early on by mistake/excessive haste, it's actually
pretty useful.
2018-06-13 00:31:36 +09:00
9a36e5d213
hfi1/user_exp_rcv/setup: increment phys appropriately
...
Old code was always registering the same section with different size,
instead of properly covering the requested map
2018-06-13 00:31:36 +09:00
4816f27639
hfi1/user_exp_rcv/setup: split into multiple tids
...
Do not round up to next power of two, but issue multiple requests
if necessary (e.g. 260k would be 256 + 4k in two registrations)
2018-06-13 00:31:36 +09:00
9c0b8aa812
mcctrl/control.c: fix debug print types
2018-06-13 00:31:36 +09:00
23f178d718
hfi1/user_exp_rcv/clear: implement TID_FREE ioctl
2018-06-13 00:31:36 +09:00
159c18b98b
hfi1/ioctl: only forward ioctl if hfi1_file_ioctl didn't handle it
...
Conflicts:
kernel/syscall.c
2018-06-13 00:31:35 +09:00
1847a3ac11
hfi1/user_exp_rcv/setup: cleanup locks/groups usage
2018-06-13 00:31:35 +09:00
15b16ffbbb
hfi1/user_exp_rcv/setup: map is noop, skip it
...
In the original driver's dma.c hfi1_dma_map_single just passes
the physical address back, so directly use that.
2018-06-13 00:31:35 +09:00
e64d89cd48
hfi: bases for user_exp_rcv
...
This implements a skeleton setup function and call it on ioctl
Many missing points:
- missing pci mapping to make setup work
- no clear (passed to linux, so will likely bug out)
- missing locks/safe-guards
Conflicts:
kernel/Makefile.build.in
2018-06-13 00:31:35 +09:00
7366da4390
Fix other warnings
...
Most were harmless, but the change to ACCESS_ONCE from volatile
cast is probably useful.
Expanding macro, we basically went from:
m = (volatile struct sdma_vl_map *)dd->sdma_map;
to
m = *(volatile struct sdma_vl_map **)&(dd->sdma_map);
i.e. the explicit lookup is at a different level.
2018-06-13 00:31:35 +09:00
2dc85ee417
user_sdma: fix use of uninitialized variable (vl)
...
This defines a single field in hfi1_pportdata, getting offset
from dwarf headers -- need to compute that at configure time
2018-06-13 00:31:35 +09:00
73cc07f98e
ioctl() investigation - TO RESET
2018-06-13 00:31:35 +09:00
815e2244ca
HFI1: minor change of declarations
2018-06-13 00:31:34 +09:00
163af73554
HFI1: properly iterate iovecs according to underlying page sizes
2018-06-13 00:31:34 +09:00
fd316f3ca3
HFI1: pass per-CPU txreq_cache to user_sdma_send_pkts()
2018-06-13 00:31:34 +09:00
122588bc4d
mcexec: --enable-hfi1 to runtime enable/disable HFI1 driver
...
Conflicts:
executer/user/mcexec.c
2018-06-13 00:31:34 +09:00
70238982c2
HFI1: use embedded kmalloc cache for req->tids (fixes AllReduce hang)
2018-06-13 00:31:34 +09:00
5b5191ef64
HFI1: move txreq kmalloc cache header into CPU local variable
2018-06-13 00:31:34 +09:00
a65faeaed4
kmalloc cache: embed cache pointer into kmalloc_header
...
Conflicts:
kernel/mem.c
2018-06-13 00:31:34 +09:00
4dea1842e0
kmalloc cache: embed cache pointer into kmalloc_header
...
Conflicts:
kernel/mem.c
2018-06-13 00:31:34 +09:00
5353b11f90
HFI1: disable kmalloc cache for req->tids (AllReduce fails otherwise)
2018-06-13 00:31:34 +09:00
abdbf96254
HFI1: use process rank for SDMA engine selection
2018-06-13 00:31:33 +09:00
bd170e63ba
kmalloc cache refactor and pre-alloc in HFI1 open()
2018-06-13 00:31:33 +09:00
d35fa16417
HFI1: more detailed profiling (disabled by default)
2018-06-13 00:31:33 +09:00
6406a0df6b
HFI1: compute SDMA pkt length taking large pages into account
2018-06-13 00:31:33 +09:00
52e8f03b4b
HFI1: store base physical address in iovec if physically contiguous
2018-06-13 00:31:33 +09:00
b071a3f32c
HFI1: use fast_memcpy() in header fillings
...
Conflicts:
kernel/user_sdma.c
2018-06-13 00:31:33 +09:00
90258f00bd
HFI1: use generic kmalloc cache for user_sdma_txreqs and req tids
2018-06-13 00:31:33 +09:00
28eb649056
Generic lock-free kmalloc cache implementation
...
Conflicts:
kernel/mem.c
2018-06-13 00:31:33 +09:00
744ebacf65
HFI1: more pre-allocation in txreq cache
2018-06-13 00:31:33 +09:00
62e438a0aa
HFI1: do device ioremap() mappings in per-process fashion
2018-06-13 00:31:32 +09:00
5ac582a678
user_sdma_send_pkts(): unlikely() around slow path condition
2018-06-13 00:31:32 +09:00
51bc28acca
sdma_select_user_engine(): hash on CPU number
2018-06-13 00:31:32 +09:00