Summary: 93 instances, 63 unique Text Count // TODO just do a qucikselect 1 // FIXME: convert to int32_t everywhere? 1 // FIXME: we should adjust queryTileSize to deal with this, since 3 # TODO: once deprecated classes are removed, remove the dict and just use .lower() below 1 /// FIXME when MSB of uint16 is set 1 // FIXME: maybe also consider offset in bytes? multiply by sizeof(T)? 1 // FIXME: speed up 1 // FIXME: this is a non-coalesced, unaligned, non-vectorized load 4 // FIXME: what to use for managed memory? 1 // TODO: check as thoroughfully for other index types 1 * TODO: in this file, the read functions that encouter errors may 1 // FIXME: type-specific abs 1 // FIXME: we might ultimately be calling this function with inputs 2 // TODO: parallelize? 2 // TODO shrink global storage if needed 1 // FIXME: compiler doesn't like this expression? compiler bug? 4 // TODO find a better name 3 if (with_id_map) { // FIXME test on q_map instead 1 # FIXME: no rev_swig_ptr equivalent for torch.Tensor, just convert 1 // FIXME: is this a CUDA 9 compiler bug? 2 // FIXME: inherit our same device 1 // FIXME: make sure there are no outstanding memory allocations? 1 // FIXME jhj convert to _n version 1 // FIXME: if we have exact multiples, don't need this 3 : dis > radius; // TODO templatize to remove this test 1 // TODO: Support big endian (currently supporting only little endian) 1 // FIXME: probably impractical for large # of dims? 1 // FIXME: 2 different float16 options? 2 // FIXME: really this can be into pinned memory and a true async 1 // FIXME: tune 1 nbit -= 8; // TODO remove nbit 1 // FIXME: there could be overflow here, but where should we check this? 1 // FIXME: this is a non-coalesced load 4 // FIXME: faiss CPU uses +/-FLT_MAX instead of +/-infinity 1 // TODO check nb of bytes written 1 // FIXME: stride with threads instead of single thread 1 // FIXME: some weird CUDA 11 bug? where cublasSgemmEx on 1 // FIXME: we cannot use temporary memory for new requests because 1 /// FIXME: the output distances must fit in GPU memory 1 // FIXME: some issue with getLaneId() and CUDA 10.1 and P4 GPUs? 4 // TODO should not need stable 1 // FIXME: assumes that nothing is currently running on the sub-indexes, which is 2 // FIXME: why are we doing this? 1 // FIXME: as of CUDA 11, a memory allocation error appears to be 2 # TODO: all result manipulations are in python, should move to C++ if perf 1 // FIXME: this is a non-coalesced, unaligned, 2-vectorized load 3 // FIXME: try always making this centroid id 0 so we can 1 // TODO parallelize 1 # TODO check class name 1 // FIXME: optimize with a dedicated kernel 1 // FIXME: inherit our same device 1 // FIXME: is this a CUDA 9 compiler bug? 2 // FIXME: why does getLaneId() not work when we write out below!?!?! 1 // FIXME jhj: kernel for copy 2 // FIXME: GPUize more of this 3 // FIXME: type-specific abs() 1 // FIXME: this is also slow, since we have to recover the 1 // FIXME: investigate loading separately, so we don't need this 1 // TODO: make tree of partial sums 1 // FIXME: make a SSE version 1 // TODO find a way to provide the nprobes together to do a matmul 1 // FIXME avoid a second pass over the array to sample the threshold 1 // FIXME: parameterize based on algorithm need 1