facebookresearch / faiss
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 21% duplication:
    • 63,922 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 13,747 duplicated lines
  • 11,179 duplicates
system21% (13,747 lines)
Duplication per Extension
cuh38% (4,258 lines)
cpp13% (3,622 lines)
cu32% (3,233 lines)
h15% (1,310 lines)
py16% (1,050 lines)
bash17% (104 lines)
c69% (92 lines)
yaml50% (78 lines)
Duplication per Component (primary)
faiss/gpu32% (7,713 lines)
faiss15% (2,075 lines)
faiss/utils21% (1,273 lines)
faiss/impl9% (919 lines)
contrib21% (307 lines)
benchs/distributed_ondisk18% (198 lines)
tutorial/cpp74% (196 lines)
faiss/python16% (178 lines)
benchs11% (175 lines)
benchs/bench_all_ivf10% (156 lines)
c_api8% (135 lines)
benchs/link_and_code22% (116 lines)
faiss/invlists7% (109 lines)
c_api/gpu10% (46 lines)
tutorial/python37% (42 lines)
conda/faiss-gpu50% (39 lines)
conda/faiss53% (39 lines)
c_api/utils31% (31 lines)
cmake0% (0 lines)
conda0% (0 lines)
c_api/impl0% (0 lines)

Duplication Between Components (50+ lines)

G benchs/distributed_ondisk benchs/distributed_ondisk contrib contrib benchs/distributed_ondisk--contrib 306 benchs/link_and_code benchs/link_and_code benchs/distributed_ondisk--benchs/link_and_code 86 faiss/python faiss/python contrib--faiss/python 64 faiss faiss faiss/impl faiss/impl faiss--faiss/impl 286 faiss/utils faiss/utils faiss--faiss/utils 157 faiss/impl--faiss/utils 92 benchs benchs benchs--benchs/distributed_ondisk 62 benchs--benchs/link_and_code 90 benchs/link_and_code--contrib 50 c_api c_api c_api--faiss 70 c_api/gpu c_api/gpu c_api--c_api/gpu 82 conda/faiss-gpu conda/faiss-gpu conda/faiss conda/faiss conda/faiss-gpu--conda/faiss 78 benchs/bench_all_ivf benchs/bench_all_ivf benchs/bench_all_ivf--benchs/link_and_code 76

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 11,179 duplicates...
Size#FoldersFilesLinesCode
70 x 2 benchs/distributed_ondisk
contrib
rpc.py
rpc.py
19:146 (48%)
20:145 (47%)
view
62 x 2 faiss/gpu/impl
faiss/gpu/impl
cu
IVFFlatScan.cu
PQScanMultiPassPrecomputed.cu
405:476 (13%)
613:684 (9%)
view
62 x 2 faiss/gpu/impl
faiss/gpu/impl
cuh
PQScanMultiPassNoPrecomputed-inl.cuh
PQScanMultiPassPrecomputed.cu
423:486 (9%)
439:502 (9%)
view
57 x 2 faiss/gpu/impl
faiss/gpu/impl
cu
Distance.cu
GeneralDistance.cuh
185:252 (9%)
294:361 (15%)
view
56 x 2 faiss/gpu/utils
faiss/gpu/utils
cuh
Tensor.cuh
Tensor.cuh
421:486 (9%)
557:622 (9%)
view
54 x 2 faiss/gpu/perf
faiss/gpu/perf
cu
PerfIVFFlat.cu
PerfIVFPQ.cu
87:156 (43%)
97:166 (41%)
view
50 x 2 faiss/gpu/impl
faiss/gpu/impl
cu
IVFUtilsSelect1.cu
IVFUtilsSelect2.cu
113:168 (33%)
179:234 (24%)
view
38 x 2 faiss/gpu/impl
faiss/gpu/impl
cu
IVFFlatScan.cu
PQScanMultiPassNoPrecomputed-inl.cuh
433:476 (8%)
650:693 (5%)
view
38 x 2 faiss/gpu/impl
faiss/gpu/impl
cuh
PQScanMultiPassNoPrecomputed-inl.cuh
PQScanMultiPassPrecomputed.cu
650:693 (5%)
641:684 (5%)
view
35 x 2 faiss/gpu/impl
faiss/gpu/impl
cu
IVFFlatScan.cu
PQScanMultiPassNoPrecomputed-inl.cuh
391:431 (7%)
582:622 (5%)
view
31 x 2 faiss/gpu/impl
faiss/gpu/impl
cu
IVFFlatScan.cu
PQScanMultiPassPrecomputed.cu
365:403 (6%)
573:611 (4%)
view
26 x 2 faiss/gpu/impl
faiss/gpu/impl
cuh
GpuScalarQuantizer.cuh
GpuScalarQuantizer.cuh
301:332 (3%)
439:470 (3%)
view
25 x 2 benchs/link_and_code
contrib
datasets.py
vecs_io.py
29:65 (16%)
14:50 (100%)
view
25 x 2 faiss/gpu/utils
faiss/gpu/utils
cuh
Select.cuh
Select.cuh
190:219 (4%)
467:496 (4%)
view
25 x 2 faiss/gpu/utils
faiss/gpu/utils
cuh
MergeNetworkWarp.cuh
MergeNetworkWarp.cuh
272:300 (5%)
359:387 (5%)
view
24 x 2 faiss/utils
faiss/utils
distances.cpp
distances.cpp
177:204 (6%)
246:273 (6%)
view
24 x 2 faiss/gpu/impl
faiss/gpu/impl
cuh
PQScanMultiPassNoPrecomputed-inl.cuh
PQScanMultiPassNoPrecomputed.cuh
523:546 (3%)
21:44 (55%)
view
24 x 2 faiss/gpu/impl
faiss/gpu/impl
cuh
PQScanMultiPassNoPrecomputed-inl.cuh
PQScanMultiPassPrecomputed.cu
596:622 (3%)
613:639 (3%)
view
23 x 2 faiss/gpu/impl
faiss/gpu/impl
cu
BroadcastSum.cu
BroadcastSum.cu
21:48 (7%)
130:157 (7%)
view
23 x 2 faiss/utils
faiss/utils
partitioning.cpp
partitioning.cpp
951:973 (2%)
959:981 (2%)
view
Duplicated Units
The list of top 20 duplicated units.
See data for all 20 unit duplicates...
Size#FoldersFilesLinesCode
14 x 2 benchs/link_and_code
benchs/bench_all_ivf
datasets.py
datasets.py
0:0 
0:0 
view
12 x 2 faiss
faiss
IndexPQFastScan.cpp
IndexIVFPQFastScan.cpp
175:188 
655:668 
view
11 x 3 benchs
benchs/distributed_ondisk
benchs/link_and_code
bench_for_interrupt.py
search_server.py
datasets.py
0:0 
0:0 
0:0 
view
12 x 2 contrib
benchs/distributed_ondisk
rpc.py
rpc.py
0:0 
0:0 
view
9 x 2 contrib
benchs/distributed_ondisk
rpc.py
rpc.py
0:0 
0:0 
view
9 x 2 faiss/impl
faiss/impl
simd_result_handlers.h
ResultHandler.h
380:389 
152:161 
view
8 x 2 faiss/impl
faiss/utils
simd_result_handlers.h
partitioning.cpp
347:355 
389:397 
view
8 x 2 faiss/utils
faiss/utils
Heap.h
Heap.h
220:228 
231:239 
view
8 x 2 faiss/utils
faiss/utils
Heap.h
Heap.h
272:280 
283:291 
view
7 x 2 faiss/utils
faiss/utils
distances_simd.cpp
distances_simd.cpp
641:648 
690:697 
view
7 x 2 faiss/utils
faiss/utils
distances_simd.cpp
distances_simd.cpp
658:665 
699:706 
view
7 x 2 faiss/utils
faiss/utils
Heap.h
Heap.h
143:150 
168:175 
view
7 x 2 faiss/utils
faiss/utils
Heap.h
Heap.h
153:160 
178:185 
view
7 x 2 faiss/invlists
faiss/invlists
BlockInvertedLists.cpp
InvertedLists.cpp
83:96 
194:201 
view
7 x 2 benchs/link_and_code
benchs/bench_all_ivf
datasets.py
datasets.py
0:0 
0:0 
view
8 x 2 contrib
benchs/distributed_ondisk
rpc.py
rpc.py
0:0 
0:0 
view
6 x 2 faiss
faiss
IndexHNSW.cpp
IndexNSG.cpp
273:280 
67:74 
view
6 x 2 faiss
faiss
IndexIVF.cpp
IndexBinaryIVF.cpp
285:291 
111:117 
view
6 x 3 benchs
benchs/distributed_ondisk
benchs/link_and_code
bench_for_interrupt.py
search_server.py
datasets.py
0:0 
0:0 
0:0 
view
8 x 2 contrib
benchs/distributed_ondisk
rpc.py
rpc.py
0:0 
0:0 
view