microsoft / LightGBM
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 17% duplication:
    • 39,979 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 6,893 duplicated lines
  • 521 duplicates
system17% (6,893 lines)
Duplication per Extension
cpp11% (1,546 lines)
cl60% (1,324 lines)
hpp17% (1,153 lines)
cu83% (833 lines)
py14% (812 lines)
h10% (586 lines)
R10% (389 lines)
in59% (88 lines)
i13% (78 lines)
vcxproj24% (72 lines)
cmake2% (12 lines)
Duplication per Component (primary)
src/treelearner32% (2,820 lines)
src/io13% (964 lines)
python-package/lightgbm15% (782 lines)
include/LightGBM10% (482 lines)
src21% (417 lines)
R-package/R10% (323 lines)
src/metric27% (317 lines)
R-package/src21% (278 lines)
src/objective11% (131 lines)
src/boosting6% (118 lines)
swig12% (78 lines)
windows14% (72 lines)
ROOT9% (33 lines)
python-package8% (30 lines)
src/application5% (18 lines)
src/network2% (18 lines)
cmake/modules5% (12 lines)
R-package/inst0% (0 lines)
R-package/pkgdown0% (0 lines)
cmake0% (0 lines)
helpers0% (0 lines)

Duplication Between Components (50+ lines)

G include/LightGBM include/LightGBM src src include/LightGBM--src 300 R-package/src R-package/src ROOT ROOT R-package/src--ROOT 66

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 521 duplicates...
Size#FoldersFilesLinesCode
106 x 2 src/treelearner/kernels
src/treelearner/kernels
histogram_16_64_256.cu
histogram_16_64_256.cu
201:320 (12%)
512:630 (12%)
view
93 x 2 src/treelearner/kernels
src/treelearner/kernels
histogram_16_64_256.cu
histogram_16_64_256.cu
154:265 (11%)
777:886 (11%)
view
93 x 2 src/treelearner/kernels
src/treelearner/kernels
histogram_16_64_256.cu
histogram_16_64_256.cu
37:137 (11%)
349:449 (11%)
view
88 x 2 src/treelearner/kernels
src/treelearner/kernels
histogram_16_64_256.cu
histogram_16_64_256.cu
349:443 (10%)
659:754 (10%)
view
88 x 2 src/treelearner/kernels
src/treelearner/kernels
histogram_16_64_256.cu
histogram_16_64_256.cu
37:131 (10%)
659:754 (10%)
view
69 x 2 R-package/R
R-package/R
lgb.cv.R
lgb.train.R
106:199 (17%)
74:167 (29%)
view
65 x 2 src/io
src/io
dense_bin.hpp
sparse_bin.hpp
236:321 (17%)
281:365 (13%)
view
63 x 2 src/treelearner/ocl
src/treelearner/ocl
cl
histogram256.cl
histogram64.cl
67:136 (8%)
85:154 (8%)
view
62 x 2 src/treelearner/ocl
src/treelearner/ocl
cl
histogram16.cl
histogram64.cl
76:145 (8%)
68:137 (8%)
view
53 x 2 src/treelearner/kernels
src/treelearner/kernels
histogram_16_64_256.cu
histogram_16_64_256.cu
512:575 (6%)
823:886 (6%)
view
52 x 2 src/treelearner/kernels
src/treelearner/kernels
histogram_16_64_256.cu
histogram_16_64_256.cu
268:320 (6%)
889:941 (6%)
view
52 x 2 src/treelearner/kernels
src/treelearner/kernels
histogram_16_64_256.cu
histogram_16_64_256.cu
578:630 (6%)
889:941 (6%)
view
49 x 2 src/treelearner/kernels
src/treelearner/kernels
histogram_16_64_256.cu
histogram_16_64_256.cu
452:510 (5%)
763:821 (5%)
view
48 x 2 src/treelearner/ocl
src/treelearner/ocl
cl
histogram16.cl
histogram256.cl
93:145 (6%)
67:120 (6%)
view
47 x 2 python-package/lightgbm
python-package/lightgbm
dask.py
dask.py
1267:1314 (4%)
1420:1467 (4%)
view
47 x 2 python-package/lightgbm
python-package/lightgbm
dask.py
dask.py
1095:1142 (4%)
1267:1314 (4%)
view
47 x 2 python-package/lightgbm
python-package/lightgbm
dask.py
dask.py
1095:1142 (4%)
1420:1467 (4%)
view
46 x 2 src/treelearner/ocl
src/treelearner/ocl
cl
histogram256.cl
histogram64.cl
425:473 (6%)
363:410 (6%)
view
39 x 2 src/treelearner/kernels
src/treelearner/kernels
histogram_16_64_256.cu
histogram_16_64_256.cu
154:199 (4%)
466:510 (4%)
view
34 x 2 R-package/src
R-package/src
in
Makevars.in
Makevars.win.in
23:56 (68%)
24:57 (66%)
view
Duplicated Units
The list of top 14 duplicated units.
See data for all 14 unit duplicates...
Size#FoldersFilesLinesCode
25 x 2 src/io
src/io
dense_bin.hpp
sparse_bin.hpp
256:281 
301:326 
view
25 x 2 src/io
src/io
dense_bin.hpp
sparse_bin.hpp
283:308 
328:353 
view
22 x 2 python-package/lightgbm
python-package/lightgbm
dask.py
dask.py
0:0 
0:0 
view
13 x 2 src/metric
src/metric
binary_metric.hpp
binary_metric.hpp
175:192 
286:303 
view
11 x 2 include/LightGBM/utils
include/LightGBM/utils
common.h
common.h
492:503 
1147:1158 
view
11 x 2 src/io
src/io
parser.cpp
parser.cpp
44:55 
57:68 
view
9 x 2 src/io
src/io
dense_bin.hpp
sparse_bin.hpp
345:354 
389:398 
view
9 x 2 src/treelearner
src/treelearner
cuda_tree_learner.cpp
gpu_tree_learner.cpp
83:92 
51:60 
view
9 x 2 src/treelearner
src/treelearner
cuda_tree_learner.h
gpu_tree_learner.h
49:58 
53:64 
view
8 x 2 src/io
src/io
dense_bin.hpp
sparse_bin.hpp
356:364 
400:408 
view
7 x 2 include/LightGBM/utils
include/LightGBM/utils
common.h
common.h
81:88 
90:97 
view
7 x 2 src/io
src/io
multi_val_dense_bin.hpp
multi_val_sparse_bin.hpp
117:124 
173:180 
view
6 x 2 src/treelearner
src/treelearner
parallel_tree_learner.h
parallel_tree_learner.h
67:73 
122:128 
view
11 x 2 python-package/lightgbm
python-package/lightgbm
basic.py
basic.py
0:0 
0:0 
view