facebookresearch / NeuralDB
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 34% duplication:
    • 5,839 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 2,009 duplicated lines
  • 170 duplicates
system34% (2,009 lines)
Duplication per Extension
py34% (2,009 lines)
Duplication per Component (primary)
dataset-construction/src/ndb_data93% (718 lines)
modelling/src/neuraldb39% (514 lines)
dataset-construction/src/ndb_data/generation24% (281 lines)
modelling/src/neuraldb/evaluation64% (172 lines)
modelling/src/neuraldb/dataset24% (144 lines)
dataset-construction/src/ndb_data/construction13% (100 lines)
modelling/src/neuraldb/modelling14% (42 lines)
ssg7% (26 lines)
modelling/src/neuraldb/retriever13% (12 lines)
modelling0% (0 lines)
modelling/src/neuraldb/util0% (0 lines)
modelling/src0% (0 lines)
dataset-construction/src/ndb_data/util0% (0 lines)
dataset-construction/src/ndb_data/data_import0% (0 lines)
dataset-construction/src/ndb_data/wikidata_common0% (0 lines)
ROOT0% (0 lines)

Duplication Between Components (50+ lines)

G dataset-construction/src/ndb_data/generation dataset-construction/src/ndb_data/generation modelling/src/neuraldb modelling/src/neuraldb dataset-construction/src/ndb_data/generation--modelling/src/neuraldb 350

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 170 duplicates...
Size#FoldersFilesLinesCode
124 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions_100.py
sample_questions_250.py
27:250 (100%)
27:250 (100%)
view
72 x 2 dataset-construction/src/ndb_data/generation
modelling/src/neuraldb
question_to_db.py
convert_spj_to_predictions.py
159:242 (13%)
122:205 (20%)
view
65 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions.py
sample_questions_50.py
27:117 (55%)
27:117 (54%)
view
63 x 2 modelling/src/neuraldb/evaluation
modelling/src/neuraldb/evaluation
postprocess_baselines.py
postprocess_spj.py
37:107 (53%)
36:106 (65%)
view
62 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions_1000.py
sample_questions_500.py
27:115 (50%)
27:115 (50%)
view
61 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions.py
sample_questions_500.py
27:113 (52%)
27:113 (49%)
view
61 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions.py
sample_questions_1000.py
27:113 (52%)
27:113 (49%)
view
61 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions_1000.py
sample_questions_500.py
117:251 (49%)
117:251 (49%)
view
61 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions_1000.py
sample_questions_50.py
27:113 (49%)
27:113 (50%)
view
61 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions_50.py
sample_questions_500.py
27:113 (50%)
27:113 (49%)
view
58 x 2 dataset-construction/src/ndb_data/generation
modelling/src/neuraldb
question_to_db.py
convert_spj_to_predictions.py
245:327 (11%)
208:290 (16%)
view
54 x 2 modelling/src/neuraldb
modelling/src/neuraldb
final_scoring_with_dbsize.py
final_scoring_with_dbsize_sweep.py
32:101 (33%)
32:101 (25%)
view
35 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions_250.py
sample_questions_50.py
27:72 (28%)
27:72 (29%)
view
35 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions.py
sample_questions_250.py
27:72 (29%)
27:72 (28%)
view
35 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions_100.py
sample_questions_500.py
27:72 (28%)
27:72 (28%)
view
35 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions_100.py
sample_questions_50.py
27:72 (28%)
27:72 (29%)
view
35 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions.py
sample_questions_100.py
27:72 (29%)
27:72 (28%)
view
35 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions_250.py
sample_questions_500.py
27:72 (28%)
27:72 (28%)
view
35 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions_100.py
sample_questions_1000.py
27:72 (28%)
27:72 (28%)
view
35 x 2 dataset-construction/src/ndb_data
dataset-construction/src/ndb_data
sample_questions_1000.py
sample_questions_250.py
27:72 (28%)
27:72 (28%)
view
Duplicated Units
The list of top 5 duplicated units.
See data for all 5 unit duplicates...
Size#FoldersFilesLinesCode
56 x 2 modelling/src/neuraldb
modelling/src/neuraldb
final_scoring_with_dbsize_sweep.py
final_scoring_with_dbsize.py
0:0 
0:0 
view
22 x 2 dataset-construction/src/ndb_data/construction
dataset-construction/src/ndb_data/construction
make_database_initial_cache.py
make_database_initial.py
0:0 
0:0 
view
14 x 2 modelling/src/neuraldb
dataset-construction/src/ndb_data/generation
convert_spj_to_predictions.py
question_to_db.py
0:0 
0:0 
view
7 x 2 modelling/src/neuraldb
dataset-construction/src/ndb_data/generation
convert_spj_to_predictions.py
question_to_db.py
0:0 
0:0 
view
6 x 2 modelling/src/neuraldb
dataset-construction/src/ndb_data/generation
convert_spj_to_predictions.py
question_to_db.py
0:0 
0:0 
view