apache / datafusion
Source Code Overview

Analysis scope, overview of main, test, generated, deployment, build, and other code.

Source Code Analysis Scope
Files includes and excluded from analyses
txt
snap
proto
gitattributes
editorconfig
gitmodules
Overview of Analyzed Files
Basic stats on analyzed files
Intro
For analysis purposes we separate files in scope into several categories: main, test, generated, deployment and build, and other.

  • The main category contains all manually created source code files that are being used in the production.
  • Files in the main category are used as input for other analyses: logical decomposition, concerns, duplication, file size, unit size, and conditional complexity.
  • Test source code files are used only for testing of the product. These files are normally not deployed to production.
  • Build and deployment source code files are used to configure or support build and deployment process.
  • Generated source code files are automatically generated files that have not been manually changed after generation.
  • While a source code folder may contain a number of files, we are primarily interested in the source code files that are being written and maintained by developers.
  • Files containing binaries, documentation, or third-party libraries, for instance, are excluded from analysis. The exception are third-party libraries that have been changed by developers.

main286606 LOC (70%) 1,027 files
test64100 LOC (15%) 267 files
generated19376 LOC (4%) 7 files
build and deployment1513 LOC (<1%) 20 files
other33896 LOC (8%) 236 files
Main Code
All manually created or maintained source code that defines logic of the product that is run in a production environment.
proto
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*".
  • 1027 files match defined criteria (286,606 LOC, 100.0% vs. main code):
    • 858 *.rs files (280,712 LOC)
    • 49 *.toml files (2,901 LOC)
    • 2 *.proto files (1,568 LOC)
    • 107 *.sql files (750 LOC)
    • 8 *.py files (623 LOC)
    • 1 *.rdf files (37 LOC)
    • 1 *.html files (12 LOC)
    • 1 *.js files (3 LOC)
  • " *.rs" is biggest, containing 97.94% of LOC.
  • " *.js" is smallest, containing 0% of LOC.


*.rs280712 LOC (97%) 858 files
*.toml2901 LOC (1%) 49 files
*.proto1568 LOC (<1%) 2 files
*.sql750 LOC (<1%) 107 files
*.py623 LOC (<1%) 8 files
*.rdf37 LOC (<1%) 1 file
*.html12 LOC (<1%) 1 file
*.js3 LOC (<1%) 1 file
Test Code
Used only for testing of the product. Normally not deployed in a production environment.
snap
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*/[Tt]ests/.*".
    • files with paths like ".*_test[.].*".
    • files with paths like ".*_tests[.].*".
    • files with paths like ".*/test_.*".
    • files with paths like ".*/[Tt]est/.*".
    • files with paths like ".*/test[.].*".
    • files with paths like ".*/testing[.].*".
    • files with paths like ".*[.]snap".
  • 267 files match defined criteria (64,100 LOC, 22.4% vs. main code):
    • 145 *.rs files (59,131 LOC)
    • 104 *.sql files (4,582 LOC)
    • 18 *.snap files (387 LOC)
  • " *.rs" is biggest, containing 92.25% of LOC.
  • " *.snap" is smallest, containing 0.6% of LOC.


*.rs59131 LOC (92%) 145 files
*.sql4582 LOC (7%) 104 files
*.snap387 LOC (<1%) 18 files
Generated Code
Automatically generated files, not manually changed after generation.
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*/generated/.*".
    • files with paths like ".*/package[-]lock[.]json".
  • 7 files match defined criteria (19,376 LOC, 6.8% vs. main code):
    • 6 *.rs files (12,365 LOC)
    • 1 *.json files (7,011 LOC)
  • " *.rs" is biggest, containing 63.82% of LOC.
  • " *.json" is smallest, containing 36.18% of LOC.


*.rs12365 LOC (63%) 6 files
*.json7011 LOC (36%) 1 file
Build and Deployment Code
Source code used to configure or support build and deployment process.
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*[.]sh".
    • files with paths like ".*[.]git[a-z]+".
    • files with paths like ".*/[.]gitignore".
    • files with paths like ".*/[.]gitmodules".
    • files with paths like ".*/package[-]lock[.]json".
    • files with paths like ".*/package[.]json".
    • files with paths like ".*([.]|/)webpack([.]|/).*".
    • files with paths like ".*/[.]gitattributes".
    • files with paths like ".*[.]bat".
  • 20 files match defined criteria (1,513 LOC, 0.5% vs. main code):
    • 19 *.sh files (1,485 LOC)
    • 1 *.js files (28 LOC)
  • " *.sh" is biggest, containing 98.15% of LOC.
  • " *.js" is smallest, containing 1.85% of LOC.


*.sh1485 LOC (98%) 19 files
*.js28 LOC (1%) 1 file
Other Code
txt
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*[.]md".
    • files with paths like ".*/README[.][a-z0-9]+".
    • files with paths like ".*[.]txt".
    • files with paths like ".*/[.]gitignore".
    • files with paths like ".*/[.]dockerignore".
    • files with paths like ".*[.]json".
    • files with paths like ".*/LICENSE[.][a-z0-9]+".
    • files with paths like ".*/[Ee]xamples/.*".
    • files with paths like ".*[.](rst|rest|resttxt|rsttxt)".
    • files with paths like ".*[.]editorconfig".
  • 236 files match defined criteria (33,896 LOC, 11.8% vs. main code):
    • 102 *.md files (13,703 LOC)
    • 12 *.json files (7,093 LOC)
    • 46 *.rs files (6,670 LOC)
    • 73 *.txt files (6,349 LOC)
    • 3 *.toml files (81 LOC)
  • " *.md" is biggest, containing 40.43% of LOC.
  • " *.toml" is smallest, containing 0.24% of LOC.


*.md13703 LOC (40%) 102 files
*.json7093 LOC (20%) 12 files
*.rs6670 LOC (19%) 46 files
*.txt6349 LOC (18%) 73 files
*.toml81 LOC (<1%) 3 files
Analyzers
Info about analyzers used for source code examinations.


2025-05-07 20:26