facebook / fbthrift
Source Code Overview

Analysis scope, overview of main, test, generated, deployment, build, and other code.

Source Code Analysis Scope
Files includes and excluded from analyses
pxd
pyx
pyi
tcc
hs
cmake
txt
ml
clang-format
clang-tidy
vim
yy
in
tex
el
csproj
proto
hsc
  • 42 extensions are included in analyses: h, java, cpp, py, mustache, thrift, pxd, pyx, pyi, php, go, tcc, cc, hs, rs, json, cmake, txt, md, sh, m, cs, yml, erl, gitignore, toml, ml, clang-format, clang-tidy, xml, vim, yy, in, tex, el, rb, csproj, proto, hsc, c, svg, pl
  • 2 criteria are used to exclude files from analysis:
    • exclude files with path like ".*/[.][a-zA-Z0-9_]+.*" (Hidden files and folders) (13 files).
    • exclude files with path like ".*/deps/.*" (Dependencies) (2 files).
Overview of Analyzed Files
Basic stats on analyzed files
Intro
For analysis purposes we separate files in scope into several categories: main, test, generated, deployment and build, and other.

  • The main category contains all manually created source code files that are being used in the production.
  • Files in the main category are used as input for other analyses: logical decomposition, concerns, duplication, file size, unit size, and conditional complexity.
  • Test source code files are used only for testing of the product. These files are normally not deployed to production.
  • Build and deployment source code files are used to configure or support build and deployment process.
  • Generated source code files are automatically generated files that have not been manually changed after generation.
  • While a source code folder may contain a number of files, we are primarily interested in the source code files that are being written and maintained by developers.
  • Files containing binaries, documentation, or third-party libraries, for instance, are excluded from analysis. The exception are third-party libraries that have been changed by developers.

main249991 LOC (19%) 2081 files
test996554 LOC (78%) 4968 files
generated0 LOC (0%) 0 files
build and deployment269 LOC (<1%) 5 files
other18351 LOC (1%) 100 files
Main Code
All manually created or maintained source code that defines logic of the product that is run in a production environment.
pyx
cmake
hs
pyi
pxd
yy
tex
el
in
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*".
  • 2081 files match defined criteria (249,991 lines of code, 100.0% vs. main code):
    • 547 *.h files (70,189 lines of code)
    • 243 *.cpp files (39,461 lines of code)
    • 65 *.cc files (37,511 lines of code)
    • 478 *.mustache files (26,593 lines of code)
    • 314 *.java files (25,421 lines of code)
    • 92 *.py files (15,771 lines of code)
    • 53 *.go files (7,277 lines of code)
    • 81 *.php files (6,839 lines of code)
    • 32 *.rs files (5,101 lines of code)
    • 30 *.pyx files (4,102 lines of code)
    • 26 *.cmake files (2,558 lines of code)
    • 19 *.hs files (2,202 lines of code)
    • 34 *.pyi files (1,528 lines of code)
    • 31 *.thrift files (1,379 lines of code)
    • 27 *.pxd files (1,352 lines of code)
    • 1 *.yy files (1,119 lines of code)
    • 1 *.tex files (865 lines of code)
    • 1 *.el files (348 lines of code)
    • 1 *.rb files (194 lines of code)
    • 1 *.c files (74 lines of code)
    • 3 *.toml files (68 lines of code)
    • 1 *.in files (39 lines of code)
  • " *.h" is biggest, containing 28.08% of code.
  • " *.in" is smallest, containing 0.02% of code.


*.h70189 LOC (28%) 547 files
*.cpp39461 LOC (15%) 243 files
*.cc37511 LOC (15%) 65 files
*.mustache26593 LOC (10%) 478 files
*.java25421 LOC (10%) 314 files
*.py15771 LOC (6%) 92 files
*.go7277 LOC (2%) 53 files
*.php6839 LOC (2%) 81 files
*.rs5101 LOC (2%) 32 files
*.pyx4102 LOC (1%) 30 files
*.cmake2558 LOC (1%) 26 files
*.hs2202 LOC (<1%) 19 files
*.pyi1528 LOC (<1%) 34 files
*.thrift1379 LOC (<1%) 31 files
*.pxd1352 LOC (<1%) 27 files
*.yy1119 LOC (<1%) 1 files
*.tex865 LOC (<1%) 1 files
*.el348 LOC (<1%) 1 files
*.rb194 LOC (<1%) 1 files
*.c74 LOC (<1%) 1 files
*.toml68 LOC (<1%) 3 files
*.in39 LOC (<1%) 1 files
Test Code
Used only for testing of the product. Normally not deployed in a production environment.
tcc
pyx
pyi
pxd
hs
hsc
ml
proto
csproj
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*/[Tt]est/.*".
    • files with paths like ".*_test[.].*".
    • files with paths like ".*_tests[.].*".
    • files with paths like ".*/[Tt]ests/.*".
    • files with paths like ".*/test_.*".
    • files with paths like ".*/[Ss]pecs/.*".
  • 4968 files match defined criteria (996,554 lines of code, 398.6% vs. main code):
    • 1,048 *.java files (216,751 lines of code)
    • 829 *.cpp files (179,436 lines of code)
    • 964 *.h files (115,061 lines of code)
    • 105 *.tcc files (95,833 lines of code)
    • 110 *.php files (80,341 lines of code)
    • 284 *.pyx files (77,047 lines of code)
    • 389 *.py files (64,807 lines of code)
    • 124 *.go files (53,832 lines of code)
    • 32 *.rs files (49,534 lines of code)
    • 269 *.pyi files (19,249 lines of code)
    • 373 *.thrift files (17,018 lines of code)
    • 319 *.pxd files (14,941 lines of code)
    • 60 *.hs files (4,669 lines of code)
    • 32 *.cc files (3,904 lines of code)
    • 7 *.m files (1,835 lines of code)
    • 6 *.cs files (825 lines of code)
    • 5 *.erl files (475 lines of code)
    • 1 *.hsc files (229 lines of code)
    • 1 *.pl files (207 lines of code)
    • 2 *.ml files (200 lines of code)
    • 1 *.proto files (189 lines of code)
    • 1 *.csproj files (100 lines of code)
    • 6 *.sh files (71 lines of code)
  • " *.java" is biggest, containing 21.75% of code.
  • " *.sh" is smallest, containing 0.01% of code.


*.java216751 LOC (21%) 1048 files
*.cpp179436 LOC (18%) 829 files
*.h115061 LOC (11%) 964 files
*.tcc95833 LOC (9%) 105 files
*.php80341 LOC (8%) 110 files
*.pyx77047 LOC (7%) 284 files
*.py64807 LOC (6%) 389 files
*.go53832 LOC (5%) 124 files
*.rs49534 LOC (4%) 32 files
*.pyi19249 LOC (1%) 269 files
*.thrift17018 LOC (1%) 373 files
*.pxd14941 LOC (1%) 319 files
*.hs4669 LOC (<1%) 60 files
*.cc3904 LOC (<1%) 32 files
*.m1835 LOC (<1%) 7 files
*.cs825 LOC (<1%) 6 files
*.erl475 LOC (<1%) 5 files
*.hsc229 LOC (<1%) 1 files
*.pl207 LOC (<1%) 1 files
*.ml200 LOC (<1%) 2 files
*.proto189 LOC (<1%) 1 files
*.csproj100 LOC (<1%) 1 files
*.sh71 LOC (<1%) 6 files
Build and Deployment Code
Source code used to configure or support build and deployment process.
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*[.]sh".
    • files with paths like ".*[.]csproj".
    • files with paths like ".*[.]git[a-z]+".
    • files with paths like ".*/[.]gitignore".
    • files with paths like ".*/build[.]xml".
  • 5 files match defined criteria (269 lines of code, 0.1% vs. main code):
    • 4 *.sh files (194 lines of code)
    • 1 *.xml files (75 lines of code)
  • " *.sh" is biggest, containing 72.12% of code.
  • " *.xml" is smallest, containing 27.88% of code.


*.sh194 LOC (72%) 4 files
*.xml75 LOC (27%) 1 files
Other Code
txt
vim
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*[.]md".
    • files with paths like ".*[.]txt".
    • files with paths like ".*[.]json".
    • files with paths like ".*/README[.][a-z0-9]+".
    • files with paths like ".*[.]vim".
    • files with paths like ".*/[.]gitignore".
    • files with paths like ".*/[Dd]emos?/.*".
    • files with paths like ".*[.]svg".
  • 100 files match defined criteria (18,351 lines of code, 7.3% vs. main code):
    • 39 *.json files (14,034 lines of code)
    • 20 *.txt files (1,652 lines of code)
    • 19 *.md files (1,305 lines of code)
    • 12 *.cpp files (894 lines of code)
    • 7 *.thrift files (188 lines of code)
    • 1 *.h files (123 lines of code)
    • 1 *.vim files (80 lines of code)
    • 1 *.svg files (75 lines of code)
  • " *.json" is biggest, containing 76.48% of code.
  • " *.svg" is smallest, containing 0.41% of code.


*.json14034 LOC (76%) 39 files
*.txt1652 LOC (9%) 20 files
*.md1305 LOC (7%) 19 files
*.cpp894 LOC (4%) 12 files
*.thrift188 LOC (1%) 7 files
*.h123 LOC (<1%) 1 files
*.vim80 LOC (<1%) 1 files
*.svg75 LOC (<1%) 1 files
Analyzers
Info about analyzers used for source code examinations.
  • *.h files are analyzed with CppAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Advanced heuristic dependency analysis
  • *.cpp files are analyzed with CppAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Advanced heuristic dependency analysis
  • *.cc files are analyzed with CppAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Advanced heuristic dependency analysis
  • *.mustache files are analyzed with HtmlAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Advanced heuristic dependency analysis
  • *.java files are analyzed with JavaAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Advanced heuristic dependency analysis (based on package names)
  • *.py files are analyzed with PythonAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Basic heuristic dependency analysis
  • *.go files are analyzed with GoLangAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Basic heuristic dependency analysis
  • *.php files are analyzed with PhpAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Basic heuristic dependency analysis
  • *.rs files are analyzed with RustAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • No dependency analysis
  • *.pyx files are analyzed with PythonAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Basic heuristic dependency analysis
  • *.cmake files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.hs files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.pyi files are analyzed with PythonAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Basic heuristic dependency analysis
  • *.thrift files are analyzed with ThriftAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.pxd files are analyzed with PythonAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Basic heuristic dependency analysis
  • *.yy files are analyzed with JsonAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.tex files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.el files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.rb files are analyzed with RubyAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • Basic heuristic dependency analysis
  • *.c files are analyzed with CStyleAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • No dependency analysis
  • *.toml files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.in files are analyzed with RustAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • No dependency analysis


2022-04-14 22:51