facebook / hhvm
Source Code Overview

Analysis scope, overview of main, test, generated, deployment, build, and other code.

Source Code Analysis Scope
Files includes and excluded from analyses
ml
ini
txt
mli
hhi
cmake
in
wsdl
diff
patch
xsl
s
po
mo
xsd
gitattributes
clang-format
phpt
gdb
profile
y
ll
mll
gitmodules
mly
clang-tidy
awk
  • 49 extensions are included in analyses: php, h, cpp, ml, rs, ini, inc, txt, mli, toml, hhi, cmake, py, md, in, sh, c, json, wsdl, hack, gitignore, diff, patch, xml, cc, xsl, s, po, mo, html, xsd, gitattributes, pl, ts, clang-format, phpt, yml, css, bzl, gdb, profile, hpp, y, ll, mll, gitmodules, mly, clang-tidy, awk
  • 5 criteria are used to exclude files from analysis:
    • exclude files with path like ".*/[.][a-zA-Z0-9_]+.*" (Hidden files and folders) (40 files).
    • exclude files with path like ".*/(3rd|[Tt]hird)[-_]?[Pp]arty/.*" (Dependencies) (80 files).
    • exclude files with path like ".*/cache/.*" (Caches) (10 files).
    • exclude files with path like ".*/dependencies/.*" (Dependencies) (10 files).
    • exclude files with path like ".*/deps/.*" (Dependencies) (22 files).
Overview of Analyzed Files
Basic stats on analyzed files
Intro
For analysis purposes we separate files in scope into several categories: main, test, generated, deployment and build, and other.

  • The main category contains all manually created source code files that are being used in the production.
  • Files in the main category are used as input for other analyses: logical decomposition, concerns, duplication, file size, unit size, and conditional complexity.
  • Test source code files are used only for testing of the product. These files are normally not deployed to production.
  • Build and deployment source code files are used to configure or support build and deployment process.
  • Generated source code files are automatically generated files that have not been manually changed after generation.
  • While a source code folder may contain a number of files, we are primarily interested in the source code files that are being written and maintained by developers.
  • Files containing binaries, documentation, or third-party libraries, for instance, are excluded from analysis. The exception are third-party libraries that have been changed by developers.

main1050636 LOC (63%) 5244 files
test580369 LOC (35%) 22598 files
generated0 LOC (0%) 0 files
build and deployment1036 LOC (<1%) 28 files
other22705 LOC (1%) 949 files
Main Code
All manually created or maintained source code that defines logic of the product that is run in a production environment.
ml
hhi
mli
cmake
mll
mly
ll
in
y
gdb
profile
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*".
  • 5244 files match defined criteria (1,050,636 lines of code, 100.0% vs. main code):
    • 1,046 *.cpp files (453,006 lines of code)
    • 915 *.ml files (217,585 lines of code)
    • 626 *.rs files (138,845 lines of code)
    • 1,180 *.h files (127,846 lines of code)
    • 415 *.php files (32,180 lines of code)
    • 165 *.hhi files (23,541 lines of code)
    • 447 *.mli files (20,959 lines of code)
    • 148 *.cmake files (11,502 lines of code)
    • 10 *.cc files (7,971 lines of code)
    • 38 *.c files (5,571 lines of code)
    • 33 *.py files (4,254 lines of code)
    • 193 *.toml files (3,037 lines of code)
    • 1 *.mll files (1,301 lines of code)
    • 1 *.mly files (1,136 lines of code)
    • 7 *.hack files (404 lines of code)
    • 1 *.ll files (392 lines of code)
    • 2 *.ts files (376 lines of code)
    • 7 *.in files (301 lines of code)
    • 1 *.y files (127 lines of code)
    • 1 *.css files (76 lines of code)
    • 2 *.pl files (73 lines of code)
    • 1 *.hpp files (47 lines of code)
    • 1 *.gdb files (43 lines of code)
    • 1 *.profile files (35 lines of code)
    • 1 *.bzl files (17 lines of code)
    • 1 *.yml files (11 lines of code)
  • " *.cpp" is biggest, containing 43.12% of code.
  • " *.yml" is smallest, containing 0% of code.


*.cpp453006 LOC (43%) 1046 files
*.ml217585 LOC (20%) 915 files
*.rs138845 LOC (13%) 626 files
*.h127846 LOC (12%) 1180 files
*.php32180 LOC (3%) 415 files
*.hhi23541 LOC (2%) 165 files
*.mli20959 LOC (1%) 447 files
*.cmake11502 LOC (1%) 148 files
*.cc7971 LOC (<1%) 10 files
*.c5571 LOC (<1%) 38 files
*.py4254 LOC (<1%) 33 files
*.toml3037 LOC (<1%) 193 files
*.mll1301 LOC (<1%) 1 files
*.mly1136 LOC (<1%) 1 files
*.hack404 LOC (<1%) 7 files
*.ll392 LOC (<1%) 1 files
*.ts376 LOC (<1%) 2 files
*.in301 LOC (<1%) 7 files
*.y127 LOC (<1%) 1 files
*.css76 LOC (<1%) 1 files
*.pl73 LOC (<1%) 2 files
*.hpp47 LOC (<1%) 1 files
*.gdb43 LOC (<1%) 1 files
*.profile35 LOC (<1%) 1 files
*.bzl17 LOC (<1%) 1 files
*.yml11 LOC (<1%) 1 files
Test Code
Used only for testing of the product. Normally not deployed in a production environment.
ml
wsdl
in
hhi
xsl
mli
cmake
po
xsd
phpt
mo
awk
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*/[Tt]est/.*".
    • files with paths like ".*_test[.].*".
    • files with paths like ".*/[Tt]ests/.*".
    • files with paths like ".*/test_.*".
    • files with paths like ".*_tests[.].*".
    • files with paths like ".*[-]test[-].*".
  • 22598 files match defined criteria (580,369 lines of code, 55.2% vs. main code):
    • 21,489 *.php files (495,867 lines of code)
    • 172 *.ml files (21,683 lines of code)
    • 90 *.cpp files (21,427 lines of code)
    • 43 *.py files (14,975 lines of code)
    • 568 *.inc files (10,076 lines of code)
    • 4 *.cc files (7,507 lines of code)
    • 34 *.wsdl files (3,475 lines of code)
    • 17 *.rs files (1,972 lines of code)
    • 43 *.in files (838 lines of code)
    • 16 *.sh files (618 lines of code)
    • 15 *.h files (596 lines of code)
    • 8 *.hhi files (316 lines of code)
    • 23 *.xml files (235 lines of code)
    • 12 *.xsl files (160 lines of code)
    • 2 *.mli files (154 lines of code)
    • 11 *.toml files (140 lines of code)
    • 21 *.hack files (64 lines of code)
    • 2 *.cmake files (55 lines of code)
    • 9 *.po files (54 lines of code)
    • 3 *.xsd files (36 lines of code)
    • 2 *.phpt files (35 lines of code)
    • 4 *.html files (33 lines of code)
    • 1 *.pl files (29 lines of code)
    • 8 *.mo files (23 lines of code)
    • 1 *.awk files (1 lines of code)
  • " *.php" is biggest, containing 85.44% of code.
  • " *.awk" is smallest, containing 0% of code.


*.php495867 LOC (85%) 21489 files
*.ml21683 LOC (3%) 172 files
*.cpp21427 LOC (3%) 90 files
*.py14975 LOC (2%) 43 files
*.inc10076 LOC (1%) 568 files
*.cc7507 LOC (1%) 4 files
*.wsdl3475 LOC (<1%) 34 files
*.rs1972 LOC (<1%) 17 files
*.in838 LOC (<1%) 43 files
*.sh618 LOC (<1%) 16 files
*.h596 LOC (<1%) 15 files
*.hhi316 LOC (<1%) 8 files
*.xml235 LOC (<1%) 23 files
*.xsl160 LOC (<1%) 12 files
*.mli154 LOC (<1%) 2 files
*.toml140 LOC (<1%) 11 files
*.hack64 LOC (<1%) 21 files
*.cmake55 LOC (<1%) 2 files
*.po54 LOC (<1%) 9 files
*.xsd36 LOC (<1%) 3 files
*.phpt35 LOC (<1%) 2 files
*.html33 LOC (<1%) 4 files
*.pl29 LOC (<1%) 1 files
*.mo23 LOC (<1%) 8 files
*.awk1 LOC (<1%) 1 files
Build and Deployment Code
Source code used to configure or support build and deployment process.
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*[.]git[a-z]+".
    • files with paths like ".*/[.]gitmodules".
    • files with paths like ".*/[.]gitignore".
    • files with paths like ".*[.]sh".
    • files with paths like ".*/[.]gitattributes".
  • 28 files match defined criteria (1,036 lines of code, 0.1% vs. main code). All matches are in *.sh files.


*.sh1036 LOC (100%) 28 files
Other Code
txt
ini
patch
diff
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*[.]md".
    • files with paths like ".*[.]txt".
    • files with paths like ".*[.]json".
    • files with paths like ".*/README[.][a-z0-9]+".
    • files with paths like ".*/[.]gitignore".
    • files with paths like ".*[.]patch".
    • files with paths like ".*[.]ini".
    • files with paths like ".*/[Ee]xamples/.*".
    • files with paths like ".*[.]diff".
  • 949 files match defined criteria (22,705 lines of code, 2.2% vs. main code):
    • 46 *.md files (8,374 lines of code)
    • 234 *.txt files (7,422 lines of code)
    • 43 *.json files (3,641 lines of code)
    • 594 *.ini files (1,759 lines of code)
    • 1 *.patch files (829 lines of code)
    • 26 *.diff files (393 lines of code)
    • 3 *.py files (247 lines of code)
    • 2 *.c files (40 lines of code)
  • " *.md" is biggest, containing 36.88% of code.
  • " *.c" is smallest, containing 0.18% of code.


*.md8374 LOC (36%) 46 files
*.txt7422 LOC (32%) 234 files
*.json3641 LOC (16%) 43 files
*.ini1759 LOC (7%) 594 files
*.patch829 LOC (3%) 1 files
*.diff393 LOC (1%) 26 files
*.py247 LOC (1%) 3 files
*.c40 LOC (<1%) 2 files
Analyzers
Info about analyzers used for source code examinations.
  • *.cpp files are analyzed with CppAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Advanced heuristic dependency analysis
  • *.ml files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.rs files are analyzed with RustAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • No dependency analysis
  • *.h files are analyzed with CppAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Advanced heuristic dependency analysis
  • *.php files are analyzed with PhpAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Basic heuristic dependency analysis
  • *.hhi files are analyzed with HtmlAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Advanced heuristic dependency analysis
  • *.mli files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.cmake files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.cc files are analyzed with CppAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Advanced heuristic dependency analysis
  • *.c files are analyzed with CStyleAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • No dependency analysis
  • *.py files are analyzed with PythonAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Basic heuristic dependency analysis
  • *.toml files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.mll files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.mly files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.hack files are analyzed with HackAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • No dependency analysis
  • *.ll files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.ts files are analyzed with TypeScriptAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • No dependency analysis
  • *.in files are analyzed with RustAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • No dependency analysis
  • *.y files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.css files are analyzed with CssAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.pl files are analyzed with PerlAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Basic heuristic dependency analysis
  • *.hpp files are analyzed with CppAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Advanced heuristic dependency analysis
  • *.gdb files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.profile files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.bzl files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.yml files are analyzed with YamlAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis


2022-04-14 22:35