microsoft / omi
Source Code Overview

Analysis scope, overview of main, test, generated, deployment, build, and other code.

Source Code Analysis Scope
Files includes and excluded from analyses
txt
reg
cmd
sed
mak
y
texi
in
l
s
  • 22 extensions are included in analyses: h, c, cpp, txt, xml, reg, cmd, sh, inc, md, sed, py, mak, y, yml, texi, json, html, in, gitignore, l, s
  • 5 criteria are used to exclude files from analysis:
    • exclude files with path like ".*/[.][a-zA-Z0-9_]+.*" (Hidden files and folders) (2 files).
    • exclude files with path like ".*/git[-]history[.]txt" (Git history) (1 file).
    • exclude files with path like ".*/git[-][a-zA-Z0-9_]+[.]txt" (Git data exports for sokrates analyses) (0 files).
    • exclude files with path like ".*/sokrates_conventions[.]json" (Sokrates scoping conventions) (1 file).
    • exclude files with path like ".*[.]txt" (Text files) (73 files).
Overview of Analyzed Files
Basic stats on analyzed files
Intro
For analysis purposes we separate files in scope into several categories: main, test, generated, deployment and build, and other.

  • The main category contains all manually created source code files that are being used in the production.
  • Files in the main category are used as input for other analyses: logical decomposition, concerns, duplication, file size, unit size, and conditional complexity.
  • Test source code files are used only for testing of the product. These files are normally not deployed to production.
  • Build and deployment source code files are used to configure or support build and deployment process.
  • Generated source code files are automatically generated files that have not been manually changed after generation.
  • While a source code folder may contain a number of files, we are primarily interested in the source code files that are being written and maintained by developers.
  • Files containing binaries, documentation, or third-party libraries, for instance, are excluded from analysis. The exception are third-party libraries that have been changed by developers.

main188752 LOC (19%) 584 files
test59216 LOC (6%) 153 files
generated0 LOC (0%) 0 files
build and deployment1305 LOC (<1%) 13 files
other713646 LOC (74%) 1166 files
Main Code
All manually created or maintained source code that defines logic of the product that is run in a production environment.
y
l
cmd
reg
s
in
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*".
  • 584 files match defined criteria (188,752 lines of code, 100.0% vs. main code):
    • 205 *.c files (108,626 lines of code)
    • 255 *.h files (47,474 lines of code)
    • 58 *.cpp files (21,305 lines of code)
    • 1 *.html files (4,738 lines of code)
    • 3 *.y files (3,641 lines of code)
    • 7 *.inc files (1,210 lines of code)
    • 1 *.l files (858 lines of code)
    • 11 *.cmd files (372 lines of code)
    • 36 *.reg files (286 lines of code)
    • 4 *.py files (178 lines of code)
    • 1 *.s files (51 lines of code)
    • 1 *.yml files (10 lines of code)
    • 1 *.in files (3 lines of code)
  • " *.c" is biggest, containing 57.55% of code.
  • " *.in" is smallest, containing 0% of code.


*.c108626 LOC (57%) 205 files
*.h47474 LOC (25%) 255 files
*.cpp21305 LOC (11%) 58 files
*.html4738 LOC (2%) 1 files
*.y3641 LOC (1%) 3 files
*.inc1210 LOC (<1%) 7 files
*.l858 LOC (<1%) 1 files
*.cmd372 LOC (<1%) 11 files
*.reg286 LOC (<1%) 36 files
*.py178 LOC (<1%) 4 files
*.s51 LOC (<1%) 1 files
*.yml10 LOC (<1%) 1 files
*.in3 LOC (<1%) 1 files
Test Code
Used only for testing of the product. Normally not deployed in a production environment.
reg
sed
cmd
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*/test_.*".
    • files with paths like ".*/[Tt]ests/.*".
    • files with paths like ".*[-]test[-].*".
    • files with paths like ".*_tests[.].*".
    • files with paths like ".*_test[.].*".
    • files with any line of content like ".*/simpletest/.*".
  • 153 files match defined criteria (59,216 lines of code, 31.4% vs. main code):
    • 59 *.cpp files (40,432 lines of code)
    • 18 *.c files (9,737 lines of code)
    • 46 *.h files (8,440 lines of code)
    • 3 *.inc files (329 lines of code)
    • 17 *.reg files (140 lines of code)
    • 6 *.sed files (52 lines of code)
    • 2 *.cmd files (47 lines of code)
    • 2 *.sh files (39 lines of code)
  • " *.cpp" is biggest, containing 68.28% of code.
  • " *.sh" is smallest, containing 0.07% of code.


*.cpp40432 LOC (68%) 59 files
*.c9737 LOC (16%) 18 files
*.h8440 LOC (14%) 46 files
*.inc329 LOC (<1%) 3 files
*.reg140 LOC (<1%) 17 files
*.sed52 LOC (<1%) 6 files
*.cmd47 LOC (<1%) 2 files
*.sh39 LOC (<1%) 2 files
Build and Deployment Code
Source code used to configure or support build and deployment process.
mak
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*[.]git[a-z]+".
    • files with paths like ".*/[.]gitignore".
    • files with paths like ".*[.]mak".
    • files with paths like ".*[.]sh".
  • 13 files match defined criteria (1,305 lines of code, 0.7% vs. main code):
    • 3 *.mak files (776 lines of code)
    • 10 *.sh files (529 lines of code)
  • " *.mak" is biggest, containing 59.46% of code.
  • " *.sh" is smallest, containing 40.54% of code.


*.mak776 LOC (59%) 3 files
*.sh529 LOC (40%) 10 files
Other Code
texi
mak
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*[.]json".
    • files with paths like ".*/[.]gitignore".
    • files with paths like ".*[.]md".
    • files with paths like ".*/README[.][a-z0-9]+".
    • files with paths like ".*[.]txt".
    • files with paths like ".*[.]texi".
    • files with paths like ".*/[Ss]amples/.*".
    • files with paths like ".*[.](xml|xsd|robot|sql|pgsql|dashboard|profile|ipynb|raml|avsc|al)".
  • 1166 files match defined criteria (713,646 lines of code, 378.1% vs. main code):
    • 617 *.h files (423,327 lines of code)
    • 391 *.c files (263,175 lines of code)
    • 95 *.cpp files (20,195 lines of code)
    • 1 *.texi files (4,684 lines of code)
    • 8 *.md files (1,317 lines of code)
    • 53 *.xml files (921 lines of code)
    • 1 *.mak files (27 lines of code)
  • " *.h" is biggest, containing 59.32% of code.
  • " *.mak" is smallest, containing 0% of code.


*.h423327 LOC (59%) 617 files
*.c263175 LOC (36%) 391 files
*.cpp20195 LOC (2%) 95 files
*.texi4684 LOC (<1%) 1 files
*.md1317 LOC (<1%) 8 files
*.xml921 LOC (<1%) 53 files
*.mak27 LOC (<1%) 1 files
Analyzers
Info about analyzers used for source code examinations.
  • *.c files are analyzed with CStyleAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • No dependency analysis
  • *.h files are analyzed with CppAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Advanced heuristic dependency analysis
  • *.cpp files are analyzed with CppAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Advanced heuristic dependency analysis
  • *.html files are analyzed with HtmlAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Advanced heuristic dependency analysis
  • *.y files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.inc files are analyzed with PhpAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Basic heuristic dependency analysis
  • *.l files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.cmd files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.reg files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.py files are analyzed with PythonAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • Basic heuristic dependency analysis
  • *.s files are analyzed with DefaultLanguageAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Basic code cleaning (empty lines removed for LOC calculations and duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.yml files are analyzed with YamlAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • No unit size analysis
    • No conditional complexity analysis
    • No dependency analysis
  • *.in files are analyzed with RustAnalyzer:
    • All basic standard analyses supported (source code overview, duplication, file size, concerns, findings, metrics, controls)
    • Advanced code cleaning (empty lines and comments removed for LOC calculations, additional cleaning for duplication calculations)
    • Unit size analysis
    • Conditional complexity analysis
    • No dependency analysis


2022-01-30 11:30