x = input()
y = "Hello "
z = y + x
print(x) # <- Data Flow Analysis (data comes directly from input )print(y) # Static stringprint(z) # <- Taint Flow Analysis (data is propagated through by x being concatenated)
Taint Analysis
Sources (untrustworthy application inputs)
Sinks (methods / assignments of interest)
Sanitizers (secures the user data)
Passthroughs / Taintstep's (functions that track tainted data
Patterns - Rules & Queries
Define a Security Pattern that you want to detect
SQL Injection, Cross site script, etc.
All static code analysis tools has rules or/and queries
Hardcoded or Customisable
Open or Closed source
Configuration Rules or Dynamic Queries
False Positives & False Negatives
Configuration Rules or Dynamic Queries
Configuration Rules (yaml, json, data structure...)
Simpler to write
Complex flows can be very hard to declare
Dynamic Queries ( programming like language)
Harder to learn and write
Complex flows are easier
Just use Regex!?
Jamie Zawinski (early Netscape engineer):
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Example - Detecting Simple Configuration Problems
from flask import Flask, render_template
app = Flask("MyApp")
@app.route("/")defindex():return render_template("index.html")
if __name__ == "__main__":
app.run("0.0.0.0", 80, debug=True)
name:"Cross Site Scripting"sources:flask:-"flask.request.args[]"-"flask.request.args.get()"sinks:flask:-"flask.make_response([0])"-"flask.Response([0]){}"-"flask.render_template_string([0])"-"flask.abort([2])"
from sca import dfg, results
from sca.flask import flask_sources, flask_sinks_xss, flask_sanitizers_xss
# Or even easier...from sca.web import web_sources, web_sinks_xss, sanitizers_xss
# XSS Query in a couple of lines
results = dfg.taint(web_sources, web_sinks_xss, sanitizers_xss)
Sanitizers
Functions or checks that cause the input to be securing used
Escaping or Encoding before the sink
Context is extremely important
Inline, Direct, and Indirect are... extremely complicated!
**Title:**
> Introduction to Static Code Analysis
**Description:**
> This talk will give an introduction into what static code analysis is, go into a deeper dive into how it's done today, and finally discuss the impact & complications around using static analysis.
**Slides:**
https://presentations.geekmasher.dev/2021-09-Defcon44131
This is not a full list but a generalist list that I have
- AST: Tree representation on the Coded parsed
- CFG: Directional Graph of the Control Flows in the Application
- DFG: Directional Graph of the Data flows in an applications
- TA:
All of these locations you can build a static code analysis tools
Source: https://en.wikipedia.org/wiki/Control-flow_graph
(a) an if-then-else
(b) a while loop
(c) a natural loop with two exits, e.g. while with an if...break in the middle; non-structured but reducible
(d) an irreducible CFG: a loop with two entry points, e.g. goto into a while or for loop
CodeQL Docs:
Data flow analysis is used to compute the possible values that a variable can hold at various points in a program, determining how those values propagate through the program and where they are used.
Sources:
- [How does JavaScript and JavaScript engine work in the browser and node?](https://medium.com/jspoint/how-javascript-works-in-browser-and-node-ab7d0d09ac2f)
- [Firing up the Ignition interpreter](https://v8.dev/blog/ignition-interpreter)
- [Carnegie Mellon University - Taint Analysis](https://www.cs.cmu.edu/~ckaestne/15313/2018/20181023-taint-analysis.pdf)
- [Northwestern - Static Analysis](https://users.cs.northwestern.edu/~ychen/classes/cs450-f16/lectures/10.10_Static%20Analysis.pdf)
- https://labs.f-secure.com/assets/BlogFiles/mwri-Static-Analysis-for-Code-and-Infrastructure-final-DevSecCon2016-2016-24-10.pdf