VIBRANCE: Securing Java Applications with Advanced Techniques

Slide Note

VIBRANCE project aims to eliminate vulnerabilities in Java applications by analyzing bytecode and implementing nuanced confinement and diversification strategies. By addressing common security pitfalls such as SQL injection, command execution, and loop manipulation, VIBRANCE enhances application security while enabling continued execution. The project is led by a team of experts funded by IARPA's STONESOUP program.

osie_449 Follow

Uploaded on Oct 10, 2024 | 2 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

VIBRANCE Vulnerabilities in Bytecode Removed by Analysis and Nuanced Confinement and Diversification Alessandro Coglio (PI), Stephen Fitzpatrick, Limei Gilham, Cordell Green KESTREL INSTITUTE Henny Sipma (PI), Matthew Barry, Anca Browne, Doug Smith, Arnaud Venet Martin Rinard (PI), Jeff Perkins (PM), Jordan Eikenberry, Douglas Kramm, Paolo Piselli, Daniel Willenson, Sasa Misailovic, Fan Long, DOLL Inc. Funded by IARPA s STONESOUP Program May 9th, 2012 HCSS

Problem normal run: often accessed using APIs via unstructured strings that encode powerful languages (e.g. SQL, Unix shells) application external input (from network, files, user, ) external resources (e.g. DB, files, settings, ) inputs affect number of loop iterations flow of data

Problem attack: application data lost/stolen, arbitrary commands executed unexpected external input (from network, files, user, ) unexpected effect on external resources (e.g. DB, files, settings, ) problem: missing/faulty validation checks are missing checks are incomplete checks occur at the wrong time (e.g. before decoding) unexpected number of loop iterations denial of service

SQL Example Code: String.format("select where user='%s' and passwd='%s'", user, passwd) Inputs: user = John' or 1=1 -- passwd = Executed SQL: select where user='John' or 1=1 -' and passwd='' Result: Attacker can access information without knowing the password

Command Example Code: String.format("grep '%s' names.txt", name) Inputs: name = x' /dev/null; cat /etc/passwd; echo ' Executed Command: grep 'x' /dev/null; cat /etc/passwd; echo '' names.txt Result: Attacker can execute an arbitrary command (such as reading /etc/passwd)

Loop Example Code: n = Integer.parseInt(data); for (int i=0; i<n; i++) ; Inputs: data = -1 Loop bound: n = -1 Result: Attacker can force loop to execute a very large number of times denial of service

Goal of VIBRANCE: Eliminate Vulnerabilities & Enable Continued Execution original Java application hardened Java application VIBRANCE

VIBRANCE Approach application external input (from network, files, user, ) external resources (e.g. DB, files, settings, ) track data provenance, statically and at run time trusted untrusted mixed inputs affect number of loop iterations confine dangerous untrusted data as it is about to be used internal data (untrusted data may be dangerous or not) randomize trusted keywords so that attacker cannot create valid syntax

VIBRANCE Architecture run-time tracking run-time confinement static analysis hardened Java app. original Java app. randomization VIBRANCE

Status Comprehensive and precise protection from injection and other tainted-data attacks SQL, OS commands, LDAP, XPath, XQuery, file path traversal, tainted loop bounds, Note that SQL injection is #1 or #2 vulnerability Safe continued execution (beyond fail-safe and failure-oblivious) Can conservatively handle unforeseen weaknesses Second layer of protection via automatic keyword randomization (previous approaches were manual) Automatic server protection Successfully scales to 200K LOC applications Robust, ready for field test

VIBRANCE Operates on Java Bytecode (not Java Source) Bytecode is always available, source may not be Bytecode is much easier to manipulate than source Little information loss from source to bytecode More general, e.g. non-Java languages (like JRuby and Scala) compile to bytecode More easily extensible to similar languages, e.g. .NET

VIBRANCE Approach application external input (from network, files, user, ) external resources (e.g. DB, files, settings, ) track data provenance, statically and at run time trusted untrusted mixed inputs affect number of loop iterations confine dangerous untrusted data as it is about to be used internal data (untrusted data may be dangerous or not) randomize trusted keywords so that attacker cannot create valid syntax

(1) Tracking Internal data is always trusted External data comes from API calls Configurable policy distinguishes trusted and untrusted API calls (e.g. local file may be trusted in server program but not in end-user program) Trusted and untrusted data is tracked through Java bytecode instructions (e.g. store value in object field) API calls (e.g. string operations, including complex operations such as regex substitution) This tracking is not specific to particular weaknesses, but applies to all injection and other tainted-data weaknesses We use a combination of static and dynamic tracking static tracking reduces runtime overhead of dynamic tracking dynamic tracking reduces false alarms of static tracking Dynamic tracking is character by character (vs. whole string), array element by array element (vs. whole array), etc.

Static Analysis Java byte code front end .class .jar .war sound abstraction from Java byte code into CHIF abstract interpretation engine C source code front end Iterators sound abstraction from preprocessed CIL code into CHIF .c CIL Abstract domains: constants intervals strided intervals linear equalities polyhedra symbolic sets value sets taint x86 binary front end disassembly abstraction from x86 binary code into CHIF .exe

VIBRANCE Approach application external input (from network, files, user, ) external resources (e.g. DB, files, settings, ) track data provenance, statically and at run time trusted untrusted mixed inputs affect number of loop iterations confine dangerous untrusted data as it is about to be used internal data (untrusted data may be dangerous or not) randomize trusted keywords so that attacker cannot create valid syntax

(2) Confinement Confine untrusted data at point of use API call (e.g. java.sql.Statement.execute(), Runtime.exec()) Language construct (e.g. loop bound check, array allocation) Confinement is use-specific Type of use (SQL, LDAP, exec, filename, etc.) Specific run-time values at this point of use Is the untrusted data properly quoted? What command is being executed? Configurable use-specific rules Checks to perform on untrusted data Attack responses (terminate, fix & continue, )

SQL Confinement Example Code: sql = String.format("select where user='%s' and passwd='%s'", user, passwd); stmt.execute (sql); Inputs: user = John' or 1=1 -- passwd = Unconfined SQL: select where user='John' or 1=1 --' and passwd='' SQL-specific check: If a trusted string contains an untrusted substring inside quotes, then any quotes inside the untrusted string must be escaped Confined SQL: select where user='John'' or 1=1 --' and passwd='' Result: Attempt to access private information is thwarted escape embedded quote

Use-Specific Confinement Rules Rules are specified by API, e.g. Statement.execute (String sql) Runtime.exec (String[] cmdarray) Rules enforce normal use of untrusted input Generally, input should specify a single data value Number Name (alphanumeric) Quoted string Details are specific to each API Rules are not provably correct Program intent is unknown Enforce best practices Rules are initially conservative and can be configured to allow more risky practices if necessary

Default Backstop Confinement Rules Program may Use an API for which our tool lacks specific knowledge Dangerous APIs use native code Send data over a socket connected to some unknown correspondent We perform default, conservative checks on sockets and natives Untrusted data is limited to alphanumeric tokens Effective at protecting against new/unknown weaknesses, at the cost of false alarms Easy to add more appropriate checks when necessary, eliminating false alarms

VIBRANCE Approach application external input (from network, files, user, ) external resources (e.g. DB, files, settings, ) track data provenance, statically and at run time trusted untrusted mixed inputs affect number of loop iterations confine dangerous untrusted data as it is about to be used internal data (untrusted data may be dangerous or not) randomize trusted keywords so that attacker cannot create valid syntax

(3) Randomization Injection attacks insert language-specific keywords (e.g. OR) into strings If the keywords are unknown, the attack will fail (similar to ASLR) Replace keywords in string constants with randomized versions, e.g. Turn OR into OR64738 Protect API call via a parser that uses the randomized keywords Standard (non-randomized) keywords will cause a parse error De-randomize on all other uses (file output, string comparisons, etc) Turn OR64738 into OR Particularly important that messages visible to the attacker do not contain the randomizing number We have implemented this for SQL This randomization mechanism is independent from tracking and confinement, thus providing an independent line of defense

Internal Testing We have performed extensive internal testing Juliet Test Suite (details in the next slides) Daikon ~ 200K LOC We never lose trusted/untrusted information Apache Tomcat ~ 200K LOC We never lose trusted/untrusted information Other Wide variety of tests Used fuzzers Tested multiple SQL servers

Juliet Test Suite: CWE-89 & CWE-78 We have augmented the Juliet tests Provided attack inputs Each test represents an exploitable vulnerability SQL Injection Tests (CWE-89) Ran 1316 tests These tests use a variety of inputs sources 6 benign inputs against each test 12 attack inputs against each test Detected 100% of attacks with no false alarms OS Command Injection Tests (CWE-78) Ran 329 tests Same input sources as SQL 6 benign inputs against each test 5 attack inputs against each test Detected 100% of attacks with no false alarms

Juliet Test Suite: CWE-606 (Loop Bounds) Tests for detection of taint and safe loop bounds in a variety of control and data flow cases Contains 2492 loops Unsafe = loop bound is determined by untrusted input and is not limited by trusted data Unsafe 417 Not Safe 1035 reachable 1040 Safe = loop bound is either (1) determined by trusted input or (2) determined by untrusted input but limited by trusted data

Static Analysis Results on CWE-606 33 false positives We correctly detect ~ 97% of safe loops 100% of unreachable loops 100% of unsafe loops 417 Unsafe 417 Not Safe 1035 reachable 1040 1040 1002

Static Analysis Results on CWE-606 450 Runtime Check Unsafe 417 Not Safe 1035 reachable 1040 Static analysis reduces 2492 loops to consider down to 450 2042 out of which 417 are actually unsafe

External Testing Performed by MITRE On April 23-25, 2012 Preliminary results Injection 42 test cases Rendered unexploitable: 39 (92.9%) Tainted data 42 test cases Rendered unexploitable: 33 (78.6%)

Future Work Address more weakness classes Error handling Resource handling Number handling Concurrency handling Improve treatment of current weakness classes (injection and tainted data) Pilot: anyone interested?

VIBRANCE: Securing Java Applications with Advanced Techniques

Download Presentation

Presentation Transcript

Related

More Related Content