class: big, middle

# ECE 7420 / ENGI 9807: Security

.title[
  .lecture[Lecture 4:]
  .title[Memory (un)safety]
  
  
]

---

# Last time

### Code injection

1. Inject code (e.g., copying payload into buffers)
--

2. Hijack control flow (e.g., stack smashing)

### Mitigations

---

# Mitigations

### How can we prevent/reduce stack smashing?

* non-executable stacks ([we needed `-z execstack` to demo!](Makefile))
 * `W^X`: memory regions writable **or** executable (limitations?)
 * stack canaries: `-fstack-protector`
 * ASLR: address space layout randomization (more later)

### ... and more to follow

---

# The attacker strikes back

### Guessing precise addresses is hard

NOP sleds, relative addressing

### Shellcode authors avoid zeroes (why?)

### Is shellcode easy to spot?
--
 See: [English shellcode](https://www.cs.jhu.edu/~sam/ccs243-mason.pdf)&ast;

.footnote[
&ast;
"English Shellcode",
Mason, Small, Monrose and MacManus,
in
_CCS '09: Proceedings of the 16th ACM conference on Computer and communications security_, 2009.
DOI: [10.1145/1653662.1653725](https://dx.doi.org/10.1145/1653662.1653725)
]

---

# Today

### Mitigation details

### Counter-mitigation attacks

### Counter-counter-mitigation mitigations

---

# Higher-level languages?

### One mitigation: no stack access

### Alternative technique: _heap spraying_

* Create lots of shellcode strings
--

* Just need _one_ control-flow hack to trigger

---

# Stages of code injection

### 1. Inject code

### 2. Hijack control flow

---

# Code injection

### Writable buffers

* any memory region: heap, stack or BSS

### User-driven memory allocation

* user is _supposed_ to be able to request allocation
--

* e.g., untrusted JavaScript allocates strings

---

# Control-flow hijacking

### Targets

???

Return addresses (last class), function pointers, vtables, conditions...

### Buffer overflow

* as demonstrated last class!

### Integer under/over-flow

### Format string vulnerabilities

---

# Integer overflow

### See [demo code](integers.c)

### Lesson: the details matter!

* don't assume that integers behave like, well, integers
--

* don't trust user input
--

* use safe integer arithmetic
  ([US-CERT](https://www.us-cert.gov/bsi/articles/knowledge/coding-practices/safe-integer-operations),
   [Microsoft](https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/ntintsafe-design-guide))

---

# Integer overflow...
--
 still???

???

Integer overflow is _still_ very much a going concern!

* OpenSSL: https://nvd.nist.gov/vuln/detail/CVE-2021-23840
--

* Linux: https://nvd.nist.gov/vuln/detail/CVE-2021-3490
--

* Windows:
  https://www.fortinet.com/blog/threat-research/microsoft-kernel-integer-overflow-vulnerability.html
--

* probably:
  https://arstechnica.com/information-technology/2021/04/in-epic-hack-signal-developer-turns-the-tables-on-forensics-firm-cellebrite

???

Another great read about this hack:
https://cyberlaw.stanford.edu/blog/2021/05/i-have-lot-say-about-signal’s-cellebrite-hack

---

# Format string vulnerabilities

### See [demo code](format-strings.c)

### Lesson: the details matter!

* don't trust user input
--

* put user strings in _values_, sure
--

* do **not** put user strings in _format_
--

* also important for higher-level languages (e.g.,
  [Ruby](https://nvd.nist.gov/vuln/detail/CVE-2008-2664))

---

# Notes about code injection

### Modern MMUs and DEP

### `W^X` policy

---

# Stages of code injection

### 1. Inject code

### 2. Hijack control flow

## But step 1 is getting harder!

???

Policies such as `W^X` make it much tougher to inject attacker-controlled code
into memory that can actually be executed.
However, that doesn't mean that attackers just gave up!
Instead, they did what attackers do: they thought creatively, out of the box,
not limited by the constraints that defenders impose on them.

## What if...

---

# What if...

### ~~0. Inject code~~

### 1. Hijack control flow

???

Is it possible to attack running software _without_ injecting code?
If we could still hijack the control flow of a program (which seems
to often be the case!) and put non-executable data in memory (e.g., on the
stack), how could we still have a viable attack?

## What code do we execute?

???

What code would we even excute?

---

# Return to libc

### Uses existing code from `libc`

???

If you can't add code to memory, you'll just have to use what's already there!
This kind of "living off the land" is possible because there is already
quite a lot of code lying around in memory.
For example, there is _lots_ of code in the standard C library, which gets
loaded into just about every process running on your system.

### e.g., return to `system()`

???

One common thing we'd like to be able to do when we attack a program is...
anything!
We'd like a general-purpose tool for letting us execute arbitrary commands once
we've broken into a process, and `libc` provides us with just such a tool:
the `system(2)` system call.
This will allow us to execute any program we like, and if that program is a
shell program, we can execute _more_ arbitrary actions.

### Especially easy on 32b x86

---

# ROP

### _Return-oriented programming_&ast;

.footnote[
  &ast;
  See, e.g., Roemer et al,
  "Return-Oriented Programming: Systems, Languages, and Applications",
  ACM TISSEC 15(1), 2012.
  DOI: [https://doi.org/10.1145/2133375.2133377](10.1145/2133375.2133377)
]

### Generalization of return-to-libc attack

### Relies on existing "gadgets" (instruction + `ret`)

### Can be automated (e.g., [ROPC](https://github.com/pakt/ropc), [Ropper](https://github.com/sashs/Ropper))

???

For fun, try out the tutorials at https://ropemporium.com !

---

# ASLR

### _Address Space Layout Randomization_

???

Defenders can make the attacker's life harder by ensuring that `libc`
(and other code) isn't loaded at the same location every time.

### Not super-helpful on 32b platforms

???

On a 32b machine, however, we might only have 16b or even 8b available for
randomization.
A lack of randomness _seems_ bad in a defensive technique called
"randomization", but why?
What would more randomness give us?

### Increases "work factor"

???

ASLR **doesn't provide definitive protection**.
Unlike other security techniques, it won't always say "no" to an attack.
What it will do is make an attacker have to do **additional work**.
For example, on a 32b system, an attacker might have to
**try their attack 128 or 32,768 times** in order to succeed.

### But maybe not by as much as you think!&ast;

.footnote[
&ast; "ASLR on the Line: Practical Cache Attacks on the MMU",
Gras, Razavi, Bosmen, Box an Giuffrida,
_Proceedings of the 2017 Networked and Distributed Systems Security Symposium_,
2017.
DOI: https://dx.doi.org/10.14722/ndss.2017.23271.
]

???

Practical attacks exist that use low-level properties of things like
memory management units (MMUs) to break ASLR, even from JavaScript code!

---

# Code reuse attacks

### ~~0. Inject code~~

### 1. Hijack control flow

## How do we stop the hijacking?

---

# Stopping hijacking

### Stack protection

Non-executable memory  
Stack canaries (`-fstack-protector`)

### CFI: control flow integrity

Static analysis, dynamic enforcement

### Full _memory safety_
--
 (next time!)

---
class: big, middle

The End.