class: big, middle

# ECE 7420 / ENGI 9823: Security

.title[
  .lecture[Lecture 27:]
  .title[SQL injection]
  
  
]

---

# Last time

### SOP

### XSS

### CSRF

## Don't trust user input!

---
layout: true
# SQL injection

### Another example of input validation failure

---

### Why is this important?

---

### Why is this important?

???

Remember this simple (but **incomplete**) model of a computer?
Let's take a look simple model of a program.

---
layout: false

# MVC pattern

.centered[
<img src="../mvc/general.svg" height="475"/>
]

???

You've probably seen the **model-view-controller** (MVC) pattern in a
Software Design course, as it's a useful way of structuring programs that users
interact with.
As a side note: you might not like the name, but it's better than the original
name for Prof. Reenskaug's architecture:
["thing-model-view-editor"](
https://folk.universitetetioslo.no/trygver/themes/mvc/mvc-index.html)!

---

# Desktop MVC

.centered[
<img src="../mvc/desktop-ui.svg" height="475"/>
]

???

The MVC pattern was first designed for desktop computers, and it is very
commonly used there.
It also continues to appear in the cycle of technology trends in which
everything centralized becomes distributed and everything distributed becomes
centralized:

* remote access to applications via [X terminal](https://en.wikipedia.org/wiki/X_terminal)
 * local ("native") applications (desktop, phones pre-iPhone)
 * remote applications via the Web (["there's no SDK!"](https://www.youtube.com/watch?v=p1nwLilQy64))
 * native applications on phones (["what we call the App Store"](https://www.youtube.com/watch?v=xo9cKe_Fch8))
 * Web applications with native-like components (who needs native SDKs when you have React?)
 * native applications based on Web technologies (React Native!)
 * Web applications based on native application technologies that adapt Web
   application technology (React Native for Web)

---

# Static Web application

.centered[
<img src="../mvc/traditional-web.svg" height="475"/>
]

???

When the Web was first created, content that could be rendered by a browser was
_static_: HTML, images and even video files that represent motion
don't change after they've been downloaded over HTTP.

---

# Dynamic Web application

.centered[
<img src="../mvc/fullstack.svg" height="475"/>
]

???

This is a more practical view of a real Web application.
Within the browser, a rich application runs in JavaScript — and possibly
WebAssembly — with code to control the UI and dynamic DOM-based view components
that may talk independently to the backend on a web server.
On that server, parts of the API may exist purely to serve view components,
while other parts control what data is allowed to flow where.

At the back end of the backend, there is almost always some kind of persistent
store: a database.
It may be a traditional database with **ACID properties**, or it may be an
**eventually-consistent** database.
It may be **distributed**, it may have **caching**, but somewhere,
data is stored.
That data is often — not always, but often — accessed using the
_lingua franca_ of databases, the _Structured Query Language_ (SQL).

---

# Structured query language

### Language for interacting with databases

### ISO/IEC 9075

* [SQL-92](http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt)
  ([grammar](https://jakewheat.github.io/sql-overview/sql-92-grammar.html))
--

* SQL:1999, SQL:2003, SQL:2008, SQL:2011, SQL:2016, SQL:2019, SQL:2020+...

### Not the only DB language ([Cypher](https://neo4j.com/developer/cypher-query-language), [GraphQL](https://graphql.org)...)
???

SQL isn't the only language that can be used for accessing databases.
[Cypher](https://neo4j.com/developer/cypher-query-language) is a query language
used for accessing graph databases stored in the
[Neo4j](https://neo4j.com) database; it includes language syntax for dealing
with nodes and edges and their relationship.
[GraphQL](https://graphql.org) isn't, strictly speaking, a database query
language any more than a REST API or an HTTP `POST` handler are.
However, it is a language that a front-end can use to make queries about data
stored in a back-end, so many of the same principles will apply!

--
, but...

---

# SQL example

```sql
select * from student
	where
		discipline=(select id from discipline where name='ONAE')
		and student_id > 200800000
	order_by name
	;
```

.caption[
(probably better to do this with an `INNER JOIN`, but this isn't a DB course...)
]

#### Admin varies (MSSQL, MySQL, Oracle, PostgreSQL, SQLite...)

???

The way that you perform administrative actions like creating users or viewing
a database's scheme can vary from database to database.

#### Core language should be consistent

???

The fundamentals of how we perform the basic CRUD operations
(Create, Read, Update, Delete) should be the same across all databases.
Sometimes the performance of one approach vs another
(e.g., a subquery vs an inner join) can be different on different platforms,
but the semantics should be the same.

---

# So what?

### Example: Engineering match

* Need to track students, preferences, disciplines, capacities...

> Hey, ECE is super-popular, let's add some more seats!

---

# Updating ECE capacity

```http
POST /admin/discipline/Computer HTTP/1.1
Host: localhost:5000
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:68.0) Gecko/20100101 Firefox/68.0
[...]
Content-Type: multipart/form-data; boundary=---------------------------14849285061285938555493870477
Content-Length: 177
Cookie: mjx.menu=renderer%3ACommonHTML
-----------------------------14849285061285938555493870477
Content-Disposition: form-data; name="capacity"

50
-----------------------------14849285061285938555493870477--
```

???

Here's an example of how a UI element can drive an API request
(in this case, an HTTP `POST` request) to cause an effect on the backend.

Q: what's missing from (the old version of) this request?

A: **CSRF protection!**

```SQL
update discipline set capacity=50 where name='Computer';
```

???

At the backend, that request is parsed and turned into a SQL statement that,
when executed, updates the database.

---

# Web + SQL = all the things

### Very common anti-pattern:

???

This is a very common pattern in Web develoment: we want to accept data from
a user (or, more specifically, their user agent) and translate it into a
SQL (or other) query to send to the backend database.
However, it's actually an
[anti-pattern](https://en.wikipedia.org/wiki/Anti-pattern),
for reasons that we will describe starting on the next slide...

```sql
insert into readings(temp, humid) values (28.7, 83.9);
```

```python
sql(f'insert into readings(temp, humid) values {temp}, {humid};')
```

```sql
update employees set name='Jonathan Anderson' where id=1234567;
```

```python
sql(f"update employees set name='{new_name}' where id={employee.id};")
```

---

# The problem

```python
f"update employees set name='{new_name}' where id={employee_id};"
```

???

Let's say that we use the technique at the top of this slide to build a SQL
query that updates an employee ID in a database.

Q: What's the problem with constructing SQL queries in this way?

???

Hint: in [the final part of Lab 5](/lab/5#_password_fuzzing),
you fuzzed a user's password, but you would also have observed something else:
that there were some passwords which would cause the server to respond with
neither an "access denied" message nor an "access granted" message, but
something else entirely: a partial HTML page.
You may have also noticed that some of these funny passwords contained a
single-quote character (`'`).
Why would that be?

```http
POST /employee/1234567 HTTP/1.1
[...]
-----------------------------14849285061285938555493870477
Content-Disposition: form-data; name="name"

h4ckz0rd'; select * from employees; update employees set name='h4ckz0rd
-----------------------------14849285061285938555493870477--
```

???

Hint: What's the problem if we receive the `POST` request shown in the middle
      of this slide?

```sql
update employees set name='h4ckz0rd';
select * from employees where name like '%';
update employees set name='h4ckz0rd' where id=1234567;
```

---

# Untrusted input strikes again

.centered[
<img src="https://imgs.xkcd.com/comics/exploits_of_a_mom.png"
     class="fullwidth"/>
]

???

This is much like a program accepting input from a user (or a file) and passing
it straight to a shell program.

### Darn those humans...
--
 computing would be easier without them
--
 (???)

---

# What to do?