class: big, middle # ECE 7420 / ENGI 9823: Security .title[ .lecture[Lecture 25:] .title[Web security] ] --- # The story so far ### ~~Introduction~~ ### ~~Software security~~ ### ~~Host security~~ ### ~~Network security~~ ### Web security --- # Today ### Web model ### Web history ### Key Web security concepts --- # What's this Web thing? -- ### Documents: text, images, etc. -- .floatright[ <img src="https://static01.nyt.com/images/2012/03/13/business/13google/13google-jumbo.jpg?quality=75&auto=webp" alt="People working in the Universal Bibliographic Repertory, which would eventually become the Mundaneum" width="350"/> .caption[Source: [Zinneke](https://commons.wikimedia.org/wiki/User:Zinneke)] via [Wikimedia](https://commons.wikimedia.org/wiki/File:Mundaneum_Tir%C3%A4ng_Karteikaarten.jpg) ] ### _Hypertext:_ -- documents+edges -- Some examples: SGML, GNU `info`, Mundaneum, Project Xanadu... ??? This image shows people working in the Universal Bibliographic Repertory, which would eventually become the Mundaneum. This system of cross-referenced index cards was meant to catalogue all of the world's information, which is now pretty close the [Google's mission statement](https://about.google). [Project Xanadu](https://en.wikipedia.org/wiki/Project_Xanadu) is particularly wild to read about! -- #### ... and HTML! --- # Web formats and protocols <img src="http-stack.svg" align="right" width="650"/> ### HTTP ??? The _hypertext transfer protocol_ (HTTP, see [RFC 2616](https://datatracker.ietf.org/doc/html/rfc2616)) describes how we transfer Web content from servers to clients. We can use `telnet`, `nc` (netcat), etc., to talk to an HTTP server and act as our own Web browser: ``` GET / HTTP/1.0 HTTP/1.1 301 Moved Permanently Date: Mon, 25 Jul 2022 14:08:02 GMT Server: Apache X-Frame-Options: SAMEORIGIN X-Content-Type-Options: nosniff Content-Security-Policy: frame-ancestors 'self'; ... ``` Or, even better: ``` GET / HTTP/1.1 Host: www.engr.mun.ca HTTP/1.1 301 Moved Permanently Date: Mon, 25 Jul 2022 14:08:02 GMT Server: Apache ... ``` -- ### HTML -- ### ... other stuff ??? This "other stuff" includes static-ish content like images, declarative content like CSS and dynamic software in languages like JavaScript and WebAssembly! --- # Web history .floatright[ <img src="https://upload.wikimedia.org/wikipedia/commons/2/2f/NeXTcube_first_webserver.JPG" alt="The first Web server" align="right" width="550"/> .caption[The first Web server: a NEXT computer at CERN] ] ### 1980s -- Tim Berners-Lee @ CERN ??? Now _Sir_ Tim Berners-Lee, I suppose! Much like the ARPAnet was developed to address a real-world problem, the Web was originally created to address a problem: making it easier for scientists at CERN to share information with each other without having to all be running exactly the same software. One lesson that we should take from this history is that the most interesting and impactful projects come from **places you might not expect**. Another good lesson is probably that engineering is never done for its own sake: we build things **for a purpose**! -- http://info.cern.ch/ ??? In the early days, one could visit websites like info.cern.ch (the first Web server) using tools like the [line mode browser](https://en.wikipedia.org/wiki/Line_Mode_Browser), a very bare-bones command-line tool that could run on almost any operating system with a network stack. -- ### 1990s -- NCSA Mosaic ??? Tim Berners-Lee initially created a graphical browser for the NeXT computer, but they weren't able to port it to other operating systems. Instead, the graphical browser that really helped the Web become popular was Mosaic from the NCSA (National Center for Supercomputing Applications). Mosaic only lived for 4–5 years: it was developed in 1992, released in 1993 and discontinued in 1997. -- Netscape ??? Mosaic was rapidly overtaken by Netscape, a browser started by one of Mosaic's original authors. -- Everything else ??? Netscape was itself rapidly overtaken by Internet Explorer, partially due to what a US judge would later find to be an abuse of its OS monopoly power (the [whole thing](https://en.wikipedia.org/wiki/United_States_v._Microsoft_Corp.) makes for a great read... did you know that the judge originally ordered Microsoft to be broken up into two companies?). --- .floatright[ <img src="ncsa-mosaic.png" width="375"/> .caption[NCSA Mosaic browser*] ] # The original Web .footnote[ * Andreessen, "NCSA Mosaic Technical Summary", _National Center for Supercomputing Applications_, 20 Feb 1993. Available: http://web.archive.org/web/19991009182307/http://cbl.leeds.ac.uk/WWW/ps/ghindex.html ] -- ### Scientists sharing data -- Collaboration without (onerus) standardization -- A few lowest-common-denominator formats: HTML, GIF, JPEG, MPEG... ??? This "lowest common denominator" approach allowed collaboration via text and images, which could be produced from almost any software and which could end up in scientific publications. It also allowed videos, which are essential for communicating some scientific results. Original HTML was an elegant approach for a more civilized age (as, if you don't mind quite a lot of profanity, [this website expresses](https://motherfuckingwebsite.com)). -- ### Trust assumptions? ??? Once again, the assumptions of a system's designers have a huge effect on how that system is used in perpetuity. What assumptions might the scientists building this data-sharing system have had, and what might the implications for the Web's TCB have been? --- # Commercialization ### Netscape, then Internet Explorer ??? Netscape started from a dominant position in user share, but Microsoft used code from NCSA Mosaic and the dominance of its operating system to challenge (and eventually overtake) Netscape. -- ### Dramatic acceleration through competition ??? Every new major version of Netscape and IE saw major new features added as proprietary HTML extensions which would then need to be quickly copied by the other. Fonts, background colours, frames, tables... [it was all a bit chaotic](https://www.w3.org/People/Raggett/book4/ch02.html), with different browsers supporting different features and Web developers (now starting to become "a thing") needing to support many versions of unevenly-extended HTML. -- ### Security... what's that? ??? In the race to implement the Next Big Feature, security wasn't a first-class consideration. --- class: big .fullwidth[ <img src="https://www.webdesignmuseum.org/uploaded/old-software/web-browsers/netscape-navigator/mosaic-netscape-0-9-01.png" alt="Mosaic Netscape v0.9-beta" class="fullwidth"/> ] ??? Netscape 0.9 (beta), back when it was still called Mosaic Netscape. The owners of the Mosaic trademark weren't pleased about that; subsequent versions of Netscape dropped the "Mosaic" part. --- # Cookies -- ### Origin -- ### Kinds Session cookies, persistent cookies, "secure" cookies, third-party cookies... -- ### Privacy What's that? -- (user tracking, FTC hearings...) ??? This might be the first time that the US Federal Trade Commission had to discuss the word "cookies" so publically! Since companies didn't seem very interested in user privacy, it became a matter of regulation, with [reports to the US Congress on online privacy](https://www.ftc.gov/sites/default/files/documents/reports/privacy-online-report-congress/priv-23a.pdf) and ever-increasing awareness of online privacy issues by regulators around the world. The FTC is still watching how companies use cookies, in some cases settling for [tens of millions](https://www.ftc.gov/business-guidance/blog/2012/08/milking-cookies-ftcs-225-million-settlement-google) with Web giants. Also today, the FTC has a [very long webpage](https://www.ftc.gov/policy-notices/privacy-policy/internet-cookies) explaining how _they_ use cookies on their site... it's a pretty good example of real transparency on cookies. Now, however, the EU has taken the lead on regulating cookies via the [General Data Protection Regulation (GDPR)](https://gdpr.eu/cookies). However, forcing people to click on "yes, find, whatever" isn't exactly the user-respecting new world we hoped it would be... --- class: big .fullwidth[ <img src="http://adrianroselli.com/wp-content/uploads/2014/12/Netscape1-about-1024x656.gif" alt="Netscape Navigator 1.0" class="fullwidth"/> ] ??? Netscape 1.0! --- # SSL ### _Secure sockets layer_ (later _Transport Layer Security_) * Diffie-Hellman key exchange * RSA, DES, MD5... ??? We've looked at TLS in the lab, so we've seen most of these things before. One difference is that instead of AES and SHA-384, we see older algorithms like DES and MD5. Another key difference is that these versions of SSL didn't have the same level of formal-methods rigour applied to them, which led to lots of vulnerabilities at the protocol level: * [BEAST](https://www.invicti.com/blog/web-security/how-the-beast-attack-works) * [FREAK](https://www.cisa.gov/uscert/ncas/current-activity/2015/03/06/FREAK-SSLTLS-Vulnerability) * [Logjam](https://en.wikipedia.org/wiki/Logjam_(computer_security) * [POODLE](https://www.cisa.gov/uscert/ncas/alerts/TA14-290A) * [CRIME](https://www.acunetix.com/blog/articles/tls-vulnerabilities-attacks-final-part/#CRIME%20(CVE-2012-4929) * ... and many, many more --- # SSL .floatright[ <img src="https://upload.wikimedia.org/wikipedia/commons/9/96/Munitions_T-shirt_%28front%29.jpg" alt="A shirt protesting US Export Controls over cryptography" width="350"/> .caption[A protest T-shirt] ] ### _Secure sockets layer_ (later _Transport Layer Security_) * Diffie-Hellman key exchange * RSA, DES, MD5... ### Export controls ??? We've looked at TLS in the lab, so we've seen most of these things before. One difference is that instead of AES and SHA-384, we see older algorithms like DES and MD5. Another key difference is that these versions of SSL didn't have the same level of formal-methods rigour applied to them, which led to lots of vulnerabilities at the protocol level: * [BEAST](https://www.invicti.com/blog/web-security/how-the-beast-attack-works) * [FREAK](https://www.cisa.gov/uscert/ncas/current-activity/2015/03/06/FREAK-SSLTLS-Vulnerability) * [Logjam](https://en.wikipedia.org/wiki/Logjam_%28computer_security%29) * [POODLE](https://www.cisa.gov/uscert/ncas/alerts/TA14-290A) * [CRIME](https://www.acunetix.com/blog/articles/tls-vulnerabilities-attacks-final-part/#CRIME%2520%28CVE-2012-4929%29) * ... and many, many more Another key problem with early SSL was the "crypto wars" that we've previously discussed. Cryptography was export-controlled, so only cripped algorithms with small keys could be used in software exported from the US to other countries. After much public brouhaha, the Clinton administration eventually relented and allowed "proper" crypto to be exported. It's since paid off pretty well in terms of economic activity. --- class: big .fullwidth[ <img src="https://upload.wikimedia.org/wikipedia/commons/6/69/Netscape_Navigator_2_Screenshot.png" alt="Netscape Navigator 2" class="fullwidth"/> ] ??? This image isn't quite Netscape 2.0: it's actually v2.02. That's pretty close, though. --- # Netscape Navigator 2.0 <img src="https://upload.wikimedia.org/wikipedia/en/c/c8/Netscape_Now%21_2.0.gif" alt="Netscape Now! badge" align="right" width="300"/> -- ### Java! -- * Web pages could run code via _applets_ -- * applets could be stitched into Web pages using... -- ### JavaScript! ??? JavaScript was originally meant to be a side show language, something with just enough functionality to tie "real" programs into HTML pages. Now, Java applet support is deprecated and JavaScript (now ECMAScript) continues to develop as a first-class programming language... though unfortunately its lack of original "real language" intent sometimes shines through (see https://www.youtube.com/watch?v=et8xNAc2ic8). The ability to use JavaScript led to the ability to _load_ JavaScript from other servers, which led to... --- # Same Origin Policy -- * code should only have access to things from the "same origin" -- * not really a _policy_... organically-grown _convention_* .footnote[ * Barth, "The Web Origin Concept", <a href="https://tools.ietf.org/html/rfc6454">RFC 6454</a>, 2011. ] --- # Same Origin Policy * code should only have access to things from the "same origin" * not really a _policy_... organically-grown _convention_* * not always applied consistently!† .footnote[ * Barth, "The Web Origin Concept", <a href="https://tools.ietf.org/html/rfc6454">RFC 6454</a>, 2011. † Schwenk, Niemietz and Mainka, "Same-Origin Policy: Evaluation in Modern Browsers", in _Proceedings of the 26th USENIX Security Symposium_, 2017. Available: https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/schwenk ] ??? We'll spend next class talking about some of the implications of the Same Origin Policy. Specifically, we'll talk about **cross-site scripting** and what we can do about it. --- # Summary ### Web model ### Web history ### Key Web security concepts --- class: big, middle The End.