class: big, middle # ECE 7420 / ENGI 9823: Security .title[ .lecture[Lecture 29:] .title[Private browsing] ] --- # Today <img src="https://pbs.twimg.com/media/F0q0eTpaYAMEKRY?format=webp&name=medium" align="right" width="475"/> ### Online tracking and surveillance ### VPNs ### Tor --- # Online tracking & surveillance -- ### Advertising & marketing -- ### Data brokerage -- ### Politics -- ### Censorship -- ### Repression --- # Tracking & surveillance -- ### Cookies ??? Websites can store include **first-party** and **third-party** cookies by asking your browser to store a cookie or by including content from a third-party site (e.g., Google Analytics) that does so. Browsers will commonly block third-part cookies when using "Private Browsing" mode, but you might need to tell your browser if you want to block these cookies all of the time. Just try inspecting the cookies that your browser holds about you on behalf of a site that shows ads... you might be a little surprised at what you find! -- ### Images ??? Images can also be used to track you: it's very common to have websites (even HTML emails!) to have a 1px × 1px transparent image that loads, e.g., https://tracker.example.com/tracker.png?userid=abd64cd8a0df. -- ### JavaScript ??? JavaScript can be used to load all kinds of code that does all kinds of things, including... -- ### Browser fingerprinting ??? **browser fingerprinting**. There are a lot of things that you can learn about a user's browser without JavaScript (IP, User-Agent, Language, etc.), but running JavaScript can access other details, e.g., screen height/width, installed fonts, installed plugins. Sure, lots of people run Firefox v90 on macOS, but how many have exactly my set of fonts, languages, timezone, etc.? The intersection of these various sets of people can be used to identify people surprisingly well... just try visiting https://amiunique.org ! Also maybe try https://coveryourtracks.eff.org just for fun. -- ### DPI ??? Finally, we know that deep packet inspection is a real issue in some environments. Companies might want to see what you're doing on your work computer, but in many parts of the world, governments want to see what you're saying to others! --- layout: true # Postal analogy --- ### Letters: --- .floatright[ <img src="https://upload.wikimedia.org/wikipedia/commons/8/89/Room_641A_exterior.jpg" alt="Room 641A in the SBC Communications building" width="400"/> .caption[Source: [JoshuaKGarner via Wikipedia](https://en.wikipedia.org/wiki/File:Room_641A_exterior.jpg)] ] ### Letters: * _Cabinet noir_* .footnote[ * Chisholm, Hugh, ed. (1911). "Cabinet Noir". Encyclopædia Britannica (11th ed.). Cambridge University Press. ] ??? The _cabinet noir_ was a feature of the French postal system since the 17th Century, later replicated by many other postal systems. Their job was to open letters that the government wanted to read, copy them and re-seal them before delivering them. This is analogous to deep-packet inspection. In fact, the analogy is very strong to rooms that contain fibre-splitting equipmment, as is alleged to be kept in places like [Room 641A of the SBC Communications Building in San Francisco](https://en.wikipedia.org/wiki/Room_641A). --- .floatright[ <img src="https://www.stampboards.com/images/samartos/Peg1-1.png" alt="Mail sorting machines" width="600"/> ] ### Letters: * _Cabinet noir_* * USPS _Mail Isolation and Tracking_† .footnote[ * Chisholm, Hugh, ed. (1911). "Cabinet Noir". Encyclopædia Britannica (11th ed.). Cambridge University Press. † "Postal Service Confirms Photographing All U.S. Mail", Nixon, <a href="https://www.nytimes.com/2013/08/03/us/postal-service-confirms-photographing-all-us-mail.html">_The New York Times_</a>, 2 Aug 2013. ] ??? The _cabinet noir_ was a feature of the French postal system since the 17th Century, later replicated by many other postal systems. Their job was to open letters that the government wanted to read, copy them and re-seal them before delivering them. This is analogous to deep-packet inspection. In fact, the analogy is very strong to rooms that contain fibre-splitting equipmment, as is alleged to be kept in places like [Room 641A of the SBC Communications Building in San Francisco](https://en.wikipedia.org/wiki/Room_641A). A less-invasive program for mail does exist in the form of the US Postal Service's — pubically-acknowledged! — _Mail Isolation and Tracking_ program. In this program, the _outside_ of every piece of mail in the US may be scanned for later use by law enforcement. --- ### Letters: <img src="rapid-remailer.png" width="600" align="right"/> * _Cabinet noir_* * USPS _Mail Isolation and Tracking_† * Remailing .footnote[ * Chisholm, Hugh, ed. (1911). "Cabinet Noir". Encyclopædia Britannica (11th ed.). Cambridge University Press. † "Postal Service Confirms Photographing All U.S. Mail", Nixon, <a href="https://www.nytimes.com/2013/08/03/us/postal-service-confirms-photographing-all-us-mail.html">_The New York Times_</a>, 2 Aug 2013. ] ??? The _cabinet noir_ was a feature of the French postal system since the 17th Century, later replicated by many other postal systems. Their job was to open letters that the government wanted to read, copy them and re-seal them before delivering them. This is analogous to deep-packet inspection. In fact, the analogy is very strong to rooms that contain fibre-splitting equipmment, as is alleged to be kept in places like [Room 641A of the SBC Communications Building in San Francisco](https://en.wikipedia.org/wiki/Room_641A). A less-invasive program for mail does exist in the form of the US Postal Service's — pubically-acknowledged! — _Mail Isolation and Tracking_ program. In this program, the _outside_ of every piece of mail in the US may be scanned for later use by law enforcement. One way to try to hide metadata about who is mailing whom is to use a _remailer_ service, which will receive your mail, open it and send its contents to someone else under new cover. However, if you're trying to hide criminal or other activity, you might want to consider that not very many people use such services, so merely using one might make you stand out a bit... --- layout: false # Put that box in another box... -- .centered[ <img src="https://upload.wikimedia.org/wikipedia/commons/3/3d/First_matryoshka_museum_doll_open.jpg" alt="The very first set of matryoshka dolls"/> ] ??? A remailer network can work like a Matryoska doll set, putting packages inside of packages inside of packages, with each hop through the network removing one layer of packaging. --- # Private email .footnote[ Chaum, "Untraceable electronic mail, return addresses, and digital pseudonyms", _Communications of the ACM_, 24(2), pp84‒90, 1981. DOI: <a href="https://doi.org/10.1145/358549.358563">10.1145/358549.358563</a> ] -- <img src="https://upload.wikimedia.org/wikipedia/commons/4/4f/Red_de_mezcla.png" alt="A mix network" width="650" align="right"/> ### Network of _remailers_ ??? In the 1980s, some crypto researchers brought this concept of remailers into the world of electronic mail. They proposed sending encrypted email to a remailer — called a **mix** — that would decrypt one layer of encryption and then send its contents on to the next mix, which would do the same, etc. Eventually, mail would go from senders to recipients, but an outside observer wouldn't be able to tell who was emailing whom. -- ### Double-blind ??? One neat property of this system was that, even if an attacker controlled one of the mix nodes, they _still_ wouldn't be able to see who was emailing whom. Each step in the process was (or, technically, still is) double-blind. -- ### Attacks Timing, $n-1$... ??? There are problems with the mix concept, however. Firstly, if a _global passive adversary_ can see that Alice emails a mix, which emails a mix, which emails a mix, which emails Bob, we can have a pretty good guess about who's emailing whom. Thus, mixes have to **add latency** to email in order to conceal identities. In particular, mixes would typically wait either a fixed amount of time or until they'd gathered up a batch of emails before sending them all on at the same time. Even _then_, however, if an attacker knew that a mix would wait for 50 emails, they could wait for you to send one, then send 49 themselves, then see where all 50 go (the $n-1$ attack). They already know where 49 are going, so the remaining one must be yours! --- # Mixing lessons <img src="https://i.dailymail.co.uk/i/pix/2015/08/06/13/2B20EABC00000578-3186947-image-m-6_1438864066303.jpg" width="400" alt="Mixing on the Great British Bake-off" align="right"/> -- ### _Anonymity set_ ??? People who used anonymous remailers were very excited about the concept of an _anonymity set_, the set of **users who might've sent a message**. The theory was, if you can only prove that I _might've_ been the one to send that email, but it _might've_ been any of these other 49 people, I have reasonable doubt! -- * size often unimportant! ??? The problem with this thinking is twofold. Firstly, there are lots of circumstances in which the size of the anonymity set isn't terribly important. A criminal prosecutor might need to prove something beyond a reasonable doubt to put you in prison, but simply using a remailer at around the right time could give the police "reasonable suspicion" to have a conversation with you, or for your boss to have your work computer examined. In some circumstances, the adversary might not care much at all about the size of an anonymity set: they'd just as happily kick down 50 doors as one. -- * probable cause, reasonable doubt or mere suspicion? ??? Secondly, there ain't that many people using anonymous remailers to begin with. If your adversary's threshold for action is mere suspicion, they can simply act against **everyone who uses a remailer**! -- ### Latency matters ??? Another key lesson from remailers is that latency is important. Nobody wants to use a high-latency service, which means that using the service makes you stand out, which makes even fewer people want to use the service, etc. If, on the other hand, a low-latency service can provide privacy properties for "regular" people as well as those seeking to evade censorship, it could be popular, which will provide even better privacy properties! -- ### The perfect vs the good --- <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/15/Tor-logo-2011-flat.svg/1280px-Tor-logo-2011-flat.svg.png" alt="Tor logo" width="300" align="right"/> # Tor -- ### Dominant tool for: -- * censorship resistance ??? Tor no longer uses the language of "anonymity". Anonymity is really hard, and it's also a word that feels devious ("what, do you have something to hide?"). Instead, Tor is typically described as a censorship resistance tool (a high-imporance use case) as well as a tool for online privacy (a high-volume use case). -- * online privacy -- ### Imperfect but usually "good enough" ??? Owing to its low-latency operation, Tor **does not protect against a global passive adversary**. If there's a true Dolev-Yao attacker, or even just a passive attacker, who can see every message in the world, they can use correlation to figure out who's talking to whom. However, if the network can be large and popular enough, it becomes very difficult to actually become a global passive adversary. -- ... ## even against some strong adversaries! --- # Tor mechanics -- ### _Tor: The Onion Router_* (not called that any more) .footnote[ * "Tor: The Second-Generation Onion Router", Dingledine, Mathewson and Syverson, in _Proceedings of the 13th USENIX Security Symposium_, 2004. Available: <a href="https://www.usenix.org/legacy/events/sec04/tech/full_papers/dingledine/dingledine.pdf">usenix.org</a> ] --- # Tor mechanics .floatright[ <img src="https://3.bp.blogspot.com/-LWw56vjVcWg/WfJsrmFVJQI/AAAAAAAAJQI/tukKZ9cazy8O1O0Ubl_gzEHU5vr0G_OgACLcBGAs/s1600/TOR.jpg" alt="Tor routing" width="500"/> .caption[ Source: <a href="https://www.kitploit.com/2017/10/exitmap-fast-and-modular-scanner-for.html">KitSploit</a> ] ] ### _Tor: The Onion Router_* (not called that any more) .footnote[ * "Tor: The Second-Generation Onion Router", Dingledine, Mathewson and Syverson, in _Proceedings of the 13th USENIX Security Symposium_, 2004. Available: <a href="https://www.usenix.org/legacy/events/sec04/tech/full_papers/dingledine/dingledine.pdf">usenix.org</a> ] ### _Telescoping_ routing -- Client builds _circuit_ from _guard_, _relay_ and _exit_ nodes ??? Tor is **source-routed**: the **client** decides which nodes it wants to use. This is different from typical IP routing, where each router can decide which path a packet ought to take. That said, Tor is an **overlay network** that runs **on top of IP**. Tor clients choose guard, relay and exit nodes from a directory of publically-visible Tor nodes. --- # Tor nodes <img src="tor-circuit.png" align="right" width="500"/> -- ### Directory nodes -- ### Guard nodes -- ### Relay nodes -- ### Exit nodes --- # Ethical considerations -- ### Dual-use technology ??? Tor is yet another of these dual-use technologies. It is used by people living under authoritarian governments who want to share uncensored news about the world, organize protests, etc. It is _also_ used by people who want to share child sexual abuse imagery, organize terrorism, etc. --- # Ethical considerations <img src="tor-exit-node-map.png" align="right" width="500" alt="Exit nodes"/> ### Dual-use technology ### Exit nodes ??? Tor is yet another of these dual-use technologies. It is used by people living under authoritarian governments who want to share uncensored news about the world, organize protests, etc. It is _also_ used by people who want to share child sexual abuse imagery, organize terrorism, etc. This dual-use nature of the technology can come home pretty quickly if you choose to run a Tor exit node. Running a Tor guard or relay node is a fairly safe thing to do: people open encrypted tunnels to you and you open encrypted tunnels to other Tor nodes. If you run an exit node, however, whatever stuff people want to do via Tor is exposed in the connections that you make to real web servers. If someone is retrieving uncensored news, it looks like you're retrieving it. If someone is sharing images, which can include awful things like child sexual abuse material (CSAM), it looks like you're sharing them. Thus, running a Tor exit node can be a risky thing to do. -- ### Social contract https://blog.torproject.org/tor-social-contract ??? Tor is largely run by a community of people — including lots of academics — who have stated objectives around advancing human rights, advocacy, research and other principles. Like other forms of technology, Tor reflects the goals of the people who make it; unlike many forms of technology, Tor's social contract is explicitly stated. -- ### What is a "bad day" for your users? ??? Also unlike many forms of technology, the risks of performing research studies on a live network can have **very serious repercussions for real users**. Thus, the _Privacy Enhancing Technologies_ community thinks about research ethics much more explicitly than many people in computer science / engineering. In fact, other domains are only now starting to catch up. --- # Using Tor -- ### What does telescoping routing buy you? ??? What telescoping routing _does_ buy you is reduced visibility from all but global adversaries (i.e., probably almost any adversary you might care about). -- ### Proxy usage * usability * tracking vs surveillance ??? Counterintuitively, however, using a privacy-enhancing proxy often means _not_ using features like TLS! The proxied mode of Tor needs to see your browser's traffic so that it can strip out lots of identifying information, etc. Thus, the current recommendation is to **not use Tor as a proxy**. -- ### Tor Browser ??? Instead, the Tor Browser integrates Tor within its own version of Firefox. This allows you to avoid the mistake of **not using Tor for any connection** (something the Dread Pirate Roberts could've used!). It also does other fingerprint-reducing things like **limiting screen size granularities**; it's recommended to not install any plugins beyond the included blockers. --- # Hidden services <img src="tor-hidden-circuit-cropped.png" align="right" width="500" alt="A hidden service being accessed through Tor"/> ### Rendezvous at a relay -- ### Client, server _both_ hiding -- ### a.k.a., "onion" services e.g., <a href="https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/">duckduckgo[...]wzczad.onion</a> -- ### a.k.a., "dark web" --- # "Onion" services ??? Onion services historically had shorter names like <a href="https://sml5wmpuq7ifq2mh.onion">sml5wmpuq7ifq2mh.onion</a>, but as time marched on, v3 onion services were required to use better cryptographic algorithms with longer data lengths (e.g., SHA-1 became SHA-3). Consequently, they now have names like <a href="http://a4zum5ydurvljrohxqp2rjjal5kro4ge2q2qizuonf2jubkhcr627gad.onion">a4zum5ydurvljrohxqp2rjjal5kro4ge2q2qizuonf2jubkhcr627gad.onion</a>. -- .floatleft.column[ ### Web services: * <a href="https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/">DuckDuckGo</a> * <a href="https://protonmailrmez3lotccipshtkleegetolb73fuirgj7r4o4vfu7ozyd.onion/">ProtonMail</a> ] -- .floatleft.column[ ### SecureDrop: * <a href="http://gppg43zz5d2yfuom3yfmxnnokn3zj4mekt55onlng3zs653ty4fio6qd.onion/">CBC</a> * <a href="http://a4zum5ydurvljrohxqp2rjjal5kro4ge2q2qizuonf2jubkhcr627gad.onion">Globe and Mail</a> ] ??? These names are a bit awkward to remember. As an alternative, the [Freedom of the Press Foundation](https://freedom.press/) (also: <a href="http://fpfjxcrmw437h6z2xl3w4czl55kvkmxpapg37bbopsafdu7q454byxid.onion/">http://fpfjxcrmw437h6z2xl3w4czl55kvkmxpapg37bbopsafdu7q454byxid.onion</a>), which is the organization behind SecureDrop, maintains a set of "onion names" for SecureDrop sites. Unlike DNS, which requires you to tell somebody what site you want to visit, onion names are distributed to everyone's Tor Browser ahead of time (though only to the desktop version at present). This allows you to access the above-named onion service at the much-easier-to-remember <a href="http://theglobeandmail.securedrop.tor.onion">theglobeandmail.securedrop.tor.onion</a>. -- .floatleft.column[ ### Others: ] <a href="http://zqktlwiuavvvqqt4ybvgvi7tyo4hjl5xgfuvpdf6otjiycgwqbym2qad.onion/wiki/index.php/Main_Page">Hidden Wiki</a> -- <a href="http://scihub22266oqcxt.onion">Sci-Hub</a> (now dead) -- <a href="http://ciadotgov4sjwlzihbbgxnqg3xiyrg7so2r2o3lt5wz5ypk4sxyjstad.onion">CIA</a> (yes, _that_ CIA) -- <a href="https://www.facebookwkhpilnemxj7asaniu7vnjjbiltxjqhye3mhbshg7kx5tfyd.onion/">Facebook</a> (!?) -- .floatleft.column[ ... and untold other places, many of which are **not good** (the <a href="https://www.theguardian.com/technology/2013/oct/03/five-stupid-things-dread-pirate-roberts-did-to-get-arrested">now-defunct Silk Road</a> being just the tip of the iceberg) ] --- # Cats and mice -- ### Blocking and Bridge nodes ??? If a government doesn't want people to use the uncensored Internet, they probably also don't want people to use Tor. Finding a Tor entry node relies on a [directory authority run by one of a small number of Tor volunteers]( https://blog.torproject.org/introducing-bastet-our-new-directory-authority), but given that anyone can access any directory authorities, censors can also block them. That's why Tor incorporates [bridge nodes](https://tb-manual.torproject.org/bridges) for use in places (companies or countries) that block Tor directory authorities. You can request a bridge node through Tor, over HTTP or even via email, and the full list isn't published. -- ### Traffic analysis*† .footnote[ * "Low-cost traffic analysis of Tor", Murdoch and Danezis, in _Proceedings of the 2005 IEEE Symposium on Security and Privacy (S&P'05)_, 2005. DOI: <a href="https://doi.org/10.1109/SP.2005.12">10.1109/SP.2005.12</a> † "Users get routed: traffic correlation on tor by realistic adversaries", Johnson, Wacek, Jansen, Sherr and Syverson, in _CCS '13: Proceedings of the 2013 ACM SIGSAC conference on Computer & Communications Security_, 2013. DOI: <a href="https://doi.org/10.1145/2508859.2516651">10.1145/2508859.2516651</a> ] -- ### Pluggable transport ??? Tor's [pluggable transport](https://tb-manual.torproject.org/circumvention) allows Tor to make its traffic profile look like another type of traffic. Does your country block Tor? Tor can make the traffic look like WebRTC (a videoconferencing protocol). Is WebRTC blocked? Tor can make it look like you're using a Microsoft website! --- # The game is afoot <img src="https://ichef.bbci.co.uk/news/660/cpsprodpb/D918/production/_84167555_84167554.jpg" alt="Sherlock" align="right" width="400"/> ### Cats and mice continue -- ### The story unfolds... see, e.g., [PETS Symposium](https://www.petsymposium.org) --- # Summary ### Online tracking and surveillance ### Remailing ### Tor --- class: big, middle The End.