Security and Privacy Analysis of Web Browsing
Web privacy
In the modern day where the online world is driven by digital advertisement, web tracking has become omnipresent and harmful to users’ privacy. As individuals surf among web pages, an intricate web of cookies, hidden pixels, fingerprinting scripts, and sophisticated algorithms silently monitor their every click. As a result, there has been a tug-of-war between web privacy technologies and web tracking techniques.
OmniCrawl: Comprehensive Measurement of Web Tracking With Real Desktop and Mobile Browsers
The OmniCrawl project focused on web tracking in the mobile environment. The majority of prior research has concentrated on desktop browsers or emulated mobile environments, potentially overlooking the nuances of mobile tracking. To address this gap, the study introduces OmniCrawl, a novel web measurement infrastructure that facilitates real-world browsers, highlighting the limitations of using emulated mobile browsers in research. Through the use of OmniCrawl, our study has revealed that the third-party advertising and tracking ecosystem in mobile browsers is more comparable to that of desktop browsers than previously thought. Our research demonstrates that common methodological choices in web measurement studies, such as the use of emulated mobile browsers and Selenium, can lead to website behavior that differs from what users actually experience.
Fig 1. OmniCrawl overview and workflow
Investigating Advertisers’ Domain-changing Behaviors and Their Impacts on Ad-blocker Filter Lists
The second study shifts the focus to ad blockers and their susceptibility to replica ad domains (RAD domains). Ad blockers traditionally rely on filter lists to thwart ads and trackers, but a rising trend of registering new domains—akin to original ones—has raised concerns about the efficacy of filter lists.
Fig 2. The percentage of ad domains and RAD domains, by purpose. The populations are ad domains (5,133) and RAD domains (420) with identifiable purposes, respectively.
This research embarked on an in-depth investigation of RAD domains, aiming to quantify their prevalence and impact. From a crawl of 50,000 websites, we identified 1,748 unique RAD domains, 1,096 of which survived for an average of 410.5 days before they were blocked. Additionally, we discovered that the RAD domains affected 10.2% of the websites we crawled, and 23.7% of the RAD domains exhibiting privacy-intrusive behaviors, undermining ad blockers’ privacy protection.
Website Fingerprinting
Website fingerprinting (WF) attacks allow an adversary who can observe the traffic patterns of the victim to predict the website the victim is visiting. On the negative side, website fingerprinting can erode user privacy, even when they use Tor. On the positive side, it can help to detect abnormal websites solely based on the traffic pattern and therefore block them.
FALCO: Detecting JavaScript-based Cyber Attack Using Website Fingerprints
We developed FALCO, a system to detect JavaScript injection attacks, which expose users to browser-based DDoS and unwanted ads. The FALCO detects these attacks by identifying discrepancies in website behavior fingerprints, obtained from external domain dependencies. FALCO achieves a 96.98% detection rate. We also offer an easy-to-use browser extension for users.
Fig 3. Obtain fingerprint from requests using Bloom filter
Know Your Victim: Tor Browser Setting Identification via Network Traffic Analysis
In another research, we developed a method to identify users’ browser settings through network traffic analysis to address privacy concerns in the Tor network. We demonstrate that browser settings significantly impact traffic and create a classifier with over 99% accuracy under closed-world assumptions. The project contributes insights into the relation between browser settings and network behavior.
Fig 4. Feature Set Summary
Publications
- OmniCrawl: Comprehensive Measurement of Web Tracking With Real Desktop and Mobile Browsers D. Cassel, S.-C. Lin, A. Buraggina, W. Wang, A. Zhang, L. Bauer, H.-C. Hsiao, L. Jia, T. Libert In Privacy Enhancing Technologies Symposium (PETS), July 2022. (Best Artifact Award)
- Investigating Advertisers’ Domain-changing Behaviors and Their Impacts on Ad-blocker Filter Lists S.-C. Lin, K.-H. Chou, Y. Chen, H.-C. Hsiao, D. Cassel, L. Bauer, and L. Jia In The Web Conference (TheWebConf, formerly known as WWW), April 2022.
- Capturing Antique Browsers in Modern Devices: A Security Analysis of Captive Portal Mini-Browsers P.-L. Wang, K.-H. Chou, S.-C. Hsiao, A. T. Low, T. H.-J. Kim and H.-C. Hsiao To appear in The 21st International Conference on Applied Cryptography and Network Security (ACNS), June 2023. (Best Student Paper Award)
- FALCO: Detecting JavaScript-based Cyber Attack Using Website Fingerprints C.-C. Liu, H.-C. Hsiao, T. H.-J. Kim In International Conference on Security and Cryptography (SECRYPT), July 2020.
- Know Your Victim: Tor Browser Setting Identification via Network Traffic Analysis C.-M. Chang, H.-C. Hsiao, T. Lynar, T. Mori In the Poster Track of The Web Conference (TheWebConf), April 2022.