Internet Censorship

1. Overview of Internet censorship

이제 인터넷은 일상 속의 일부가 된 지 꽤 오래다. 가상환경이라는 말이 무색하게 우리 생활에 깊이 관여하면서 지역, 사회, 국가, 시공간을 초월하여 정보를 얻을 수 있는 편리한 도구로 자리잡았다.  하지만 무수히 많은 정보 중 다양한 이유로 (문화, 관례, 관습, 종교, 정치 등) 특정한 정보를 탐탁치 않게 생각하는 세력이 있고 이를 조직, 이익단체, 회사, 정당, 또는 국가차원에서 일반인이 쉽게 접근할 수 없도록 검열(censorship)하기 시작했다.

일반적으로 검열이란 단어, 이미지, 생각 등을 억제하는 그 자체 또는 방법까지 포함하는 의미라고 볼 수 있다. 온라인에서 Censorship은 기술적으로 네트워크 계층 상단에서 종종 이뤄진다. 인터넷 이전에는 유선전화 선로를 물리적으로 단절하거나 신호를 방해하는 형태로(radio jamming이나 broadcast disruption) 가능했었다. 가장 간편한 방법은 인터넷 라인을 끊어버리는 것이지만, 국가 차원의 망을 AS 단위로 완전히 통제하지 않는 이상 IP의 Packet switching 특성 때문에 routing 경로는 다양할 수 밖에 없도록 설계되어 있다.  TCP/IP를 표준으로 사용하는 현대의 censorship은 크게 IP Layer에서 IP를 차단하거나 경로를 강제 우회하는 방식, TCP Layer에서 접속을 방해하는 방식, DNS 프로토콜 동작을 오용하는 방식,  그리고 Application Layer에서 사실상 정보교환의 표준이 된 HTTP 프로토콜에서 URL 접근차단이나 우회 등으로 나눌 수 있다.

 

2. Censorship at IP Layer

IP 계층에서 접속제한은 두 가지로 볼 수 있다.

첫째는 IP 주소자체에 대한 접근통제 목록 (ACL; Access Control List)을 router나 방화벽(firewall) 등 가능한 장비에 설정하는 방식이다. Router에서 Blackhole routing 설정도 같은 맥락이다. 이는 상당히 간편한 configuration으로 – CIDR (Classless Inter-Domain Routing) 등을 통해 – IP mapping이 가능하다는 장점이 있으나, 해당 IP를 일일히 조회해야 하는 번거로움이 있다. 설령 이를 자동화한다 해도 IP 자체 Blocking은 의도치 않게 개별이 아닌 NAT를 사용하는 조직이나 기관 단위 전체가 대상이 되어 버릴 수도 있고, Proxy를 통한 우회도 매우 쉽다. 또한 복수 개의 Host가 하나의 IP를 사용하는 경우에도 원하지 않는 차단이 이뤄질 수 있다. 영국같은 경우 해적 사이트의 경우 Proxy까지 아예 막기 시작했다. (관련기사)

둘째는 BGP (Border Gateway Protocol)를 이용해 주변 AS (Autonomous system)가짜 경로 정보(bogus route)를 뿌리는(advertise) 방식이다. 비록 의도하지 않았지만, 가장 악명높은 BGP 오용 사례 중 하나는 파키스탄 정부가 소유한 국영통신업체 PTA (Pakistan Telecommunication Authority)에서 특정 IP를 단속하고자 Youtube가 소유한 한 C Class 단위의 IP 대역을 neighbor AS에 가짜 정보를 흘린 (hijacking) 사고다. 당시 Youtube는 208.65.153.0/22 대역을 소유하고 있었는데, PTA AS (17557)에서 해당 IP대역 24 bit가 본인이 소유하고 있다고 거짓으로 알렸고 (announce), 이 정보를 받은 AS가 인접 AS에 퍼뜨리면서 수 분 내로 전 세계의 Youtube 트래픽 (AS 36561) 이 PTA로 몰리기 시작했다. 물론 PTA는 해당 IP를 블랙홀 처리했기 때문에 당연히 접속히 불가능했다. 아래 동영상을 보면 어떻게 해당 route 정보가 전파되었는지 알 수 있다. 시간대별로 세부 정보는 여기에서 확인할 수 있다.

우리나라 또한 인터넷 검열에서 자유롭지 못하다. 여기 Wikipedia에  국가별 검열 현황이 잘 나타나 있다. 정치/종교/이해관계/문화/사회 등 여러 가지 이유로 censorship을 시행하고 있지만,  당장 옮고 그름으로 판단하기 어려워 보인다. 마지막으로 Netsweeper에서 제공하는 Deny Page Tests에서 어떤 URL이 Filtering되고 있는지 테스트해 보면 어떨런지.

[Paper] Your Botnet is My Botnet: Analysis of a Botnet Takeover

Title Your Botnet is My Botnet: Analysis of a Botnet Takeover [link]
Author Brett Stone-Gross, Marco Cova, Lorenzo Cavallaro, Bob Gilbert, Martin Szydlowski, Richard Kemmerer, Christopher Kruegel, and Giovanni Vigna From UCSB
Publishing CCS ’09 Year 2009
Abstract Botnets, networks of malware-infected machines that are controlled by an adversary, are the root cause of a large number of security problems on the Internet. A particularly sophisticated and insidious type of bot is Torpig, a malware program that is designed to harvest sensitive information (such as bank account and credit card data) from its victims. In this paper, we report on our efforts to take control of the Torpig botnet and study its operations for a period of ten days. During this time, we observed more than 180 thousand infections and recorded almost 70 GB of data that the bots collected. While botnets have been “hijacked” and studied previously, the Torpig botnet exhibits certain properties that make the analysis of the data particularly interesting. First, it is possible (with reasonable accuracy) to identify unique bot infections and relate that number to the more than 1.2 million IP addresses that contacted our command and control server. Second, the Torpig botnet is large, targets a variety of applications, and gathers a rich and diverse set of data from the infected victims. This data provides a new understanding of the type and amount of personal information that is stolen by botnets.
Summary
1. Introduction (Approach)
  • Passive analysis of secondary effects caused by the activity of compromised machines
  • Active study for botnets (Torpig) via infiltration
  • Properties of Torpig:
    a. transmits identifiers that permit us to distinguish between individual infections
    b. harvests data from various applications and information from the infected victims

2. Torpig Infrastructure and background

torpig_network
(1) Background
  • Distributed to victims as part of Mebroot which makes use of the evasion technique by MBR (Master Boot Record) manipulation
  • Infected through drive-by-download attacks (inclusion of HTML tags to request JavaScript)
  • Injected a DLL into “explorer.exe” by installer, then loads kernel driver (disk.sys)
  • Contacted C&C server to obtain malicious modules initially, where all communication were done with a sophisticated, custom encryption algorithm
  • Uploaded stolen data (i.e stored passwords and accounts) into C&C server periodically
  • Took advantage of phishing site which consists of HTML form to enter sensitive info
(2) Domain Flux
  • Each bot uses a DGA, Domain Generation Algorithm, with domain flux which generates a list of “rendezvous points” that could be used by botmasters to control their bots
  • Bots queried a certain domain mapped onto a set of IPs, changing frequently
(3) Taking control of the botnet
  • By registering .com / .net domains for 3 weeks
  • Sinkhole the traffic, ended up with gathering 8.7GB web log and 69GB pcap data
3. Botnet analysis
(1) Stolen Data
  • AVG Antivirus Free v2.9 (or AVG)
  • Lookout Security & Antivirus v6.9 (or Lookout)
  • Norton Mobile Security Lite v2.5.0.379 (Norton)
  • TrendMicro Mobile Security Personal Edition v2.0.0.1294 (TrendMicro)
(2) Botnet Size
  • Counting bots by nid: 180,835,  by submission header fields: 182,914 machines
  • 1,247,642 unique IP addresses

(3) Some statistics from the paper

bot_statistics

Note It is a quite interesting paper because it contains live analysis of sensitive data harvested from the machines infected by real botnets on the fly. Also it is impressive to perform active infiltration impersonating C&C server in order to take over a botnet. This paper discusses one of the notorious botnets, Torpig, which was widely prevalent over the world back in 2009. The authors tried to reach comprehensive understanding on Torpig including infection path, static and dynamic analysis by reversing, relevant analysis of collected/stolen data from diverse perspectives.

[Paper] Smashing the Gadgets: Hindering Return-Oriented Programming Using In-Place Code Randomization

Title Smashing the Gadgets: Hindering Return-Oriented Programming Using In-Place Code Randomization
Author Vasilis Pappas,  Michalis Polychronakis and Angelos D. Keromytis  From Columbia University
Publishing SP ’12 Proceedings of the 2012 IEEE Symposium on Security and Privacy Year 2012
Abstract

The wide adoption of non-executable page protections in recent versions of popular operating systems has given rise to attacks that employ return-oriented programming (ROP) to achieve arbitrary code execution without the injection of any code. Existing defenses against ROP exploits either require source code or symbolic debugging information, or impose a significant runtime overhead, which limits their applicability for the protection of third-party applications. In this paper we present in-place code randomization, a practical mitigation technique against ROP attacks that can be applied directly on third-party software. Our method uses various narrow-scope code transformations that can be applied statically, without changing the location of basic blocks, allowing the safe randomization of stripped binaries even with partial disassembly coverage. These transformations effectively eliminate about 10%, and probabilistically break about 80% of the useful instruction sequences found in a large set of PE files. Since no additional code is inserted, in-place code randomization does not incur any measurable runtime overhead, enabling it to be easily used in tandem with existing exploit mitigations such as address space layout randomization. Our evaluation using publicly available ROP exploits and two ROP code generation toolkits demonstrates that our technique prevents the exploitation of the tested vulnerable Windows 7 applications, including Adobe Reader, as well as the automated construction of alternative ROP payloads that aim to circumvent in-place code randomization using solely any remaining unaffected instruction sequences.

Summary
1. Introduction
ASLR/DEP techniques can be still evaded because parts of the address space in Windows do not change due to executables with fixed loaded address and/or shared libraries incompatible with ASLR. And Some exploits allows one to calculate the base address of a DLL either by brute-force or through a leaked pointer.
 
2. Existing mitigations against ROP and their limitation
Technique Limitation
ASLR/DEP
Compiler extensions
Code randomization (permutation of function order) Precise and complete extraction of all code and data which is only possible when debugging information is available.
Control-flow integrity (binary instrumentation)
Runtime solutions Significant runtime overhead
3. Novel and practical approach: In-place code randomization
  • Case I: Atomic instruction substitution
    Idea: obfuscation, metamorphism
    Same computation can be done by a countless number of different instruction combination
    Instructions of a gadget can be substituted by a functionally equivalent but different sequence of instructions
  • Case II: Reordering intra basic block
    Idea: using dependence graph, code block can be reordered
  • Case III: Reordering register preservation code
    Idea: As long as callee-saved registers (ebx, esi, edi, ebp) are restored in the right order, their actual order on the stack is irrelevant.
  • Case IV: Register Reassignment
    Idea: two registers can be swapping during parallel, self-contained regions by drawing CFG (Control Flow Graph)
 
4. Results
  • Target: x86 PE executables – 5,235 PE files
  • On average, the applied transformations effectively eliminate about 10% gadgets
  • It also probabilistically break about 80% of the gadgets
Note This paper presents an in-place code randomization technique against ROP attack. The idea is that it could break the gadgets with ease by limited number of transformation of binary, maintaining the size of the binary exactly the same. This is done with the help of disassembly using IDA Pro. The conservative randomization guarantees to keep the original code. Although the gadgets might be found in remaining text area, it would be quite practical and effective to make attackers hard in order to take advantage of gadget collection. The result shows that only 10% gadgets elimination could break 80% in total.