Visualizing DNS Traffic
Visualizing DNS Traffic
Pin Ren, John Kristoff, Bruce Gooch

Introduction
The Domain Name System (DNS) is a general, distributed naming service, widely used in TCP/IP networks to refer to resources. All hosts can run a complete resolving name client application, but generally, end systems use a lightweight "stub resolver", which sends requests to local caching name servers to perform the lookups on their behalf. On the Internet today, almost all users and applications make extensive use of DNS through the use of lookups from a client resolver to one or more name servers.

While there have been some studies on DNS traffic over the years, we believe a number of insights into the operation of the Internet can be gained by looking at DNS traffic in non-traditional ways. In this paper we aim to help fill this gap by proposing novel visualization methods to depict DNS data in order to better identify anomalies, misconfiguration, security events and overall trends. In the long history of DNS, security concerns have often been central to many proposed or implemented changes in design and implementation both past and present. Weaknesses in the DNS protocols or operation of name servers has given rise to a number of malicious attacks including two recent threats:

1. Reflection and Amplification attacks: By their very nature, name servers tend to be widely accessible so that any client can perform a lookup for data that the server is authoritative for. Currently, there is a large population of open recursive name servers accessible on the public Internet. DNS messages are typically delivered using a single request and response over UDP with no widely deployed authentication mechanism between clients and servers. It can also be shown that a very small DNS request can solicit a disproportionately large response. The capability for many clients on many networks to be able to forge their source address coupled with the deployed DNS attributes previously mentioned, make it possible for someone to launch a devastatingly amplified, reflective attack against an unwitting victim.

2. Cache poisoning attacks: Caching name servers, including many open recursive name servers, may be susceptible to cache poisoning. One notorious incident involved a DNS implementation that was too trusting with the data it received in the additional section of a response message, making it easy for a vulnerable server to associate incorrect IP addresses with names. More recently, it has been shown that cache poisoning is a threat shared by nearly all implementations if the attacker has enough patience and some knowledge about queries a caching server makes.

The two DNS threats mentioned above are our main motivations for this research. We propose a methodology, which leverages visualization and human visual perception capabilities in the process of identifying, detecting, classifying, and analyzing the abnormal DNS querying behaviors, Our methods provide techniques for visualizing DNS queries and may allow domain experts to identify and solve the corresponding DNS security issues.


Results

Figure 1: The user interface of our DNS visualization tool. We divide the screen space into three different panels: the main visualization window contains four different visualizations organized for tabbed browsing, the statistics panel displays the aggregated frequency data generated from database, and the query detail panel displays the detailed information for selected queries. In this figure, we visualize the source IPs and their aggregated querying frequency around the midnight of May 30, 2005. The same data also drives the visualizations seen in following four figures.

Figure 2: The Flying Term metaphor employs motion to visualize the dynamic relationships of DNS queries, using frequency to regulate an object¡¯s vertical position and frequency distribution over time to decide the horizontal position. Animated tail curves highlight the important (selected) queries, and indicate change as well as history. In this figure, our visualization objects are those source port numbers and their frequency data. we can notice that port number 1025 and 1026 are the ports with highest aggregated frequency of occurrence (highlighted using red circle). Indicated by the direction of curly tail, user can notice that the frequency of port 1026 is still moving upwards. Besides those hot ports, moving dynamics of other ports can also be easily observed. For example, near the bottom of the screen, port 1313 is moving towards right (highlighted with blue circle), which means there are (many) queries issued using port 1313 in the most recent time bin of this time window.

Figure 3: Stacking Graph visualization for the source IP data used in Figure 2. The color of each stream band is consistent with the originally randomly assigned color in the Flying Term view and Statistic Panel in Figure 1, and the graphs are stacking from bottom up. The global trends of data streams are easy to observe.

Figure 4: Two Tone Pseudo Color view for visualizing the data value change. Note the value can be read out fairly accurately even with very small display estate, and the value outliers can be easily noticed. For example, there is a high volume of query occurrences in the first three minutes for the third source IP. The highest frequency in day 1 and day 3 is around 750 just by looking at the visualization. (In the value-color meter above, full red color corresponds to 753). This can be confirmed by looking at top-right corner of the statistics panel in Figure 1, the actual values for day 1 and day 3 are 751 and 753 respectively. Since the color is used to present value, we label each object using the index number to the left of the visualization in order to establish correspondences.

Figure 5: Chernoff Face View for monitoring. Note that the two faces in the first column are reference faces for showing two extremes, The first happy face is the reference for the all zero data input, and the second angry and unhappy face is for an imaginary input data with highest values input for each time bin within the current time window. The color of each face is consistent with previous views. As we discovered in Figure 4 the high value data outlier is the third object. In this visualization, we can also pick out this outlier easily, the third face in the second column is indeed an outlier visually.


Case Studies

Figure 6: Visualizing the queries for "bad names" on a blacklist. Notice that two host name strings are moving together in the visualization to the left (highlighted using red circles). After filtering out other query names, the user can conduct pairwise comparison of the two names in question in the Stacking Graph visualization to the right. Their data patterns were similar in this hour, and for some unknown reason, they both had a sudden drop of activity in the middle and recovered later. Further investigation using query detail view proved that those two "bad" host names did share one common querying source IP address. The names may be associated with the same malicious server or similar malware, which results in congruent query activity patterns.

Figure 7: Visualizing the source IP addresses and their querying frequency. By monitoring the Flying Term visualization to the left, the user noticed the high-speed sudden rise of one IP address (highlighted using red circle). In order to investigate, the user clicked this IP address¡¯s name in Flying Term visualization and the detail of its querying behaviors in this time window were revealed. They were all PTR type queries (reverse lookup), a well-known brute force password attack to an SSH server.

More Images And Video Clips


Publications
  • Pin Ren, John Kristoff, Bruce Gooch. Visualizing DNS Traffic (pdf), to appear in Proceedings of VizSEC 2006, October, 2006

Pin Ren