Introduction
The Domain Name System (DNS) is a general, distributed
naming service, widely used in TCP/IP networks to refer to
resources. All hosts can run a complete resolving name client
application, but generally, end systems use a lightweight
"stub resolver", which sends requests to local caching name
servers to perform the lookups on their behalf. On the Internet
today, almost all users and applications make extensive
use of DNS through the use of lookups from a client resolver
to one or more name servers.
While there have been some studies on DNS traffic over the
years, we believe a number of insights into the operation
of the Internet can be gained by looking at DNS traffic in
non-traditional ways. In this paper we aim to help fill this
gap by proposing novel visualization methods to depict DNS
data in order to better identify anomalies, misconfiguration,
security events and overall trends.
In the long history of DNS, security concerns have often
been central to many proposed or implemented changes in
design and implementation both past and present. Weaknesses in the DNS protocols or operation of name servers has given rise
to a number of malicious attacks including two recent threats:
1. Reflection and Amplification attacks: By their
very nature, name servers tend to be widely accessible
so that any client can perform a lookup for data
that the server is authoritative for. Currently, there
is a large population of open recursive name servers
accessible on the public Internet. DNS messages
are typically delivered using a single request and response
over UDP with no widely deployed authentication
mechanism between clients and servers. It can
also be shown that a very small DNS request can solicit
a disproportionately large response. The capability
for many clients on many networks to be able to
forge their source address coupled with the deployed
DNS attributes previously mentioned, make it possible
for someone to launch a devastatingly amplified,
reflective attack against an unwitting victim.
2. Cache poisoning attacks: Caching name servers,
including many open recursive name servers, may be
susceptible to cache poisoning. One notorious incident
involved a DNS implementation that was too
trusting with the data it received in the additional section
of a response message, making it easy for a vulnerable
server to associate incorrect IP addresses with
names. More recently, it has been shown that cache
poisoning is a threat shared by nearly all implementations
if the attacker has enough patience and some
knowledge about queries a caching server makes.
The two DNS threats mentioned above are our main
motivations for this research. We propose a methodology,
which leverages visualization and human visual perception
capabilities in the process of identifying, detecting, classifying,
and analyzing the abnormal DNS querying behaviors,
Our methods provide techniques for visualizing DNS queries
and may allow domain experts to identify and solve the corresponding
DNS security issues.
Results
Figure 1: The user interface of our DNS visualization
tool. We divide the screen space into
three different panels: the main visualization window
contains four different visualizations organized
for tabbed browsing, the statistics panel displays the
aggregated frequency data generated from database,
and the query detail panel displays the detailed information
for selected queries. In this figure, we
visualize the source IPs and their aggregated querying
frequency around the midnight of May 30, 2005.
The same data also drives the visualizations seen in following four figures.
Figure 2: The Flying Term metaphor employs motion
to visualize the dynamic relationships of DNS
queries, using frequency to regulate an object¡¯s vertical
position and frequency distribution over time
to decide the horizontal position. Animated tail
curves highlight the important (selected) queries,
and indicate change as well as history. In this figure,
our visualization objects are those source port
numbers and their frequency data. we can notice
that port number 1025 and 1026 are the ports with
highest aggregated frequency of occurrence (highlighted
using red circle). Indicated by the direction
of curly tail, user can notice that the frequency of
port 1026 is still moving upwards. Besides those
hot ports, moving dynamics of other ports can also
be easily observed. For example, near the bottom
of the screen, port 1313 is moving towards right
(highlighted with blue circle), which means there
are (many) queries issued using port 1313 in the
most recent time bin of this time window.
Figure 3: Stacking Graph visualization for the
source IP data used in Figure 2. The color of each
stream band is consistent with the originally randomly
assigned color in the Flying Term view and
Statistic Panel in Figure 1, and the graphs are stacking
from bottom up. The global trends of data
streams are easy to observe.
Figure 4: Two Tone Pseudo Color view for visualizing
the data value change. Note the value can be
read out fairly accurately even with very small display
estate, and the value outliers can be easily noticed.
For example, there is a high volume of query
occurrences in the first three minutes for the third
source IP. The highest frequency in day 1 and day
3 is around 750 just by looking at the visualization.
(In the value-color meter above, full red color corresponds
to 753). This can be confirmed by looking at
top-right corner of the statistics panel in Figure 1,
the actual values for day 1 and day 3 are 751 and
753 respectively. Since the color is used to present
value, we label each object using the index number
to the left of the visualization in order to establish
correspondences.
Figure 5: Chernoff Face View for monitoring. Note
that the two faces in the first column are reference
faces for showing two extremes, The first happy face
is the reference for the all zero data input, and the
second angry and unhappy face is for an imaginary
input data with highest values input for each time
bin within the current time window. The color of
each face is consistent with previous views. As we
discovered in Figure 4 the high value data outlier
is the third object. In this visualization, we can
also pick out this outlier easily, the third face in the
second column is indeed an outlier visually.
Case Studies
Figure 6: Visualizing the queries for "bad names"
on a blacklist. Notice that two host name strings
are moving together in the visualization to the left
(highlighted using red circles). After filtering out
other query names, the user can conduct pairwise
comparison of the two names in question in the
Stacking Graph visualization to the right. Their
data patterns were similar in this hour, and for some
unknown reason, they both had a sudden drop of activity
in the middle and recovered later. Further
investigation using query detail view proved that
those two "bad" host names did share one common
querying source IP address. The names may be associated
with the same malicious server or similar
malware, which results in congruent query activity
patterns.
|
|
Figure 7: Visualizing the source IP addresses and
their querying frequency. By monitoring the Flying
Term visualization to the left, the user noticed
the high-speed sudden rise of one IP address (highlighted
using red circle). In order to investigate, the
user clicked this IP address¡¯s name in Flying Term
visualization and the detail of its querying behaviors
in this time window were revealed. They were
all PTR type queries (reverse lookup), a well-known
brute force password attack to an SSH server.
|
|
More Images And Video Clips
Publications
-
Pin Ren, John Kristoff, Bruce Gooch.
Visualizing DNS Traffic (pdf),
to appear in Proceedings of VizSEC 2006, October, 2006
Pin Ren