Surviving slashdot'ing with a small server
David P. Anderson
Department of Geological Sciences
Southern Methodist University
Recent slashdot events on this server:
June and August 2003
August 2005
Introduction
The science and technology news service at slashdot.org,
self-described "news for nerds," has become a widely read and extremely
popular website in the last few years. Enough so that it has introduced
a new word into our vocabulary: to be
"slashdotted."
This
is a phenomenum in which a news item reported on slashdot.org generates
sufficient attention to a particular website that thousands of news-hungry
readers descend on that site en masse, creating temporary havoc for
the system and often crashing the targeted server.
Our server, www.geology.smu.edu,
has been slashdotted four times in the past few years. The server has also been
subjected to some large bursts of network activity focused around other
news media attention, particularly surrounding the destruction
of the Space Shuttle Columbia.
Our original departmental web/ftp server desktop machine ungraciously crashed
in the midst of a large surge of server accesses in the spring of 2002. This was not the first time.
Postmortem identified a combination of network problems and hardware failure.
In the fall of 2002 www.geology.smu.edu was rebuilt as a two-headed
Linux
Virtual Server. At the same time provisions were made for system redundancy using
the Linux High Availability utilities.
In this configuration two machines share the load of a single virtual web-server,
while performing out-of-band heartbeat signaling. This plus an automated backup system provides for a seamless
fail-over capability. Each machine has its own independent 100 Mbit
connection to the Gigabit SMU Internet service, and each has its own Uninterruptible
Power Supply (UPS).
If either machine fails, the other automatically
takes over it's resources to provide smoothly continuous, uninterrupted
service. Similarly, one will automatically releases the other's resources
when/if the other (failed) machine comes back online. This is a scalable
architecture, though currently configured using only two machines.
We are working on a document that describes in more detail how the system was
built and configured.
www.geology.smu.edu
Nominal Server Load
Our nominal access rate for the geology department web/ftp server has usually averaged around 4000 hits
per day which corresponds to about 400 separate visits transferring 400 megabytes per day. The server
provides infrasound data streams for the Center for Monitoring Research and serves dynamic content
monitor displays of the TXAR and NVAR seismographic stations operated by S.M.U.
Figure 2
shows the daily statistics for a typical month, March, as generated from the Apache log
files using the Webalizer
utility.
Figure 2. Daily Usage for March 2003
Space Shuttle Columbia media attention.
With the media attention surrounding our infrasound
work
on the destruction of the Space
Shuttle Columbia in February, we saw access peaks around 42000 hits per
day, corresponding to about 2700 separate visits.
Figure 3
shows the daily statistics for February as generated from the Apache log
files.
Figure 3. Daily Usage for February 2003
The Columbia was destroyed on February 1st, and we published tentative
time, location, and magnitude estimates on the evening of February 3rd.
This was accompanied by a spike in web activity on the 4th and 5th, and
sustained above-average accesses for the next week. On February 14th,
Dr. Herrin held a press conference and presented his team's completed analysis,
and that information was repeated in the national media over the next two
weeks, producing the spike of activity surrounding the 21st.
Other months not shown here have seen similar peaks of activity, such
as that surrounding the publication of our paper on the search for Strange
Quark Matter at the Los Alamos National Laboratory archives, and the
publication of some of our Planetary
Imagery in the popular press and on display at the Smithsonian
Museum in Washington D.C.
January 2003 buzz.bazooka.se
Compare and contrast the statistics for February and March with those for
January 2003 in Figure 4, when our website was listed on
buzz.bazooka.se, a very popular technology news site in Sweden. (This graph was
originally mis-identified as associated with a slashdot.org posting).
Figure 4. Daily Usage for January 2003
The seeming exponential decay of accesses is striking.
Notice the slight rebound around the 27th through the 29th,
like the behavior of an under-damped pendulum.
The fact that this decaying and rebounding pattern appears
to persists for almost 2 weeks seems remarkable. We noted in passing that the
attention span of slashdot (or buzz.bazooka in this case) seems to follow
an exponential rate of decay.
May and November 2002 slashdot articles (no data)
Our strange quark matter research
paper was referenced on the front page of
slashdot 12 May 2002, about a week after the paper itself was
first published on the
Los Alamos National Laboratory archive server.
After a flurry of news stories in the summer and fall of 2002, the SQM paper was referenced again
on the front page of
slashdot 25 November 2002.
Neither of these postings provided links directly to the www.geology.smu.edu
web server. Nonetheless the May event is the one which brought down the previous server. Or
perhaps more correctly, the one during which the previous server died. The new two-headed
server began it's logs in November and they show a modest rise in activity toward the end of the month.
At that time it had not yet occurred to us what a cool data set we had, and we did not save the log files.
June and August 2003 slashdot articles
www.geology.smu.edu was listed on the front page of
slashdot 15 June 2003,
and again on the front page of
slashdot 10 August 2003.
Click on the graphs for larger images.
Shown in the figures above are the two bursts of network activity on the geology department
web server that were generated by these two news items.
These two events are considerably larger in magnitude
and shorter in duration than the
other surges of
activity for which we have data: up to 415K hits per day and 36 Gbytes of file transfers per day.
Both events have a duration on the order of 40 hours: about two days.
Click on the graphs for larger images.
These two figures plot the hit rate and bytes per hour, rather than per day, for the same
two events. The similarity
is striking. Both events peak around 70k hits per hour and 6.0 Gb/hour within a few minutes of their
respective slashdot postings,
and decay exponentially thereafter, with a second peak about 15 hours later.
(Both articles were posted late in the afternoon and the
second peak occurs between 6am and 10am the following morning).
slashdot peak access and data rates:
|
15 June 2003
|
10 August 2003
|
peak hits per hour |
78,000 |
62,231 |
peak hits per minute |
1850 |
1930 |
peak bytes per hour |
6.4 Gb |
7.0 Gb |
peak bytes per minute |
290 Mb |
520 Mb |
total hits serviced |
1042751 |
303891 |
total bytes served |
102 Gb |
27 Gb |
The June event had about 3 times the hits and 3 times the megabytes of the August event. (Perhaps
two-wheel
balancing robots are 3x more interesting than
two-headed
virtual web servers).
The August event had greater bandwidth demands but for a narrower spike of time.
Log scale graphs of the hit and data rates of the two events should produce straight-line plots if
the decay rate really is exponential, as illustrated in these next two images.
This graph
of web-server accesses for 15 June 2003 plots the data on a log scale. The linear
portion of the data corresponds to the exponentially decaying access rate over about 40 hours.
This graph for 10 August 2003 illustrates the same linear slope for a log scale plot
of web-server accesses, this time in 10 minute rather than 1 hour intervals. Here is the same data on a
linear plot.
slashdot rising:
This graph for
August
and the next graph for
June
show the initiation of the slashdot events plotted as hits-per-minute for the first 60 minutes.
Both show a sharp rise in
web accesses beginning within the same minute as the published time stamp of their respective slashdot
news articles. Both appear to reach a peak very near the
maximum number of hits-per-minute within the first 10 minutes. In fact they both reach about 75% of that
peak in the first two minutes after the article is posted. Amazing.
click on images for larger versions
The slashdot precursor event:
Both the June and August logs have a small precursor event about 10 minutes before the
beginning of the slashdot onslaught. In both cases the nominal server rate of 5 hits per minute
ramps up to about 100 hits per minute and then back down, just before the main flood of server requests
begin. The connections are from an assortment of DSLs, cable ISPs, and unknowns.
Here
is a logscale plot of the hit and byte rates for the first 40 minutes of the August data,
which makes the precursor event more obvious.
Perhaps this precursor event is comprised of people (slashdot subscribers?) or services (mirrors?) with
advanced notice of the articles, fetching the
linked site before the articles
are officially posted. Or perhaps this flurry of activity is actually part of the editorial process itself
by which a particular article is approved and posted to the Internet.
slashdot feedback
The comments section for user feedback on the slashdot.org archive provided useful
information about the access rates
of the server during these events.
The time stamps on the comments are EDT (GMT-4) and can be used to correlate the
comments with the web server data logs.
Click on the images for time-aligned comments and bandwidth reports.
That was especially true for the 10 August 2003 posting. That slashdot article was
about the slashdotting response, and it produced a slashdotting response.
Furthermore, members of the slashdot community then spontaneously
devised impromptu methods for testing the response and bandwidth of the server, in effect designing a custom
slashdotting response for this event.
The recursive and self-referential nature of the whole thing makes my head spin.
Here are a few postings which provided some useful data points.
Observations
So far so good!
A high-bandwidth Internet connection with load sharing
and static content seem to be the keys. The current system serves up some dynamic content, but it
is all generated on
other machines. So it is effectively static content to the web server. Transfers of large data files like
images and mpegs appear to matter less than available memory and CPU cycles.
The virtual flash crowds (flashdot?) associated with news organizations and sites like slashdot.org and
buzz.bazooka.se exhibit interesting patterns of behavior that can produce huge demands on a small server.
The extreme exponential rise and long slow decay seem to be quite similar for these two
events. Other's have observed the same patterns. Posted in the slashdot comments, localroger
reported,
Yeah, I've seen that too," and
bob_jenkins reported, "I saw something similar from the BBC when they linked to one of my pages
last September." Both of these
graphs appear to exhibit the same rapid rise and exponential decay of server accesses. Finally,
titanium_angel provided a link to a
paper by Stephen Adler which looks in detail at similar events.
I've seen no other comments on what we have termed "the slashdot precursor event." However I believe I can
see a similar precursor pattern in two of the figures from Stephen Adler's paper,
here
and perhaps
here. It would be interesting to know if the slashdot subscriber program was in effect when his data
were gathered (20 October 1998).
If this is indeed a part of the process of selecting and posting an article, it would also be interesting to
know if this is unique to slashdot.org
and their particular distributed form of editorial process, or is a more common phenomenum associated
with news service postings on the Internet in general.
These findings are difficult to interpret in some ways because of the unique nature of the data set,
which exists no where else on the Internet outside of the geology department web server's access logs.
Further, because of the unpredictable nature of the news media and media attention, it is difficult
if not impossible to replicate the experience. We are reduced to waiting for someone in the SMU
Department of Geological Sciences to do something else newsworthy...
In the meantime, there are university departments and government laboratories and others with "phat pipes"
who might also benefit from a low-cost high-reliability web server that can be assembled on a modest
budget and yet is able to handle the bursts of network activity associated with sudden Internet flash crowds.
25 Aug 2003
dpa
Update: 12 August 2005: The jBot webpage was
slashdotted on 01 August 2005 at
9:08pm EDT.
At the height of the excitement we sustained
about 50,000 hits and 27 GB of data
per hour. I'll post the stats after we've had
a chance to examine the logs in more detail.
The overall event lasted more than 40 hours, generating nearly a million hits and moving about 335 Gigabytes of data.
www.geology.smu.edu