Surviving slashdot'ing with a small server

David P. Anderson
Department of Geological Sciences
Southern Methodist University


Recent slashdot events on this server:

June and August 2003
August 2005



Introduction

The science and technology news service at slashdot.org,  self-described "news for nerds," has become a widely read and extremely popular website in the last few years.  Enough so that it has introduced a new word into our vocabulary: to be "slashdotted."   This is a phenomenum in which a news item reported on slashdot.org generates sufficient attention to a particular website that thousands of news-hungry readers descend on that site en masse,  creating temporary havoc for the system and often crashing the targeted server.

Our server, www.geology.smu.edu, has been slashdotted four times in the past few years.   The server has also been subjected to some large bursts of network activity focused around other news media attention, particularly surrounding the destruction of the Space Shuttle Columbia.

Our original departmental web/ftp server desktop machine ungraciously crashed in the midst of a large surge of server accesses in the spring of 2002. This was not the first time. Postmortem identified a combination of network problems and hardware failure. In the fall of 2002 www.geology.smu.edu was rebuilt as a two-headed Linux Virtual Server.  At the same time provisions were made for system redundancy using the Linux High Availability utilities.  In this configuration two machines share the load of a single virtual web-server, while performing out-of-band heartbeat signaling. This plus an automated backup system provides for a seamless fail-over capability.  Each machine has its own independent 100 Mbit connection to the Gigabit SMU Internet service, and each has its own Uninterruptible Power Supply (UPS). 

If either machine fails, the other automatically takes over it's resources to provide smoothly continuous, uninterrupted service.  Similarly,  one will automatically releases the other's resources when/if the other (failed) machine comes back online.  This is a scalable architecture, though currently configured using only two machines. We are working on a document that describes in more detail how the system was built and configured.

 


www.geology.smu.edu


www.geology.smu.edu




Nominal Server Load

Our nominal access rate for the geology department web/ftp server has usually averaged around 4000 hits per day which corresponds to about 400 separate visits transferring 400 megabytes per day. The server provides infrasound data streams for the Center for Monitoring Research and serves dynamic content monitor displays of the TXAR and NVAR seismographic stations operated by S.M.U.    Figure 2 shows the daily statistics for a typical month, March, as generated from the Apache log files using the Webalizer utility.
 


March Daily Statistics

Figure 2.   Daily Usage for March 2003



Space Shuttle Columbia media attention.

With the media attention surrounding our infrasound work on the destruction of the Space Shuttle Columbia in February, we saw access peaks around 42000 hits per day, corresponding to about 2700 separate visits.   Figure 3 shows the daily statistics for February as generated from the Apache log files.
 


February Daily Statistics

Figure 3.   Daily Usage for February 2003



The Columbia was destroyed on February 1st, and we published tentative time, location, and magnitude estimates on the evening of February 3rd.  This was accompanied by a spike in web activity on the 4th and 5th, and sustained above-average accesses for the next week.  On February 14th,  Dr. Herrin held a press conference and presented his team's completed analysis, and that information was repeated in the national media over the next two weeks, producing the spike of activity surrounding the 21st.

Other months not shown here have seen similar peaks of activity, such as that surrounding the publication of our paper on the search for Strange Quark Matter at the Los Alamos National Laboratory archives, and the publication of some of our Planetary Imagery in the popular press and on display at the Smithsonian Museum in Washington D.C.
 
January 2003 buzz.bazooka.se

Compare and contrast the statistics for February and March with those for January 2003 in Figure 4, when our website was listed on buzz.bazooka.se, a very popular technology news site in Sweden. (This graph was originally mis-identified as associated with a slashdot.org posting).

January 2003 Daily Usage

Figure 4.  Daily Usage for January 2003



The seeming exponential decay of accesses is striking.  Notice the slight rebound around the 27th through the 29th, like the behavior of an under-damped pendulum.  The fact that this decaying and rebounding pattern appears to persists for almost 2 weeks seems remarkable. We noted in passing that the attention span of slashdot (or buzz.bazooka in this case) seems to follow an exponential rate of decay.



May and November 2002 slashdot articles (no data)

Our strange quark matter research paper was referenced on the front page of slashdot 12 May 2002, about a week after the paper itself was first published on the Los Alamos National Laboratory archive server. After a flurry of news stories in the summer and fall of 2002, the SQM paper was referenced again on the front page of slashdot 25 November 2002.

Neither of these postings provided links directly to the www.geology.smu.edu web server. Nonetheless the May event is the one which brought down the previous server. Or perhaps more correctly, the one during which the previous server died. The new two-headed server began it's logs in November and they show a modest rise in activity toward the end of the month. At that time it had not yet occurred to us what a cool data set we had, and we did not save the log files.



June and August 2003 slashdot articles

www.geology.smu.edu was listed on the front page of slashdot 15 June 2003,  and again on the front page of slashdot 10 August 2003.

June 2003 slashdotting August 2003 slashdotting

Click on the graphs for larger images.

Shown in the figures above are the two bursts of network activity on the geology department web server that were generated by these two news items. These two events are considerably larger in magnitude and shorter in duration than the other surges of activity for which we have data: up to 415K hits per day and 36 Gbytes of file transfers per day. Both events have a duration on the order of 40 hours: about two days.
Hits and bytes per hour Jun 2003 Hits and bytes per hour Aug 2003
Click on the graphs for larger images.

These two figures plot the hit rate and bytes per hour, rather than per day, for the same two events. The similarity is striking. Both events peak around 70k hits per hour and 6.0 Gb/hour within a few minutes of their respective slashdot postings, and decay exponentially thereafter, with a second peak about 15 hours later. (Both articles were posted late in the afternoon and the second peak occurs between 6am and 10am the following morning).



slashdot peak access and data rates:

 15 June 2003 
10 August 2003
peak hits per hour 78,000  62,231
peak hits per minute 1850 1930
peak bytes per hour 6.4 Gb 7.0 Gb
peak bytes per minute 290 Mb 520 Mb
total hits serviced 1042751 303891
total bytes served 102 Gb 27 Gb


The June event had about 3 times the hits and 3 times the megabytes of the August event. (Perhaps two-wheel balancing robots are 3x more interesting than two-headed virtual web servers). The August event had greater bandwidth demands but for a narrower spike of time.

Log scale graphs of the hit and data rates of the two events should produce straight-line plots if the decay rate really is exponential, as illustrated in these next two images.

Log scale hits and bytes per hour Jun 2003
This graph
of web-server accesses for 15 June 2003 plots the data on a log scale. The linear portion of the data corresponds to the exponentially decaying access rate over about 40 hours.

Logscale hits and bytes per 10 minutes Aug 2003 Liner scale hits and bytes per 10 minutes Aug 2003
This graph
for 10 August 2003 illustrates the same linear slope for a log scale plot of web-server accesses, this time in 10 minute rather than 1 hour intervals. Here is the same data on a linear plot.




slashdot rising:

This graph for August and the next graph for June show the initiation of the slashdot events plotted as hits-per-minute for the first 60 minutes.

Both show a sharp rise in web accesses beginning within the same minute as the published time stamp of their respective slashdot news articles. Both appear to reach a peak very near the maximum number of hits-per-minute within the first 10 minutes. In fact they both reach about 75% of that peak in the first two minutes after the article is posted. Amazing.

Hits per minute for rising edge Aug 2003 Hits per minute for rising edge Jun 2003
click on images for larger versions


The slashdot precursor event:

Both the June and August logs have a small precursor event about 10 minutes before the beginning of the slashdot onslaught. In both cases the nominal server rate of 5 hits per minute ramps up to about 100 hits per minute and then back down, just before the main flood of server requests begin. The connections are from an assortment of DSLs, cable ISPs, and unknowns.

Click for larger image
Here
is a logscale plot of the hit and byte rates for the first 40 minutes of the August data, which makes the precursor event more obvious.

Perhaps this precursor event is comprised of people (slashdot subscribers?) or services (mirrors?) with advanced notice of the articles, fetching the linked site before the articles are officially posted. Or perhaps this flurry of activity is actually part of the editorial process itself by which a particular article is approved and posted to the Internet.



slashdot feedback

The comments section for user feedback on the slashdot.org archive provided useful information about the access rates of the server during these events. The time stamps on the comments are EDT (GMT-4) and can be used to correlate the comments with the web server data logs.

First peak, hits and bytes, with comments Second peak, hits and bytes, with comments
Click on the images for time-aligned comments and bandwidth reports.


That was especially true for the 10 August 2003 posting. That slashdot article was about the slashdotting response, and it produced a slashdotting response. Furthermore, members of the slashdot community then spontaneously devised impromptu methods for testing the response and bandwidth of the server, in effect designing a custom slashdotting response for this event. The recursive and self-referential nature of the whole thing makes my head spin.

Here are a few postings which provided some useful data points.



Observations

So far so good! A high-bandwidth Internet connection with load sharing and static content seem to be the keys. The current system serves up some dynamic content, but it is all generated on other machines. So it is effectively static content to the web server. Transfers of large data files like images and mpegs appear to matter less than available memory and CPU cycles.

The virtual flash crowds (flashdot?) associated with news organizations and sites like slashdot.org and buzz.bazooka.se exhibit interesting patterns of behavior that can produce huge demands on a small server. The extreme exponential rise and long slow decay seem to be quite similar for these two events. Other's have observed the same patterns. Posted in the slashdot comments, localroger reported, Yeah, I've seen that too," and bob_jenkins reported, "I saw something similar from the BBC when they linked to one of my pages last September." Both of these graphs appear to exhibit the same rapid rise and exponential decay of server accesses. Finally, titanium_angel provided a link to a paper by Stephen Adler which looks in detail at similar events.

I've seen no other comments on what we have termed "the slashdot precursor event." However I believe I can see a similar precursor pattern in two of the figures from Stephen Adler's paper, here and perhaps here. It would be interesting to know if the slashdot subscriber program was in effect when his data were gathered (20 October 1998). If this is indeed a part of the process of selecting and posting an article, it would also be interesting to know if this is unique to slashdot.org and their particular distributed form of editorial process, or is a more common phenomenum associated with news service postings on the Internet in general.

These findings are difficult to interpret in some ways because of the unique nature of the data set, which exists no where else on the Internet outside of the geology department web server's access logs. Further, because of the unpredictable nature of the news media and media attention, it is difficult if not impossible to replicate the experience. We are reduced to waiting for someone in the SMU Department of Geological Sciences to do something else newsworthy...

In the meantime, there are university departments and government laboratories and others with "phat pipes" who might also benefit from a low-cost high-reliability web server that can be assembled on a modest budget and yet is able to handle the bursts of network activity associated with sudden Internet flash crowds.

25 Aug 2003
dpa



Update: 12 August 2005: The jBot webpage was slashdotted on 01 August 2005 at 9:08pm EDT.
At the height of the excitement we sustained about 50,000 hits and 27 GB of data per hour. I'll post the stats after we've had a chance to examine the logs in more detail. The overall event lasted more than 40 hours, generating nearly a million hits and moving about 335 Gigabytes of data.





www.geology.smu.edu