User:DrewJensen/why why

From Apache OpenOffice Wiki
< User:DrewJensen
Revision as of 23:10, 15 September 2007 by DrewJensen (Talk | contribs)

Jump to: navigation, search

Analysis done by TerryE in April of this year. Original Posting at OOoForum

OOoForum is sustaining about 3 transactions per second in the heavy hours, generating over 6 Gbytes of traffic per day, and has a database of around 400 Mbytes. This is quite a profile guys. Heavier than I had anticipated until I cranked the numbers.

#Topic #Replies #Views Mbytes
Snapshot at 17-Feb 40,129 129,430 13,302,181
Snapshot at 30-Mar 44,299 145,265 15,910,312
Missed records 6,137 18,055 2,394,527
(% missed) 14% 12% 15%
Delta (over 41 days) 4,170 15,835 2,608,131
Delta per day 102 386 63,613
Uplifted daily delta 116 434 73,187
Size per Topic View (60Kbytes)
MBytes Viewed 4,188
Forum Views (say 50%) 36,593
Size per Forum View (70Kbyte)
MBytes Viewed 2,443
Total Mbytes/day 6,631
Transactions per min (200% peak uplift) 152

Next is an analysis of the spread of posting. Here I have analyses posts per poster, then ranked by totals per poster and grouped these into bands.

Post Bands #Posters #Posts
1-5 27,780 52,842
6-15 3,219 27,560
16-50 875 21,865
51-200 194 16,468
201-1,000 58 25,090
1,000+ 30 68,583

From this you can see that the view transactions in fact dominate, with there being around 130 views for every post. This is partly the fact that many of use preview and view in preparing a post, but also there is a huge body of "read-only" guests that are continually browsing the forum (as I write this there are 2 power contributors, 3 occasional users and 94 guests logged onto oooforum).

Analysis of usage rates / time of day

[Image:PostsByToD.png Posts By Time of Day]

and growth rates in posting transaction per month

[Image:PostsByMoth.png Posts By Month]

A second analysis was performed in June of this year. Full posting at

These are in pairs (#Replies, #Views) for the three downloads that I've done: 18 Feb, 30 Mar, 29 Jun. Some magic numbers
  • We've been running pretty steady at 190 posts per day on OOoForum. The hourly averages vary from about 3-6% with the peak window is Midnight GMT which equates to 12 posts per hour or one every 5 mins, with peak bursts maybe 2-3x or that. We have on average 4 posts per topic, so that's 40-50 new topics per day.
  • OOoForum had about 1.9M views in this same period, which equates to 21K per day, or 900 average per hour. Quite different from the 12 posts.
  • I got at first a bizarre correlation between the number of views and the number of posts per topic. At first I thought that this was due to Bot activity, but then I looked at individual sets for say 3 replies and got a very different pattern. I wrote a little routine to histogram by # of replies and this showed a common pattern which is best seen by the the attached graph. This is a log-lin plot of a histogram of view counts in the last 90 days by topic.

[Image:Histogram1.png Histogram]

This has a strong negative exponential characteristic. One of the strongest causal mechanisms for this is that peoples criteria for deciding to view a given post are largely independent. (Though there were a small collection of top posts so the top 1,000 topics picked up 0.5M of the 1.9M views).
  • The reason that the number of views is strongly correlated to the number of replies is that (i) as a message content is approximately proportional to the number of replies, people tend to hit larger messages when searching, (ii) users tend to regard messages with lots of replies and views as interesting.
Personal tools