Announcement

Collapse
No announcement yet.

news news news

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • news news news

    November 29, 2006 - 19:00 UTC
    Last night we had a blip of an outage due to our data download server losing its mount of the file server holding the workunits. This was strictly a random failure, and it worked itself out on its own. We saw similar behavior back when we had a heavier load on the entire system. We released the enhanced client which greatly reduced the rate of workunit/result exchange and therefore reduced the occurrence of these load-related problems. Thanks to Moore's Law and an ever-increasing user base, we'll need to address this issue sooner than later.

    The other lingering, randomly-occurring problem has to do with "rough periods" accessing the database (see eariler tech news items for details). Basically, what's going on is this: every 24 hours or so a process dumps all useful user/host/team stats to XML files which other sites can upload and generate leader boards, graphs, etc. These tables have continually grown in size, and apparently when this process runs they can knock the result table out of memory. The feeder process, which keeps a healthy queue of available work to send out to users, needs the result table in memory or else a sub-second query to select more work becomes a multi-minute query to read the whole result table back into memory from disk. We're looking into making these queries more efficient.
    We're also looking at setting up a new BOINC database server (remember that the BOINC database is separate for the SETI@home-specific science database which already is on a new server and working well). Recently Intel donated several pieces of hardware to us, including a quad dual-core Xeon processor system (i.e. 8 3GHz processors total). We're currently working out some system quirks, but when we begin trusting it we'll make this our master BOINC database server, and the current one will be a replica. This will provide an immediate backup if needed, and remove the necessity for the weekly outages. More to come on that. Another recently Intel system has already been set up and is being used as a backend science CPU server (and to read new data from hard drives sent up from Arecibo). The last of the known never-touched classic data tapes has been read last week and is in the splitter queue. Next we will start reading tapes that have gone through the pipeline in some form or another, but for some reason never made it into our master database. Possible reasons include: bad data (but hopefully not), a tape drive failure that caused the tapes to remain unread (surprisingly more common than you'd think), poor initial analysis or database corruption leading to failure during redundancy checking. So don't be upset when tapes from the late 90's appear on the queue. Data from 1998 is worth the same as data taken in 2006. The ETs we are looking for come from light years away. A few years won't make any difference when looking for signals consistently repeating over time.
    According to the latest official figures, 43% of all statistics are totally worthless...
Working...
X