Brad Fitzpatrick (bradfitz) wrote in lj_nifty,
Brad Fitzpatrick

Last unclustered post

I couldn't think where to post this, but then I remembered lj_nifty!

Last unclustered post:
(last unclustered comment was talkid=33436284)

Add one to that URL and it won't work.

Other interesting fact: the main database is now only 4.3 GB. When everything was unclustered, it was up to over 50 GB.

More interesting: the table that maps old URLs to new URLs (just the numbers, actually, not the URLs) is 2.7 GB of that 4.3 GB. But now that it's no longer being populated, I can do some things to the table to make it smaller. But even 4.3 GB is small enough to run all in memory.

Other interesting thing: the data that the directory needs is only 400 MB, which can easily easily run in memory, on potentially many different machines. (current project. sssh. :P)

