Being a bit evil
17.03.2008 19:20
One thing that I learned at Zemanta is never to underestimate processing power and memory needed to do anything non-trivial (and also a lot of trivial things) with English Wikipedia dumps. After you spend some time dealing with these huge XML files you gradually learn from your mistakes and accept that as a fact.
The resources needed to process Wikipedia also became kind of a recurring inside joke at the office, especially when this needs to be explained to someone new in this field:
(this is, of course, a completely unofficial imitation of a xkcd comic)
Posted by Ideas
| Categories: