Currently set to No Index

Optimizing Split Sizes for Hadoop’s CombineFileInputFormat

Many of the challenges of using Hadoop with small files are well-documented. But there’s one thing I haven’t seen a lot of discussion about: optimizing the maximum split size for the CombineFileInputFormat and its derivatives (e.g., CombineTextInputFormat). But this is actually a pretty big issue – improper configuration can cause out-of-memory errors or degraded performance.

Read Full Story

The second and third best features of LyX you aren’t using.

LyX is a WYSIWYG editor for latex files. It’s a little bit clunky to use at first, and isn’t perfect (thank you, open source developers– I’m not ungrateful!) but after becoming familiar with it, it’s probably the single piece of software that has most improved my productivity. I like it so much I use it not just for papers but often as a scratchpad for math.

Read Full Story