The main purpose of data mining and analytics is to find novel, potentially useful patterns that can be utilized in real-world applications to derive beneficial knowledge. For identifying and evaluating the usefulness of different kinds of patterns, many techniques/constraints have been proposed, such as support, confidence, sequence order, and utility parameters (e.g.Read Full Story
The utility numfmt, part of Gnu Coreutils, formats numbers. The main uses are grouping digits and converting to and from unit suffixes like k for kilo and M for mega. This is somewhat useful for individual invocations, but like most command line utilities, the real value is using it as part of pipeline.
The –grouping option will separate digits according to the rules of your locale.
A different form of statistical analysis could prove benefitial, but I think the main thing to keep in mind is that data mining algorithms just show you what trends there are in the data, rather than prove anything concretely. If a trend is found in the data, that is the beginning rather than the end of the research.Read Full Story
Law is our main system of official blame; it is how we officially blame people for things. So it is a pretty big deal that, over the last few centuries, changes to law have induced big changes in who officially blames who for most things that go wrong. These changes may be having big bad effects.
Long ago most everyone could use law to blame most everyone else.
One of the main projects I worked on last year.
Data for Breakfast
Recently, Automattic created a Marketing Data team to support marketing efforts with dedicated data capabilities.
One of the main projects I’ve been working on over the past year.
Data for Breakfast
A generalized machine learning pipeline, pipe serves the entire company and helps Automatticians seamlessly build and deploy machine learning models to predict the likelihood that a given event may occur, e.g., installing a plugin, purchasing a plan, or churning.
The main difference between data analysis today, compared with a decade or two ago, is the way that we interact with it. Previously, the role of statistics was primarily to extend our mental models by discovering new correlations and causal rules. Today, we increasingly delegate parts of our reasoning processes to algorithmic models that live outside our mental models.Read Full Story