I am a regular reader of Louis Columbus on Forbes. He recently wrote a great article on the impact of machine learning on global manufacturing. There are plenty of ways that manufacturers are using machine learning to bolster their bottom line.Read Full Story
I first wrote about the fake data menace here (that was 2017) and again (2018). Earlier this year, I discussed an article in the Hustle about fake Amazon reviews, and made a video about it.
This holiday season, Buzzfeed finally noticed. Here’s a nice article detailing the scams that abuse online review platforms.
I recently wrote a blog post here
comparing the number of CRAN downloads an R package gets relative to its number of
stars on GitHub. What I didn’t really think about during my analysis was whether or
not scraping CRAN was a violation of its Terms and Conditions. I simply copy and
pasted some code from R-bloggers
that seemed to work and went on my merry way.
… or how I stoped worrying and wrote a blog post to remember it ad infinitum.
Magrittr’s pipe operator is one of those newish R-universe features that I
really want to have around whenever I put some lines into an R-console.
This is even TRUE when writing a package.
So the first thing I do is put magrittr into the DESCRIPTION file and add
A couple of weeks ago I wrote a guest post on churn prediction for Kissmetrics, and they just published it.
Churn prediction is one of the most popular Big Data use cases in business. It consists in detecting which customers are likely to cancel a subscription to a service based on how they use the service.
A few months ago I attempted to understand the natural gradient, and wrote a post to help organize what I knew. Unfortunately there was too little detail and all I really understood was a “black box” version of the natural gradient: what it did, not how it worked on the inside.Read Full Story
h1. facebook like translations for your rails app
p(meta). 07 October 2009
Last year I wrote a plugin for “Sanbit”:http://sanbit.com, the language learning site I was working at the time that allowed you to add a facebook style translation system to your application, I call it “Sanbit Translations”:http://github.com/jtoy/sanbit_translations.
A few years ago, I wrote a post Don’t teach built-in plotting to beginners (teach ggplot2). I argued that ggplot2 was not an advanced approach meant for experts, but rather a suitable introduction to data visualization.
Many teachers suggest I’m overestimating their students: “No, see, my students are beginners…”.
A year ago today, I wrote up a blog post Text analysis of Trump’s tweets confirms he writes only the (angrier) Android half.
My analysis, shown below, concludes that the Android and iPhone tweets are clearly from different people, posting during different times of day and using hashtags, links, and retweets in distinct ways.
In 2012, I wrote a paper that I probably should have called “truncated bi-level optimization”. I vaguely remembered telling the reviewers I would release some code, so I’m finally getting around to it.
The idea of bilevel optimization is quite simple. Imagine that you would like to minimize some function . However, itself is defined through some optimization.
Three years ago I wrote The Cupboard is Full “The bottom line is the cupboard is full. The expansion should continue for some time.”This is one in a series of post in late 2016 – post election – explaining why I thought the expansion should continue, even though I was extremely disappointed about the outcome of the election.Read Full Story
Eleven months ago on a long train ride home, I wrote the first lines of code for a small platforming game. Little did I know that this prototype was the start of something much more than a just game — it was a dream that would become shared within an amazing team, and it was the greatest step in a personal journey that had begun over eight years ago.Read Full Story
(This is a post from the Twitter Engineering Blog that I wrote with Alpa Jain.)
One of the magical things about Twitter is that it opens a window to the world in real-time. An event happens, and just seconds later, it’s shared for people across the planet to see.
Consider, for example, what happened when Flight 1549 crashed in the Hudson.
One of my students recently asked me for advice on learning ML. Here’s
what I wrote. It’s biased toward my own experience, but should generalize.
My current favorite introduction is Kevin Murphy’s book (Machine
Tuesday night I wrote a short blog post about how I used python to find cheap tickets to a music festival. I finished up pretty late so I decided to post it online the next morning. I woke up pretty early and posted the article on a few websites around seven. I started watching my google analytics page and the hits started coming in very fast, much faster than normal.Read Full Story
A post-hoc analysis, part 2
As I wrote in my last blog post, around 3 years ago I decided to try to build a budgeting service like mint.com for the norwegian market. After around a year, having reached the prototype stage, I decided to take a short break from further building, to think about the business details. This quickly turned into an … extended break.
Some time ago I wrote about a structured learning project I have been working on for some time, called pystruct.After not working on it for some time, I think it has come quite a long way the last couple of weeks as I picked up work on structured SVMs again. So here is a quick update on what you can do with it.Read Full Story
This article is an extension of a previous one I wrote when I was experimenting sentiment analysis on twitter data. Back in the time, I explored a simple model: a two-layer feed-forward neural network trained on keras. The input tweets were represented as document vectors resulting from a weighted average of the embeddings of the words composing the tweet.Read Full Story
About a year ago I wrote a review of The Data Incubator (updated review is here). I always know when the Data Incubator application season is here because I always get a few people who have found my blog reaching out with questions about the process. I decided to put together a short list of some of the most common questions I get asked.
Should I do the program?
The program has pros and cons.
A while ago, I wrote a review of The Data Incubator based on my experience in the program. Since then, it’s been the most common reason people reach out to me. I’ve had people reach out to tell me how the program went for them, to ask me questions about the program, or to ask advice.Read Full Story