4 Data Science Conferences to Attend in Asia

Previously, I’ve written posts about 2019 and 2020 data science conferences to attend. Researching for both of those posts have given me a fair amount of knowledge on conferences happening in different regions around the world. In this article, I’m going to cover 5 data science conferences in Asia that you should consider attending in 2020.

Read Full Story

5 Data Science Conferences to Attend in Asia

Previously, I’ve written posts about 2019 and 2020 data science conferences to attend. Researching for both of those posts have given me a fair amount of knowledge on conferences happening in different regions around the world. In this article, I’m going to cover 5 data science conferences in Asia that you should consider attending in 2020.

Read Full Story

How One Instagram Influencer Built Her Brand and Attracted 40K Followers

This article was written in the first person by Rafaella Aguiar, Director of Marketing at Kicksta, following an interview with Erin Marie.
Building a big following on Instagram, and doing it fast, takes hard work and dedication — but it’s also one hundred percent possible, and I have the proof.
My name is Erin Marie.

Read Full Story

Why Artificial Intelligence Research is Still Relevant

As a college student, you will often be required to submit a written research paper in whatever field. For such a paper, you will have to analyze different data and come up with a hypothesis. This is where artificial intelligence comes into play. It helps you analyze data and make predictions based on your findings.

Read Full Story

Most Winning A/B Test Results are Illusory

Whitepaper about errors in A/B testing, written for Qubit.
Covered at qz.com and Hacker News
Introduction
Marketers have begun to question the value of A/B testing, asking: ‘Where is my 20% uplift? Why doesn’t it ever seem to appear in the bottom line?’ Their A/B test reports an uplift of 20% and yet this increase never seems to translate into increased profits.

Read Full Story

Before You Know It: The Unconscious Reasons We Do What We Do – Process – Reads

This book is written by a psychologist but understanding how people think is a large part of decision science [1] – a particularly common application of data science. There are a lot of experiments described in the book to learn how people feel or make decisions in a certain context and to rationalize those seemingly irrational behaviors.

Read Full Story

How to predict abstention rates with open data

This post was written in collaboration with Alexandre Vallette who’s my co-author on an upcoming guide to hacking with open data (scroll down to find out more).Open data is a way to increase transparency into what happens in our society. When coupled with predictive modelling, it becomes a way to interpret why things happened.

Read Full Story

Discussion of “Fast Approximate Inference for Arbitrarily Large Semiparametric Regression Models via Message Passing”

This article is written with much help by David Blei. It is extracted from a discussion paper on “Fast Approximate Inference for Arbitrarily Large Semiparametric Regression Models via Message Passing”. [link]
We commend Wand (2016) for an excellent description of
message passing (mp) and for developing it to infer large semiparametric
regression models.

Read Full Story

Advice to aspiring data scientists: start a blog

Last week I shared a thought on Twitter:
When you’ve written the same code 3 times, write a functionWhen you’ve given the same in-person advice 3 times, write a blog post— David Robinson (@drob) November 9, 2017
Ironically, this tweet hints at a piece of advice I’ve given at least 3 dozen times, but haven’t yet written a post about.

Read Full Story

Using Neo4j Spatial Procedures in legis-graph-spatial · William Lyon

Neo4j 3.0 introduced the concept of user defined procedures: code written in Java (or any JVM language) that is deployed to the database and callable from Cypher. User defined procedures are an alternative to unmanaged extensions, with the key difference that user defined procedures are callable from Cypher (instead of extending the http REST endpoints).

Read Full Story

How much compute do we need to train generative models?

Update (09/01/17): The post is written to be somewhat silly and numbers are not meant to be accurate. For example, there is a simplifying assumption that training time scales linearly with the # of bits to encode the output; and 5000 is chosen arbitrarily given only that the output’s range has 65K*3 dimensions and each takes one of 256 integers.
Discriminative models can take weeks to train.

Read Full Story

How to analyze smartphone sensor data with R and the BreakoutDetection package

Yesterday, Jörg has written a blog post on Data Storytelling with Smartphone sensor data. Here’s a practical approach on how to analyze smartphone sensor data with R. In this example I will be using the accelerometer smartphone data that Datarella provided in its Data Fiction competition.

Read Full Story

Do Market Research Agencies Produce Poor Quality Reports?

Editor’s Note: Some years ago, I was asked to review a draft report written by a new, more junior colleague. The report went on for fifty pages and was mostly just a question-by-question description of data. Conclusions didn’t appear until the end. After rousing myself from report-induced slumber, I asked the person why he wrote the report that way.

Read Full Story

Why Responsible AI Development Needs Cooperation on Safety

We’ve written a policy research paper identifying four strategies that can be used today to improve the likelihood of long-term industry cooperation on safety norms in AI: communicating risks and benefits, technical collaboration, increased transparency, and incentivizing standards.

Read Full Story

Introduction to AutoML with MLBox

Today’s post is very special. It’s written in collaboration with Axel de Romblay the author of the MLBox Auto-ML package that has gained a lot of popularity these last years.
If you haven’t heard about this library, go and check it out on github: It encompasses interesting features, it’s gaining in maturity and is now under active development.

Read Full Story

Data Science Book: Profit Driven Business Analytics

Verbeke, Baesens and Bravo have written a data science book focusing on profit. Instead of the typical statistical or programming point of view, Profit Driven Business Analytics has a self-proclaimed value-centric perspective.
This means the book approaches each topic with a focus on profit, costs and ROI. Each data science subject is briefly explained and illustrated with business cases.

Read Full Story