The California Consumer Privacy Act (CCPA), enacted in 2018, came into effect from 1 January 2020. It is hailed as one of the strongest data privacy laws in the country, and other states are most likely to follow the path.Read Full Story
I attended a business conference earlier this month, where the topic of big data came up. I was surprised to see how many people are still sceptical about the value of big data in the world of business.
You probably know that big data is being used by business owners all over the world. The market for big data is growing faster than anybody ever predicted.
By Alastair Marsh | Bloomberg
Chris Ballinger came away from a year of crunching numbers at Toyota Motor Corp.’s Silicon Valley skunkworks convinced that his dream of automotive automation was no more fanciful than his bosses’ ambition to make a vehicle that can drive itself.
If you feel like the app TikTok came out of nowhere, you’re not wrong.
Since launching in early 2018, TikTok’s been covered by seemingly every major news publication and racked up millions of downloads globally.
Despite TikTok’s major early success, the app still feels like a bit of a mystery, especially to marketers.
What is the difference between segmentation and personalization? This is the question that came up during one of the webinar on personalization by Optimizely. This blog post is for those who have the same question.Basic definition of Segmentation is – division into separate parts or sections.Read Full Story
p(meta). 22 July 2009
I just came back from “railscamp”:http://railscamps.com New England edition. It was one of the best programming events I have been to. It was a mix of a hackfest, tech talks, binge drinking, and LAN parties. One of the things that made it great was that there was no internet.
I recently came across a great natural language dataset from Mark Riedel: 112,000 plots of stories downloaded from English language Wikipedia. This includes books, movies, TV episodes, video games- anything that has a Plot section on a Wikipedia page.
This offers a great opportunity to analyze story structure quantitatively.
Amazon ML (Machine Learning) made a lot of noise when it came out last month. Shortly afterwards, someone posted a link to Google Prediction API on HackerNews and it quickly became one of the most popular’s posts. Google’s product is quite similar to Amazon’s but it’s actually much older since it was introduced in 2011.Read Full Story
So, I was browsing exp.lore.com and came across these nifty little usb-sticks a couple of days ago. Huh, that’s a pretty decent just-in-time gift I thought – might be an idea to buy a couple of them for those occasions where you don’t really have time to buy a gift for someone. So I click the link, and end up on the fine site fab.com.Read Full Story
One of the nicest compliments I’ve received over the years came from a company founder who read one of my reports and said I’d summarized his company’s work better than they did. It’s just one of the things I do—take a pile of information and figure out what it’s about. I summarize. So if you need to tease out the short version of something complicated, call me.Read Full Story
An odd request came in last week when a prospective customer asked us about a benchmark on the percentage of duplicates we can find for them using MDM.
In this blog, I wanted to touch base on few key reasons why this is odd in many ways. I would also like to take this chance to explain what are the right questions you should be asking to your vendor when it comes to MDM matching.
This just came out, the book Radical Candor by Kim Scott. It’s a good read on managing and focused on people. I’d recommend it if you are a manager or help others manage people. I’d summarize it by saying it takes a teaching and mentoring approach to management, very much of the school that managers primarily exist to help the people on their team.Read Full Story
The October 2019 issue of the European Journal of Clinical Investigations came out today.
Two major trends in the big data landscape came to our attention that we wanted to address. We wanted to discuss data recruitment and what it means for the GDPR professionals out there. As you may know, all organizations – within the EU – that collect personal data must comply with the GDPR. The consequences for failing to meet GDPR standards are huge.Read Full Story
As a graduate student, I came to love working with the roll call voting data sets that have been compiled for the United States Congress by political scientists like Keith Poole and Howard Rosenthal. These datasets can be represented in simplified form as matrices in which the rows correspond to legislators and the columns correspond to bills that the legislators vote for or against.Read Full Story
I recently came up with the idea for a series of screencasts:
I’ve thought about recording a screencast of an example data analysis in #rstats. I’d do it on a dataset I’m unfamiliar with so that I can show and narrate my live thought process.
I was reading a paper the other day and came across the word aleatory.
This turns out to be an excellent word. It comes from the Latin alea for “dice”, as in alea jacta est, which is what you say
when you’re Julius Caesar and you cross the Rubicon.
It means random, or subject to chance.
Data Mining Research (DMR): Can you tell us who you are and how you came to the field of Data Science?
Jerome Berthier (JB): My name is Jerome Berthier, I am an engineer in Computer Science and I have an MBA in management. After 10 years working in different roles for an IT provider (developer, sales representative, managing director), I joined ELCA in 2012 to head the BI division.
Recently I came across provocatively titled “Machine Learning in Monitoring is BS” and decided to reply but the response came out longer than typical comment so I posted it separately.
Ambiguity of the data – true point. It’s impossible to build a single universal model that eats any data and alert when it’s wrong.
It is about a month since General Data Protection Regulation(GDPR) came into effect across the European Union. It’s the most critical data privacy law thus far, an 88-page monster translated into 26 different languages. When we summarize those pages, GDPR on privacy requires companies to:
Clearly state how they’re collecting and storing data about EU citizens.