Menu

The hype, benefits, and dangers of Big Data

A Readlist of all the articles referenced in this post is available here. Readlists allow you to send all the articles to your Kindle, read them on your iOS device, or download it as an e-book.

Despite the overly alarmist title, Andrew Leonard’s How Netflix is turning viewers into puppets1 is a fascinating article on how Netflix uses Big Data in their programming decisions:

“House of Cards” is one of the first major test cases of this Big Data-driven creative strategy. For almost a year, Netflix executives have told us that their detailed knowledge of Netflix subscriber viewing preferences clinched their decision to license a remake of the popular and critically well regarded 1990 BBC miniseries. Netflix’s data indicated that the same subscribers who loved the original BBC production also gobbled down movies starring Kevin Spacey or directed by David Fincher. Therefore, concluded Netflix executives, a remake of the BBC drama with Spacey and Fincher attached was a no-brainer, to the point that the company committed $100 million for two 13-episode seasons.

The article also asks what this approach means for the creative process, something I’ve written about before in The unnecessary fear of digital perfection, so I won’t rehash that argument here.

What’s interesting to me about the rise in Big Data approaches to decision-making is the high levels of inaccuracy inherent to the analysis process. Of course, this is something we don’t hear about often, but Nassim N. Taleb recently wrote a great opinion piece about it for Wired called Beware the Big Errors of ‘Big Data’, in which he states:

Big-data researchers have the option to stop doing their research once they have the right result. In options language: The researcher gets the “upside” and truth gets the “downside.” It makes him antifragile, that is, capable of benefiting from complexity and uncertainty — and at the expense of others.

But beyond that, big data means anyone can find fake statistical relationships, since the spurious rises to the surface. This is because in large data sets, large deviations are vastly more attributable to variance (or noise) than to information (or signal). It’s a property of sampling: In real life there is no cherry-picking, but on the researcher’s computer, there is. Large deviations are likely to be bogus.

He gets into more detail on the statistical problems with Big Data in the article, and his book Antifragile looks really interesting too.

Since I haven’t written about Big Data before, I also want to reference a few articles on the topic that I enjoyed. Sean Madden gives some interesting real world examples in How Companies Like Amazon Use Big Data To Make You Love Them2. But over on the skeptical side, Stephen Few argues in Big Data, Big Deal that “interest in big data today is a direct result of vendor marketing; it didn’t emerge naturally from the needs of users.” He also makes the point that data has always been big, and that by focusing on the “bigness” of it, we’re missing the point:

A little more and a little faster have always been on our wish list. While information technology has struggled to catch up, mostly by pumping itself up with steroids, it has lost sight of the objective: to better understand the world—at least one’s little part of it (e.g., one’s business)—so we can make it better. Our current fascination with big data has us looking for better steroids to increase our brawn rather than better skills to develop our brains. In the world of analytics, brawn will only get us so far; it is better thinking that will open the door to greater insight.

Alan Mitchell makes a similar point in Big Data, Big Dead End, a case for what he calls Small Data:

But if we look at the really big value gap faced by society nowadays, it’s not the ability to crunch together vast amounts of data, but quite the opposite. It’s the challenge of information logistics: of how to get exactly the right information to, and from, the right people in the right formats at the right time. This is about Very Small Data: discarding or leaving aside the 99.99% of information I don’t need right now so that I can use the 0.01% of information that I do need as quickly and efficiently as possible.

What I think we should take from all of this is that our ability to collect vast amounts of data comes with enormous predictive and analytical upside. But we’d be foolish to think that it makes decision-making easier. Because Big Data does not take away the biggest challenge of data analysis: figuring how to turn data into information, and information into knowledge. In fact, Big Data makes this harder. To quote Nassim again:

I am not saying here that there is no information in big data. There is plenty of information. The problem — the central issue — is that the needle comes in an increasingly larger haystack.

In other words: proceed with caution.


  1. Link via @mobivangelist 

  2. It’s interesting that the phrasing of both this headline and the Netflix one implies that companies are using Big Data to persuade us to do things against our will. But I can’t figure out if that’s a real fear, or just clever linkbait. 

Optimization points in responsive web design

Mark Boulton argues that we need to think further than breakpoints in responsive design, and also spend time figuring out the “optimization points”. From The In-Between:

I think we’re missing a trick for using breakpoints to make lots of subtle design optimisations. […] Content-out design means defining your underpinning design structure from your content, and then focusing on what happens in between ‘layouts’. This approach of optimising your design by adding media queries (I like to call these optimisation points rather than break points, because nothing is broken without them, just better), means you are always looking at your content as you’re working. You become more aware of the micro-details of how the content behaves in a fluid context because your focus shifts from controlling the design in the form of pages, to one of guiding the design between pages.

He shares some examples and also links to more resources on how to accomplish this. One good example of this subtle optimization approach is Responsive Typography, a concept by Marko Dugonjić where the size of the typography displayed on the screen is based on the viewing distance of the reader, calculated via webcam.

My phone isn’t better than your phone

I really enjoyed Michele Catalano’s Grimes, Pop Music, and Cultural Elitism, which starts with this quote from Clare Boucher (better known as Grimes):

I don’t see why we have to hate something just because it’s successful, or assume that because it is successful it has no substance.

It’s an article about our tendency to look down on pop music (and the people who like pop music), but it points to a much broader cultural phenomenon:

The elitism one shows when they dismiss pop music as vapid and those who like it equally vapid is a detriment to any open conversation. The defenders of pop – myself included – are often put on the defensive, made to offer up excuses as to why we like what we do. No one should have to defend their musical choices. No artist who worked hard to get where they are should be roundly dismissed because their music doesn’t fit some elitist standard.

This kind of elitism is something we all have to watch out for. I will probably never switch away from my iPhone, but that doesn’t mean that Android users are undiscerning losers. The best phone is the phone you like the best. That’s all there is to it. As hard as it can be sometimes, we have to decouple the things people like and don’t like from their value as human beings.

Whose basket is it?

In Yours vs. Mine Dustin Curtis explains his preference to use “Your stuff” as opposed to “My stuff” in interfaces. It might sound trivial, but whether we use “My” or “Your” reveals something about how we view technology:

After thinking about this stuff for a very long time, I’ve settled pretty firmly in the camp of thinking that interfaces should mimic social creatures, that they should have personalities, and that I should be communicating with the interface rather than the interface being an extension of myself. Tools have almost always been physical objects that are manipulated tactually. Interfaces are much more abstract, and much more intelligent; they far more closely resemble social interactions than physical tools.

The answer for me, then, is that you’re having a conversation with the interface. It’s “Your stuff.”

This echoes Yahoo!’s recommendations:

Labeling stuff with “Your” instead reinforces the conversational dialogue. It is how another human being might address you when talking about your stuff. Even with MySpace1, people say things like “I saw what you put on your MySpace.”

So MTN at least got one out of four right on this page:

MTN account


  1. I guess they haven’t updated this pattern in a while. 

The dirty world of Facebook EdgeRank Optimization

I’ve been seeing more and more scams like this one in my Facebook News Feed:

Dirty Facebook EdgeRank Optimization

You only have to think about it for 4 seconds to realize that making a comment on a photo on the web will result in you watching and seeing absolutely diddly-squat (“P.S.: This is not Insane after all!”). And yet, in this particular case, 259,304 people thought about it for 3 seconds or less, commented, waited and saw nothing, and then moved on to the next thing.

The question is, why do Page admins do this? What’s the use of tricking people into commenting on photos, especially when they’ll realize right away that they’ve been made to look like a fool? Well, because there’s money in it, of course.

This is a pretty transparent scam to beat Facebook’s EdgeRank system — the algorithm that Facebook uses to determine what articles should be displayed in a user’s News Feed. When someone comments on a picture it makes it more likely that the picture will show up in their friends’ News Feeds, so it’s an easy way for a Page to gain more exposure very quickly.

Once these Pages have built up hundreds of thousands of “Likes” using the scam, they usually do one of two things. They either start punting things they want to sell, or they sell the Page itself to a business that changes some of the details and uses it as their instantly enormously popular brand Page.

This is obviously pretty dirty, and also nothing new — we’ve had black hat SEO and dark patterns since the dawn of the web. But what I can never understand about the people who use these tactics is why they don’t long for the satisfaction and personal growth that comes from doing real work and reaping the rewards of that. Why create a community of people who couldn’t figure out that you’re scamming them, as opposed to a real community that values your company and what you do? I’ve written about this before in my defense of doing things the hard way:

When we do things the hard way, we invest in ourselves in the best possible way. We kick off an endless cycle of learning and mastery that helps us grow and lead fulfilling lives of purpose. When we take shortcuts, we become mere pretenders. We learn how to play the part, but there is no substance or continued growth. The instant gratification makes us build the house of cards ever higher, which brings anxiety about the whole thing coming tumbling down. Why would we shortchange ourselves like that?

So what can we do about these scams? Well, for one, obviously don’t comment on it. But I also recommend clicking on the little arrow on the right and hiding the post. That will tell EdgeRank that the person who commented on the photo is not worth paying attention to, so in time you’ll see less and less of those kinds of posts. Who says we can’t all be EdgeRank Optimization specialists?

Gestural interfaces and generational transition

Francisco Inchauste did a great interview with MIT Technology Review about the user experience challenges of gestural interfaces. From Does Gestural Computing Break Fitts’ Law?:

I think there are a lot of usability/UX rules and laws that will come into question as we move forward into more of these experimental kinds of interfaces. I know many of them already have been retested/validated by other researchers.

A lot of newer interaction paradigms aren’t naturally intuitive as we like to think. Tapping and swiping at “pictures under glass” (or in this case, content) is always going to be a learned thing, like when we were introduced to the desktop metaphor or icons.

I think we’re in a period of generational transition when it comes to fully gestural interfaces1. Despite living on the Internet, I still struggle to remember some of the newer gestures that are popping up in iOS apps. On the other hand, my 3½-year old daughter has zero problems figuring out (and remembering) gestures, because this is the world she’s growing up in. There is no major shift in mental model needed — to her, this is just how technology works. It reminds me of something Chuck Skoda said a while ago in The touchscreens are coming:

While I fully expect the future to have keyboards and mice (or some precision pointing device), touch is already precluding the ubiquity of both in the minds of children. When the upcoming generation is running the show, we will find another absurd idea, that a computer built for human interaction will have a screen that doesn’t respond to touch.

And when that generational transition is complete, what we once thought of as “newer interaction paradigms” will simply be “the way things are”.2


  1. By the way, check out Rise, a fantastic, fully gestural alarm clock app by Francisco and the team at Simplebots. 

  2. I think I deserve a special Internet high five for not making a “the future is already here…” reference here. 

An interface should get out of the way, except when it shouldn’t

Rus Yusupov talks about the design process at Vine in Design at Vine: Everyone needs an editor. I love these kinds of posts because I always learn something — either confirmation that we’re not the only ones doing things a certain way, or that we’re doing something wrong and need to change.

One of Vine’s key design principles got me thinking about the “invisible design” debate again:

Strive for simplicity. An interface should get out of the way. People should be able to focus on being creative, not on how to use the app. In many ways, interface design is like film editing: if you notice it, it wasn’t done well.

This idea has been a common refrain over the years, especially since Dieter Rams formalized his 10 principles of good design and said that “Good design is as little design as possible.” Except that somewhere along the line, we started to believe that “as little design as possible” means “getting out of the way”. It doesn’t.

Rams didn’t say that good design disappears completely. “As little design as possible” is not about making things invisible, it’s about “not burdening products with non-essentials”. It’s about making the right choices about what should be there, and what shouldn’t. There is nothing wrong with making the things that are in the product visible, sometimes very much so. Let’s not forget that one of Rams’s other principles is that “Good design is aesthetic”:

The aesthetic quality of a product is integral to its usefulness because products we use every day affect our person and our well-being.

I would add that making the right interface elements appropriately visible is essential for a visual hierarchy that effectively guides users through an interface.

Nevertheless, at some point the design community collectively arrived at this conclusion that good design is invisible — or even better, not even there. And I think that’s a dangerous line of thought. In the case of Vine, they used this principle well to ensure simplicity in the app. But there is still a very strong visual identity in the app.

We need to remember not to conflate what should be two different arguments. “How it works” should be invisible, but “How it looks” certainly doesn’t have to be. I think Dieter Rams would agree with that.

I’ve written about this topic before in So, is good design invisible, or not?

Banner blindness and you

Joaquin (no last name?) talks about ad banner blindness in The non-click generation:

See, the point is, I know this ad is always in that space, I know what it does, I know its intentions, and I know the methods. It’s invisible to me because I know so much about it.

That’s nothing new, of course, but the article did remind me of Mike Lacher’s extremely funny I Am the One Who Clicks Banner Ads:

While you check the weather, I find out why California dermatologists hate the one weird skin care secret discovered by a stay-at-home mom. While you read the New York Times, I rollover for more information about how to get my diabetes under control. While you search IMDB, I click for showtimes, tickets, and behind-the-scenes videos for Think Like a Man. Page after page, banner after banner, I click and I click.

Oh, and while you’re on McSweeney’s, you might as well check out I’m a Social Media Community Manager!:

What is a Social Media Community Manager? Oh sorry, I didn’t hear you over the sound of how hip my job is.

The Internet would be so much less weird and fun without McSweeney’s.

More

  1. 1
  2. ...
  3. 137
  4. 138
  5. 139
  6. 140
  7. 141
  8. ...
  9. 201