Visual design clutter index for web pages

I’ve been working on a project where we’re trying to come up with a way to establish a visual design “clutter index.” The goal is to see if there is some threshold beyond which web page clutter impacts business metrics like conversion and click-through rates. The challenges are widespread of course, and mainly focused on the following 3 areas:

The definition and measurement of clutter. There are a variety of ways to measure clutter on pages, ranging from the completely objective (e.g., % of white space on a page) to completely subjective (e.g., how do users rate the page on a clean vs. cluttered scale).
The definition of conversion. Since some pages on an e-commerce web site are revenue-generating, and others aren’t, an important question is how you define conversion. For revenue-generating pages (e.g., pages with a “checkout now” button) this is easy — “Did the page result in a sale?” For other pages, like product information pages, this measure won’t work, so some other measure of engagement with the page becomes necessary.
Controlling for other influencing factors. In conjunction with the first two points comes the problem of causality vs. correlation. Assuming you have your definitions of clutter and conversion nailed down, how can you be sure any changes you see in conversion is caused by clutter (causal relationship), and not some other factor you are not accounting for (there’s correlation but no causal relationship).

The way to go about it is to take as many measurements of clutter as you can, feed them into a statistical model with a variety of conversion metrics, and see what comes out. You also have to find a way to account for other influencing factors so that you can control for that in your model. Easy, right? Ok, so there are a lot of open issues, but they’re definitely not insurmountable. I also believe it’s a worthy pursuit, the hypothesis being that there are clear business reasons for keeping designs and interfaces simple.

And apparently we’re not the only ones thinking about this… Ruth Rosenholtz and her colleagues at MIT recently wrote a paper (Measuring Visual Clutter) where they seem to have developed what they call a “clutter detector” for a variety of interfaces, mostly offline (desk clutter, map clutter, etc.). They describe some of their challenges in doing this as follows:

The fact that one person’s clutter is the next person’s organized workspace makes it hard to come up with a universal measure of clutter. Rosenholtz and colleagues modeled what makes items in a display harder or easier to pick out. They used this model, which incorporates data on color, contrast and orientation, to come up with a software tool to measure visual clutter.

On the issue of subjective measures of clutter:

Although there was a fair bit of disagreement among the people being tested about what constituted clutter, when the researchers compared results from their clutter measure to those of their human subjects, they found a good correlation.

I’m still digesting the paper, but it’s a fascinating read so definitely check it out. Thoughts on how to approach this for e-commerce web pages are also more than welcome!

5 August 2009

Using Twitter to value online information

I have recently noticed an interesting trend among the people I follow on Twitter. It appears that my network is dividing itself neatly into 2 camps: those who care deeply about the content they publish, and those who use it more casually. Let me explain…

Saying “good night” to everyone you know

Twitter users who casually update their status without thinking about it too much continuously say things like “Yep,” “Good night tweeple,” and “Banging my head against the desk.” Cryptic information that can be quite difficult to figure out. I’m not saying that this is necessarily a bad thing. It’s just clear that some people view Twitter as a broadcast medium mainly meant for people they know in the real world, and that’s fine (I tend to think that’s what Facebook is for, but let’s not split hairs about this).

I’m also not suggesting that all tweets should be serious — the odd random or exasperated update can be interesting, enlightening, and often very funny, and it also shows that there’s a real person at the other end. I do follow a lot of these casual users, but I know all of them personally so their updates are meaningful to me. And of course there is always the option to stop following someone, so you only have yourself to blame for the content you receive on Twitter.

But then there are those who care a lot about what they say…

People who care see Twitter not just as an outlet for random thoughts, but also a valuable tool to learn and share and expand their knowledge about issues they care about. I follow a bunch of people who clearly care about the content they put on Twitter, and it adds enormous value to my work life and personal life (people like @jontyfisher, @adamnash, @SmithInAfrica, and @TheONECampain, just to name a small and diverse subset of folks).

Sharing interesting information on Twitter makes you a good citizen of the web for a very important reason. It allows the best content to rise to the top. What makes content sharing on Twitter powerful is that humans are involved, not just technology. The difference between going through your RSS feeds and learning about something through your Twitter network is that on Twitter, someone read the content and decided that it is good enough to share. And if you follow people with similar interests, chances are you will find it interesting too. As Justin Basini (@justinbasini) put it in a recent post: “Twitter users aggregate, edit, filter and share better than any technology.”

But what if the content isn’t interesting to anyone else? Well, then it will just die in the constant stream of tweets that go by every day. If the content is good, it will be retweeted, and spread rapidly not just through your own network but the networks of others.

In sociology the phenomenon of information spreading through multiple networks is known as The strength of weak ties. In a 1973 paper, Mark Granovetter developed his theory of weak ties. The theory states that because a person with strong ties in a network more or less knows what the other people in the network know (e.g. in close friendships or within your closely-guarded Facebook network), the effective spread of information relies on the weak ties between people in separate networks.

And this is of course one of the main strengths of Twitter — that not all connections have to be mutual (when you follow someone they don’t have to follow you back, like on Facebook). In other words, retweeting allows information to jump from one tightly-knit network to the next, allowing for the rapid spread of valuable information throughout the entire network, not just your own.

A new way to value information on the web

There are still a lot of people who feel that Twitter is a waste of time and adds no value. That might be true for them, but I think we are seeing a very interesting phenomenon here, and that is a new way to value information on the web and separate what’s worthy of reading from what’s not.

RSS feeds allow us to see content we might be interested in (but not every article will be good). Digg and similar services allow us to see what other people find interesting. But only Twitter puts those features together and lets us see content that people with similar interests than ours find valuable. And there is real power in that.

Oh, and you can follow me on Twitter if you’d like.

3 August 2009

The dangers of "test and learn"

A recent discussion on a user experience forum I participate in turned to the topic of A/B testing. I really enjoyed the conversation so I wanted to reiterate some of the points I made, and expand on it a little bit as well. It’s not my goal to define A/B testing here but to share my opinion on its use. I believe that even though A/B testing can be extremely valuable to help identify the best iteration of a site or a particular page, it should never be used in isolation.

Since A/B testing is relatively cheap to do and the results are so compelling, companies are in danger of adopting a “test and learn” culture where pages are just A/B tested with no additional user input. That would be the wrong way to go. A/B testing shouldn’t be used on its own to make decisions, it should always be used in conjunction with other research methods — both qualitative (such as usability testing, ethnography) and quantitative (such as desirability studies).

A/B testing is an important method in the research toolkit because it can give you information that usability testing on its own cannot. The main goal of A/B testing is to see how business metrics move up and down depending on the version of the page — click through rates, checkout rates, purchasing rates, etc. You can’t see that with usability testing alone. But as Kohavi et al. point out in their paper Practical Guide to Controlled Experiments on the Web, A/B testing has some major limitations:

Quantitative Metrics, but No Explanations. It is possible to know which variant is better, and by how much, but not why. In user studies, for example, behavior is often augmented with users’ comments, and hence usability labs can be used to augment and complement controlled experiments.
Short term vs. Long Term Effects. Controlled experiments measure effects during the experimentation period, typically a few weeks. It is wise to look at delayed conversion metrics, where there is a lag from the time a user is exposed to something and take action. These are sometimes called latent conversions.
Primacy and Newness Effects. These are opposite effects that need to be recognized. If you change the navigation on a web site, experienced users may be less efficient until they get used to the new navigation, thus giving an inherent advantage to the Control. Conversely, when a new design or feature is introduced, some users will investigate it, click everywhere, and thus introduce a “newness” bias.
Features Must be Implemented. A live controlled experiment needs to expose some users to a Treatment different than the current site (Control). The feature may be a prototype that is being tested against a small portion, or may not cover all edge cases. Nonetheless, the feature must be implemented and be of sufficient quality to expose users to it.
Consistency. Users may notice they are getting a different variant than their friends and family. It is also possible that the same user will see multiple variants when using different computers (with different cookies).

As with most things, it is important to use A/B testing responsibly. Since every research/testing method comes with its own limitations, a combination of methods is the only way to get the full picture and make the right decisions.

Visual design clutter index for web pages

Using Twitter to value online information

Saying “good night” to everyone you know

Sharing content via Twitter

A new way to value information on the web

The dangers of "test and learn"