User experience architecture

June 23, 2010

Usability findings: defects or risks?

Insight and advice

When we run a usability test or a heuristic evaluation we create two types of value:  insight and  advice.

Insight is packaged as a set of findings.  Good  insight comes from findings that are accurate, clear and at the  appropriate level of abstraction. Great insight requires an eye for pervasive patterns of design error that mine the detail to extract a few simple, powerful themes.

Advice comes as recommendations. Good recommendations are clear, pragmatic and actionable. Great recommendations also reflect the actual priorities of the business.

Modelling risk

The interesting part is getting from great insight to great advice. One approach is to borrow a model from HRA (Human Reliability Analysis). Reliability analysis is interested in identifying and assessing risk.

Here’s a useful model. r=p*i where r is the risk associated with some factor, p is the probability of an incident and i is the impact of that incident.  For example, we can use this to compare the risk to society of a nuclear meltdown (low p, high i) to the risk from traffic accidents (high p, low i). The results are a decent starting point for making investment decisions on programmes to prevent, detect and recover.

Findings stated as risks

The findings of a usability evaluation are actually predictions. In a test, we investigate the behaviour and attitudes of a sample to infer the behaviour of a population of users. In an inspection, we role play a sample for the same reason.  Our predictions are actually statements of risk.

  1. [Based on the behaviour of our test participants we predict that] sophisticated language will deter a few users from using the menus to proceed beyond the home page.
  2. [Based on the behaviour of our test participants we predict that] misleading visual affordances will mask the interactivity of the product configuration controls for the majority of users.
  3. [Based on the behaviour of our test participants we predict that] due to fixed font sizes,  a few users will be unable to read the privacy statement.

Each of these findings is a prediction grounded in data.  It estimates a probability in terms of a number of users.  It models impact in terms of what the defect prevents the user from achieving. Of course, they are other ways of expressing these factors. p could  be  error rate or frequency of the defect within the design. i could model consequences such as:  user attrition; lost revenue; productivity leakage; hazards or compliance issues.  Choosing the right approach can make for an interesting and enlightening conversation with your client. Alternatively, High (3), Medium (2) and Low (1) may be all you need.

Using the model

So, to assess the risk of the three findings above:

1. risk  = Low (few users) x High (blocked by the home page) = 1 x 3 = 3

2. risk = Moderate (many users) x Moderate (can’t configure a product) = 2 x 2 = 4

3.  risk = Low ( few users) x Low  (privacy statement) = 1 x 1 =1

On a scale of 1 (p=Low : i=Low) to 9 (p=High : i=High), fixing configurator affordances is our highest priority.  Interestingly, on our scheme, it is actually a relatively moderate risk. As with any calculation, if the result jars with your intuition, check your assumptions. Your initial assessment of p and i may be out.

Now we have a risk for each  finding, it’s straightforward to prioritise the recommendations.  If there’s fixed development capacity for usability issues, select findings to tackle by ranking the risk scores.  Otherwise, make an investment decision on whether to address each finding based on it’s  absolute risk score.

You might also want to factor in the cost of the fix – but that’s another calculation for another day.

July 3, 2009

User testing – a foundation recipe

User testing – the foundation recipe

Improvisation from basics

Every good cook has some treasured foundation recipes: a simple muffin mix to which she can add nuts, chocolate or spices; perhaps a tomato and onion based soup to which she can throw in seasonal vegetables, pasta or chopped ham; maybe a spicey curry base that works well with prawns, chicken or vegetables.

To improvise in the kitchen, firstly master the basics then understand when each variation is appropriate. For a white sauce, add parsley to accompany fish. Add mustard for boiled bacon or cheese for savory pancakes. No onions? Chop a scallion. Left over Tarragon? Chop it up; chuck it in. Last night’s Salsa? Think again!

Experienced usability practitioners follow a similar approach in designing a usability test. It’s applied science; observation and analysis are fundamental. However, depending on goals and constraints, we can look for many things, observe in different ways and choose from a wide range of analytical techniques. As with cooking, there’s a foundation recipe and a wide range of variations.

User testing – the foundation recipe

Here’s a seven-step recipe that covers most types of testing. The two activities in parenthes are not strictly part of the method; they do, however, reduce risk and ensure that you learn from your experience.

Prepare artefacts
(Pilot) 4.
Analyse data
Report results
Brief client
(Project debrief)

You can expect to have some activity for each step. However, the nature and scope of that activity will vary according to the needs of the client and the culture of the project. Consider a test to assess the safety of a remotely-controlled radiography device. You might plan for hypothesis-testing (design study) using a large sample size (recruit participants) to record error rates (measure) for statistical analysis (analyse data).

The report (report results) might become a formal project deliverable while a handover meeting (brief client) would be essential for a mixed audience of technical, business and medical specialists. It’s the equivalent of high-tea, muffins with chopped dates, walnuts and cinnamon.

For a small-scale “in-flight” study, the model is the same but the activities are smaller and simpler. A formative research design (design study) uses a small sample (recruit participants) to acquire data (observe, ask) for qualitative analysis (analyse data). The results are presented in a PowerPoint deck (report results) and reviewed by the design team and project manager (brief client). This situation is more like a simple dusting of caster sugar – good rather than fancy.


Here are the nuts, raisins and chocolate chips to add to the basic recipe.

1. Design study Summative, Formative, Benchmark, Competitive, Comparative

User-driven, “chauffeured”
Open-ended, Scripted?

2. Recruit participants Quota sample, Stratified sample, Opportunity sample

Recruit directly, Use an agency

Volunteers, Incentives

3. Prepare artefacts Paper, Static, PowerPoint, Axure etc, Wizard of Oz, Live code
4. Observe Direct, Indirect (video), Remote (e.g.TechSmith)

From a control room, Side-by-side

In a lab, In an office, In the field

4. Measure Count, Time, Code, Checklist
4. Question Active, Passive
Interrupt protocol, Debrief protocol, Before-and-after protocol
5. Analyse data Quantitative, Qualitative

Specific observations, Generalised issues

Descriptive, Analytical

Business impact oriented, Solution feature oriented

6. Report results Document, PowerPoint, Annotated video, Verbal

Formal, Informal, Standardised

7. Brief client Briefing, Review, Action-planning

You can read more about these techniques in books such as Practical Guide to Usability Testing (Dumas) or Human Computer Interaction (Preece et al).


The success of a user-test is pretty much determined by the quality of the thinking you do before you book a lab or approach a recruiter. Here’s a checklist that covers the main issues. Use it as the basis of a workshop or planning session before you start on design and logistics.

1. Design study
  • What do want to find out?
    • Summative – is it good enough?
    • Formative – how could it be improved?
    • Benchmark – how good is it now?
    • Competitive – how does it compare to competitors?
    • Comparative – which alternative works best?
  • Who is your target audience?
  • What tasks do you want to test?
  • Who will “drive” – you or your participants?
  • Is it open-ended or does it need to follow a pre-defined path through the prototype?
2. Recruit participants
  • How are you going to find the people you need?
  • What incentives will you offer them?
  • How you are you going to get them in the right place at the right time?
  • How long do you need them for?
3. Prepare artefacts
  • In what form will you show the design to the participants?
  • How interactive does it need to be?
  • How much ground does it need to cover?
  • How high fidelity should it be?
4. Observe
  • What events and outcomes are you looking for?
  • How will you record them?
  • How many observers will you use?
  • How visible should you be?
  • How involved should you be?
  • What balance are you seeking between recording expected events and noticing surprises?
  • How will you ensure that observation does not distort the data?
  • What evidence will you need?
4. Measure
  • What events and outcomes do you want to measure?
  • How will you log the data you need?
  • How will you ensure that the measurement process does not distort the data?
4. Ask
  • What attitudes and insights do you need to capture?
  • When will you capture this information? During a task, after each task? At the end of the study?
  • How will you ask the question? In person, on a form, through the design itself?
  • How will you calibrate this information? Do you need to capture an opinion before each task?
  • How will you record this information?
  • How will you ensure that asking questions does not distort observations and measurements?
5. Analyse data
  • What is the right blend of qualitative, quantitaive and video?
  • What’s the analytical focus: the problems; the causes; the impact; or recommendations?
  • What level of rigor is appropriate and affordable?
6. Report results
  • Who is going to read it? What do they need to know?
  • How long and formal does it need to be?
7. Brief client
  • How do we turn the study into a pragmatic, actionable plan?
  • How do we get commitment to change?

As in the kitchen, get the basics right but be prepared to improvise the detail. That way you’re still ready when you don’t have the right method in the store cupboard.

August 9, 2008

Data good; findings better

Filed under: evaluation — Tags: , , , , , — uxarchitecture @ 11:06 am

I get to read a lot of usability studies. Some are insightful and persuasive, clearly communicating the main issues and inviting action. Others contain indigestible inventories of raw data. Here are some examples:

  • a long list of specific errors;
  • an exhaustive set of annotated screen shots; or
  • a table of design problems grouped by page.

A heuristic evaluation can generate hundreds of expert comments. Likewise, a skilled observer can capture many subtle observations by analysing the video from a usability study. Data is good – but data is exactly what it is, the raw material from which a skilled analyst extracts findings.

Here’s what clients tell me they want to know.

  1. How well does it work?
  2. What are the major problems?
  3. What’s the impact on my users and my business
  4. What do I need to do to fix it?
  5. How can my design team learn from this?
  6. How do I know you’ve done thorough and impartial work?

The missing step in these “briefcase buster” reports is analysis. A usability practitioner needs the ability to mine hundreds of data points to extract the one or two pages of insight that truly answer the client’s questions. There are many methods including; shuffle-the-post-it, qualitative analysis and mapping to guidelines. Here’s a route-map.

  1. Analyse data to create findings. A finding describes a pervasive issue: the graphic design is primitive; the actions do not match the user’s task model; terminology is arcane and inconsistent.
  2. Support findings with selected data. This demonstrates rigor, illustrates abstract ideas with concrete examples and adds emotional impact.
  3. Describe the specific impact on the business: higher learning costs; lower adoption; brand damage; reduced sales.
  4. Recommend design changes: follow the Windows style guide for radio button behaviour; do not use a fixed font size; describe business processes in plain English.
  5. Recommend tools and methods improvements: consider using a professional graphic designer; construct a task model before designing screens; read the Polar Bear book.

Good findings should be high level, clear, business-focused and actionable. Above all, to paraphrase the good Doctor, “Speak the client’s language” To us it’s a research project, to them it’s an investment.

Create a free website or blog at