Learning from Neuroscience: UX Redesign of Computer Games
If machines are going to support humans better then they had better understand what we are saying when we tell them to do something. When even we humans don’t yet truly understand how we can so effortlessly extract meaning from some of our grammatical lexica, it’s no surprise machines have a hard time trying to keep up.
A team of computer game design experts, computational linguistics experts and coders at Queen Mary University London and Essex University created a set of computer games that crowdsource the collection of natural human language usage data through their play with the aim of better understanding language patterns that humans find so natural and machines so problematic. The project and the resources it generated were open source, and had already spanned many years of rigorous scientific research.
The team needed a lot of data from their games, which meant lots of players playing the games frequently. Yet player drop off was fast and vast.
I partnered with them to change that.
- What the hell is a GWAP?
- The problem
- High level project goals
- Initial observations & pre-research
- My role
- Solving problems
- Solution focus 1: Unified experience — the map
- Solution focus 2: Game 1
- Final round usability testing
- Success metrics
- Next steps & reflection
What the hell is a GWAP?
The games existed for two primary purposes:
- Outsourcing the steps within a computational linguistic analysis process that machines are failing at to humans
- Run large scale academic Game Design Psychology and Computer Science experiments
A human-based computation game or ‘game with a purpose’ (GWAP) is a human-based computation technique of outsourcing steps within a computational process to humans in an entertaining way (gamification) — Wikipedia
GWAPs motivate people through entertainment rather than an abstract interest in solving computation problems. In this case the computational problems are in the Computational Linguistics domain, specifically the better understanding of human language by machines.
When I came onto the project the four games were essentially separate entities, from both user and project perspectives, each with different game developers. As word games, they all shared the same core gameplay.
In all the games, players are presented with text documents composed of written prose. Players assign grammatical categories to words, or chunks of words, as they read. If enough players agree on a given chunk’s categorisation then it is considered reliable enough fodder to be fed to the machine learning algorithms that draw out grammatical insights.
While at kick-off each of the games existed on separate websites and shared no common visual or interaction design patterns, their game play drew from the same pool of playable text documents. Each of the games were created to collect different types of lexical data from the text documents, and the text documents move through the games in series as a pipeline.
Unfortunately the first game in the pipeline, originally titled ‘TileAttack’, was causing a bottleneck. Without enough players validating documents in ‘TileAttack’, the other games were starved of the very source material of their play. As a result players playing the games further down the pipeline would find their play ending woefully prematurely.
High Level Project Goals
While the core goal was to drastically increase the amount and quality of data coming from the games, it was clear that this wasn’t going to happen without addressing the pipeline issue.
Initial Observations & Pre-Research
We needed a huge quantity of human validated language examples for the Machine Learning to be effective. To achieve this the games needed to be fun and rewarding enough for players to want to play them and keep coming back. Concerns raised in heuristic analyses were backed up in a useful ‘fast and dirty’ preliminary usability testing round:
- Player objectives were unclear
- The complex concepts that underpinned gameplay obstructive
- The user interface experience was confusing and unattractive
- The gameplay was monotonous
- The games were totally disconnected experiences
Here’s how I transformed them into a gratifying experience that players wanted to keep coming back to.
Without existing Design or Product resources in the team, my job was to bring the various disparate game projects and their contributors together to define a singular Product Vision, gather insights through User Research, validate the vision and then turn it into tangible user experiences that fulfilled project goals.
Working 100% remotely as a ‘UX team of one’, I took on a broad spectrum of end-to-end responsibilities. Soon after my starting on the project the team made it clear that they wanted all the help they could get in terms of Product Management, so I found myself roadmapping and creating Gantt charts with the team as well as designing:
- Product Strategy
- Product Management
- UX Strategy
- UX Research
- UX Design
- UI Design
- UX Content Writing
From developing the Product Vision through roadmapping and design phases to Developer handover, visually co-designing with the team in Miro workshops was crucial to achieving successful results.
I was honoured to work and learn from such a concentration of knowledge and experience as the DALI team. Notably present was the esteemed Richard Bartle. Richard might be considered the Tim Bernars Lee of Game Design, having created the very first multi-user games in the dawn of the connected age, his theories underpin the core concepts of current Game Design.
Bailing fast: Analytics showed that by the second interaction over 90% of users were dropping off — even though over 70% of them were being paid to play! It was clear that potential players were being put off from the get go. Heuristic analysis defined serious usability flaws, but ultimately players didn’t even know what they were meant to be doing.
Through a few incisive survey questions piggybacking the recruitment screening questionnaire, it was discovered that only players with an existing interest in language games had interest in the value proposition of any of the games, striking off the ‘pure gamers’ user group off our list. Pure gamers weren’t one of our user groups.
The team realised that none of the games would compete with Candy Crush for player attention, but instead were for people who liked word games, liked exercising their brain and liked learning the nuances of language. This we needed to capitalise on, rather than waste resources on attempting to turn them into action games.
Recruitment across a the most representative spectrum of relevant interview respondents was carried out to minimise bias, with interviewees taking part remotely from numerous continents.
Due to the bureaucratic barrier university structures presented in outsourcing recruitment to specialist external companies (laborious request to tender process, etc.), we had to be satisfied using predominantly academic channels for recruitment. Independent recruitment outside academic channels, while pursued, was not providing sufficient uptake in the given time frame.
On the plus side, the academic networks reached students, lecturers and alumni across all subject areas, but allowed me to reach users with the level of highly advanced language skills that would have been ticky using standard recruitment channels.
Interviews were of a semi-structured format to balance greatest value for gathering unpredictable insights with enough consistency to provide comparative data.
All games were usability tested using timed task analysis and compared to test subjects’ favourite comparable games to gather behavioural insights and investigate design preferences. They were concluded by gathering a system satisfaction score using the SUPR-Q measurement format (which I generally recommend over the more traditional System Usability Scale, or SUS).
Key Findings & Design Challenges
While many other relevant findings combined to generate a deeply nuanced understanding of our users’ thoughts, beliefs, mental models and experiences, these were the findings that stood out. The problems are presented here with a preview of the solutions that came out of the fun cross-functional team workshops I facilitated and some additional solo sketching.
#1: Our user groups have very different needs
PROBLEM: Some players are academic linguistics experts flashing their skills, some are using the games to learn English as a foreign language.
DESIGN CHALLENGE: A different baseline skill level was inherently required for each of the individual games. To cater for players’ broad scope of abilities, the individual games could be presented at appropriate key stages in highly personalised user journeys within a single unified experience.
#2: What the #5%* is going on?!
PROBLEM: Players had a really hard time working out the aim of the games and how they were meant to play.
DESIGN CHALLENGE: Communicate player goals clearly and develop a gameplay mechanics learning journey that is itself fun and easy to understand.
#3: These linguistic concepts are very complex — for everyone
PROBLEM: Even the world experts in this field on the team couldn’t agree on a singular definition of some key language concepts. Our players were going to need a carefully designed learning experience to continually feel motivated and provide project value in their play. Fast on-boarding is not always possible when there’s so many new things to learn.
DESIGN CHALLENGE: Make learning them fun, rewarding and comfortably progressive through an engaging narrative. Learn from doing when possible and design for the psychological, neurological and contextual limitations of user memory.
#4: Samey gameplay drives churn
PROBLEM: Once players understood how to play each of the original individual games there was little sense of progression and motivation to keep playing.
DESIGN CHALLENGE: Nest the various games within a single wrapper game, presenting them at appropriate stages of personalised user journeys to keep players engaged.
#5: I want to help science but show me my impact
PROBLEM: While this could not be proven in practice — only observing behaviour could confirm this — in survey and interviews all player types/user groups consistently reported that playing to help a scientific cause, as for all our games, would be a significant motivator. This said, if there was not sufficient feedback on the impact a player had in moving the project towards its goals, the motivation players felt to play to help a worthy cause was largely untapped.
DESIGN CHALLENGE: Surface players’ contribution to the cause within the games to tap that feeling of satisfaction that their play has helped science.
#6: It’s gotta look and feel sweet
PROBLEM: Players seeing the games for the first time said they felt like they’d accidentally clicked on a hooky phishing link. These were the result of experimental academic projects, not big team, big budget game houses. The experience appeared and felt fragmented.
DESIGN CHALLENGE: Provide a clean, modern UI style that is a pleasure to use, while present all the games within a unified visual and interaction style to drive coherence and credibility of the consummate value proposition.
As qualitative research identified that not enough pure action gamers were interested in the kind of cognitive challenge that our games presented, the five player types/user groups the team proposed during the kick-off workshop became four.
All the personas had quite different needs, but for reasons of brevity, this case study will focus on the ‘Grammar Ace’ player type as she was found to present the widest spectrum of potential in terms of life time value, ability to quickly generate quality data and their role in keeping the document pipeline flowing. We’ll discover how and why as we move into the problem solving phase later in this case study.
The majority of players over all user groups shared some consistent core traits round reasons to play a game, reasons to stop playing a game and when and where they play.
The Quirks of Game Design UX
Celia Hodent’s fantastic book ‘The Gamer’s Brain: How Neuroscience and UX can impact Design’ was hugely influential to me throughout the project. This was my first opportunity to design in the gaming domain and I discovered some aspects of practice vary when working in this field.
Key Concepts — The Psychology of Why People Play
Hodent’s book is a treasure trove of design insight from the lens of an extremely studied and experienced psychologist and neurologist. While user interviews told us that our users play to learn, to feel good about helping a worthy cause, to relax and to also stimulate their brains, digging deeper into the psychology of motivation was extremely useful, and has implications across other domains of UX design.
One concept she elucidates in the book is around the theme of motivation. Why are we motivated to do anything, let alone spend time on an activity as frivolous as playing games? Motivation can be split into two categories; extrinsic and intrinsic. Extrinsic motivation is when we are motivated to perform actions to realise outcomes beyond the action itself while intrinsic motivation occurs when one pursues an activity for the sake of the activity itself, not as a means to getting something else.
Most UX deals with extrinsic motivation, for example engaging in the Amazon eComm experience is primarily motivated by the extrinsic need to purchase a product, not to dwell within the Amazon experience itself. More important to the gaming experience is intrinsic motivation. One perspective often used as an intrinsic motivation framework in game development is self-determination theory (SDT). Based on this perspective, a game should aim to satisfy basic psychological needs for competence, autonomy, and relatedness to be engaging (see Przybylski et al. 2010) and be meaningful to the player.
From these Hodent defines these ‘pillars’, somewhat analogous to project design principles, but specifically around the theme of motivation:
Make players feel skillful, in control, and to feel a sense of progression and mastery. It is therefore critical to clearly express to our audience what the short-term, mid-term and long-term goals of the games are so players can put forth more effort and engagement when playing them, knowing that it can be a long-term investment.
Satisfying players’ needs in autonomy mostly entails allowing them to make meaningful choices and to have a sense of volition but also to make the game systems clear so they feel in control and can experience a sense of purpose.
This means offering meaningful social interactions in the game. Games offering various channels to communicate information or emotion among players (i.e., chat system or emotes), to compete against each other, or to play cooperatively increase their likelihood of being engaging.
Meaning is about “having a sense of purpose, value, and impact,” Dan Ariely (2016a). This is why it is important to reveal the reasons behind anything the player needs to do or has to learn. In our example our player’s sense of purpose is derived from feeling that they are helping progress a valued scientific cause.
This brought about a key reframing of a core UX tennet:
‘Answering user needs’, as our primary goal for most UX projects, becomes here ‘providing players a rewarding experience’.
To assist in decision making, improve alignment, and provide clarity through the ideation, I facilitated a workshop with the team to create a set of design principles that would become a north star to be consulted for guidance repeatedly from that moment onward.
Remote Ideation Workshops
I facilitated synchronous remote cross-functional ideation workshops using the online collaboration tool Miro to leverage the combined problem solving abilities and specialist knowledge contained within the team. The team addressed our highest priority user problems in turn, generating lots ideas.
After subsequent feature prioritisation workshops with the developers, where user value is mapped over feasibility, the team decided on the which solutions to develop. The most influential on the project have already been touched upon earlier in the ‘Key Findings & Design Challenges’.
How to prevent bail from samey gameplay?
Provide paradigm shifts in gameplay in a sequenced player journey the individual games are encountered sequentially as ‘games within a game’
How to draw players back to Game 1 to keep pipeline flowing?
Use moments of paradigm shift as an opportunity to incentivise players to return to Game 1 when pipeline flow wanes.
How to prevent bailing when level of challenge too easy or too hard?
Players set the skill level of play during on-boarding and start with the game that provides the most suitable level of challenge. Constantly assess player performance and dynamically adjust level of challenge to suit player skill.
How create a harmonised flow for the user as they move from game to game?
The four games were originally disconnected experiences. Present all games as nodes on a geographical map that players explore in a immersive unified thematic narrative.
How to prioritise quality of data over quantity?
To prioritise the highest quality data output from most trusted players and screen low quality data from unskilled players perform constant player performance measurement and assign value weighting on the data that they generate accordingly.
How to drive retention, frequency and quantity of plays per user?
Surface players’ impact on project progress. Provide involved gameplay that provides reward over an extended time frame.
How to prevent churn when unable to progress?
Reframe failure as learning opportunity. Always provide onward journey and progress game even if player unable to meet game challenge.
How to make game mechanics easier to learn?
Minimise cognitive load in UI, maximise usability, utilise familiar IX patterns wherever possible, employ phased learning cycles.
Solution Focus 1: Unified Experience — The Map
As previously mentioned, unifying all the individual games as ‘games within a game’ was a solution to provide varied extended player experience and providing a framework for personalised player journeys.
In the diagram above we focus on the journey of the Grammar Ace persona as they move across the four individual games over an extended journey. Players start in individual game that best suits their grammar ability level and are introduced to other games at specific timely moments.
In the journey map we see further detail on the rationale behind the timing and choice of moving the flow of their play to another game. Note that Game 0, while adding value to the player experience through being faster to play and quicker to pick up than the other games, does not generate any project critical data. It exists to run subtle game design experiments in A/B tests that are separate from the data goals of the project.
In the above diagram we see how the cross game journey impacts and develops player skill, providing them the satisfaction of a dynamic yet appropriate level of challenge and persistent learning. As their skills develop the quality of the data that they generate increases and they are better able to help less skilled players as they themselves climb the skill ladder.
So how de we bring players back to the crucial Game 1? As agency is important for players in their choice of individual games — some of the games already had significant numbers of loyal players — at the end of each round players are given the option to continue playing or move to another game, yet are encouraged to play another game to provide variation through visual prioritisation in the UI or incentivised to play Game 1 through score bonuses and through metaphor of the productivity metrics which requires them to play across the various games. More on these metrics in the next solution focus.
In the player experience the map concept was an extension of a narrative that I developed with the team in collaborative tematic design workshops, a storyline that tied all the games together in an immersive thematic fiction. Significant time was put into this and utilising my UX copywriting skills developed in conversational interface design were key to the final experience, but I’ll keep that out of scope of this already long case study.
Solution Focus 2: Game 1
When I came onto the project Game 1 looked extremely simplistic, although the grammatical concepts inherent within play were complex and gameplay was almost impossible to grasp for all usability test participants. Play was even more monotonous than the other games. As it was causing the blockage in the document pipeline, this was the game that required the most extensive redesign to turn it into a game that players wanted to play and would be drawn to come back to time and time again.
Plant Growing Metaphor
The aim of the game was to identify noun phrases within given texts. For the identified noun phrases to be deemed correct they would need to be either validated by other players in previous games or in on-boarding against a document fully validated tio ‘gold standard’ by an expert linguist. To surface the currently obscured collaborative nature of play and provide for the ‘relatedness’ motivation pillar I developed a plant growth metaphor. As per the nature of the crowd sourced model, identified noun phrases would need to be validated by five players to be moved to ‘silver standard’, and be deemed reliable enough move to the next stage of the pipeline flow and be used in Games 2 and 3.
Improving Usability: Text Selection
Among the serious usability issues surfaced in usability testing was the original text and noun phrase interaction pattern. The interaction, requiring a click on the leftmost ‘word tile’ of the words the user wanted to select then the rightmost to move the whole set into a new lane as a selection, was unfamiliar to users. Also, the ‘lanes’ format pushes vast amount of text off screen to the right requiring laborious scrolling leading to memory failures.
Anyone who uses a mobile phone often (most of the world, and certainly most of our user base) use a familiar text selection interaction pattern using selection handles. Using such a familiar interaction pattern takes balance of limited cognitive resources away from trying to work out game mechanics allowing them to concentrate on the fun challenges of gameplay.
My first idea was to use the standard mobile IX pattern: words are selected by long press then selection is edited using handles at start and end of selection. — See sketch 1
Nested phrases needed to able to be selected while keeping previously selected phrases visible. ‘Onion layers’ nested model: Prior selected phrases are highlighted with a colour corresponding to the noun phrase type facilitating grammatical learning and game state feedback. Where multiple noun phrases are selected for a given set of words they are visually nested within each other in encapsulating layers, like an onion. — See sketch 2
I provided error correction through a selection confirmation popup. The options are clear and simple, with background noise being greatly reduced. — See low fidelity prototype screen with error correction
Iterated further, and after low fidelity usability testing the low fidelity screen was moved to its final iteration during my time on the project.
Final Round Usability Testing
After numerous iterations the final round usability testing was received with an enthusiastic reception. We all had our doubts about the narrative fiction we had created to wrap the experience together, and to be honest we were surprised by how keen users were on it.
“I want to open it every morning and check how my civilization is doing. And at lunch and in bed before I go to sleep.” — Ben, 22, Non-Linguistics Undergrad, UK
“I’m hooked to see what happens next. Sounds difficult but exciting, looking forward to a challenging task to come.” — Srinidhi, 32, Non-Linguistics Post-Grad, USA
The collaborative plant growing concept and user experience was also essentially successful:
“I do like that. I’m feeling invested. I feel a pull to the game. I want to come back to it and harvest.” — Femi, 24,Non-Linguistics Undergrad, UK
“I like this collaborative growth/seed concept and visualisation. NPs aren’t fun or engaging as a rule but they are here. Gives a sense of accomplishment. It’s fun!” — Ellise, 36, Linguistics PhD, USA
Players noted that they were wary of the format and scope of collaboration. A few were concerned that there might be inter-player communication, which they were worried citing negative past experienced of trolling and bullying, and noted risks around the quality and speed of moderation and policing of forums. Relevant expectations were duly from the outset in the on-boarding flow. When users’ concerns over the nature of collaboration and the limits of communication designed into the game their fears were quelled.
“I love this sense of community — I don’t need to know any other players to feel a sense of satisfaction.” — Alana, 27, Linguistics Graduate, Belgium
Concerns over data privacy and the ethics of its end use were voiced. This was addressed by making data use, privacy and overall ethics clear during on-boarding.
Interaction patterns were predictable and didn’t get in the way of the users focus on gameplay:
“It feels familiar and easy to navigate.” — Marcia, 41, Linguistics Professor, Brasil
Ultimately the redesign was successful. Usability and experience satisfaction scores based on benchmarking tests of the original games vs. the redesigned experience show my input’s impact.
Average task completion improved by 500%
Average satisfaction metrics (SUPR-Q) improved by 300%
(Numbers rounded up or down to the nearest 50%)
Next Steps & Reflection
With four games to redesign and the whole unifying ‘wrapper’ game, this case study — while already long — is of course scratching the surface of this GWAP project.
At the time of my contractural sign-off the team had only just started the process of finding an illustrator who’s imagery, it was expected, would assist in unifying the holistic value proposition, adding significant delight and richness to the experience. A majority of users actually voiced their desire to see illustrated elements throughout the experience and I hope they will be satisfied with the illustrator’s contribution.
Looking back over the project, managing scope was a challenge and prioritising the many exciting solution concepts the team were consistently proposing against feasibility and resource allocation was key to a timely delivery.