This article took part of my information design degree at University of Reading.
Scrolling is an essential part of our everyday digital lives. We scroll digital documents on our computers, e-book readers, mobile phones and tablets. We even scroll the channel list on our televisions. There are methods like roll, click, drag, swipe, point and even wave. We depend on scrolling to consume the vast volume of information we face daily. However, the idea of scrolling, and how it should be handled, has been questioned multiple times before, and still is.
Two main models
In the early days of interaction design, as display terminals got increasingly more complex, the debate whether the keyboard arrows should control the window or the text went on. By window, it means that the keyboard arrows control which direction the window ‘pans’ or ‘tilts’ to reveal off-screen content, opposite to as if the arrows controlled the direction of which the text moved. However, the general agreement was that the keyboard arrows should control the window. The same logic applied during the introduction of the computer mouse, and at the time, the only ways for scrolling were either using keyboard arrows (or the function keys ‘page up’, ‘page down’, ‘home’ and ‘end’), or by controlling the scroll bar with the mouse cursor.
The ‘moving window’ model really caught on, and went on to become a well-established convention within most human-machine interaction systems.
Sharks vs. Scrollers
Around a decade later, in connection with the 1987 Hypertext Conference, the American human-computer interface expert and UI pioneer Jef Raskin was interviewed, in which he critiqued his former employer Apple, for its ‘card-based’ hypertext model: ‘There are two camps: HyperCard and scrolling’, he said. ‘I’m a holy scroller’ (Jones 1987). Raskin was in great disfavour of how Apple’s HyperCard software managed documents, and he introduced the terms ‘card sharks’ and ‘holy scrollers’ to describe the two opposing camps.
The ‘sharks’ were in favour of the card model, which Apple used in HyperCard. In this model the text is given a fixed sized canvas (a card), where the designers may create beautiful layouts and organize information in a flexible way. If there is more text than can fit in the canvas, more cards are included. Almost as printed books, where the reader has to flip through the pages to view the information.
The ‘holy scrollers’ on the other hand, pointed out the fact that scrolling eases the reading experience. Usability expert Jakob Nielsen (2013:153) agrees, and says that ‘users have to jump less, but at the cost of a less-fancy layout because the designer can’t control what users are seeing at any given time’. This is because the card model creates text breaks at random places, and thus interrupts the reading flow. The reader’s mental strain increases unnecessarily, compared with reading the scroll model’s continuous text flow.
Another aspect is the user’s ability to predict the document’s length. The scroll bar is a clear visual indicator of both how long the document is, and how far the reader has come. Without having the physical stack of cards, or pages, pagination is too abstract for the user to precisely calculate either the document’s total and remaining information.
[…] scrolling is a simple way to move across content without advance planning: you just keep moving down.
These are the words of Nielsen (2005). Later, in an essay called ‘scrolling and attention’, he adds to this point:
Scrolling beats paging because it’s easier for users to simply keep going down the page than it is to decide whether or not to click through for the next page of a fragmented article. (Nielsen 2010b)
Which camp ‘won’ the late 80s hypertext discussion, is quite obvious, since virtually every piece of software today uses the scroll model to handle long documents.
Some websites today have (usually blogs and news sites) implemented a hybrid solution, though, whereas long stories are split into multiple pages, hyperlinked across the article. This reflects the ideas of the card model, but in reality it is still the scroll model that is applied. Implementations like these are problematic for the users, on multiple levels.
If you have a long article, it’s almost never good to simply chop it up into a linear sequence of pages. If the only navigation is a link saying “continue” or “next page,” then it’s typically better to stick it all on a single page and rely on scrolling instead of page-turning. (Nielsen 2011)
Jacob Nielsen
Nielsen (1997) also notes that ‘having to download several segments slows down reading and makes printing more difficult.’
Touch changes everything
In this essay I will focus mainly on the two conflicting models, moving window and moving text. The traditional scrolling method of moving the window the direction which is commanded relies on an understanding that the text, or any other content, has a fixed positioning (see figure 1). The window metaphor is misleading in that case, since windows in our physical world are impossible to move around to get a different view. A more appropriate metaphor to describe this model may be a monocular. The monocular has a limited scope, but has the ability to be moved around in order to view the surroundings. In the moving window model, the scroll bar would move the same direction as the command, while the content would move the opposite direction.
The moving text model, on the other hand, relies on a mental understanding in which the window has a fixed position, and it is really the text that moves (see figure 2). As if the visual image is a movie reel, and the window is the projection frame.
Both the window moves model and the scroll model became recognised interaction design conventions, and neither model met resistance until recent years. First, when touch screens were included in the equation, changes in our scrolling behaviours emerged. The window moves model makes little sense when controlling a touch-based interface. When using a smartphone or a tablet, the only logical way of scrolling is to swipe the content the same direction your fingers move, thus the text moves model.
Try imaging scrolling on a smartphone using the window moves model. It is just awkward. It feels wrong. The metaphor used in these interfaces are different, because the control is different. A document displayed on a touch screen mimics physical paper. It makes sense to ‘grab’ it, and ‘move’ it across the surface to scroll, because that is how we would interact with real paper within similar constraints. Still, there are some weird things going on, though. The scroll bar always follows the window moves model, and would move opposite direction as the control.
The wheel controls the bar
The invention of the mouse scroll wheel truly improved computer usability. Cooper (2007:381) goes as far as naming it ‘one of the most useful innovations in pointing devices’. He makes a point about that the physical scroll wheel is more accessible and easier to use than the virtual scroll bar. The wheel is always in reach of the user’s middle finger. It is intuitive and precise in use, while interaction with the scroll bar has some challenges. The actual clickable area is usually tiny. Also, the scroll speed, or you may say the sensitivity of the scroll bar, depends on the length of the document. This is difficult for users to predict, which often result in an awkward back and forth struggle with the scroll bar.
In terms of scrolling, the scroll wheel is almost the opposite of the touch screen. The only model that really makes sense is the window moves model. The reason for this lies in the physical interaction with the wheel. While on the touch screen, the scroll gesture mimics arranging the ‘document’ itself, there is no strong connection between pushing or pulling paper and the rolling motion of the middle finger on the wheel. Therefore it makes sense that the scroll wheel controls the scroll bar.
Natural scrolling
The issue, however, is not solely that today’s users are forced to use the window moves and text moves models interchangeably. On handheld touch devices we move text, but on computers there seem to be some confusion about what the ‘right’ way of scrolling is.
In 2011 Apple reversed the default scroll method in their new Mac OS X Lion system to the text moves model. They named the feature ‘Natural scrolling’, and wanted to bridge the gap between iOS and their Macs. Suddenly all new Macbooks, Magic Trackpads and Magic Mouses shipped with ‘natural scrolling’. By letting their users scroll the same way in their iPhones and their Macbooks, Apple made their stance: Text moves is the future.
A problem is that a lot of their users have reversed the default touchpad or mouse scroll settings, and are still scrolling by the window moves model. Mac forums on the web are bombed by questions about how to change the scroll settings. Apparently the ‘natural scrolling’ does not seem to be as natural for users as Apple initially assumed.
Note that Apple still is the only major hardware manufacturer that ships their products with text moves scrolling as a default setting. A lot of computer, tablet and mobile users work cross-platform. It is probably fair to assume that a lot of users occasionally encounter hardware from other manufacturer than their own devices too. Who has not used a friend’s, family member’s or colleague’s computer, only to get caught off guard of their scrolling settings? That is certainly a frustrating and uncomfortable sensation, but until all users agree upon one scroll model, situations like these are going to be a reality.
Scroll conventions
The fact that a lot of Mac users have even gone the relatively drastic measure to actually revert a default hardware setting on their computer is interesting. This points to circumstantial evidence that text moves scrolling in non-touch interfaces still is problematic, even though (or because of?) users use this scrolling method regularly on their smartphones and tablets.
If the lack of a widely recognised standard way of scrolling is a problem, would a convention really solve the problem? One of Nielsen’s (1995) famous usability heuristics is called ‘consistency and standards’. With this, he claims that ‘users should not have to wonder whether different words, situations, or actions mean the same thing’ and that designers should ‘follow platform conventions’.
Then, we could ask: Is a common convention really what is needed? Could we live with the fact that users just need to accept different scrolling models in different use situations, for different hardware manufacturer or different personal preferences? Surely, that is what the situation is today. Don Norman argues in favour of a standard. Additionally he rises an interesting point when he asks: ‘When the touch technology moves to other vendors, what will the result be?’ (Norman and Wadia 2013)
A scroll affordance?
In Don Norman’s (2013) The design of everyday things he explains the concept affordances as objects’ embedded clues of how they are intended to be used. A door handle affords pulling, and a flat plate or a horizontal bar affords pushing. Designers use affordances actively to help users obtaining their goals, in addition to including constraints. The keyboard keys afford pressing, the mouse wheel affords rolling and the touchpad and touch screen afford touching.
In a paper presented at 2013 Society for Information Display conference Norman and Wadia (2013) state:
Affordances are more important for physical controls (such as a handle or knob) than for touch devices: with a touch device, the main affordance is that of touchability.
This is an interesting point being raised, which may also explain some of the reasons the feeling of scrolling is so different on touch screen devices from other controllers.
However, none of the mentioned objects afford scrolling. Scrolling, whether it follows the window or text model, is the effect of the user’s actions. The direction of which is expected relies on the user’s mental model. A mental model is a concept regarding how the users believe a system at hand works. How the mental model was formed may be difficult for the user to express, but it could be based on past experiences, visual cues within the system, or some other ‘rationality’ to back up the understanding of the system. Nielsen (2010a) explains that ‘individual users each have their own mental model. A mental model is internal to each user’s brain, and different users might construct different mental models of the same user interface.’
An interface’s use of metaphors heavily influence the users’ mental models of the system. The laptop’s touch pad is quite abstract, does not contain much information about how it functions. The user has to create the mental model himself, which does not always reflect the designer’s intended model.
People don’t need to know all the details of how a complex mechanism actually works in order to use it, so they create a cognitive shorthand of explaining it, one that is powerful enough to cover their interactions with it, but that doesn’t necessarily reflect its actual innermechanics. (Cooper 2005:28)
In general, you could say that if the actual system’s process is closely related to the user’s mental model, the system would be easier for him to understand, and thus use.
Poor mapping
Why is it so that in touch-based interfaces there is only this one indisputable way of scrolling, whilst for other input methods we cannot agree upon a standard? What makes the touch screen special? The single most significant difference is that when using a touch-based interface, the users are interacting directly on the content. Don Norman discusses this concept, and famously names this concept natural mapping:
Natural mappings are those where the relationship between the controls and the object to be controlled is obvious. Depending upon circumstances, natural mappings will employ spatial cues. (Norman 2013:115)
A mouse and the on-screen cursor is an example of good natural mapping. The spatial relationship between the mouse and the cursor’s movements are closely related. The movements are connected both through speed and direction of movement. The same goes for the touchpad.
Scrolling on a touchpad, however, is more complicated story. The reason is that there is no clear link between the ‘scroll gesture’ and scrolling itself. Cooper (2005:242) argues that this connection is essential for good usability:
Mapping describes the relationship between a control, the thing it affects, and the intended result. Poor mapping is evident when a control does not relate visually or symbolically with the object it affects. Poor mapping requires users to stop and think about the relationship, breaking flow. Poor mapping of controls to functions increases the cognitive load for users and can result in potentially serious user errors.
Another factor to consider is the fact that the touchpad is horizontal, while it controls a virtually perpendicularly monitor. The brain needs to translate horizontal movements into vertical ones in order to predict the result of scroll movements on the touchpad surface. The mapping is poor. This may be a decisive reason users find it hard to change from one scroll model to another. The scroll movement is saved in the user’s muscle memory, and he has a clear mental model of how the system works.
Keep the illusion
Steve Krug (2006:11) states in Don’t make me think that all interfaces should be self-evident, obvious and self-explanatory. That users ‘should be able to “get it” – what it is and how to use it – without expending any effort thinking about it’.
The physical distance between the ‘scroll surface’ – the control – and the content it is affecting is key. ‘Natural scrolling’ makes more sense when applied to a touchpad, relatively closely connected to the screen surface, via an imaginary extension of the displayed image down to the horizontal touch pad area, than on a separate mouse. When using Apple’s Magic Mouse and Magic Trackpad, the relationship between the control and the affected object is loose. It is bad mapping.
Users in favour of the moving text model have little problems with this, however, because they have accepted the concept of paper being pushed and pulled through gestures. Their mental model of how the system works is in sync with the scrolling model applied.
To get such a user situation to work, the designer needs to choose an interaction design convention, and the user has to mentally accept this model. Norman (2012) argues that ‘in all cases, every view is correct. It all depends upon what you think is moving.’
If the user believes the ‘universe’ the interface consists in is real, then they should have no problems navigating within it. Taking the Harry Potter universe as an example, it is possible to suggest that users are willing to accept mental models that defy what they already believe in.
The audience find Harry in a world where magic is real, and they accept the concept of superpowers and witch craft without rising an eye lid. Everything fits perfectly together in harmony, because the context where these highly unrealistic actions takes place is carefully and convincingly narrated. The same goes for user interfaces. Whenever the visual metaphors and cues are persuasive, and when the user is comfortable with its affordances and constraints, the interaction between the human and the system becomes seamless. However, if the user encounters something that breaks this illusion, thus their mental model, the whole experience falls apart.
References
- Cooper, A. (2004). The inmates are running the asylum: Why high-tech products drive us crazy and how to restore the sanity. Indianapolis, IN: Sams Publishing
- Cooper, A.; Robert Reimann and David Cronin (2007). About face 3. Indianapolis, IN: Wiley Publishing
- Jones, R.S. (1987). ‘Conference looks at hypermedia’. http://www.flickr.com/photos/formforce/4662660914/in/photostream/
- Krug, S. (2006) Don’t make me think: A common sense approach to web usability, 2nd ed. Berkeley, CA: New Riders Publishing
- Nielsen, J. (1995). ‘10 usability heuristics for user interface design’. Nielsen Norman Group. Published 1 January 1995. Accessed 9 December 2014. http://www.nngroup.com/articles/ten-usability-heuristics/
- Nielsen, J. (1997). ‘Be Succinct! (Writing for the Web)’. Nielsen Norman Group. Published 15 March 1997. Accessed 11 December 2014. http://www.nngroup.com/articles/be-succinct-writing-for-the-web/
- Nielsen, J. (2005). ‘Scrolling and scrollbars’. Nielsen Norman Group. Published 11 July 2005. Accessed 6 December 2014. http://www.nngroup.com/articles/scrolling-and-scrollbars/
- Nielsen, J. (2010a) ‘Mental Models’. Nielsen Norman Group. Published 18 October 2010. Accessed 6 December 2014. http://www.nngroup.com/articles/mental-models/
- Nielsen, J. (2010b). ‘Scrolling and Attention’. Nielsen Norman Group. Published 22 March 2010. Accessed 10 December 2014. http://www.nngroup.com/articles/scrolling-and-attention/
- Nielsen, J. (2011). ‘Mini-IA: Structuring the Information About a Concept’. Nielsen Norman Group. Published 21 June 2011. Accessed 11 December 2014. http://www.nngroup.com/articles/mini-ia-structuring-information/
- Nielsen, J. (2012). ‘Usability 101: Introduction to usability’. Nielsen Norman Group. Published 4 January 2012. Accessed 6 December 2014. http://www.nngroup.com/articles/usability-101-introduction-to-usability/
- Nielsen, J. and Raluca Budiu (2013). Mobile usability. Berkeley, CA: New Riders
- Norman, D. (2012). ‘What Moves? Culture & Interaction Design’. Nielsen Norman Group. Published 10 July 2012. Accessed 7 December 2014. http://www.jnd.org/dn.mss/what_moves_culture_.html
- Norman, D. (2013). The design of everyday things: Revised & expanded version. New York, NY: Basic Books
- Norman, D. and Bahar Wadia (2013). ‘The Next Touch Evolution. Advancing the Consumer Experience in Other Realms: Tasks and Tough Environments’. 2013 Society for Information Display Conference. http://www.jnd.org/dn.mss/opportunities_and_ch.html