The Politics of Hypermodality

Is there a politics of multimodality as such? I have already argued that there are no media in which meanings are made purely with one semiotic resources system. A printed text also makes non-linguistic meaning visually though choice of font, page layout, etc. You can’t write without using a visual code which can also be read non-linguistically. Handwriting analysis points to indexical signs of the age, gender, emotional state, etc. of the writer, which are not coded verbally in the words written, but overcoded through a double-signification (linguistic and ….) of the material signifiers needed to write down words in the first place. Spoken language likewise, even disembodied in a sound recording or radio broadcast, consists of material signs (phones, acoustic productions) which signify not only linguistic units (words, phrases) but simultaneously can be interpreted through non-linguistic codes as signifying individual identity, state of health, emotional state, dialect community, non-native “accent”, etc. Every image can be and usually is also interpreted in terms of the semantic categories of a verbal meaning system (a language) as well as in terms of its own visual semiotics. Sounds and noises are frequently verbally named (a bang, a thud, a scream) or assigned qualities in a verbal semantic system (loud, high-pitched, screeching), and music has at least indexical meanings (classical/rock, instrumental/vocal, western/asian) that are not made in terms of its internal semiotic resources (see van Leeuwen 1999 for fuller discussion). Finally, verbal text and music can also be visualized in terms of images that have internal meaning relations of a non-linguistic sort.

There are no genuinely uni-modal media or events, most fundamentally because all signifiers are material, and every material object or event can be construed in relation to more than one semiotic system; it can be selectively contextualized in relation to many kinds of signs. The various pure semiotic resource systems that we distinguish culturally (language, gesture/posture, images, sound/music, actions, etc.) are idealizations that can never exhaust the possible (and conventionally accepted) meanings of any material signifier. (See Lemke 2000 for further development of this argument.)

So what we are really concerned with in seeking a politics of multimodality are the social conventions regarding the degree of importance assigned to different media and their combinations, and the conventional ways of exploiting various kinds of signs. I think there is no doubt that “logocentrism” in modern European intellectual and academic culture represents a political ideology. To privilege linguistic meaning to the point of excluding or denigrating pictorial modes of representation (cf. Fischman 2001) must have a politics, it must favor some interests or modes of social control.

The conventional argument in these cases is that images have an inherent degree of ambiguity which makes them unsuitable to precise scholarly meanings and accurate reasoning. I do not think this premise is acceptable. The ambiguity of verbal text is very high, as anyone advocating the parallel ideological claims of mathematics or scientific notation as superior to verbal text could amply demonstrate. Linguistic registers and genres have evolved specialized rhetorical and textual strategies to reduce ambiguities of certain kinds (for reference, for implication) over long historical periods, but so have many visual genres, e.g. those employed in medical and botanical illustration, in scientific data visualizations such as those we’ve seen and mentioned above, etc. Moreover, visual representations can present meanings-by-degree (the shapes of clouds and mountains, degrees of brightness, shades of color, exact relative sizes, etc.) far more precisely than can the more gross categorial distinctions of verbal language. Mathematics was largely invented, beyond simple counting, to extend the semantics of natural language to this domain of meaning-by-degree (ratios and fractions, geometric relations, quantitative co-variation, etc. cf. Lemke, in press-b). The total amount of information in an image, and the total number of discernable contrasts which define its visual features is certainly comparable to those of natural language. Its two- and three- dimensional affordances for organizational relationships is also superior to what the syntagms of verbal text afford. Visual-diagrammatic logics are probably superior to both verbal reasoning and mathematical logic notations on many criteria.

The underestimation of visual semiotics as a resource for meaning I take to be of a kind with the denigration of women or non-Europeans (and one can often find gender contrasts and racist prejudices which associate logical precision with males and europeans; cf. Walkerdine 1988). What is the danger in the image? Or in other semiotics, as compared with verbal language? Does  mathematics, seen as a superior refinement of natural language in just those respects in which language is already said to be superior to depiction or music, provide a clue? In scientific genres, which have long been multi-modal, one can show that precision and power (applicability) of meaning is most often achieved by combinations of modalities of presentation (usually verbal, graphical, and mathematical) and not solely by any of these in isolation. The claims made for natural language and for mathematics as autonomous systems of meaning making are specious. Mathematics is heavily dependent on the semantics of natural language in its use in practice, and frequently also indebted to insights produced through visual representations (especially earlier in its history) in the long dialogue between algebraists and geometers. Natural language itself has no autonomous semantics. The fully contextualized meanings of verbal expressions (as opposed to the bare meaning potentials of decontextualized lexical terms or syntactic forms), where they achieve maximum precision and applicability, depend very much on experiential contexts that include visual and actional relationships, which in turn become embedded in the more generic and abstract meaning potentials of verbal signs. 

So it seems much more likely to me that what is accounted as “ambiguity” in visual images is nothing more than there not being a one-to-one correspondence of images to texts. If one takes language, a priori, to be the standard of precision in meaning, then anything that does not have a unique verbal reading is judged to be “ambiguous”. A high resolution image of the earth as seen by a sattelite is in no way ambiguous; it is indeed more precise and accurate than any possible verbal description could be. A comparison of predicted and measured values in some experiment, shown on a data graph, is a far better basis for reasoned judgments than any verbal presentation of the comparison, or even any numerical or algebraic presentation.

So why the denigration of visual representations? My strong suspicion is that because text and image mutually contextualize one another, influencing our interpretations of each and both together, that it is the power of the image (and other semiotics) to subvert and undermine the authority of linguistic categories and categorical imperatives which is being politically suppressed by logocentrism and mono-modal purism. Language affords a low-dimensional representation of experience and the complexity of social-natural realities. It reduces matters of degree to matters of kind, frequently to dichotomizing categories (masculine/feminine, gay/straight, capitalist/communist, heroes/terrorists) through which sentiments and allegiances can be more easily manipulated. Of course visual images can also be used in this way, but they inherently afford a much greater display of complexity and “shades of grey”, whether in unedited documentary footage from a war zone or in the daily gyrations of a stock price over months or years, or those of the earth’s average temperature in a debate on global warming. When we put images and text together, their very incommensurability, the fact that they cannot both present exactly the same message, casts doubt on the monological pretensions of either, but particularly those of language.

A more balanced multimodality is potentially more politically progressive, whether in the deliberate juxtaposition of texts and images that never quite tell the same story and force us to more critical analysis than either might do alone, or in the representation of issues of “race”, gender/sexuality, social class, culture, etc. in multidimensional ways as matters of degree and possibility rather than category and constraint.

For the last two or three years on the NASA homepage [http://www.nasa.gov] there has been a small featured item (Cool NASA Websites) at the bottom, a link to information about “Women and NASA”. In January 2000 the image on this icon was a head-and-shoulders portrait photograph of a 30-something woman, smiling, with a somewhat bland if not blank expression, and an markedly unfashionable “beauty parlor” hairstyle and wearing what I read as a designer sweater. This could easily be an image from a 1950s midwestern high school or college yearbook. Most of the “Cool” sites appear meant to appeal to boys and young men, and there are almost no images of individual men to be found unless you seek them out. This link is presumably meant to appeal to contemporary young women, but hardly seems gauged to do so. By July 2001, this image had been replaced by a new one, labeled “Space in my Life” and showing a group portrait of seven women, three in NASA astronaut uniforms, two are dark-skinned, one or two others appear probably of non-European descent, and most are standing. This could easily be a picture of a space shuttle crew, if all were women. The contrast of the two images (which are of course not both visible) highlights the contradiction between the verbal intent (make NASA seem “cool” for young women to promote their interest in “Space”) and the visual semiotics of the first image. Alternatively, that image can be read as an index of the cluelessness of the page designer, the stereotypes about women among at least some at NASA (all who saw and approved or did not yank the image), or the lack of serious commitment to appealing to women, not to miss the obliviousness to diversity. (Unfortunately, while the new image is a great improvement, much of the content to which it links remains problematic, at least in my view.)

As another instance, consider the most political of the scientific items on the GSFC page analyzed above. The Interdisciplinary link displays an image of a graph of global temperature data since 1850 and a caption which asserts that “ A remarkable global warming trend has become evident since 1900.” It attributes this with somewhat lower Warrantability (“It is thought to be ….”) to release of carbon dioxide and other greenhouse gases from human activity. The image graph also shows the levels of atmospheric carbon dioxide gas. But the image can be read as not showing anything very remarkable (the increase in temperature looks small compared to the year-to-year ups and downs), and there is clearly an increase in carbon dioxide over at least 50 years with no accompanying rise in temperature like that seen thereafter. I personally happen to agree with the verbal interpretation, but it was certainly contested both scientifically and politically for quite some time, and by some people still. Seeing this visual representation invites more critical scrutiny of the text’s assertions than might be the case even if the text alone presented a verbal summary of the data. 

Is there then also a politics of hypertextuality? I leave aside here the issue of the politics of the WorldWideWeb as a technological medium; clearly it has an affordance for peer-to-peer communication and publication that is potentially far more democratic than print publishing or broadcast media. It is not clear that larger economic and political forces will permit such a democratization to become genuinely significant. What is more specific to hypertext as a semiotic medium is its multi-sequential organization; its ease of affordance of linkage and multiple linkage, and the construction of traversals by users which can be invited and constrained by authors/designers but which remain in important ways unpredictable, especially over the longer textscales of extended traversals.

I believe that hypertextuality invites and affords more complex dialogical (or pseudo-dialogical) chaining of offers and demands, choices and constraints between users and designers/sites  (see Aarseth 1997 and Lemke 2002 to sort out these roles and agencies) than does text which is built with the strong expectation that readers will follow a default sequence through the text, at least on textscales of several paragraphs to several pages, if not much longer scales.

As Kolb (1997) has also argued, this circumstance alters the affordances of the medium for making the traditional sorts of extended arguments (enthymemes) that are common in, say, modern academic philosophy and throughout most of the social sciences. Authors cannot count on readers staying within the grasp of their argumentation; it is harder to lead the reader rhetorically down the garden path to agreement with the author’s views. Instead, readers explore alternative pathways through a hypertext, or create their own traversals, particularly over longer text-scales. Authors may produce a consistent voice or viewpoint in all the lexias they include in a hypertext and hope that their cumulative effect will naturalize their viewpoint for the reader. Or they may inscribe their viewpoint into the organizational structure of links and pathways (e.g. subcategorization schemes). Nonetheless, authors lose the power to make some traditional kinds of monological, coercive/cogent arguments.

What the hypertext medium affords the author or designer in a positive sense is the opportunity to escape monologism altogether, not in the trivial sense of creating a pseudo-dialogue with the user, but in the more profound Bakhtinian (1935/1981) sense of including multiple social voices, giving the reader access to a field of heteroglossia, of discourse diversity and conflict. Why would authors do this? We might, for example, want to include in the same web both officially authoritative images or texts and counter-images and discrepant discourses. And link them together in such a way that the socially dominant viewpoint is constantly confronted with its Others, its usual monological voice constantly subverted by an implied dialogic opposition. We might also want to create for the user a space which affords and demands critical analysis by leaving the last word unsaid, by opening up possibilities and foregrounding alternatives and contradictions, inviting users to think for themselves … as most texts do not.

To make one point here which does depend on the peer-to-peer affordances of the technological medium of the WorldWideWeb, it is also far easier to create hypertexts in which the users add text, images, sounds, animations, videos to the web, comment on other elements, create links and pathways of their own, and more fully collaborate in producing an artifact that is materially different because of their participation, and not just semiotically re-interpreted. This was the design of the early Landow (1997) webs and has also been used very creatively by Goldman-Segall (1998), both in educational contexts.

When we combine the affordances of multimodality with those of hypertextuality, it is doubly possible to resist the monological voices of traditional genres (such as the one I am now writing). First, through the incommensurability of different semiotics and their differential affordances, particularly the opportunities for presenting “non-essential” details, relational complexity, and meaning-by-degree provided by visual media; and second, through the cross-linking of diverse viewpoints, discourses, images, etc. in hypertext, which affords the user the opportunity to make meanings that are not the implied or explicit conclusions of the author/designer.

Hypermedia genres not surprisingly begin as transpositions of familiar unimodal and print-text genres, and their special affordances only gradually become evident, taken up, and used by authors, so that new hypermedia genres evolve. The politics of hypermedia will generally be a negative one initially: devaluation of the medium and especially of features in which it diverges from the dominant genres of modernist institutions. Hypermedia will be accused of incoherence, inconsistency, ambiguity, lack of logical rigour, being a post-modern fad, etc. just to the extent that they make use of the subversive potential of the medium. The cause of critical empowerment of users, providing them with resources and means to make independent judgments and form new points of view, should make allies of hypertext authors and visual media designers, but only to the extent that both recognize that they face a common enemy: logocentric monologism. The cause will prosper only to the degree that we share a cultural value system in which the rhetorical of persuasion is seen as morally inferior to the empowerment of users to make up their own minds and to participate as peers in the building of webs and communities.

Visual communication is at its most powerful, not when it retreats into the splendid isolation of an imaginary semiotic autonomy, but when it confronts verbal language head-on and challenges its hegemony, when it takes its place as an equal (and equally often as the leading) partner in multimodal communication. The medium in which both confrontation and partnership, both subversion and empowerment, is most fully afforded today, is that of hypertext. Travelling together in hypermodality, we can make meanings that will let people see and speak in new and more throughtfully critical ways.