Note: for best results, please view this page in Google Chrome.
On August 6, 1945, the United States dropped an atomic bomb on Hiroshima, Japan. A few days later, on August 9, the United States dropped a second atomic bomb on Japan, in Nagasaki. Shortly thereafter, Japan surrendered, ending World War II.
If you have spent any time in a high school U.S. history course, chances are that this account of the end of World War II sounds familiar. Indeed, it is an archetypal representation of a sequence of events that have become recognized as the pivotal closure of years of horrendous conflict: Japan stubbornly persists, the United States military unleashes the most powerful bomb in history, and the Allies emerge victorious.
This, at least, is the standard explanation offered in high school history textbooks in the United States. Few would argue that this is objectively inaccurate, and indeed, these events did run their course. However, as a historical narrative, it also serves an important secondary function that is easily scrubbed away when we are not looking at it — as a narrative discourse that positions those events in spaces of collective memory that are composed around assumptions about why those events occurred and how we should interpret them.
History — or more precisely in this case, historiography — is a form of narrative and composed memory. It is experienced and then aggregated for later interpretation with words, images, and artifacts. But just like any other kind of narrative, our interpretation of it is heavily dependent upon the perspective from which we engage with it. Situated in peoples and places, narratives are intersubjective representations of reality that manifest in different forms, contingent upon who or what is watching them. In their manifold expression, narratives can either challenge or strengthen the boundaries of those perspectives.
We are no stranger to narratives, though they sometimes appear in places we may not notice; narratives occupy many spaces of our private and public lives. We are perhaps most familiar with those modalities that we have been taught represent the most formalized expressions of narrative, including written, oral, and historical, to name a few. And yet there is another kind of narrative that while no less salient demands our attention purely by way of our intimately somatic experience of it: visual narrative, or the retelling of events and stories through images, colors, and geometries that seize the eyes. In the company of the conventional images of cinema and photography, information visualization, too, is another kind of visual narrative, one in which the act of retelling is founded upon data and the grammar of that retelling is derived from points, lines, and planes.
Information visualization offers a way of building new narratives, and indeed, it appears in countless places, behaviors, and routines of our daily lives. But in its primary capacity of directing our attention to specific composed spaces around us, it is also a powerful aesthetic for representing the manifold nature of narrative expressed in other forms. By acknowledging the constructedness of knowledge and its representation, visualization invites opportunities to recognize the ways in which the experiences and identities of both user and designer alike are fundamentally implicated in the creative and interpretative acts. For this very reason, visualization can thus also operate as a medium for reconciliation, honoring the complexities of narrative when understood as an intersubjective act between designer and user. When we are unsure of whether we are observing the same thing, visualization can guide our attention to the tensions and friction between the narratives we build and assure us that the materiality of that friction is the seed of dialogue.
In this project, then, we seek to explore this role of visualization as reconciliatory medium through the lens of two specific events in history: the dropping of the atomic bombs on Hiroshima and Nagasaki, Japan. Focusing on narratives of these events that are presented in history textbooks in the United States and Japan, we wish to explore ways in which we may use visualization to represent and communicate the many retellings of these events, their parallels, and their differences.
In our analysis, we identified 11 textbooks, including 7 high school U.S. history textbooks (in English) and 4 high school Japanese history textbooks (in Japanese). In these textbooks, we selected excerpts for comparative analysis, limiting those excerpts to those describing the atomic bombs in Hiroshima and Nagasaki as well as the events immediately preceding and following the bombings. To determine scope, we started by identifying the sentences of each text that described the actual dropping of the atomic bombs and moved outwards, following the natural structure of the chapters and sections in which those sentences appeared. This resulted in a selection of excerpts that varied slightly in temporal scope but aligned along a common narrative arc, beginning with the Battles of Iwo Jima and Okinawa and ending with Japan's surrender following the atomic bombs.
We transcribed these texts from physical books and prepared them for tokenization — a process of splitting the texts out into individual words. In the process, we filtered out non-lexical items, such as English and Japanese stop words (high-frequency words such as to, and, they/she/he), to focus our analyses on only semantically significant portions of the texts. In the representation to the left, each dot represents a single word (token, or lexical item), ordered in sequence. In general, we see that the excerpts from English textbooks are longer in word count than the excerpts from Japanese textbooks.
After tokenization, we began with a simple question to characterize the lexical content of each text: what proportion of each consists of nouns, adjectives, verbs, and adverbs, and what does comparison of those proportions possibly suggest? Using tagging software, we assigned part of speech tags (noun, adjective, verb, and adverb) to each token. These are colored to the left: nouns are purple, adjectives are orange, verbs are red, and adverbs are green. Roll over individual points to see their corresponding words.
Immediately, we see that the 7 U.S. textbook excerpts are much more colorful, meaning lexically dense with many adjectives and adverbs in proportion to nouns and verbs, than the Japanese textbook excerpts. While the English texts appear to be both literally and figuratively more vibrant, the Japanese texts are comparatively sparse in their distributions of parts of speech. With this kind of representation, however, it is difficult to do any direct comparison of proportions of each part of speech.
If we sort these words by part of speech tag, though, we can more easily compare the distribution of parts of speech within and across these texts. By this simple measure, we see first that there is a stark contrast between the U.S. and Japanese history textbooks. While the U.S. textbooks are lexically diverse, with many descriptive adjectives and adverbs, the Japanese textbooks are relatively conservative, consisting predominantly of nouns and verbs. This is confirmed by a close reading of the texts themselves – the U.S. texts are more colorful, descriptive, and generally verbose, while the Japanese texts are noun-heavy and factually objective. For example, consider the following selection from one of the U.S. textbooks:
The deadline passed, and at 8:15 A.M. on August 6, flying at 31,600 feet, a B-29 bomber named Enola Gay released the five-ton uranium bomb nicknamed Little Boy. Forty-three seconds later, as the Enola Gay turned sharply to avoid the blast, the bomb tumbled to an altitude of 1,900 feet, where it exploded as planned with the force of 20,000 tons of TNT. A blinding flash of light was followed by a fireball towering to 40,000 feet. The tail gunner on the Enola Gay described the scene: "It's like bubbling molasses down there...the mushroom is spreading out...fires are springing up everywhere...it's like a peep into hell." The shock wave, firestorm, cyclonic winds, and radioactive rain killed some 80,000 people. Dazed survivors wandered the streets, so badly burned that their skin peeled off in large strips.
Compare this with an excerpt from a Japanese text, also explaining the dropping of the bombs but in much less detail:
日本政府がポツダム伝言を黙殺すると、アメリカは、7月に実験が成功したばかりの原子爆弾を8月6日に広島に投下し、8月9日には長崎にも投下した。
When the Japanese government did not respond to the Potsdam Declaration, the United States, having just successfully tested them, dropped atomic bombs on Hiroshima on August 6 and again on Nagasaki on August 9.
The differences between these texts in terms of number of adjectives and adverbs suggests something deeper to explore. By tagging these texts according to parts of speech, we can begin to ask a more sophisticated question about linguistic style: how do differences in lexical diversity relate to the relative subjective or objective nature of these texts as a whole? And what constitutes a reliable measure of subjectivity? To address this question, we can employ some simple sentiment analyses that can tell us the relative polarity of words in the texts.
Those analyses yield a new representation, shown to the left. How subjective or objective is each word? How positive or negative is each word? Each word is now colored according to its relative subjectivity: more subjective and connotative words are colored more deeply red while more objective words are lighter gray. The result is a makeshift heatmap suggesting regions of subjective language, indicated by clusters of bright red points.
If we sort these texts by word polarity, we can again more easily make some direct comparisons. Much like the differences in lexical diversity we saw previously, here we see differences between the U.S. and Japanese textbook selections in their usage of subjective and objective language. The English texts appear to use more subjective language when discussing this sequence of events; the Japanese texts appear to use more objective words.
Let's return our texts to their most basic tokenized representation. We have thus far described purely linguistic analyses applied to these texts, suggestive of stylistic differences between them. Beyond this, however, we can also do some thematic analyses to compare what these texts discuss and to what length. What specific events in history do these texts discuss or mention, and how much language is devoted to those discussions within and across the texts?
In this representation, each word is uniquely colored according to thematic content, yielding sections that discuss specific events or topics in history. For example, sections discussing the Battle of Okinawa are colored light green, sections discussing the Potsdam Declaration are colored peach, and sections discussing the dropping of the atomic bombs are colored purple. The same colors represent the same common themes/topics across all texts. Some colors are repeated due to the large number of tags; roll over individual sections to view their tags.
Let's take a closer look at some of these themes specifically. First, let's compare these texts' discussion of the actual dropping of the atomic bombs. Unsurprisingly, we see that all texts, both U.S. and Japanese, discuss the atomic bombs themselves. This is something we expect, as the atomic bombs represent a pivotal moment in World War II history. However, what is more surprising is the different degree and length to which these texts discuss the atomic bombs. While some texts, particularly the Japanese ones, limit their discussion to single sentences simply noting that the bombs were dropped, others discuss these events to great depth, adding significant detail about the bombings' physical characteristics and immediate toll.
By exploring other tags and themes, we begin to notice some differences in historical scope and coverage across these texts. Here, we compare discussions about U.S. strategy around using the atomic bombs, i.e., U.S. military strategy about how and whether to detonate the bombs in a display of force against Japan. We see that only U.S. texts have content about these discussions, occupying a significant chunk of space for almost all of these texts. Meanwhile, the Japanese texts do not cover this detail.
Conversely, if we restrict our theme tags to words discussing the firebombing of Japanese cities, including the Great Tokyo Air Raid (東京大空襲, March 9, 1945), we see that only a handful of U.S. texts discuss this, in contrast to a significant proportion of the Japanese texts.
The tags we have explored here offer a purely analytical perspective on these texts, namely a description of what content they cover. But we can use this approach to ask questions of greater importance, regarding the larger narratives of these texts. For example, how do these excerpts contextualize the dropping of the atomic bombs in intellectual and moral terms? How are these discussions composed? To consider this, let's examine one more theme in the texts: discussion and analysis on why the atomic bombs were dropped on Hiroshima and Nagasaki and whether or not they were justified.
This discussion of justification appears to be present in only U.S. texts considered here, and of those, only a handful. Furthermore, while half of the texts in which this discussion is present offer significant coverage of this topic, the remaining appear to devote only a couple of sentences to this. What does this suggest about the historiographical claims being made by these different texts? In other words, how does a discussion of justification — or lack thereof — shift the narrative of these traumatic moments in human history? At a fundamental level, this presence or lack of discussion either opens or closes the door to recognizing multiplicities of perspective. When objective events in history are contextualized in a larger normative assessment, we are invited to question the authority of those voices which dictate our interpretations of significant events in collective historical memory. In the absence of that contextualization, singular perspectives are perpetuated, and orthogonal discourses which might open the way to productive new interpretations are banished.
This form of thematic analysis offers a crude but effective way of understanding parallels and differences in scope across the texts. With it, we can answer some basic questions about how these narratives are constructed, including parallels in content among the texts or instances of content that are present in some but omitted in others.
Folded among these parallels and diferences, however, is a bigger question that we are ultimately chasing in performing this analysis. In our comparison of differences in how these narratives are constructed, we are really beginning to ask its inverse: what does a "canonical" narrative of the dropping of the atomic bombs look like? What forms the core of that narrative, and to what degree do the individual narratives of these texts align with the canonical one?
We can borrow one technique from an entirely different domain of study to address this question: sequence alignment, commonly employed in biological DNA analysis. Instead of DNA sequences, however, we can use sequences of theme tags to determine if particular tags are conserved across the texts. We begin by collapsing the theme tags in each text into an ordered series of topics covered, disregarding the number of words and sentences allocated to each topic. Each unique topic is assigned a unique color, as shown to the left. Roll over individual boxes to see each tag.
Then, we find the maximum possible alignment across all texts, inserting gaps in places where topics are discussed in some texts but not others. In texts where the same topics are discussed but in a slightly differing order, those orderings are preserved using the same color scheme that applies a unique color to each unique tag.
We now begin to see global patterns across all the texts in their historiographical scope, beyond the ordering of topics within individual texts. Notably, we see that two topics are discussed across all texts, whether from the U.S. or Japan: the actual dropping of the atomic bombs and Japan's subsequent surrender. From here, there are a handful of other topics that appear in nearly all texts, including the Soviet Union's entering the war against Japan immediately after the bombing of Hiroshima as well as commentary on the end of World War II.
As before, however, we also see differences between the U.S. and Japanese texts in content scope. For example, while many of the U.S. texts discuss both the Manhattan Project and justification of the atomic bombings, these discussions are notably absent in the Japanese texts. Likewise, some topics are unique to the Japanese texts, such as the changeover in the Japanese Cabinet (内閣) preceding the bombings.
With this alignment, we can begin to clearly see that certain facets are core to a canonical narrative about the bombings of Hiroshima and Japan. At a minimum, these include the specific bombing events themselves and Japan's subsequent surrender, followed by closely co-occurring topics like the Soviet Union entering the war against Japan. Beyond these, there are smaller satellite topics that are distanced further from the core. We can represent this canonical narrative structure with a network and position the content of the texts around the nodes of that network. Here, each dot represents a single word from one of the texts, colored light red if from a Japanese text and blue if from a U.S. text. Utilizing our subjectivity and objectivity scores from earlier, dots more tightly confined to the nodes of the network represent more objective language while dots further out from those centers indicate more subjective language.
Coloring these dots by text source language illustrates an emergent spatial stratification of these texts by coherence to the "canonical narrative." While the English texts appear to be distributed more dispersely around many nodes both central and satellite, the Japanese texts appear more closely contained around the central nodes with only a couple of satellite nodes in the mix. As a whole, we could say that the U.S. texts appear to adhere less strongly to the canonical narrative (they are more entropic) while the Japanese texts are strongly canonical.
We can see how specific texts are distributed around this network of canonicity by highlighting them one at a time. Here we see the distribution of one English text, mostly confined to the interior nodes of the network. We might describe this text as adhering fairly tightly to the canonical narrative.
In contrast, this distribution, from another English text, is highly dispersive in both the nodes of the canonical network around which it is spread as well as the spatial distribution of individual words around those nodes. We might describe this particular text as both thematically and semantically entropic, deviating more from a canonical narrative.
Similar observations may be made about the Japanese texts. Here, one Japanese text is distributed fairly closely to the center of the network, suggesting strong adherence to the center of a canonical narrative.
Meanwhile, this Japanese text is a tiny bit more dispersive, spreading out to satellite nodes. Still, it appears to adhere fairly tightly to the canonical narrative.
While these kinds of analyses are suggestive of important differences between U.S. and Japanese textbook depictions of these events in history, we must be careful in drawing any sweeping conclusions. However, one thing we can safely note from these representations is that the lexical diversity between these two groups of texts — English and Japanese — may be a proxy of the cultural and political environment in which they are composed. For example, while Japanese textbooks undergo a rigorous process of review and editing before they are approved for national curricula, U.S. textbooks have a bit more artistic freedom in their composition. Thus, it may not be particularly surprising that the Japanese textbook depictions examined here are more terse and less descriptive than their U.S. counterparts, given that they must fit within more stringent page count and stylistic requirements.
Regardless of the explanation, though, we must remember that textbooks do play a significant role in how these narratives (and many others like them) are first introduced to students. They are integrated into one of the most important systems of socialization — the education system — and thus can be a formative force in how we engage with historical memory. But with media that broaden our interpretation of them, such as visualization used here, we can challenge ourselves and others to be more critical of the authority to which they lay claim.
When it comes to the events of August 6 and 9, 1945, these narratives are not isolated to textbooks. They are anchored in people, places, and memory. They are rooted in the physical bodies of hibakusha (被爆者), surviving victims of the atomic bombs' effects; they are concentrated in the physical artifacts of buildings decimated by the explosions. Textbooks offer one medium of carrying those narratives forward through history, but they are incomplete.
Reconciliation is possible when we recognize the spaces of friction and tension between the narratives we construct. Sometimes we may be looking at the same thing — such as traumatic events in human history — but unaware of it because our lenses compose and organize our experiences of it in seemingly incompatible ways.
By recognizing the ways that our narratives overlap and diverge, we validate the experiences of ourselves and others. And when our understanding is built upon an ethic of validation, we learn to reconcile.
Future Directions
In this small prototype, we have conducted analyses and explored visualization approaches on a limited selection of texts. We would like to expand our data to many additional texts from both the United States and Japan, considering several questions along the way: How much space do the accounts of Hiroshima and Nagasaki occupy in the textbooks overall? How do textbooks and their narratives change across time and editions? What cultural, social, and political forces come into play in those dynamics? How can visualization be used to reveal trends in the rhetoric of these narratives over time?
Importantly, we can involve both U.S. and Japanese citizens in this research to encourage reconciliatory dialogue. By crowdsourcing data collection, asking people from both countries to find and share excerpts from textbooks, we can build spaces for both countries to discuss the differences in narratives composed in both. Beyond textbooks, we may begin to look at additional media of narrative as well, including popular nonfiction literature, websites, and museum material.
Acknowledgments
This work was presented at the Japan Association for Digital Humanities Conference 2017 in Kyoto, Japan (September 11-12, 2017). This project is partially inspired by similar work done previously through the Divided Memories and Reconciliation project at Stanford University. Textbook excerpts were collected by Steven Geofrey Braun (Boston, MA, USA) and Kelsey Menninga (Tokyo, Japan). Data analysis and these visualizations were created by Steven Geofrey.