A quick note: This post will likely be edited and refined. I’m still attempting to find a better way to share my work in Voyant Tools on here but, despite the embedding function working when I preview my custom HTML block, it never seems to function in the same way once I preview the page as a whole. Anyway, if anyone has any tips or tricks as to how to successfully integrate interactive Voyant Tools’ embeddable tools, I’d greatly appreciate it.
Having spent last semester in the second portion of this course clumsily employing Python in a frantic attempt to introduce text analysis machine learning and word embedding to the philosophy of Bernard Stiegler (specifically his Nanjing Lectures, The Neganthropocene, the unpublished Technics & Time, 4, and the collective publication Bifurcate: There is No Alternative), I was thrilled to approach text analysis with a singular focus on the “frontend” hermeneutic experience of experimenting with a text rather than having to build out the backend to support my basic ability to do so. After waffling between a few of the provided examples of open-source text analysis applications, receiving unsolvable error codes from the JSTOR Labs Text Analyzer and SameDiff notifications suggesting that my corpus was too large when trying to evaluate the cosine similarities present between Walter Benjamin’s The Arcades Projects and David Harvey’s Paris, Capital of Modernity (weird, right?), I eventually landed on the wildly intuitive Voyant Tools, largely due to the fact that it functions as suggested and allowed me a wide range of investigative potential without trudging through troubleshooting to get there.
As a result of a recently whipped up blog post for my “Doing Things with Novels” course discussing Benjamin’s Arcades as an ideal subject for digital hypertext projects similar to that of Joyce’s Ulysses (despite each hosting their own digital graveyards of DH projects), I proceeded to explore Voyant Tools via Benjamin’s text, primarily due to its size (1073 pages) and scope of subject matter (its convolutes ranging from the iron industry and Parisian fashion to early street lighting and Marx). After converting my PDF of The Arcades Project to a txt. file, I, ever-avoidant of my Pythonic script employing NLTK that is intended to do this very thing, scrounged around for an open-source application that allows for the removal of stop words, dabbling with sites such as Tools.FromDev before I realized that such a function could be done (with ease) in Voyant Tools. Though I had to edit the stoplist to include the French stopwords that appear throughout the piece (and initially dominated the Cirrus) such as le, la, and du, I eventually cleaned the text in such as way that its word cloud (used as somewhat of an indicator of cleanliness at this point) showed ‘Paris (3856),’ ‘Baudelaire (1319),’ and ‘time (803)’ as Benjamin’s most frequently used terms rather than ‘les.’
Somewhat disappointingly, when provided with a numerically indexed list of most-used terms, little was illuminated or surprising by such insight into the text. Beyond providing a cute and colorful arrangement of a book’s most salient and central components, I fail to really understand what “analysis” could be conducted based on a word cloud. Having read The Arcades Project, the Cirrus above did little more than to confirm that a book about Paris, Baudelaire, and the experience of time amidst 19th-century capitalism was (surprise, surprise) precisely about those very things. The TermsBerry, though certainly offering more interaction and interpretation, provided similarly unremarkable results. Rather than determining the collocate frequency of obvious words such as ‘Paris,’ I chose to explore the connections that one of my favorite Benjaminian terms, phantasmagoria, has within the text, primarily due to a curiosity about the concept’s vague and only partially-formulated application within the similarly incomplete Arcades Project (Benjamin died before this “theatre of all his struggles and all his ideas” could be fully achieved). Despite Benjamin’s usage of this term being varied, pervasive, and at times lucid (as a savoring of false consciousness for the bourgeoisie, as the spectacle operating as replacement for reality, in the manifestations of and our experience with products amidst commodity culture), the phantas* TermBerry offered few connections (besides maybe ‘Time’ and ‘Marx’) that would grant any novel insight into the application of this concept that reading the text (or scholarly analyses of the text) wouldn’t work to more effectively provide.
Inspired by my comparative attempt made earlier in SameDiff, I thought these two tools (Cirrus & TermsBerry) might prove to be more useful when analyzing the use of terms across two distinct texts. As an ideal model for comparative analysis with The Arcades Project, David Harvey’s Paris: Capital of Modernity is a similarly Marxist analysis of Haussmann-era Paris in the 19th century, employing similar cultural and political figures, objects, and phenomenon – the poet-apostle of modernity, Baudelaire, the spatial forms of the Arcades, the time-space compression mentioned in my last blog, along with the transformations in public life via consumerism and the spectacle. In Harvey’s own words, his aim is, “…quite different from Benjamin’s. It is to reconstruct, as best I can, how Second Empire Paris worked, how capital and modernity came together in a particular place and time, and how social relations and political imaginations were animated by this encounter…” (2003, p. 18). All of this considered, the differences between these two theoretical frameworks begin to reveal themselves even through something as simple as a Cirrus word cloud. As one can see, Paris still remains the dominant term but words such as ‘workers,’ ‘labor,’ ‘class,’ and ‘capital’ have risen through the ranks, illuminating Harvey’s more orthodox-Marxist analysis.
Whereas phantasmagoria had a distant assortment of useless connections under Benjamin’s TermBerry, Harvey’s use of phantasmagoria, when the TermBerry is expanded to include 250 terms, only reveals two collated connections: empire & capital. Given phantasmagoria’s limited usage in Harvey’s text, we can return to the text to see exactly how these connections might have occurred;
(18-19): “Benjamin also insists that we do not merely live in a material world but that our imaginations, our dreams, our conceptions, and our representations mediate that materiality in powerful ways; hence his fascination with spectacle, representations, and phantasmagoria.”
(109): “The phantasmagoria of universal capitalist culture and its space relations incorporated in the Universal Exposition blinded even him to the significance and power of loyalties to and identifications with place.”
In each instance, ‘empire’ is located in a nearby sentence but is not directly related to the meaning of the concept. While ‘capital’ makes a certain amount of sense, it is still not enough to determine the core components of a concept purely through the development of a TermBerry. However, it is interesting that, despite Benjamin operating as a primary developer in the conceptual production and employment of phantasmagoria, I feel as if this TermBerry exercise indicates that Harvey’s application more clearly reveals a concise and understandable instance of the concept’s application.
It was around this time that I realized that I could combine the two texts in the same Voyant Tools workspace. Hold your applause. ‘Paris’ and ‘Baudelaire’ still reigned supreme but Harvey’s addition to Benjamin’s tome brought the aforementioned labor-centric terms (‘work,’ ‘workers,’ and ‘class’) into the Cirrus. More than anything, this development provided me with the fun ability to compare the usage of concepts between two texts, as seen below.
For example, while each thinker employs ‘Paris’ to a similar degree, the distinction in their approaches might be said to be seen through Benjamin’s far greater invocation of art in The Arcades Project via his usage of ‘Baudelaire’ and Harvey’s historical-materialist approach via his heightened focus on ‘city.’
This can be made clearer through the trend graph above. From this, we can assume that Harvey’s application of Marxist theory far outweighs Benjamin’s usage of such theoretical frameworks and terminology. Thus far, this comparative tool presents the most scholarly potential in analyzing the ideological, political, and theoretical underpinnings of texts. Additionally, the graph below offers insight into the stylistic methodology each thinker takes in approaching similar subjects – Benjamin, ever the flirt with terminology that invokes a sense of mysticism, is seen using such language (‘dream,’ ‘awaken,’ ‘phantasmagoria,’ ‘reality,’ and ‘enchantment’) to a far greater degree, illuminating again the material approach of Harvey and the uniquely Benjaminian style of analysis that exists in The Arcades Project.
As a last little tidbit, another interesting application of the Trends tool comes through the ability to view a text’s relative frequency of a term’s usage throughout the document’s segments. For example, if I wanted to understand, address, and analyze how both Benjamin and Harvey approached their application of Baudelaire (or how they constructed an argument using Baudelaire as a central element), I could look to the line graphs below to view the rate at which Baudelaire was mentioned and where in the text might be relevant to my investigations of the French poet. I imagine that this could be employed in a multitude of ways and the very basic functionality of this feature represented here barely scratches the surface.
Though I feel compelled to conclude this reflection for the sake of the reader, I come to this conclusion shortly after actually realizing the potential of this tool. Having initially approached this blog critical of the insipid simplicity of a word cloud (and haunted by my semi-successfully text analysis project last semester), through my further exploration of Voyant Tools (literally as I wrote this) I came to recognize the potential it offers in the comparative analysis of texts. While working with one text presents many yeah-I-already-know-thats, Voyant Tools’ open-source gift of instantaneous intertextual analysis feels like something I could not only dabble with for hours but also utilize in developing critical approaches to arguments and analyses in the future. In short, I’m excited and this tool is cool.
If I’m able, I’ll attach my Voyant Tools workspace with both The Arcades Project and Paris, Capital of Modernity here.
If that doesn’t work, click below to download Benjamin’s and Harvey’s work. You can easily upload each into Voyant Tools and enthusiastically peruse the endless possibilities that come with the text analysis of two irrelevant texts elucidating how one city briefly functioned two hundred years ago. You’re welcome.
Super helpful to see the evolution of this project and where you landed. I’d say I share some of your suspicions about the approach, but am, like you, open to better understanding the details and usefulness of the tools in practice. It is interesting to me, how the project took off a bit once you chose to compare two texts instead of focusing on the one. In the readings, questions of just how many texts in an investigation are necessary to arrive at well informed insights seemed to be demonstrated in your choice. That might’ve been from a data only perspective, but in your project I learned a lot about two authors I have not read (except that article in our shared class) and had a window into alternative views to a similar period. It gave me historical context and a rough sketch to help orient me—all from graphs and visualizations independent of the texts. I’m sure there is plenty to mine from a single text too (was going to try my had at that), but this project really sparked to life with the added perspectives.