Text Analysis with Voyant

For this assignment, I had several novel in mind to use in Voyant.  With each novel I had certain impressions about repeating words.  For The Great Gatsby and The Bluest Eye, I noticed the use of colors in the novel and wanted to find out how much was used and if they were similar.  I had the impression that colors were used a lot in both novels, specifically the colors blue and yellow. The impression I had for The Day of the Locust was somewhat similar but the colors mentioned in the book were different, bordering on the metallic scale of colors.

I analyzed each book individually and was surprised by the results.  I expected to see a lot of usage for colors in each novel.  However, that was not the case.  The results were quite different than what I expected. Colors barely showed up on the visualization part of the program.  

Great Gatsby

Although the novel is very visual descriptively, the result from Voyant showed a different visual than the one that I had expected to see.  ‘Gatsby’ and ‘Daisy’ showed up, but ‘Tom’ showed up more that ‘Daisy’ and, surprisingly, ‘house’ showed up even more than those.  Also surprising was the number of times ‘eyes’ was used.  Most all of this was new information.

https://voyant-tools.org/?corpus=1f6377e12f4415d8302d8b4b5e0f94d6

Bluest Eye

A similar experience happened in the novel.  Whereas I was expecting to see many colors, only the color of the title was prominent.  What was surprising was ‘Cholly” , “don’t” and “know” showing up as much as they did.

https://voyant-tools.org/?corpus=5341dcf69e2b4eb3fd9b99c964f55855

Day of the Locust

At this point, I lowered my expectation as to what to expect since the images I had in mind and what Voyant visualized were different.  What was surprising here was to see ‘Earle” whom I had thought as a somewhat minor character show up many times.

https://voyant-tools.org/?corpus=03ddb1d009689725c1ccbaa69ef0943e

Using this tool gave me a nice visual aspect to view the novel and expand how I analyze it.  With the prominence of certain words or characters, I am able to add another component to how to interpret a novel. Voyant is another tool to add to the toolbox.

Zotero Workshop

I just finished a Zotero tutorial with one of the librarians and would like to share what I have learned.

Zotero is an open-source reference management application and add-on to manage bibliographic data and sources.  It was developed and is managed by the non-profit Corporation for Digital Scholarship at George Mason University.  It gives users a free cloud account with 300 MBs of storage and has options to have larger storage for a small fee.  The program offers many helpful tools to make researching, organizing, citing, and creating bibliographic information easy or easier and a much less time-consuming task.

Organizing

The interface is relatively simple and straight forward.  There are different ways to organize one’s resources: 

  • The user can create Collections or Folders and subdivisions.  Having sources for each project is easy and can be placed in the same collection.  
  • Tags can be added to each source or piece of material.  This makes sorting or searching for items easy.
  • Related references can be linked that are located in different places in the library.
  • Notes can be added to sources or materials.  This is similar to a stickie or Post-it, called Child.

There are 4 ways to add sources or materials (e.g. screen shots) in Zotero.

  1. One way is to manually enter the information: author, title, publisher, etc.
  2. A user can just drag a PDF into a collection and the program will automatically fill in the information.
  3. Entering an Identifier (PMID, ISBN, etc.) is another option.
  4. Using a browser plug-in will also work.

All of these will activate Zotero to fill in the bibliographical information of the source. One very helpful feature is that if a user right-clicks on a source in the collection and the source is Open Access, Zotero will retrieve the PDF and add it to the collection.

Citing

The program will cite sources in the body pf the writing as well as organize a bibliography for the work. Again, there several ways that a user can do this, either in the program itself or as part of a writing program.

  1. A very helpful and time-saving feature is to generate a source page (Bibliography, Reference, Works Cited, etc.) for one’s work. From the program, one can right-click on an option and a very large selection of styles will appear.  The usual styles (Chicago, MLA, APA, etc.) are there along with a large host of others.  One can fine tune a style, add their own or retrieve another.  This makes it very adaptable to one’s needs.
  2. Adding an in-text citation while writing is also another feature. And, again, one can choose the style to use for the citation.  Zotero will automatically add the citation and or footnote or whatever is correct in the style.

A user can add Zotero to a word processing program such as LibreOffice Writer, Microsoft Word or another program.  When that is done, a Zotero icon is added to that program and the same citations can be done with the click of a button.

Groups

Another feature in the program is the ability to create a group page.  This is helpful for group projects where the group can input and share sources.  The individual setting up the group has the ability to set perimeters as to what members can do. The group can store, add, edit or, search sources. 

The workshop was extremely helpful in learning the program and its time-saving features.  I also left with many other helpful tidbits of information such as the Hathi Trust database, LibreOffice Writer (open source instead of MS Word) and other nuggets.  I’m sure anyone writing in either academia or similar professions will benefit from this program, and it’s good to know if that if an individual subscribes for extra storage then it will benefit a non-profit company rather than a big tech behemoth. 

Response to readings 11/30

As K-12 and undergraduate programs continue to incorporate immersive, augmented and virtual reality technologies into lesson plans with ever greater frequency, educators seek, with ever greater rigor and exactitude, to formally assess the quality and efficacy of these tools: from defining a virtual tool’s learning outcomes, to measuring its success in achieving those outcomes, to student perceptions of immersion and usefulness. Hutson and Olsen (2022) and Makransky and Meyer (2022) demonstrate the variety of questions asked–and conceptual categories employed–in assessing augmented reality for education. Drawing theoretical principles from multimedia design, instructional design, and the cognitive-effective model of immersive learning, Makransky and Mayer highlight the importance of conceptually distinguishing between immersion–the concreteness and thoroughness of detail with which a virtual world is constructed, the experiential limits of its horizons defined by its creators—from perception, a student’s subjective sense of being submerged in that virtual world, with limited external distraction. Makransky and Meyer hypothesized that a group of middle school students taking a 360-degree, headset-enabled virtual trip to Greenland to learn about climate change would not only perceive a greater degree of immersion, interest and enjoyment compared to a group of their peers experiencing the same content in standard 2D video format, but would also perform better on immediate and later tests as a result of these higher levels of perception, interest and enjoyment. Their hypotheses validated, the authors reached the following conclusion based on their observations: “It appears that enjoyment and interest are involved in learning but in different ways. Enjoyment directly mediate[d] the results on the immediate posttest but not the delayed posttest. Alternatively, interest directly mediate[d] the results on the delayed posttest but [did not mediate] the immediate posttest” (p. 1787).

Workshops summary.

It is an interesting requirement to attend the workshops outside class hours and I in particular have not encountered this type of an assignment at all during my undergraduate years. I understand that In this ever shifting job market and the skill set in need of a constant update workshops were breath of fresh year and gave me the chance to experiment and explore. Experimentation and exploration to new ideas are a hallmark of digital humanities. I want to preface that I am trained in Computer Science mostly but history has always been my favorite subject and a passion of mine. The above served as a guidance in which types of workshops that I chose to pursue.

I have attended the Python workshop on October 30, 2022 on Friday and it was interesting because Python as a code is different from Java another popular code writing program. I was mostly trained in Java and Python was new for me. Even though Java and Python share many similarities they do differ in how each executes the programs. I like the simplicity and cleanliness of Python and it is no wonder that it gaining more track within the dev community. I will look forward to taking classes in the future about Python programing language.

My wife is not a citizen yet of this country and I have been taking classes in our local library on how to process her claim to get the citizenship. I do not want to pay thousands of dollars for a lawyer if I can do it myself and besides it is a good civic lesson on how the person becomes a citizen because I myself do not remember it since my parents took care of everything when I was a teenager. The process of citizenship has become more digital and at times less access able for the people of older ages and especially for those who a digitally illiterate.

I am also will be taking Mapping classes on Dec 1, 2022 as I am completely new to mapping design and mapping software and I really want to explore that crucial aspect of digital humanities. I have been playing with Twine lately and I am surprised that there exists open source tools like that and it is my first semester and everything pretty much is new to me.

!Qué viva el Bronx¡ + Late Map Praxis + Into Workshop to QGIS

StoryMaps project: Bronx Untitled (2022)

Candles and Oil at La Familia Markets, Norwood, Bronx, 2022.

Inspired by the maps and reflections the class made and, as I was finally able to go to a map workshop!, I returned and partially completed my own praxis assignment in StoryMaps about the Bronx. I attempted an affective mapping of my working-class, culturally rich and diverse neighborhood, plus a more “larger” curated view of the borough, plus a reflection on borders and shapes. It’s still under construction. I would love to continue the project and work with the community, with the owners of the shops mostly, and ask for their permission to take pictures and include them in the map. Maybe even do maps with them, if they have time, following mainly the methodology of “Talking Maps” as developed by Victor Daniel Bonilla. A Colombian author that constructed many maps with the Nasa community to create visual representations, maps, of their struggle during the 1960s as they continuously had to fight the Colombian government for sovereignty of their land. But their maps mainly tell the story of a Quintín Lame, one of Colombia’s most important indigenous leaders and activist. The class we had on maps also made me remember how much indigenous communities have resisted attempts at mapping, or at leas at maps created to control them, pinpointing their territories and use as a tool to continue our endless history of exploitation and extractivism. But, the work of Victor Daniel Bonilla was done with and for the community. The results are different, the process is different, the intention is different, is about memory, also about who gets to define your narrative.

Mapas Parlantes, Víctor Daniel Bonilla, “Talking Maps” from 45 salón nacional de artistas

The photos I’ve included in the StoryMap “Bronx Untitled” where mainly taken by me (with verbal consent from the store owners) but some where also from other people’s blogs. I wanted to share a few things/questions that might resonate with those of us/you that want to do more maps. To situate the affective mapping theoretically I started reading “Affective Mapping: Melancholia and the Politics of Modernism” (Flatley, 2008) if anyone has reading recommendations please let me know! I found it useful to think alongside Yarimar Bonilla & Max Hantel (2016) piece on sovereignty. I also think an affective map could be a useful technique against the bad representation or imaginary that there is out there about a borough so huge and diverse like The Bronx. Then, thanks to the workshop in QGIS I was able to step a bit out of my framework and think with those geographical concepts like coordinates (x,y) of latitude and longitude, vectors (collection of those points), features (main information unit), attributes (information about a feature), layers and .sph files. I’ve got much much to learn and most importantly to think differently. But I found the workshop truly useful for a true beginner like me, I recommend it. Weirdly also while using the program at some point it asked me if I would give it permission to record my screen, does anyone know why this happens?

Why do they want to record my screen!?

Anyways, in the process of creating the affective map I started to “see” certain things about my neighborhood that were new, like the idea that a lake that once was part of that region, before 1888, truly influenced to this day how we humans are located and developed our activities around space. Also, there are some “vacant” spaces here and there in the neighborhood. I wonder if this is neglected territory or is a strategy for gentrification by increasing prices on already overcrowded apartments?. But also, when thinking about space one should not say that something is “vacant” on those spaces there was so much life, birds, vegetation, raccoons, etc. The fact that a space does not have humans doesn’t mean is empty. In any case I was also “surprised” by the lack of information on this lots.

The lack of information is so political…

Like some of you I also wondered how could I represent something in space that escapes the x,y coordinates. I did a sketch of what/how I sometimes interpreted my neighborhood space, as a collision of unseen cultural borders. The neighborhood has a high rate of hispanics but there are also many people from Bangladesh and it was so interesting for me how my aunts and mom would not really shop on Halal stores even though they basically sold the same exact vegetables as the Mexican store and where literally next to each other! That always makes me wonder how maybe we carry invisible borders within our own existence, borders that mentally accommodate space in a very specific way. It’s not so much that I can’t go to the Halal store but there’s something informing my “hispanic” experience that prevents me from going, from choosing it instead of the more familiar looking store. And yet, there were moments that the neighborhood flourished in unique ways, whenever there was a special religious event, the colors of the street changed, the beautiful dresses inundated the streets with greens, blues, purples, golden colors and the jewelry tingling sounds coming and going alongside laughs. Anyways, how can you express that in a map? And why? I remember someone in class saying that those maps (or perhaps some data visualization) goes to the point of not being useful but just being a pice of art….

Bogota’s border on top of Dhaka’s border on top of Zip code border of Norwood, The Bronx. All borders gravitating towards the center of Williamsbridge park, that once was a lake.
Screenshot of QGIS work (done thanks to the digital fellow’s workshop) Blue for the street roads, purple for the Zip Code borders, and the Zip Code numbers labeled.

Something interesting that emerges from this two maps is how the neighborhood’s central park (once a lake) might or might be not connected to the larger flora and fauna of Van Cortland that gives Norwood it’s northern limit (according to official standards but in reality it feels more like an extension of the barrio) but it’s part of the 10467 Zip Code. Maybe, while analyzing this space through a more non-human centered view one might be able to connect this two rich and important bodies of life, vegetation and mental health relief. Van Cortland in turn, connects to other parks that eventually lead to the old Croton Aqueduct. Water is so important, maybe one of the most important layers for, yes, geographical thought but also for cultural thought.

Cien años de soledad en el parque – One hundredth years of solitude at Van Cortland park, pandemic times, María F. Buitrago, 2020.

If you want to talk about the Bronx HIT ME UP!

—Happy day & Thanks.

Connected Pedagogy: Untangling Literary Allusions

Like some of my classmates, I wondered about which references younger college students might either get or miss in “The Reverse Peephole” — I think Maria’s recreation of the experience of connecting to AOL and Estefany’s link to the Seinfeld scene that gives the introduction its title will both be useful to students who might have been born after both.

I know that many educators at the K12 level have been rethinking the idea of a literary canon and changing their reading lists to correct the gaps, omissions, and biases in the traditional high school curriculum. It’s an exciting development; that said, I’ve wondered if first-year students of the new canon/non-canon might have trouble (at least at first) recognizing references that authors educated on the traditional canon would expect most readers to understand. In just this introduction, for example, Scott references a number of books, historical figures, and characters with little or no explanation:

  • Don Quixote (1605)
  • Rene Descartes (1596–1650)
  • La Princesse de Clèves (1678)
  • Robinson Crusoe (1719)
  • The Odyssey
  • The Lion, the Witch, and the Wardrobe (1950)
  • The Biblical story of Lazarus
  • Poe’s “The Telltale Heart”

This is in addition to the allusions he does explain. If you know what these works are generally understood to signify (e.g, what it means to “tilt at windmills”), references like these are almost invisible; if you don’t, they can be distracting or confusing, and leave readers with the choice of pausing repeatedly to look up minor details, guessing at what the comparison might mean, or just ignoring the parts they don’t understand (and hoping they’re not meaningful).

In truth, you don’t need to read the books Scott references to get his allusions; you just need to have heard them referenced enough, gotten a good summary, or (if you grew up in the 90s like I did) watched an educational TV show about a jack russell terrier reenacting the classics. I didn’t think providing Wishbone links for all Scott’s references would be appropriate or very helpful, so I chose to write a brief gloss for two of the references that seem to be doing some heavy lifting.

Here are my annotations:

Original TextWork ReferencedMy annotation
At the same time, the various and non-stop opportunities for communication are notable for highlighting our isolation, and it’s perhaps this intensity of digital communicability that brings mythic proportions to mind. When the Olympian postman Hermes goes in search· of Odysseus during the latter’s long confinement on Calypso’s island, he looks for him in a cave, but ‘Of Odysseus there was no sign, since he sat wretched as ever on the shore, troubling his heart with tears and sighs and grief. There he could gaze out over the rolling waves, with streaming eyes.’ I think of this weeping Odysseus sometimes, when I’m waiting with indecorous zeal for an email or a text, or when I catch myself peering into the rolling blue of Facebook, unable to remember for whom or what I’m looking. I see his yearning in miniature, in the five seconds it takes for someone to bring a phone from their pocket and put it back again.

These are ship-in-a-bottle feelings, which life can accommodate. The otherwise cheerful and productive of us have cheerful, productive lives amid digital longings and desolations: But it is certainly true that invoking the messenger god is one of the constitutive practices of our times. 
The OdysseyIn Homer’s “Odyssey,” various gods interfere with and aid the warrior Odysseus in his journey home from the Trojan War. The journey takes him ten years; for seven of those, he’s held captive by the nymph Calypso. In the passage Scott refers to, Hermes has actually come to persuade Calypso to free Odysseus, who is mourning and longing for his wife and son at home.

Scott compares Odysseus’ “tears and sighs and grief” for his home to a feeling of yearning for digital contact. He juxtaposes two situations with very different stakes: a father and husband longing to return home vs. a Facebook user waiting for a notification.

Is this comparison overly dramatic, and if so, why do you think Scott chose to make it? What effect does this comparison have on the tone of the essay, and on the arguments the author is making? For example, how does this passage change your overall understanding of this introduction if Scott is being self-deprecating and ironic? Or what if his comparison of these situations is serious and earnest?
 It has long been the word on the street that, if you dabble in other realities, then you shouldn’t expect to remain unchanged. Lazarus was never his old self again. The BibleLike the reference above to the Odyssey, this is another brief reference to a very old text. In the Bible, Lazarus has been dead for four days when Jesus brings him back to life.
Like the earlier comparison of Odysseus’ grief to the social media user’s FOMO, this comparison is dramatic: literally being brought back to life vs. “dabbl[ing] in the “other realities” of the internet and the digital world. Arguably, Scott is using Lazarus as a metaphor for just how transformative our experience in the digital world can be; even if we aren’t reborn as individuals, the changes to the experience of being human might qualify as a rebirth of sorts.

Besides for this reference and the one to the Odyssey, there are several other old literary references in this reading, including the one that gives “The Fourth Dimension” its title. Why might the author choose to build his argument about new ways of being using references to such old works? Does it have an effect on how accessible his argument is? If so, what?

In addition to clarifying the context of the allusions, I wanted to bring up two questions that these passages raised for me:

First, how does Scott want us to read his extravagant comparisons between extreme situations (Odysseus stranded for years on an island, far from those he loves) and the everyday experience of using the internet and existing in a digitally-mediated world? My first impulse was to think he’s being arch, maybe a little self-deprecating — of course waiting for a new email (probably spam!) is not the same as longing to sail home. But the more I reflected, the more I wondered if I was missing his argument based on my own biases. I’m curious about what other readers (especially younger ones who might have a different relationship to technology) might think.

Second, what’s the rhetorical effect of making so many allusions to such old works, anyway? Again, is this a bit of irony? An attempt to show the profundity of the digital world and our relationship to it? A response to an anticipated argument that new and old media can’t coexist in our understanding of what it means to be human? Or maybe it’s just Scott’s attempt to prop up his own intellectual bona fides? I’d like to read the entire book to get a clearer answer — but again, I’m curious about what the students will think.

Connected Pedagogy Assignment 

I found this assignment of combining theory and praxis interesting but unfortunately this post is coming a bit delayed because of sickness in my home.

I think it is a relevant discussion to have within DH of how social annotation can be a part of facilitating learning experience for students. My own experience with social annotation only comes from my time at the graduate center, and it was interesting to dig deeper into a more theoretical perspective. The reading points to different interesting aspects of annotations that are relevant to reflect upon. One annotation will not change the whole learning outcome for the students, but I have chosen to focus with my annotation on trying to create collaborative, co-construction of knowledge and at least try to address some of the power issues raised by Brown, M and Croft, B. They described how critical social annotation can undermines norms around knowledge authority. To put some of the theory into practice my annotation asks the students to find out together how to approach one of the claims of the readings. I found it relevant how Roopika Risam points to how immersing students in knowledge production gives experience with deconstruction the political formations of knowledge. It would take more than one annotation but it is a relevant goal to work towards.

I have picked to annotate this part of the reading:But an arguably more interesting phenomenon is the voluntary reversal, for now we endorse and facilitate all sorts of peepholes into our domestic interiors. It is perhaps during our drowsy meanders of the deep night, alone in the glowing dark, that we most often find ourselves, through social media’s chain of associations, in a kitchen full of strangers, caught in a moment of togetherness. One could rightly argue that these views are stage-managed, a show to be enjoyed, the opposite of an ambush. And yet there’s always an excess that can’t be controlled, knowledge that slips around the sides of the spotlight. This is a new vision of our homes, with windows opening onto faraway rooms, and lights shining out into remote darknesses.

Annotation:“This is a new vision of our homes, with windows opening onto faraway rooms, and lights shining out into remote darknesses.” What does this sentence mean to you? I’m personally not an expert on social media and I have only used it limited within the last few years, but I understand it as the author claims we show more of ourselves today than earlier. How can we address this claim? What question can we ask to explore this? I suggest that it can be helpful to ask what our own experiences with social media both as sharing, viewing and interacting. What parts did you find interesting in the reading and what do you wish to know more about?

Reflecting on Engagements with the Four-Dimensional

In “The Four-Dimensional Human: Ways of Being in the Digital World,” my annotations focused more on guiding students towards connecting their own personal experiences to those they are reading on the page. Oftentimes, I struggle with readings that deal with more abstract or foreign terms that can cause confusion. In those instances where I could envision myself drifting away from the reading, I left annotated questions for students to reflect on how the reading is related to their own lives. My hope is that students can find themselves engaging with readings on a more personal level, in a way that builds their understanding and empowers them to share their unique perspectives with others.

QuoteAnnotationExpected Impact
“It was postulated in many ways: as ether, as the unconscious, as a duration in time, or as time itself. But most popularly it was a space into which one might travel, a world that could be reached if only the right conduit or portal could be found. The prospect of discovering this dimension was so appetising that it belonged to everyone.”What ways might you interact with a fourth dimension described here? How has this dimension changed over time, and how has it remained the same? Who might you be able to talk to that can share their experiences and feelings of voyaging into this dimension?This annotation is found earlier in the reading, and is intended to start building a foundation  for students to insert themselves into their reading. The idea is that students will feel encouraged to continue to ask themselves similar questions that immerse themselves in exciting ways.
“So the modems gave the sense of a journey. Through certain designated portals we could move into a specific way of being that felt like entering a new territory.”What modern day modems do you feel like you interact with, similar to as described in the passage thus far?Personally for me, I didn’t know what a modem was until recently. It can be frustrating for unfamiliar terms to be introduced. Thus, the idea behind this question was for students to build their own definition of what a modem is, as it relates to how it’s described in the quote, before they look up what it might be themselves. The question serves to shift focus more towards the exploratory portal experience of what a modem does (which is more relevant in the reading), rather than what the technical aspects of a modem are.
“A truth and cliche of digital life is that our comeliest meals occur both on our table and in the pockets and on the desks of our international 4D colleagues, a meal to be both eaten and approved of.”Think about the phrase: “Instagram has to eat” – how is your own experiences with food (or otherwise) simultaneously experienced, curated, and consumed?The question I posed is meant to encourage students to reflect on their own experiences with documenting meals on social media. In this reflection, I hope students are able to think more critically about their relationship to social media moving forward.
“With walls not being what they once were, the home itself has become four-dimensional, with new ground plans to match its digital environment.”How has your home become associated with the space in which you experience other digital spaces?The annotated question invites students to bring their own experiences in juxtaposition to the reading. In doing so, students are guided towards a deeper understanding of the four-dimensional that centers their experiences.
“Our portals to the fourth dimension have been wedged open, and there it is, spread out across the everyday, indeed nestled inside the everyday, causing it to ripple and bend.”How has growing up on the internet impacted your perception of self, and of community? What are some moments where you have felt deeply entrenched and touched by the fourth dimension, both in ordinary and unsettling ways?The final annotation aims to contextualize the “rippl[ing] and bend[ing]” of everyday life described in the quote. The words “ripple” and “bend” evoke strong visuals, and in posing this question alongside the imagery, I hope students are able to think about both the tangible and intangible ways the fourth dimension of the internet shapes them.

Paradoxes of living in a Fourthdimensional body

The Dalai Lama, when asked what surprised him most about humanity, he said:

“Man.
Because he sacrifices his health in order to make money.
Then he sacrifices money to recuperate his health.
And then he is so anxious about the future that he does not enjoy the present;
the result being that he does not live in the present or the future;
he lives as if he is never going to die, and then dies having never really lived.”

I was reminded of the Dalai Lama’s quote when reading and annotating this passage.

How does time pass in this dimension? What dreams begin to prey on a four-dimensional mind? What. are the paradoxes and ironies of owning a four­dimensional body, with its marvellous new musculature?

Last week, my daughter, the pups and I took a road trip to visit my parents who live in Indian Land, SC (yes there actually is a town named that, smh). We drove halfway, spent the night in the Shenandoah Valley area and arrived at my parents the following day for a 48-hour visit. The trip was on a whim and in the car my daughter and I laughed about why we made this short long trip. My daughter made the comment that she has not seen her only living grandparents in a year and a half, but it didn’t seem that long because of Facetime. Her statement struck me as I thought of my childhood and how almost every weekend my family got in the car and drove from NJ to Long Island to visit my grandparents. Facetime did not exist, my 3 sisters and I had to fight over the yellow wall phone which was primarily used to make plans or gossip with friends. I wonder if we were transported back in time, would I be driving my 3 daughters, on a more regular basis, to stay connected with family and friends.

Below is my annotation to the passage:

The 3 questions that the author asks are worth reflecting on. Choose one question and share your response. I like to think about the paradoxes of technology. An example is how I use social media and technology to stay connected with friends and family. The convenience of technologies like Facetime and the ability to keep up with others online has left me feeling disconnected and not fully present in their lives because there is less frequent physical interaction.

The paradoxes of being a fourth dimensional human could fill this page. I hope reflecting on the authors questions brings awareness of our reality regarding technology to the readers. For further exploration on technology paradoxes, check out this blog:

A blog written by Stephen Petrina at UBC titled the 10 paradoxes of technology

Humanities and the Web Post-Workshop Informational!

I recently had the opportunity to attend the Internet Archive’s workshop on web archiving in Los Angeles. Firstly, before I get into a rundown of what we learned, I just wanna say it was AWESOME. I met some great people there, including fellow DHers! This was their first workshop, and there will be more forthcoming, so don’t be sad if you missed this one! Please feel free to ask me questions in the comments or in an email if you want more information or clarification on any of these points. I’d recommend reading this post chronologically, as it goes in order from basics to advanced topics. Now onto the good stuff…

What is the Internet Archive?

  • Contains primarily, though not exclusively, 20th-21st century records of human interaction across all possible mediums (newspapers to fiction to gov. info to art etc).
  • Constant change and capture.
  • Every country in the world included.
  • Fit for both macro and micro level research questions.
  • Fit to archive both hundreds or millions of documents.
  • Known for the Wayback Machine, which takes snapshots of websites at different points in time, shows you those snapshots, as well as information about when snapshots are taken.

What is a web archive?

  • Web archives are a collection of archived URLs that contains as much original web content as possible while documenting the change over time and attempting to preserve the same experience a user would’ve had of the site on the day it was archived.

Challenges of web archiving

  • Trying to hit the balance between access to billions of bits of information and actual usability of that information.
  • Content is relational and self-describing.
  • Difficult to subset relevant collections, storing and computing all of it.
  • So many methods and tools to choose from.

Glossary

  • Crawler – software that collects data from the web.
  • Seed – a unique item within the archive.
  • Seed URL – the starting point/access point for the crawler.
  • Document, here meaning any file with a unique URL.
  • Scope – how much stuff the crawler will collect.
  • WARC—file type for downloaded archived websites.

Examples of steps to archive from a project on COVID response in the Niagara Falls region

  • Close reading with Solrwayback – searchable, individual items examinable in the collection.
  • Distant reading with Google Colab – sentiment analysis, summary statistics, data visualization.
  • Data subsetting with ARCH – full-text dataset extraction from the Internet Archive’s collections.
  • As an outcome, helped the City of Niagara Falls formulate a better FAQ for common questions they weren’t answering.

Other methods

  • Web scraping – creating a program that takes data from websites directly.
  • Topic modeling – assess recurring concepts at scale (understanding word strings together to create a topic).
  • Network analysis – computationally assessing URL linking patterns to determine relationships between websites.
  • Image visualization – extracting thousands of images and grouping them by features.

Web archiving tools

  • Conifer (Rhizome)
  • Webrecorder
  • DocNow
  • Web Curator Tool
  • NetArchive suite
  • HTTrack
  • Wayback Machine – access tool for viewing pages, surf web as it was.
  • Archive-It
  • WARC – ISO standard for storing web archives.
  • Heritrix – web crawler to capture web pages and creates WARC files.
  • Brozzler – web crawler plus browser-based recording technology.
  • ElasticSearch & SOLR – full-text search indexing & metadata search engine software.            

Intro to Web Archiving

  • The average web page only lasts ~90-100 days before changing, moving, or disappearing.
  • Often used to document subject areas or events; capture and preserve web history as mandated; taking one-time snapshots; and supporting research use.

Particular challenges

  • Social media is always changing policies, UI, and content.
  • Dynamic content, stuff that changes a lot.
  • Databases and forms that requires user interaction, alternatives include sitemaps or direct links to content.
  • Password protected and paywalled content.
  • Archive-it can only crawl the public web, unless you have your own credentials.
  • Some sites, like Facebook, explicitly block crawlers. Instagram blocks them but has workarounds.

How to Use the Internet Archive (It’s SO EASY)

  • Browse to web.archive.org/save – enter URL of the site you want to archive, creates an instance. Boom!
  • You can also go to: archive-it.org – create a collection (of sites), add seeds (URLs).
    • Two types of seeds: with the end / (backslash) and without the end backslash. Without adds all subdomains- eg, if I did my Commons blog noveldrawl.commons.gc.cuny.edu, it’ll give me ALL the commons blogs- everything before the ‘.commons’. If I do noveldrawl.commons.gc.cuny.edu/, it’ll give me just all the stuff on my blog AFTER the slash, like noveldrawl.commons.gc.cuny.edu/coursework.

I archived the website data… now what do I do with it?!… Some Tools to use with your WARC files:

  • Palladio – create virtual galleries
  • Voyant – explore text links
  • RawGraphs – create graphs

ARCH (Archives Research Compute Hub)

  • ARCH is not publicly available until Q1 2023; workshop participants are being given beta access and can publish experiment results using it.
  • Currently can only use existing Archive-It collections, however after release user-uploaded collections will be supported.
  • It uses existing collections in Archive-It, which you do need a membership to use.
  • Non-profit owned, and the internet archive is decentralized and not limited to a government or corporate tool.
  • Supports computational analysis of collections, eliminating the need for the technical knowledge to analyze sites, and allows for analysis of complex collections on a large scale.
  • Integrates with the Internet Archive, and has the same interface as Archive-It.
  • Can extract domain frequency (relationship between websites), plain text, text file info, audio files, images, pdfs, ppts. It can also create graphs of these relationships in browser. There’s even more it can do than this, if you need it, it can probably do it. All data is downloadable, which can be previewed before download.

Observations

  • I noticed the majority of everyone present had faced some sort of cultural erasure, threatened or realized, modern or archaic, that has brought them to their interest in archiving.
  • From experience using these tools, I’d say Wayback is great if you need to just archive one site, perhaps for personal use, whereas Archive-It is great if you have many sites in a particular research area that you’re trying to archive and keep all in one place.

Links of interest

  • https://archive-it.org/collections/11913 – Schomburg Center for Research in Black Culture, Black Collecting, and Literary Initiatives (67 GB, 23+ websites since March 2019; contains blog posts, articles, bookstore lists, etc)
  • https://archive-it.org/collections/19848 – Ivy Plus Libraries Confederation, LGBTQ+ Communities of the Former Soviet Union & Eastern Europe (30 GB, 70+ websites, since Aug ’22; contains news, events, issues, etc)

Further Resources