English, Guest Posts, Methodology

It Functions, and that’s (almost) All: Another Look at “Tagging the Talmud”


Itay Marienberg-Milikowsky is currently a visiting scholar at the Interdisciplinary Center for Narratology, Universität Hamburg, where he conducts his post-doc research entitled “The Rise of  Narrativity in Talmudic Literature: Computational Perspectives.” This is our third post in an ongoing series on Digital Humanities and Rabbinic Literature.

In Alfred Döblin’s famous novel Berlin, Alexanderplatz, a certain Franz Biberkopf rejoins the modern city after a prolonged incarceration, where he is astonished by the relentless, alienating pace of change. In time, Biberkopf gradually becomes entrapped in a net of forces stronger than himself, and his bewilderment is reflected in the splitting of his voice – or, maybe, the narrator’s voice – into two (if not more) contradictory points of view. Thus, the telegraph is described in one sentence as “astonishing, clever, tricky,” while in a subsequent sentence, we read: “It’s hard to get enthusiastic about all this; it functions, and that’s all” (p. 76).

Revolutionary technologies are at the heart of the Talmud Blog’s recent series, which deals with “the interface of Digital Humanities and the study of Rabbinic Literature.” Despite some critical evaluations, these technologies are thankfully much less threatening than the ones that made Biberkopf so dizzy. Nevertheless, while I can identify with Prof. Satlow’s response, as these technologies are indeed “astonishing, clever, tricky,” and using them in Talmudic studies is highly promising, the other, split voice in my head tells a quite different story. Not because I am a bit more skeptical about the potential of these innovations to make real, lasting contributions to the relevancy and funding of the field, but simply because what was true of the early telegraph is true also of our state-of-the-art digital environment: “It functions, and that’s all.” A careful assessment of this voice will be fruitful, as it pushes us to quickly fill the gap between hypothetical ideas and real scholarly achievements. In what follows, I will try to explain this necessary, modest shift in attitude.

Let me begin from a seemingly distant point, and clarify some issues regarding “Distant Reading” to which Satlow refers. First, despite the assumption that Distant Reading has always been wedded to the digital age, it is worth remembering that Franco Moretti’s paradigmatic article, “Conjectures on World Literature” (2000), in which he coined this term, does not even mention computers. This is also true, if I remember correctly, for “Graphs, Maps, Trees” (2005), excluding an incidental note about “computational stylistic” in the beginning of the book (p. 4). Computers are mentioned in Moretti’s work for the first time only in “Style.inc” (2007), and, of course, ever since he co-established the Stanford Literary Lab in 2010, he mentions them consistently. Even then, I think, computers are not a goal in and of themselves, but a tool for something else.

That “thing” may be called “distant reading,” but even that name – a pun playing on the concept of ‘close reading’ (the traditional paradigm in literary scholarship since New Criticism) – is somewhat misleading. Intuitively, a distant perspective goes hand in hand with enormity, and Moretti expresses his desire to cover “the great unread”; his latest studies certainly embody this desire in many senses. But in the pre-digital stage of his work, and beyond, Moretti also occasionally expresses something of no less importance; namely, a desire for a systematic mode of reading, a narrow methodological investigation of textual appearances of well-defined, countable phenomena:

You define a unit of analysis […] until, ideally, all of literary history becomes a long chain of related experiments: a “dialogue between fact and fancy,” as Peter Medawar calls it: “between what could be true, and what is in fact the case” (‘Conjectures’, pp. 61-62).

This kind of “reading,” however, no longer produces interpretations but merely tests them: it’s not the beginning of critical enterprise, but its appendix. And then, here you don’t really read the text anymore, but rather through the text, looking for unit of analysis. The task is constrained from the start; it’s a reading without freedom (Ibid. p. 61, n.19).

Against this background, it is not surprising to later find a confession regarding the idiom used in previous drafts of the article – “serial reading” (see “Distant Reading,” p. 44): serial, not distant, although these two are interrelated. We can understand why computers eventually entered the picture; but they were not there from the beginning. However, if emphasizing consistency of a systematic analysis sounds quite banal to philological-historical ears, then, within the original context of the article – a discussion about literary theory and its applications to a specific (albeit large) corpus – it seems almost a provocative claim, for reasons that are presently beyond our limits.

Inspiring as they are, we don’t have to treat Moretti’s words as “The Law of Moses from Sinai” (‘הלכה למשה מסיני’). Nevertheless, this meditation on his work provides new insights useful for studying rabbinic literature in the digital age. By placing quantitative perspectives at the center, we can again test where the study of rabbinic literature actually stands.

Michael Satlow justifiably notes that the conditions for digital research in rabbinic texts are relatively good compared to other ancient literatures. On a daily basis we enjoy high quality and impressively smart databases, such as Bar-Ilan’s Responsa project, the Friedberg project (including “Hachi Garsinan” for those of us who are working on the Bavli), the new integrated “Cooperative Development Initiative” website, Ma’agarim, Sefaria, etc. These esteemed collaborative efforts provide us not only with incredible access to a huge variety of texts and versions, but also supply wonderful tools for analyzing them.

However, there is a world of difference between “digitization” – in other words, turning the “non-digital” into “digital” (for instance, turning a hand-written manuscript into a digital file) – and what can be termed “digital hermeneutics,” or, better still, “computational/quantitative hermeneutics”; meaning, a reflexive use of computational tools for the purposes of interpreting a text, examining its poetics, or describing its place in the longue durée of literary history – all while remaining aware of  those unique aspects (not to say, values) of the quantitative perspective. While digital projects achieve public and academic appreciation – after all, who would not want to have texts digitally accessible? – it is hardly surprising that attempts to harness computational forces for the purposes of reading endeavors, which are often claimed to be inherently subjective and speculative, encounter more suspicious opposition.

Considering this, what Rabbinic studies (or Piyut; or even Modern Hebrew literature) is missing at the moment is not necessarily (or, at least, not only) new sophisticated databases, archives, editions and the like, but something quite different: An extended collaborative effort (or, to use Modern Hebrew slang: ׳קילומטראז׳׳) to manage quantitative text-interpretive experiments; a constructive critical dialogue, and, above all, a theoretical-conceptual framework that will imbue this effort with greater significance, without relinquishing an intensive dialogue with other currents in each related discipline.


* * *



“The first message is received by the Submarine Telegraph Company in London from Paris on the Foy-Breguet instrument in 1851,” taken from Wikipedia


The good news is that we don’t have to wait for this to happen. There is no reason to speak only in a future tense, a problem characteristic of the Digital Humanities (and here I take issue with Marton Ribary’s post). Many options are available to us right now, in part thanks to accessible and powerful tools, like Nodexl or Palladio (for network-analysis), Voyant (for automatic text-analysis), and, my personal favorite, CATMA (for the integration of automatic text-analysis and freely “undogmatic” manual annotation). This is even before we consider basic software, like Microsoft Word and Excel, which still has much to contribute. There is so much to do! In what follows, I outline the conditions that make this sort of research feasible, and suggest a few initial hypothetic examples.

First, the target audience of this approach is not a group of “digital humanists” in the narrow sense of the phrase – those gifted scholars with their feet firmly planted, at the same time, on the two sides of the ocean that separates the “two cultures,” the humanist and the scientific (see C.P. Snow, “The Two Cultures and the Scientific Revolution”). The target audience members may know nothing about programming, nor about coding. In fact, standard acronyms like “TEI,” “XML,” “CTS” and “NLP” may even seem to them strange and meaningless, and they do not prepare texts for re-use with adaptation to pre-established digital standards; it is not a part of their intellectual passion. Many of them don’t have the money, or the time, for ambitious academic “start-ups” that provide them with a return on their investment only after a long preparation process, if at all. But they do feel at home in the so-called regular forms of Talmudic studies (or literary studies; it doesn’t really matter), and they are ready to challenge old assumptions with new questions. Conversely, they are prepared to challenge new assumptions with old insights, and they should be willing to bridge what appear to be opposing paradigms.

Second, we have to temper our expectation for full automization, time-saving, or insights that arise directly from raw data “by pushing a button” (and let’s put aside “objectivity,” a naïve pre-Kuhnian image of the process). Yes, digital tools do make a lot of Sisyphean work speedy and effective; and good visualization can lead to unexpected thoughts. But if we want numbers and graphs, attractive and beautiful as they are, to tell us something important, we have to help them, to arouse them, and for this, serious human effort is needed. Integrating the humanized into the computerized is not a mistake, nor a regression: it is exactly what makes this method so stimulating.

Third, quantitative investigation, constructed systematically, can shed light not only on a huge corpus, but also on a single small text. When someone tells you that your corpus is not big enough for digital manipulation, don’t believe him; it depends solely on what you are seeking to do. How many times have you re-read, from a fresh point of view, a lyric song, a simple Mishna, or even a complicated miniature Talmudic story, and suddenly found yourself facing something that – you could swear – wasn’t there before? The same is true for this new approach. Slicing a text with Voyant by carefully examining its word list and the inter-relationships of those words affords us an opportunity to see the text through its “deformations,” or, as one might say, “decompositions.” In this way we can, for example, evaluate the effects of a novel’s characters – or a gallery of sages in a Midrash – based on countable measures, such as who is speaking, when he is speaking, how often, what he is relating (or, at least, what his vocabulary is) – and so on and so forth; a literal deconstruction of a text. The quest of Russian Formalism’s concept of the artistic “device” or “technique,” coined exactly a century ago by Victor Shklovsky (“Art, as Device,” 1916-1917), now becomes, perhaps more than ever before, a realizable goal.


* * *


Distant reading, therefore, is most significant when it is struggling against close reading; or, to put it differently, when it is integrated as a complex outlook that does not ignore the multi-dimensionality inherent to a concrete text or corpus. For this, we have all that is needed: “traditional” scholarship, tools, and texts.[1]

Let’s take, as a hypothetical exercise, three of Michael Satlow’s aspirations. It would indeed be very impressive to get an automatic comprehensive social network analysis of all the links connecting all the sages in the entire corpus of rabbinic literature. Internal sectioning options would take us even further. However, although this task is potentially achievable with automated technologically, we can already begin with semi-manual pilot tests (using Nodexl, or Paladio), covering Talmudic tractates or mishnaic orders. Comparing networks of one order in the Mishna and the Tosefta, or different Palestinian/Babylonian Talmudic chapters, would be a nice start (these graphs can be created within a few days). A starting point such as this can indicate what else might be done with a wider lens, though it also has merit in and of itself. By heaping on more and more analytical layers – by checking not only who is related to whom, but also in what ways are they connected, what contents are transmitted between them, what are the rhetorical patterns that shape their relationships, and so on – in the end, we arrive at a rich map, half-computerized, half-manual, totally systematized, and hopefully interesting.

The structure of the Talmudic sugya can be another representative example. There are many automated ways of re-thinking the sugya. Syntactic-semantic analysis by “tree-bank” (following Satlow’s suggestion), or thematic analysis by topic-modeling (a much-discussed strategy), could provide fascinating (and contradictory?) starting points – though they are, unfortunately, far beyond the average abilities of most Talmudists. Mapping Talmudic events would also not be easy, not only because “event” is a very complex category in narrative theory,[2] but also because the fundamental “acts” in Talmudic literature are, most probably, speech-acts. But if we were to tie these missions to one of the main “traditional” questions in Bavli studies during recent decades – the question of the weight, character, function and effect of the anonymous material – we would very soon find that we don’t have any choice but to annotate manually, sugya after sugya, without ignoring the non-decisive interpretive questions that are inherent to the process. It is not a bad idea; quite the opposite: it is not hard to imagine a collaborative workshop (or seminar) dedicated to creating a “marked-up edition” of talmudic chapters, pointing by tags – and not without reasonable hesitation – not only to different strata of the chapter, but also, say, to different levels of abstract thinking and halachic conceptualization in the text, different intellectual institutions represented by it, or poetic changes in the stories woven into it – to mention only three topics that are arguably connected to the question of the “stam.” Adding an “objective” automated layer of linguistic patterns to various human tags, one might play with the interrelationship between all these layers. If one wants to understand, within the context of Talmud scholarship, what “undogmatic reading” is, one only has to remember the problem of materials presented as tanaitic or amoraic on the surface of the text (which can be easily identified by a computer), and the well-known fact that many of these are likely to be an artificial editorial compilation (which can be identified by a human interpreter, although automatization of this operation can be instructive).

Sugya after sugya, chapter after chapter – exhausting, but exciting. This is certainly one of the main lessons that “serial reading” teaches us: generalizations can and should be reached, but they must be based on (a) a proportional corpus, and (b) a method that is suitable for dealing with a wide variety of phenomena and data. In short: a convincing way to read “the great unread.” And with no less importance to our discussion here: it is not an eschatological mission. CATMA, for example, is designed to support such a project – in any text, in any language, under any interpretational framework.

These initial suggestions give rise to a great many questions, only a few of which can be noted here: What is the role of numbers in our literary (humanistic) thinking? How does the move from representative examples to an analysis of corpus-wide trends modify our approach to the small-scale examples that are still undeniably important? How can we “translate” (or “operationalize,” to use Moretti’s words) abstract and flexible ideas into strictly-defined “units of analysis,” or tag-sets for an annotation that would be fluid and systematic at the same time? And, above all, what can be learned from the connections, occasionally surprising, between measurable elements on the text’s surface, and our theoretical-conceptual insights regarding its depth?

In fact, leading scholars of other literatures have broadly dealt with these questions.[3] This is not the place to discuss their solutions and to adapt them to Rabbinic studies. To do so, we need many more case-studies, while, as Satlow noted, we are still at the very beginning. Achieving this goal, however, does not have to wait for the World to Come. Considering that many Talmudic texts have been digitalized, and given that there are numerous tools that can already help us, it seems that the time has simply come to start working. “It functions, and that’s all.” This fact, in and of itself, does not make computational research less attractive; it merely bestows on it an attractiveness of a different kind.


* I am grateful to Shai Secunda and Yitz Landes, the editors of The Talmud Blog, who spared no efforts to give this text its best form. However, the responsibility for it is fully on me.

[1] In fact, this is not entirely correct. Only recently I wanted to download some open-source texts from Sefaria into a simple MS-Word readable configuration, but without the help of a friend who is a real digital humanist, I couldn’t do it. So the “open” is still not quite open enough.

[2] See, for example, Peter Hühn, “Event and Eventfulness.

[3] Compare, for example: Jan Christoph Meister, ‘Toward a Computational Narratology’, in: M. Agosti and F. Tomasi (eds.), Collaborative Research Practices and Shared Infrastructures for Humanities Computing, Padua: CLEUP, 2014, pp. 17-36; Franco Morreti, ‘Operationalizing: Or, the Function of Measurement in Modern Literary Theory’, Stanford Literary Lab Pamphlet no.6 (December 2013); Andrew Goldstone and Ted Underwood, ‘The Quiet Transformations of Literary Studies: What Thirteen Thousand Scholars Could Tell Us’, New Literary History 45:3 (2014), pp. 359-384.


2 thoughts on “It Functions, and that’s (almost) All: Another Look at “Tagging the Talmud”

  1. Shaul Stampfer says:

    This will be required reading for second-year students of Talmud. The first year one learns how little we know. After that, one can begin to master the tools that will enable the student to learn and to contribute to the pool of knowledge. It is at that stage that this piece will be exceedingly important. It is original, wide-ranging, modest and stimulating. Not a bad combination.

    Shaul Stampfer / Jerusalem

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s