Like many Jewish text geeks, I’ve been following the goings-on at Sefaria closely. Beyond providing free versions and translations of oodles of Jewish texts, Sefaria has made them available through a stunning, extremely accessible platform that allows for an expansion of the community of learners and an enrichment of the dialogue within that community. I’ve asked Sefaria’s Ari Elias-Bachrach to share his recent report on the status of the Talmud on Sefaria and to invite our readers to share what they would most like to see on the website. – Y.L.
Sefaria is a non-profit organization that is creating a massive library of interconnected Torah texts. It is all free and all in the public domain. To do this we’ve done a number of things including importing text from other open source projects on the web like WikiSource, and digitizing public domain sefarim and putting them on the web. One of the things we’re working on is building a Talmud that gives a better learning experience than anything that comes before it. Our goal is to have the standard Talmud text with not just Rashi and Tosafot, but also other major commentators all in the same place. Additionally, citations from things like the masorat hashas and ein mishpat will be linked so you can see the relevant halachot automatically.
Our Talmud text comes from WikiSouce, and we’ve been correcting it to ensure it matches the text of the Vilna shas. We realized that given the number of mefarshim we plan on having, an amud was simply too large a unit of measure to reasonably use. When we did the Tanach it was comparatively simple – the commentators usually comment on specific verses, so any given verses just needs to link to those commentaries. However, a single amud might contain 100 comments from Rashi Tosafot, the Rosh, and the other major commentators. Without breaking up the amud into smaller units, there’s no way to know which subset of those 100 commentaries to display. When you click on a pasuk in the Torah, you see all the commentaries on that pasuk in Torah. We wanted something similar here – when you click on a sentence in the Talmud, we wanted to display the relevant commentaries on that sentence. The conclusion was clear – we needed a way to break up the dapim. Thankfully Koren Publishers graciously allowed us to use their punctuation to break up the amud. Each line of the amud now corresponds to a grammatical phrase (not a line of the Vilna printing). We undertook a massive project to segment all of shas in this manner, and finished in the fall of 2014. In the process we also double checked the text we had from WikiSource to make sure it matched the Vilna shas. (As a side note, we found a significant number of errors in the WikiSource Talmud in both the Talmud text and the Rashi and Tosafot. These errors have unfortunately propagated to many sites across the web, and in many cases it is clear we’re the first people to actually check the text for accuracy).
Next up of course is the commentaries of Rashi and Tosafot. Now that the Talmud was segmented, we needed to make sure to associate each comment with the appropriate line. One of our wonderful volunteer developers Noah Santacruz made a commentary poster – a program that looks at the dibur hamatchil and tries to place the comment in the right place. Unfortunately, it cannot place every comment based solely on that information, as sometimes the text of the dibur hamatchil will appear multiple times in a daf, or it might not match at all if there are roshei teivot in use, or the commentator decides to abbreviate the text in some other fashion. To fix those, we’ve had people going through manually learning the appropriate masechtot and placing the commentaries where they belong. At the same time they’ve also been checking the contents of the comments against the Vilna shas to make sure our text is accurate. So far we’ve finished Brachot, Megillah, and Taanit. Kiddushin and Ketubot are in progress. Those of you doing Daf Yomi will be happy to know we’ll be keeping the Ketubot progress in front of the Daf Yomi cycle, so you don’t have to worry. This is still an ongoing process and while we’re looking for ways to improve our automation, we’re also looking for volunteers. If you, your chevruta, or your school group is learning Talmud and wants to help out the cause of Talmud learning on the internet, you could help by placing the missing commentaries in the right place as you learn. If you’re interested please let us know and we’ll help to get you started.
Throughout this process we’ve been checking the text of the Talmud and the commentaries. We’ve found a significant number of mistakes and typos, most of which have been copied over and over again by countless websites. One of the advantages to our system is that we’re able to spot and correct these errors quickly and easily. Sefaria currently has the most accurate Talmud text freely available on the internet today (using the Vilna shas as the standard), and when we’re done we will have the most accurate copies of Rashi and Tosafot too.
After Rashi and Tosafot of course come the other major commentaries. We’ve recently finished digitizing the Rosh and the Nosei Keilim there. We’re currently working on digitizing Maharsha, Maharal, Maharsham, the Rif, and the Nosei Keilim on the Rif. We’ve also acquired digitized versions of the Pnei Yehoshua, Yad Ramah, Ramban, Shita Mekubetzet, Rashba, and Tosafot Rid. So far we’ve done Shita Mekubetzet on Brachot. While getting these into our system is difficult for the same reasons as Rashi and Tosafot, you can expect to start seeing all these commentaries appearing on Sefaria starting in a few months.
Lastly, we’re also working on a few other features that should be helpful to people including an integrated dictionary with data from Jastrow and the Comprehensive Aramaic Lexicon, as well as a way of integrating the Mesorat Hashas and Ein Mishpat Ner Mitzvah. We’re also going to put in links to the Mishnah whenever the Gemarah quotes a Mishnah so that you can easily navigate to the Mishnah to see the various Mishnah commentaries we have. Currently that includes Ovadia M’Bartenura and the Tosafot Yom Tov, but we should be adding the Rambam this summer.
What other features would you find useful? One of the advantages to our system is that while extracting text is much more difficult than just putting images online, it also gives us a lot more flexibility and allows for the building of some features which may not have been possible before.