Search engine with(out) a difference

epi search brechtIl tema della ricerca e dell'accesso ai contenuti (accademici e non) è un nodo centrale della svolta epistemica avviata dalla digitalizzazione delle risorse culturali. Molto è stato scritto sulla tematica del "potere degli archivi" e della manipolazione delle memorie, da Derrida a Foucault, ma poco è stato detto sul legame fra monolinguismo universalista del codice, dei software e degli algoritmi e gli oligopoli del sapere, ovvero i cartelli multinazionali dell'editoria scientifica dai quali, un po' come i governi con le agenzie di rating, dipende ormai la valutazione tanto dei singoli ricercatori che delle università nel loro complesso. Ma ranking, rating e i software che gestiscono e rendono accessibili (o meno) archivi e database non sono dispositivi indipendenti dai contesti geopolitici (e dalle lingue) che li producono. E dunque che cosa rischia di scomparire nel buco nero dei database e degli strumenti del searching non è solo la diversità linguistico-culturale all'interno delle scienze sociali e umanistiche, ma gli oggetti stessi di quelle scienze.

Ripropongo qui, con alcune modifiche, il testo di un intervento inviato alla lista Humanist (e rimasto senza risposta da parte degli autori del motore), dove riferivo di un esperimento effettuato con il motore di ricerca EPI Search, un strumento che permette di ricercare all'interno di una biblioteca di cinquemila volumi. Per ragioni di spazio il testo era stato privato degli esempi in calce, che qui ho ripristinato.

Recently Ken Friedman on Humanist announced the launch of a new research engine created at the Institute for Studies in Coherence and Emergence (ISCE):

"Epi-search takes queries of 50 to 10,000 words and performs several functions: 1) a “find more like this” search identifies contextually  related documents from the ISCE.edu library, 2) displays key concepts and
 terms from the query and presents them in word clouds, and 3) transforms  extracted terms and concepts into “enhanced queries” that are sent directly  into more than a dozen third-party online databases."

So EPI Search "works by placing blocks of text into the engine", but the info availabe on the ISCE web site does not specify in which language. So I assumed it would have worked (to an extent, of course) also with other European languages. However, the results I've got were quite discouraging.

I've pasted in the query box three foreign paragraphs or fragments from different sources (a classical theater text in German, a scholarly essay in Spanish, an academic essay originally written in Italian and later translated in English):

1) the incipit of Bertolt Brecht, Leben Des Galilei (German original text);

2) the incipit  of an essay written by a well-known Italian historian, Carlo Ginzuburg (Italian text);

3) the published English translation of the same passage;

4) a passage taken from a famous essay written by the Spanish philospher José Ortega y Gasset (Spanish text).

As for (1) and (2), I got a "Runtime error page":

Server Error in '/' Application.
 [...]
 <!-- Web.Config Configuration File -->
<configuration>
 <system.web>
 <customErrors mode="Off"/>
 </system.web>
 </configuration>
 [...]

Results for (3) and (4) can be seen at the end of this message... (I've omitted Google results included in the EPI Search for Ginzburg (3). Only the search with the Ortega y Gasset Spanish excerpt gave some results: a book on extraterrestrial life... Which proves after all that even Anglophone algorithms sometimes possess irony.

To be honest, what worries me more is not that EPI does not work with non-English texts, but the intrinsic (and well-known) bias of its sources: "Epi-search then runs a 'find more like this search' to recommend books from the 5000 volume ISCE.edu library shows you how and why the results shown were recommended AND provides links to 'good' related searches from 9 academic databases including: [...]".

I'm not sure what they mean here for "good searches", but we know that both Thompson Reuters Web of Knowledge and Scopus shape institutional and individual rankings in the global academic world, reinforcing the de facto dominance of the English language within the sciences, social sciences, arts and the humanities (thanks to Ernesto Priego for pointing me to this resource).

This situation is not just a source of perpetual frustration for non English-speakers who struggle -- especially in the Social Sciences and the Humanities -- for expressing their ideas on cultural objects and phenomena which are certainly not linguistically "neutral", but constitutes the biggest threat to cultural and scientific diversity. I'd lke to remember that cultural-linguistic diversity and variation are not a luxury we can't just afford, but the condition for the existence of what we call "culture" on this  planet.

(Well, and if you are still uncertain, you should know that bilingualism and multiligualism are good for your brain.)

While I realize that these are all huge issues that cannot be solved or answered in a blog post, I believe that it should be our duty as scholars of all disciplines to make explicit and transparent our choices at all levels, including (and perhaps especially) at the level of software: the cultural and linguistic hegemony of Western databases is supported and deployed by related softwares, algorithms, and encodings. Of course I think that ISCE is doing a great job in making available for free part of its library, and the EPI search engine can be a very useful tool. The problem relies in what we are representing, and how we are doing it. It is our responsability to preserve cultural diversity, and even relatively small players can make a difference by building more inclusive "representations".

Or is it the destiny of all human culture produced/processed in a language different from English to disappear under obscure "runtime errors"?

-----------------------------

(1) Brecht, Leben Des Galilei

Galilei sich den Oberk¨orper waschend, prustend und fr¨ohlich:
Stell die Milch auf den Tisch, aber klapp kein Buch zu.
Andrea: Mutter sagt, wir m¨ussen den Milchmann bezahlen.
Sonst macht er bald einen Kreis um unser Haus, Herr
Galilei.
Galilei: Es heißt: er beschreibt einen Kreis, Andrea.
Andrea: Wie Sie wollen. Wenn wir nicht bezahlen, dann
beschreibt er einen Kreis um uns, Herr Galilei.
Galilei: W¨ahrend der Gerichtsvollzieher, Herr Cambione,
schnurgerade auf uns zu kommt, indem er was f¨ur eine
Strecke zwischen zwei Punkten w¨ahlt?
Andrea grinsend: Die k¨urzeste.
Galilei: Gut. Ich habe was f¨ur dich. Sieh hinter den Sterntafeln
nach.

[RUNTIME ERROR]

---------------------------

(2) Carlo Ginzuburg, Spie. Radici di un paradigma indiziario
runtime error ginzburg italianoTra il 1874 e il 1876 apparvero sulla Zeitschrift für bildende Kunst una serie di articoli sulla pittura italiana. Essi erano firmati da un ignoto studioso russo, Ivan Lermolieff; a tradurli in tedesco era
stato un altrettanto ignoto Johannes Schwarze. Gli articoli proponevano un nuovo metodo per l’attribuzione dei quadri antichi, che suscitò tra gli storici dell’arte reazioni contrastanti e vivaci discussioni. Solo alcuni anni dopo l’autore gettò la duplice maschera dietro a cui si era nascosto. Si trattava infatti dell’italiano Giovanni Morelli (cognome di cui Schwarze è il calco e Lermolieff l’anagramma, o quasi). E di metodo morelliano gli storici dell’arte parlano correntemente ancora oggi.

[RUNTIME ERROR -- see above]

------------------------

(3) Carlo Ginzburg, Morelli, Freud and Sherlock Holmes: Clues and Scientific Method

Between 1874 and 1876 a series of articles on Italian painting was published in the German art history journal Zeitschriftfiir bildende Kunst. They bore the signature of an unknown Russian scholar, Ivan Lermolieff, and the German translator was also unknown, one Johannes Schwarze. The articles proposed a new method for the correct attribution of old masters, which provoked much discussion and controversy among art historians. Several years later the author revealed himself as Giovanni Morelli, an Italian (both pseudonyms were adapted from his own name). The Morelli method is still referred to by art historians.

Recommended books from the ISCE library

Inner Vision: An Exploration Of Art And The Brain
Inside Modernism: Relativity Theory, Cubism, Narrative
Art And Complexity
The Language of Displayed Art
Clues, Myths, and the Historical Method
Born Under Saturn
Renaissance Art: A Very Short Introduction
Rediscovering History: Culture, Politics, and the Psyche (Cultural Sitings)
The Critique Handbook
Art History: The Key Concepts (Routledge Key Guides)

[9 GOOGLE RESULTS SHOWN]

------------------------

(4) José Ortega y Gasset, Meditación sobre la técnica

Como en la Universidad actual —y conste que no me refiero sólo a la española— las lecciones no suelen ser eso que he llamado peripecia, quiere decirse que la Universidad es un lugar de crimen permanente e impune. Hace pocos años todavía insinuar esto era completamente inútil. No se encontraban oídos prestos a escuchar pareja advertencia. Hoy las cosas han cambiado. La desazón, la desmoralización reinante en todo el mundo y la fulminante pérdida de prestigio por parte de las Universidades son dos hechos tan patentes y crudos que abren camino a la sospecha de si no estarán en cierta relación el uno con el otro, es decir, de si los defectos sustantivos de la institución universitaria no serán una de las causas que han producido el terrible desconcierto de la vida europea.

Recommended books from the ISCE library

Evolving The Alien: The Science Of Extraterrestrial Life
Classical Mythology: A Very Short Introduction

[NO GOOGLE RESULTS SHOWN]

 

Un Commento su “Search engine with(out) a difference”

  1. Es un buen parangón el que presentas. Evidentemente, cuando le pides a una persona que te recomiende alguna lectura en torno a un tema determinado, necesariamente estará influenciada tanto por su experiencia propia de lectura en ese tema, como por su experiencia acerca de ti mismo. Por supuesto, los motores de búsqueda suelen funcionar del mismo modo, puesto que son creaciones humanas: poco le estás preguntando al motor, y mucho al creador del motor. ¡Y a su maestro!
    La máquina sólo responde a su maestro, como el Golem. Pero usarla depende de la voluntad de uno. ¿Cómo matar al Golem? No se puede matar al Golem. Todos sabemos que, como mucho, se le puede recluir al altillo de una sinagoga checa. Esperemos que al menos sea un altillo bastante espacioso, porque hay mucho software mal orientado y peor gobernado que enviaría allí.