Regarding size Austria has the largest part of the Alpine arc. Thus, mountains are central and omnipresent for the people in Austria. The project Alpenwort wanted to know how we actually write or speak about something as central? How do we describe mountains, our movement on the mountains, life with the mountains – in Austria and elsewhere? Which knowldege has been collected about mountains and how can we access it?
The Zeitschrift des Österreichischen Alpenvereins (ZAV) (Austrian Alpine Club Journal) is a unique source in order to research these questions. In its first decades the magazine contributions reflect the ongoing touristic and cartographic exploration of the Alps and the economic and scientific discoveries involved. During the 20th century perspectives expanded to the mountains of the world. Globally relevant topics such as environment and nature protection are discussed as well as questions of regional identity and cultural heritage.
The project Alpenwort at the Department of Languages and Literatures digitized it and turned it into a linguistically annotated corpus. This corpus is available for the research community and those who are interested via several internet platforms. More than 42.000 pages were scanned and turned into machine readable text. This process was not without problems: 60 volumes are printed in German Gothic Script, with which computers have almost the same troubles as human readers. For example the word water “Wasser” was frequently misread as “Waffer”, and the famous mountaineer Peter Habeler was changed into Peter Habeier. This meant that many of these errors had to corrected semi-manually or manually to get a relatively clean text.
In a further step the texts were segmented into linguistic units – paragraphs, sentences, words – and then annotated with additional information about the words. Very important and interesting in this step are personal and place names. They are particularly difficult as they often were written differently in our early texts.
The so annotated corpus is available via different platforms and research on how we talkt about mountains can get started. Different analyses are possible: frequencies of words and phrases, periods in time when a word was used mor often etc. For example, the German suffixoid “-wärts” (similar to ‘-wards’) was used much more frequently and in more creative ways in our early Alpenwort-volumes. There are interesting compounds such as: äquatorwärts, stradawärts, menschwärts and even feindwärts.
The Alpenwort corpus is already being used in teaching and will be available on the platform Hyperbase (Université Nice Sophia Antipolis) and CQP-Web (University of Innsbruck) as well as on ZENODO and ARCHE.
The follow-up project „Semantics for Mountaineering History“ uses the corpus and is dealing with the possibilty of correctly identifying person and place names as well as extracting semantic information from the corpus.
Another follow-up project KEA (KEywords of the Alps) digitizes the New Zealand Alpine Journal and transforms it into a linguistically annotated corpus.