Das Internet als linguistisches Korpus

This article discusses whether the Internet can be used as a linguistic corpus. It is based on experiences in connection with the Variantenwörterbuch des Deutschen (Dictionary of Standard German Variants), which was compiled 1997-2004. In order to identify national and regional variants of the Germa...

Full description

Bibliographic Details
Main Author: Hans Bickel
Format: Article
Language:deu
Published: Bern Open Publishing 2006-07-01
Series:Linguistik Online
Online Access:https://bop.unibe.ch/linguistik-online/article/view/612
id doaj-4ada537a3a5e455cb48a612e1ce2af3e
record_format Article
spelling doaj-4ada537a3a5e455cb48a612e1ce2af3e2021-09-13T12:56:55ZdeuBern Open PublishingLinguistik Online1615-30142006-07-0128310.13092/lo.28.612Das Internet als linguistisches KorpusHans BickelThis article discusses whether the Internet can be used as a linguistic corpus. It is based on experiences in connection with the Variantenwörterbuch des Deutschen (Dictionary of Standard German Variants), which was compiled 1997-2004. In order to identify national and regional variants of the German language in Germany, Austria and Switzerland, it was necessary to work with a large linguistic corpus that could also provide data on the frequency of rather rare words. The question was: Is the Internet suitable as a corpus for linguistic frequency analysis? The use of the WWW as corpus can be suitable only 1. if reliable and reproducible results can be obtained; 2. if the results are closely related to the language as it is actually used. The test showed that the Internet is an extremely useful corpus to get information on word frequency. The enormous size and the large number of different text types makes it an extremely versatile corpus, which has a systematic connection to the written language reality. https://bop.unibe.ch/linguistik-online/article/view/612
collection DOAJ
language deu
format Article
sources DOAJ
author Hans Bickel
spellingShingle Hans Bickel
Das Internet als linguistisches Korpus
Linguistik Online
author_facet Hans Bickel
author_sort Hans Bickel
title Das Internet als linguistisches Korpus
title_short Das Internet als linguistisches Korpus
title_full Das Internet als linguistisches Korpus
title_fullStr Das Internet als linguistisches Korpus
title_full_unstemmed Das Internet als linguistisches Korpus
title_sort das internet als linguistisches korpus
publisher Bern Open Publishing
series Linguistik Online
issn 1615-3014
publishDate 2006-07-01
description This article discusses whether the Internet can be used as a linguistic corpus. It is based on experiences in connection with the Variantenwörterbuch des Deutschen (Dictionary of Standard German Variants), which was compiled 1997-2004. In order to identify national and regional variants of the German language in Germany, Austria and Switzerland, it was necessary to work with a large linguistic corpus that could also provide data on the frequency of rather rare words. The question was: Is the Internet suitable as a corpus for linguistic frequency analysis? The use of the WWW as corpus can be suitable only 1. if reliable and reproducible results can be obtained; 2. if the results are closely related to the language as it is actually used. The test showed that the Internet is an extremely useful corpus to get information on word frequency. The enormous size and the large number of different text types makes it an extremely versatile corpus, which has a systematic connection to the written language reality.
url https://bop.unibe.ch/linguistik-online/article/view/612
work_keys_str_mv AT hansbickel dasinternetalslinguistischeskorpus
_version_ 1717380683107663872