Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Task 5 : Creating a corpus You are serving as a reviewer for a journal of corpus linguistics. The editor has sent you an article

Task 5: Creating a corpus
You are serving as a reviewer for a journal of corpus linguistics. The editor has sent you an article in which a
group of researchers describes the data collection procedure for the Comprehensive Klingon Corpus (CtlC),
an attempt to provide the very first linguistically annotated corpus of the Klingon language, a fictional alien
language created by Marc Okrand for the Star Trek universe. Here is the description of the data collection
procedure from the paper:
To establish an initial core of sentences, we manually extracted all the example sentences from published
sources on Klingon, such as Okrand (1992) and Okrand (1996). Then we crawled the contents of the major
web forums for Klingon language enthusiasts for additional example sentences. To create the annotation,
we relied on a crowd-sourced approach, asking forum participants for their help in glossing each word in our
collection of Klingon sentences with its meaning in English. Annotation was performed via a wiki which
we set up for this purpose. We did not require participants to create accounts on our wiki, which had the
desired effect on participation.
In the process, the Klingon language community also created many additional sentences in order to discuss
various points of grammar, which we were happy to integrate into our growing corpus. After some weeks,
the volunteers had provided us with full annotations, and the wiki discussion pages were found to already
form a comprehensive annotation standard, which we are publishing together with this resource. In a final
standardization step, the crowdsourced annotations were automatically converted to confirm to the Leipzig
glossing rules in a final step, and the resulting high-quality annotations are released together with the raw
data under a CC-BY-SA license.
As a reviewer, your task is to provide feedback on this procedure in the light of established best practices
and the standard workflow of corpus linguistics.
4/

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Modern Database Management

Authors: Jeff Hoffer, Ramesh Venkataraman, Heikki Topi

12th edition

133544613, 978-0133544619

More Books

Students also viewed these Databases questions