Empty for complete segments.   Segmentu osoak ikusi nahi badituzu, testu kaxa hutsik utzi eta kopurua botoia sakatu.

TERM31_A1.rs3 (51)
Left unitSenseRight unitRelation typeRelation nameTaggerrhetdb Notes
In most projects statistical methods have been used to reduce the assumed terms which follow the linguistic model.--> The methods applied vary widely from project to project, so the simplest idea is to require a minimum absolute frequency (Justeson & Katz, 95), though several probabilistic formulae are generally combined. background N-SA1
In recent years work has begun to develop instruments in several languages for automatic terminology extraction in technical texts, though human intervention is still required to make the final selection from the terms automatically chosen. As an example we can cite the following instruments: LEXTER (Bourigault, 92), AT & Tko Terminght (Church & Dagan, 94), TERMS by IBM (Justeson & Katz, 95) and NPtool (Arpper, 95). Their areas of application can be divided into two main groups: information indexing and the making-up of terminological glossaries. In areas where terminology is developing dynamically, such as computer science, it is almost impossible to carry out effective terminological work without an instrument of this type. -->If a similar instrument is to be developed for Basque we shall come up against more major drawbacks, because the unifying process of the language has not been completed, research carried out is limited and Basque is an agglutinative language. background N-SA1
Morpho-syntactic models are usually used,--> so it is advisable to have the text already analysed or at least labelled. cause N-SA1
In complex inflected languages poor results will ensue if only the formal aspect of words is dealt with:--> lemmatisation will be necessary. cause N-SA1
a discrimination between terms must be made,<-- because some of them may form part of longer units. cause N-SA1
The methods applied vary widely from project to project,--> so the simplest idea is to require a minimum absolute frequency (Justeson & Katz, 95), cause N-SA1
We do not yet have any results, but we believe that the model will be wider than the noun phrase.<-- In the choice of technical terms, the case of internal declension may prove decisive. cause N-SA1
we shall come up against more major drawbacks,
<-- because the unifying process of the language has not been completed, research carried out is limited and Basque is an agglutinative language. cause N-SA1
The results obtained are not yet those required for absolutely automatic extraction.--> A balance must be found between recall and precision. In this balance preference is given to recall, provided there is a person who can carry out the terminology reduction. To obtain a recall of 95% precision is usually reduced to 50%, and for a precision of 85% cover is not reduced even to 35%. cause N-SA1
While these tools are being prepared,--> we must work on the modelling of technical terms, i.e. we must reduce their characteristics. circumstance N-SA1
In recent years work has begun to develop instruments in several languages for automatic terminology extraction in technical texts,<-- though human intervention is still required to make the final selection from the terms automatically chosen. concession N-SA1
It is a hard task to obtain a formal, complete definition of a term,--> but that is precisely what a major part of this work consists of: defining the characteristics of terms. concession N-SA1
The results are conditioned heavily by the quality of the linguistic tool used.<-- In any event in some projects neither morphological nor syntactic analysis is carried out (Su et al., 96). concession N-SA1
The methods applied vary widely from project to project, so the simplest idea is to require a minimum absolute frequency (Justeson & Katz, 95),<-- though several probabilistic formulae are generally combined. concession N-SA1
We do not yet have any results,--> but we believe that the model will be wider than the noun phrase. concession N-SA1
Morpho-syntactic models are usually used, so it is advisable to have the text already analysed or at least labelled.<-- The results are conditioned heavily by the quality of the linguistic tool used. In any event in some projects neither morphological nor syntactic analysis is carried out (Su et al., 96). concession N-SA1
If a similar instrument is to be developed for Basque--> we shall come up against more major drawbacks, because the unifying process of the language has not been completed, research carried out is limited and Basque is an agglutinative language. condition N-SA1
As an example we can cite the following instruments: LEXTER (Bourigault, 92), AT & Tko Terminght (Church & Dagan, 94), TERMS by IBM (Justeson & Katz, 95) and NPtool (Arpper, 95). <--Their areas of application can be divided into two main groups: information indexing and the making-up of terminological glossaries. elaboration N-SA1
In recent years work has begun to develop instruments in several languages for automatic terminology extraction in technical texts, though human intervention is still required to make the final selection from the terms automatically chosen.<-- As an example we can cite the following instruments: LEXTER (Bourigault, 92), AT & Tko Terminght (Church & Dagan, 94), TERMS by IBM (Justeson & Katz, 95) and NPtool (Arpper, 95). Their areas of application can be divided into two main groups: information indexing and the making-up of terminological glossaries. elaboration N-SA1
Lemmatisation is linked to morphological analysis and the removal of ambiguities.<-- In complex inflected languages poor results will ensue if only the formal aspect of words is dealt with: lemmatisation will be necessary. elaboration N-SA1
Linguistic knowledge is also of prime importance in the standardisation of terminology:<-- a discrimination between terms must be made, because some of them may form part of longer units. elaboration N-SA1
Linguistic techniques are used basically to make the initial selection of terms. Morpho-syntactic models are usually used, so it is advisable to have the text already analysed or at least labelled. The results are conditioned heavily by the quality of the linguistic tool used. In any event in some projects neither morphological nor syntactic analysis is carried out (Su et al., 96). Lemmatisation is linked to morphological analysis and the removal of ambiguities. In complex inflected languages poor results will ensue if only the formal aspect of words is dealt with: lemmatisation will be necessary.<-- Linguistic knowledge is also of prime importance in the standardisation of terminology: a discrimination between terms must be made, because some of them may form part of longer units. elaboration N-SA1
Linguistic techniques are used basically to make the initial selection of terms. Morpho-syntactic models are usually used, so it is advisable to have the text already analysed or at least labelled. The results are conditioned heavily by the quality of the linguistic tool used. In any event in some projects neither morphological nor syntactic analysis is carried out (Su et al., 96). Linguistic knowledge is also of prime importance in the standardisation of terminology: a discrimination between terms must be made, because some of them may form part of longer units.<--Lemmatisation is linked to morphological analysis and the removal of ambiguities. In complex inflected languages poor results will ensue if only the formal aspect of words is dealt with: lemmatisation will be necessary. elaboration N-SA1
Linguistic techniques are used basically to make the initial selection of terms. Lemmatisation is linked to morphological analysis and the removal of ambiguities. In complex inflected languages poor results will ensue if only the formal aspect of words is dealt with: lemmatisation will be necessary. Linguistic knowledge is also of prime importance in the standardisation of terminology: a discrimination between terms must be made, because some of them may form part of longer units.<--Morpho-syntactic models are usually used, so it is advisable to have the text already analysed or at least labelled. The results are conditioned heavily by the quality of the linguistic tool used. In any event in some projects neither morphological nor syntactic analysis is carried out (Su et al., 96). elaboration N-SA1
In this balance preference is given to recall, provided there is a person who can carry out the terminology reduction.<-- To obtain a recall of 95% precision is usually reduced to 50%, and for a precision of 85% cover is not reduced even to 35%. elaboration N-SA1
A balance must be found between recall and precision.<-- In this balance preference is given to recall, provided there is a person who can carry out the terminology reduction. To obtain a recall of 95% precision is usually reduced to 50%, and for a precision of 85% cover is not reduced even to 35%. elaboration N-SA1
2. Terminology extraction It is a hard task to obtain a formal, complete definition of a term, but that is precisely what a major part of this work consists of: defining the characteristics of terms. To obtain technical terms from the corpus a combination of NLP techniques (based on linguistic knowledge) and statistical techniques is usually used.<--2.1. Linguistic Techniques Linguistic techniques are used basically to make the initial selection of terms. Morpho-syntactic models are usually used, so it is advisable to have the text already analysed or at least labelled. The results are conditioned heavily by the quality of the linguistic tool used. In any event in some projects neither morphological nor syntactic analysis is carried out (Su et al., 96). Lemmatisation is linked to morphological analysis and the removal of ambiguities. In complex inflected languages poor results will ensue if only the formal aspect of words is dealt with: lemmatisation will be necessary. Linguistic knowledge is also of prime importance in the standardisation of terminology: a discrimination between terms must be made, because some of them may form part of longer units. 2.2. Statistical Techniques In most projects statistical methods have been used to reduce the assumed terms which follow the linguistic model. The methods applied vary widely from project to project, so the simplest idea is to require a minimum absolute frequency (Justeson & Katz, 95), though several probabilistic formulae are generally combined. 2.3. Results The results obtained are not yet those required for absolutely automatic extraction. A balance must be found between recall and precision. In this balance preference is given to recall, provided there is a person who can carry out the terminology reduction. To obtain a recall of 95% precision is usually reduced to 50%, and for a precision of 85% cover is not reduced even to 35%. elaboration N-SA1
The IXA Group intends to develop a tool of this type for Basque.<-- The morphological analyser is already being prepared (Alegria et al, 96), the lemmatizer/labeller is almost completed (Aduriz et al, 96) and work has been done on surface level syntax. elaboration N-SA1
To that end, basing work on existing technical dictionaries and using statistical techniques, principal models must be obtained.<-- We do not yet have any results, but we believe that the model will be wider than the noun phrase. In the choice of technical terms, the case of internal declension may prove decisive. elaboration N-SA1
The IXA Group intends to develop a tool of this type for Basque. The morphological analyser is already being prepared (Alegria et al, 96), the lemmatizer/labeller is almost completed (Aduriz et al, 96) and work has been done on surface level syntax. <--While these tools are being prepared, we must work on the modelling of technical terms, i.e. we must reduce their characteristics. To that end, basing work on existing technical dictionaries and using statistical techniques, principal models must be obtained. We do not yet have any results, but we believe that the model will be wider than the noun phrase. In the choice of technical terms, the case of internal declension may prove decisive. elaboration N-SA1
In recent years work has begun to develop instruments in several languages for automatic terminology extraction in technical texts, though human intervention is still required to make the final selection from the terms automatically chosen. As an example we can cite the following instruments: LEXTER (Bourigault, 92), AT & Tko Terminght (Church & Dagan, 94), TERMS by IBM (Justeson & Katz, 95) and NPtool (Arpper, 95). Their areas of application can be divided into two main groups: information indexing and the making-up of terminological glossaries.--> In areas where terminology is developing dynamically, such as computer science, it is almost impossible to carry out effective terminological work without an instrument of this type. evidence N-SA1
While these tools are being prepared, we must work on the modelling of technical terms, i.e. we must reduce their characteristics.<-- To that end, basing work on existing technical dictionaries and using statistical techniques, principal models must be obtained. We do not yet have any results, but we believe that the model will be wider than the noun phrase. In the choice of technical terms, the case of internal declension may prove decisive. means N-SA1
Automatic terminology extraction and its application to Basque-->1. Introduction In recent years work has begun to develop instruments in several languages for automatic terminology extraction in technical texts, though human intervention is still required to make the final selection from the terms automatically chosen. As an example we can cite the following instruments: LEXTER (Bourigault, 92), AT & Tko Terminght (Church & Dagan, 94), TERMS by IBM (Justeson & Katz, 95) and NPtool (Arpper, 95). Their areas of application can be divided into two main groups: information indexing and the making-up of terminological glossaries. In areas where terminology is developing dynamically, such as computer science, it is almost impossible to carry out effective terminological work without an instrument of this type. If a similar instrument is to be developed for Basque we shall come up against more major drawbacks, because the unifying process of the language has not been completed, research carried out is limited and Basque is an agglutinative language. 2. Terminology extraction It is a hard task to obtain a formal, complete definition of a term, but that is precisely what a major part of this work consists of: defining the characteristics of terms. To obtain technical terms from the corpus a combination of NLP techniques (based on linguistic knowledge) and statistical techniques is usually used. 2.1. Linguistic Techniques Linguistic techniques are used basically to make the initial selection of terms. Morpho-syntactic models are usually used, so it is advisable to have the text already analysed or at least labelled. The results are conditioned heavily by the quality of the linguistic tool used. In any event in some projects neither morphological nor syntactic analysis is carried out (Su et al., 96). Lemmatisation is linked to morphological analysis and the removal of ambiguities. In complex inflected languages poor results will ensue if only the formal aspect of words is dealt with: lemmatisation will be necessary. Linguistic knowledge is also of prime importance in the standardisation of terminology: a discrimination between terms must be made, because some of them may form part of longer units. 2.2. Statistical Techniques In most projects statistical methods have been used to reduce the assumed terms which follow the linguistic model. The methods applied vary widely from project to project, so the simplest idea is to require a minimum absolute frequency (Justeson & Katz, 95), though several probabilistic formulae are generally combined. 2.3. Results The results obtained are not yet those required for absolutely automatic extraction. A balance must be found between recall and precision. In this balance preference is given to recall, provided there is a person who can carry out the terminology reduction. To obtain a recall of 95% precision is usually reduced to 50%, and for a precision of 85% cover is not reduced even to 35%. 3. Application to Basque The IXA Group intends to develop a tool of this type for Basque. The morphological analyser is already being prepared (Alegria et al, 96), the lemmatizer/labeller is almost completed (Aduriz et al, 96) and work has been done on surface level syntax. While these tools are being prepared, we must work on the modelling of technical terms, i.e. we must reduce their characteristics. To that end, basing work on existing technical dictionaries and using statistical techniques, principal models must be obtained. We do not yet have any results, but we believe that the model will be wider than the noun phrase. In the choice of technical terms, the case of internal declension may prove decisive. preparation N-SA1
1. Introduction-->In recent years work has begun to develop instruments in several languages for automatic terminology extraction in technical texts, though human intervention is still required to make the final selection from the terms automatically chosen. As an example we can cite the following instruments: LEXTER (Bourigault, 92), AT & Tko Terminght (Church & Dagan, 94), TERMS by IBM (Justeson & Katz, 95) and NPtool (Arpper, 95). Their areas of application can be divided into two main groups: information indexing and the making-up of terminological glossaries. In areas where terminology is developing dynamically, such as computer science, it is almost impossible to carry out effective terminological work without an instrument of this type. If a similar instrument is to be developed for Basque we shall come up against more major drawbacks, because the unifying process of the language has not been completed, research carried out is limited and Basque is an agglutinative language. preparation N-SA1
2.1. Linguistic Techniques-->Linguistic techniques are used basically to make the initial selection of terms. Morpho-syntactic models are usually used, so it is advisable to have the text already analysed or at least labelled. The results are conditioned heavily by the quality of the linguistic tool used. In any event in some projects neither morphological nor syntactic analysis is carried out (Su et al., 96). Lemmatisation is linked to morphological analysis and the removal of ambiguities. In complex inflected languages poor results will ensue if only the formal aspect of words is dealt with: lemmatisation will be necessary. Linguistic knowledge is also of prime importance in the standardisation of terminology: a discrimination between terms must be made, because some of them may form part of longer units. preparation N-SA1
2.2. Statistical Techniques-->In most projects statistical methods have been used to reduce the assumed terms which follow the linguistic model. The methods applied vary widely from project to project, so the simplest idea is to require a minimum absolute frequency (Justeson & Katz, 95), though several probabilistic formulae are generally combined. preparation N-SA1
2.3. Results-->The results obtained are not yet those required for absolutely automatic extraction. A balance must be found between recall and precision. In this balance preference is given to recall, provided there is a person who can carry out the terminology reduction. To obtain a recall of 95% precision is usually reduced to 50%, and for a precision of 85% cover is not reduced even to 35%. preparation N-SA1
3. Application to Basque-->The IXA Group intends to develop a tool of this type for Basque. The morphological analyser is already being prepared (Alegria et al, 96), the lemmatizer/labeller is almost completed (Aduriz et al, 96) and work has been done on surface level syntax. While these tools are being prepared, we must work on the modelling of technical terms, i.e. we must reduce their characteristics. To that end, basing work on existing technical dictionaries and using statistical techniques, principal models must be obtained. We do not yet have any results, but we believe that the model will be wider than the noun phrase. In the choice of technical terms, the case of internal declension may prove decisive. preparation N-SA1
2. Terminology extraction-->It is a hard task to obtain a formal, complete definition of a term, but that is precisely what a major part of this work consists of: defining the characteristics of terms. To obtain technical terms from the corpus a combination of NLP techniques (based on linguistic knowledge) and statistical techniques is usually used. 2.1. Linguistic Techniques Linguistic techniques are used basically to make the initial selection of terms. Morpho-syntactic models are usually used, so it is advisable to have the text already analysed or at least labelled. The results are conditioned heavily by the quality of the linguistic tool used. In any event in some projects neither morphological nor syntactic analysis is carried out (Su et al., 96). Lemmatisation is linked to morphological analysis and the removal of ambiguities. In complex inflected languages poor results will ensue if only the formal aspect of words is dealt with: lemmatisation will be necessary. Linguistic knowledge is also of prime importance in the standardisation of terminology: a discrimination between terms must be made, because some of them may form part of longer units. 2.2. Statistical Techniques In most projects statistical methods have been used to reduce the assumed terms which follow the linguistic model. The methods applied vary widely from project to project, so the simplest idea is to require a minimum absolute frequency (Justeson & Katz, 95), though several probabilistic formulae are generally combined. 2.3. Results The results obtained are not yet those required for absolutely automatic extraction. A balance must be found between recall and precision. In this balance preference is given to recall, provided there is a person who can carry out the terminology reduction. To obtain a recall of 95% precision is usually reduced to 50%, and for a precision of 85% cover is not reduced even to 35%. preparation N-SA1
It is a hard task to obtain a formal, complete definition of a term, but that is precisely what a major part of this work consists of: defining the characteristics of terms.--> To obtain technical terms from the corpus a combination of NLP techniques (based on linguistic knowledge) and statistical techniques is usually used.
purpose N-SA1
we must work on the modelling of technical terms,<-- i.e. we must reduce their characteristics. restatement N-SA1
In this balance preference is given to recall,<-- provided there is a person who can carry out the terminology reduction. unless N-SA1

SegmentsRelation typeRelation nameTaggerrhetdbNotes
To obtain a recall of 95% precision is usually reduced to 50%, and for a precision of 85% cover is not reduced even to 35%. contrast N-NA1
because the unifying process of the language has not been completed, research carried out is limited and Basque is an agglutinative language. list N-NA1
2.1. Linguistic Techniques Linguistic techniques are used basically to make the initial selection of terms. Morpho-syntactic models are usually used, so it is advisable to have the text already analysed or at least labelled. The results are conditioned heavily by the quality of the linguistic tool used. In any event in some projects neither morphological nor syntactic analysis is carried out (Su et al., 96). Lemmatisation is linked to morphological analysis and the removal of ambiguities. In complex inflected languages poor results will ensue if only the formal aspect of words is dealt with: lemmatisation will be necessary. Linguistic knowledge is also of prime importance in the standardisation of terminology: a discrimination between terms must be made, because some of them may form part of longer units. 2.2. Statistical Techniques In most projects statistical methods have been used to reduce the assumed terms which follow the linguistic model. The methods applied vary widely from project to project, so the simplest idea is to require a minimum absolute frequency (Justeson & Katz, 95), though several probabilistic formulae are generally combined. 2.3. Results The results obtained are not yet those required for absolutely automatic extraction. A balance must be found between recall and precision. In this balance preference is given to recall, provided there is a person who can carry out the terminology reduction. To obtain a recall of 95% precision is usually reduced to 50%, and for a precision of 85% cover is not reduced even to 35%. list N-NA1
The morphological analyser is already being prepared (Alegria et al, 96), the lemmatizer/labeller is almost completed (Aduriz et al, 96) and work has been done on surface level syntax. list N-NA1
1. Introduction In recent years work has begun to develop instruments in several languages for automatic terminology extraction in technical texts, though human intervention is still required to make the final selection from the terms automatically chosen. As an example we can cite the following instruments: LEXTER (Bourigault, 92), AT & Tko Terminght (Church & Dagan, 94), TERMS by IBM (Justeson & Katz, 95) and NPtool (Arpper, 95). Their areas of application can be divided into two main groups: information indexing and the making-up of terminological glossaries. In areas where terminology is developing dynamically, such as computer science, it is almost impossible to carry out effective terminological work without an instrument of this type. If a similar instrument is to be developed for Basque we shall come up against more major drawbacks, because the unifying process of the language has not been completed, research carried out is limited and Basque is an agglutinative language. 2. Terminology extraction It is a hard task to obtain a formal, complete definition of a term, but that is precisely what a major part of this work consists of: defining the characteristics of terms. To obtain technical terms from the corpus a combination of NLP techniques (based on linguistic knowledge) and statistical techniques is usually used. 2.1. Linguistic Techniques Linguistic techniques are used basically to make the initial selection of terms. Morpho-syntactic models are usually used, so it is advisable to have the text already analysed or at least labelled. The results are conditioned heavily by the quality of the linguistic tool used. In any event in some projects neither morphological nor syntactic analysis is carried out (Su et al., 96). Lemmatisation is linked to morphological analysis and the removal of ambiguities. In complex inflected languages poor results will ensue if only the formal aspect of words is dealt with: lemmatisation will be necessary. Linguistic knowledge is also of prime importance in the standardisation of terminology: a discrimination between terms must be made, because some of them may form part of longer units. 2.2. Statistical Techniques In most projects statistical methods have been used to reduce the assumed terms which follow the linguistic model. The methods applied vary widely from project to project, so the simplest idea is to require a minimum absolute frequency (Justeson & Katz, 95), though several probabilistic formulae are generally combined. 2.3. Results The results obtained are not yet those required for absolutely automatic extraction. A balance must be found between recall and precision. In this balance preference is given to recall, provided there is a person who can carry out the terminology reduction. To obtain a recall of 95% precision is usually reduced to 50%, and for a precision of 85% cover is not reduced even to 35%. 3. Application to Basque The IXA Group intends to develop a tool of this type for Basque. The morphological analyser is already being prepared (Alegria et al, 96), the lemmatizer/labeller is almost completed (Aduriz et al, 96) and work has been done on surface level syntax. While these tools are being prepared, we must work on the modelling of technical terms, i.e. we must reduce their characteristics. To that end, basing work on existing technical dictionaries and using statistical techniques, principal models must be obtained. We do not yet have any results, but we believe that the model will be wider than the noun phrase. In the choice of technical terms, the case of internal declension may prove decisive. list N-NA1
  Segmentu osoak ikusi nahi badituzu, testu kaxa hutsik utzi eta kopurua botoia sakatu.