Vocabulary Management

Vocabulary Management

Contents

13.1. The Vocabulary Problem.
13.2. Research on Vocabulary Issues.
13.3. Vocabulary Solutions.
13.3.1. Syndetic Structure in Displayed Alphabetical Indexes.
13.3.2. Indexing Thesauri.
13.3.2.1 Examples of Indexing Thesauri.
13.3.3. End-User Thesauri.
13.3.3.1. Compiling an End-User Thesaurus.
13.3.3.1.1. Sources of Terms.
13.3.3.1.2. Selecting Terms.
13.3.3.1.3. Categorizing Terms.
13.3.3.1.4. Bound Terms Versus Elemental Descriptors.
13.3.3.1.5. Term Relationships.
13.3.3.1.6. Variant Forms and Equivalent Terms.
13.3.3.1.7. Homographs.
13.3.3.1.8. Thesaurus Displays.
13.3.4. Co-Occurrence Term Clustering.
13.3.5. Ontologies.
13.4. Our Examples.
13.4.1. A Book Index.
13.4.2. An Indexing and Abstracting Service.
13.4.3. A Full-Text Encyclopedia/Digital Library.
13.1. The Vocabulary Problem.

• 1

• 2 richness of human language

• 3 categories of information needs and desires

• 4 searches for known items with known vocabulary

• 5 searches for known items with unknown vocabulary

• 6

• 7

• 8 searches for unknown items with known vocabulary

• 9 searches for unknown items with unknown vocabulary

• 10 searches for unknown items with unknown vocabulary and vague concepts

• 11 searches of exploration

• 12 continua of information seeking situations

13.2. Research on Vocabulary Issues.

• 13 research on information seeking; views of Belkin (Nicholas J.) on anomalous states of knowledge

• 14 views of Furnas (George W. et al.) on variability of vocabulary

• 15 vocabulary of users compared to Library of Congress subject headings

• 16 views of Bates (Marcia) on variability of vocabulary

• 17 variability of vocabulary among searchers and indexers

• 18

• 19 variability of vocabulary in full-text sources

13.3. Vocabulary Solutions.

• 20

• 21 research on solutions for vocabulary problems

• 22 views of Bates (Marcia) on variability of vocabulary

• 23 experimental research on end-user thesauri

• 24 field research on use of thesauri

• 25

• 26 integration of thesauri with search interfaces

• 27 combining thesauri and co-occurrence lists

• 28 mapping of search terms to controlled vocabulary

• 29 interaction with multiple controlled vocabularies

• 30 display of thesauri for searching

• 31 work of Pollitt (A. Steven, et al.) on display of thesauri for searching

• 32 facets in EMTREE thesaurus

• 33 dynamic postings in faceted relational classified displayed indexes

• 34

• 35

• 36

• 37

• 38

• 39

13.3.1. Syndetic Structure in Displayed Alphabetical Indexes.

• 40 definition of syndetic structure

• 41 role of syndetic structure

• 42 subject headings versus terms in syndetic structure

• 43 types of syndetic structure; types of cross references

• 44 equivalent-term cross references

• 45 see-also references

• 46 narrower-term cross references; related-term cross references

• 47 omission of see-also references

• 48 purpose of syndetic structure

• 49 cross references in OPACs

• 50 postings data in cross references

• 51 cross references and syndetic structure in thesauri

• 52 UF as notation for un-used terms

• 53

• 54 UF as instruction for creation of equivalent-term cross references

• 55 form of equivalent-term cross references in OPACs

• 56 NT as notation for narrower terms; narrower-term cross references

• narrower terms versus related terms in syndetic structure, in thesauri :57

• 58

• 59 translation of notation for thesauri into natural human language

• 60

• 61 BT as notation for broader terms; broader-term cross references

• 62

• 63

• 64 RT as notation for related terms; related-term cross references

• 65

• 66

• 67 general see-also references

• 68 cross references in library catalogs

• 69

• 70

• 71

• 72

• 73 omission of cross references in OPACs

• 74 impact of omission of cross references

• 75 proposal for research on syndetic structure

• 76

• 77

13.3.2. Indexing Thesauri.

• 78

• 79

• 80 source of term “thesaurus”

• 81 books on construction of thesauri

• 82 thesauri for full-text IR databases

• 83 views of Soergel (Dagobert) on construction of thesauri

• 84 card format for term records for thesauri

Soergel’s Thesaurus Record Card

Notes for the record card:

• 85 computer programs for construction of thesauri

13.3.2.1 Examples of Indexing Thesauri.

• 86

• 87 Unesco thesaurus (1977)

• 88 term records in Unesco thesaurus (1977)

• 89 classification notation in Unesco thesaurus (1977)

• 90 notation in Unesco thesaurus (1977)

• 91 KWIC display in Unesco thesaurus (1977)

• 92 hierarchical displays in Unesco thesaurus (1977)

• 93

• 94

• 95 relational displays in Unesco thesaurus (1977)

• 96

• 97

• 98 Unesco thesaurus (1995)

• 99 microthesauri in Unesco thesaurus (1995)

• 100

• 101

• 102 display of multiple hierarchical levels in Unesco thesaurus (1995)

• 103 term records in Unesco thesaurus (1995)

• 104 Eurovoc thesaurus

• 105 term records in Eurovoc thesaurus

• 106

• 107

• 108 microthesauri in Eurovoc thesaurus

• 109

• 110

• 111 ASIS thesaurus

• 112 display of ASIS thesaurus

• 113 facets in ASIS thesaurus

• 114

• 115

• 116

• 117

• 118

• 119

13.3.3. End-User Thesauri.

• 120 end-user thesauri versus indexing thesauri

• 121 differences between indexing thesauri versus end-user thesauri

• 122 lead-in terms in end-user thesauri

• 123 gathering terms in end-user thesauri

• 124 examples of end-user thesauri

13.3.3.1. Compiling an End-User Thesaurus.

13.3.3.1.1. Sources of Terms.

• 125

• 126

• 127

• 128

• 129 procedures for compilation of end-user thesauri

13.3.3.1.2. Selecting Terms.

• 130

• 131 search statements as source of terms for end-user thesauri

• 132 views of Landauer (Thomas K.) on users as source of terms for end-user thesauri

• 133

• 134

• 135

• 136

• 137

• 138

• 139

• 140

• 141 selection of terms from texts for end-user thesauri

• 142

• 143 identification of phrases from full text for end-user thesauri

• 144

• 145

documentary unit. A “documentary unit” is the portion of a document that can be directly retrieved by an IR database. Documentary units may be complete documents, such as complete books, or complete periodical articles. Or they may be parts of complete documents — chapters in books, or paragraphs or charts or diagrams or illustrations in periodical articles. This same variety in the size of documentary units applies to all media. An IR database for videotapes, for example, might retrieve only complete videotapes (so that the documentary unit is the complete tape), or it might be able to retrieve individual frames or short sequences of frames, in which cases, either the individual frames, or the short sequences of frames, constitute the documentary units. In all cases, the documentary unit is the unit that is analyzed for indexing (either by machine algorithm or by human inspection). Consequently, the “documentary unit” is also called the “unit of analysis.” “Bibliographic unit” has also been used for this concept, indicating the unit described and retrievable via a bibliography. Small documentary units have also been called “information units,” but one should hope that all documentary units will be informative!

• 146 phrases from full text for end-user thesauri

• 147

• 148 indexers as source of terms for end-user thesauri

13.3.3.1.3. Categorizing Terms.

• 149

• 150

• 151

• 152 stop list terms in end-user thesauri

• 153

• 154 sorting of terms for end-user thesauri

• 155 facets for end-user thesauri

• 156 primary facets for end-user thesauri

• 157 term records for end-user thesauri

• 158 field tags for term records

• 159 initial categorization of terms for thesauri

• 160 size of categories in thesauri

• 161 categories of entities in end-user thesauri

• 162

• 163 categories of operations and processes in end-user thesauri

• 164 definition of categories in thesauri

• 165

• 166 categories in thesauri not mutually exclusive

• 167 merger of conceptually similar term records

• 168 sorting of terms in end-user thesauri

13.3.3.1.4. Bound Terms Versus Elemental Descriptors.

• 169

• 170 views of standards for thesauri on bound terms

• 171

• 172

• 173

• 174 impact of bound terms on size of thesauri

13.3.3.1.5. Term Relationships.

• 175

• 176 term relationships in thesauri

• 177 examples of term relationships in thesauri

• 178 equivalence relationships in thesauri

• 179 hierarchical relationships in thesauri

• 180 associative relationships in thesauri

• 181 more detailed term relationships in thesauri

• 182 views of Farradane (Jason) on term relationships

• 183 views of Diener (Richard) on term relationships

• 184 views of Wang, Vandendorpe, and Evens on term relationships

• 185 views of ALA ALCTS Subject Analysis Committee on term relationships

• 186 compilation of term relationships by Michel (Dee) and Kuhr (Pat)

• 187 research on term relationships in thesauri

• 188 attitudes of users toward term relationships in thesauri

• 189 hierarchical relationships versus associative relationships in thesauri

• 190 term relationships in hierarchical displays in thesauri

• 191 display of term relationships in thesauri

• 192 term relationships during compilation of thesauri

• 193

• 194 attitudes of users toward term relationships in thesauri

• 195 hierarchical relationships versus associative relationships in thesauri

• 196 views of Cutter (Charles Ammi) on role of principles in cataloging

• 197

• 198

• 199

• 200

13.3.3.1.6. Variant Forms and Equivalent Terms.

• 201

• 202 gathering terms in end-user thesauri

• 203 gathering terms versus preferred terms in thesauri

• 204 choice of gathering terms in end-user thesauri; choice of preferred terms in indexing thesauri

• 205 equivalent terms versus variant terms in end-user thesauri

• 206

• 207 cross references in hypertext

• 208

• 209

• 210 used for terms versus equivalent terms in end-user thesauri

13.3.3.1.7. Homographs.

• 211

13.3.3.1.8. Thesaurus Displays.

• 212 search options in end-user thesauri

• 213 browsable indexes for end-user thesauri

• 214 relational displays in end-user thesauri

• 215 searching with end-user thesauri

• 216

• 217

13.3.4. Co-Occurrence Term Clustering.

• 218

• 219 research on clustering of terms for vocabulary management

• 220 clustering terms for vocabulary management

13.3.5. Ontologies.

• 221 definitions of ontologies

• 222

• 223

• 224

“1. A systematic account of Existence.

“2. (From philosophy) An explicit formal specification of how to represent the objects, concepts and other entities that are assumed to exist in some area of interest and the relationships that hold among them.

“For AI systems, what “exists” is that which can be represented. When the knowledge about a domain is represented in a declarative language, the set of objects that can be represented is called the universe of discourse. We can describe the ontology of a program by defining a set of representational terms. Definitions associate the names of entities in the universe of discourse (e.g., classes, relations, functions or other objects) with human-readable text describing what the names mean, and formal axioms that constrain the interpretation and well-formed use of these terms. Formally, an ontology is the statement of a logical theory.

“A set of agents that share the same ontology will be able to communicate about a domain of discourse without necessarily operating on a globally shared theory. We say that an agent commits to an ontology if its observable actions are consistent with the definitions in the ontology. The idea of ontological commitment is based on the Knowledge-Level perspective.

“3. The hierarchical structuring of knowledge about things by subcategorising them according to their essential (or at least relevant and/or cognitive) qualities. See subject index. This is an extension of the previous senses of “ontology” (above) which has become common in discussions about the difficulty of maintaining subject indices” (1997-04-09).

• 225 ontologies versus thesauri

• 226 views of Hjerppe (Roland) on ontologies versus knowledge organization systems

• 227 views of Sowa (John) on categories in ontologies

• 228 views of Poli (Roberto) on categories in ontologies

• 229

• 230 categories and term relationships in thesauri versus ontologies

• 231 weak structures in ontologies

• 232 views of Vickery (Brian C.) on ontologies

• 233 ontologies for machine translation; conceptual levels in ontologies

• 234 ontologies for business

• 235 compilation of ontologies

• 236 views of Vickery (Brian C.) on ontologies

13.4. Our Examples.

13.4.1. A Book Index.

• 237

• 238 vocabulary management for book indexes in print media

• 239 integration of vocabulary management in book indexes

• 240 equivalent-term cross references for synonymous and equivalent terms in book indexes

• 241 double posting for synonymous and equivalent terms in book indexes

• 242

• 243 equivalent-term cross references for narrower terms in book indexes

• 244 terminology in equivalent-term cross references

• 245 see-also references in book indexes

• 246 application of thesauri to book indexes

• 247

• 248

• 249

• 250 vocabulary management for indexes in electronic books

• 251 presentation of see-also references in displayed indexes in electronic media

• 252 non-displayed indexes for electronic books

• 253 presentation of suggestions for vocabulary management for searches in non-displayed indexes

• 254

• 255

• 256

13.4.2. An Indexing and Abstracting Service.

• 257

• 258 vocabulary management for indexing and abstracting services in print media

• 259 see-also references for equivalent terms in automatic indexing

• 260

• 261

• 262 vocabulary management for non-displayed indexes for
indexing and abstracting services in electronic media

• 263 suggestions for vocabulary management for multiple terms in search statements

• 264 optional status of suggestions for vocabulary management

• 265

13.4.3. A Full-Text Encyclopedia.

• 266

• 267


Posted

in

, ,

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *