Journal of the American Society for Information Science and Technology JASIST -- Table of Contents

Journal of the American Society for Information Science and Technology (JASIST) -- Table of Contents

Contributed by
Richard Hill
American Society for Information Science and Technology
Silver Spring, Maryland, USA
Fax: (301) 495-0810
Phone: (301) 495-0900
rhill@asis.org

VOLUME 52, NUMBER 6

[Note: below the contents of Bert Boyce's "In This Issue" has been cut into the Table of Contents.]

CONTENTS

Editorial

In this issue
Bert R. Boyce
Page 443

Research

Assessment of the Effects of User Characteristics on Mental Models of Information Retrieval Systems
Xiangmin Zhang and Mark Chignell
Page 445, Published online 15 February 2001

In this issue we begin with Zhang and Chignell who use the Repertory Grid Technique (RGT) to extract user's mental models of information retrieval systems in order to study the effects on these models of four characteristics: educational and professional status, first language, academic discipline, and computer experience. Each of 64 subjects rated nine retrieval system concepts as to three attributes (form/process, targeted/not targeted, and specific to IR system/applicable to all information systems) yielding 27 variables for analysis. A factor analysis yielded nine factors with an eigenvalue greater than one, which accounted for 68% of the variation from the original ratings. The first factor appeared to be concerned with the purposefulness of querying; the second, applicability of data organization; the third, the function of querying; the forth, applicability of querying; the fifth, applicability of browsing; the sixth, function of data structure; the seventh, purposefulness of browsing; the eighth, function of the document; and the ninth factor, the purposefulness of data structure. Analysis of variance and Tukey tests were applied to the subjects factor scores. Educational and professional background, discipline, and computer experience all had significant effects on the factor scores representing the mental models, language did not. Student an information professional scores differed widely on factors 1 and 3. Graduates differ from other students on factors 2 and 6. The user's discipline shows significant differences on factors 1, 2, 3, and 7, and computer experience has differences on 1, 2, and 7. Overall information professionals and students have strikingly different models. Science students see browsing as a targeted activity but humanities students do not. Language does not seem to affect mental models of information retrieval systems.
Modeling the Retrieval Process for an Information Retrieval System Using an Ordinal Fuzzy Linguistic Approach
E. Herrera-Viedma
Page 460, Published online 15 February 2001

Herrera-Viedma, believes that quantitative weights computed from term occurrence are appropriate for the characterization of documents, but not for queries or the estimated relevance levels for ranking of retrieved documents, where human understanding argues for qualitative expression. Terms for queries are ranked in seven symmetric ordinal classes by searchers, or by an importance weight or by a weight indicating how many documents should be returned for that term. An RSV is computed for each document for each ordered representation of the query. These are then aggregated by the search system for final evaluation of documents. The aggregation is carried out by linguistic implication functions which provide varied definitions of disjunction and conjunction depending upon the relative importance of the logical sub-expressions of the query. Users will need to determine which, or how many of the ordering schemes to use.
Discovering Term Occurrence Structure in Text
Abraham Bookstein and T. Raita
Page 476, Published online 15 February 2001

Bookstein and Raita observe that term occurrences tend to clump in texts. That is to say, if a term's occurrence is observed in adjacent text segments, the expected number of random clumps will be exceeded. Strongly clumped terms have retrieval value, and if text is partitioned to minimize clumping strength such stretches of text are likely to be content homogeneous. Linear clumping strength is measured by the ratio of the expected value of clumps formed to the observed value. The standard deviation will express the degree of non-randomness or clumping. Condensation clumping views the problem as a distribution of terms (balls) into text segments (urns) and the ratio of the expected number of segments containing the term to the observed number as the clumping measure. The common retrieval measure, inverse document frequency, can be rewritten in these terms with little difference between the two when the probability the segment contains the term is small. The standard deviation of the condensation clumping measure will allow an expression of the degree of non-randomness, but is complex to compute. The use of an approximate value at least as large as the standard deviation simplifies the process. The two measures diverge as segments are merged together with linear clumping decreasing and condensation clumping increasing.

Using the same general model a measure is constructed using the gaps between segments with term occurrence, where the text is considered to be wrapped in a circular fashion. More generality is achieved, but it appears that performance is very similar to the previous measures.
Optimal Query Expansion (QE) Processing Methods with Semantically Encoded Structured Thesauri Terminology
Jane Greenberg
Page 487, Published online 22 February 2001

Greenberg looks at the automatic expansion of queries using thesaurus terms in varying relationships with entry terms, based on a binary relevance evaluation of initial return by end users, as opposed to interactive expansion where the system provides a list of possibilities based on the initial return and the user chooses expansion terms. Using ten queries collected from MBA students, the ProQuest Controlled Vocabulary, and the ABI/Inform database on DIALOG, she mapped each query to the thesaurus terms as a base, and created four expansions: synonyms, narrower terms, related terms, and broader terms. Relevance judgements were made on the basis of topical matching (aboutness) by the contributors of the queries reviewing the Union set of the responses to the query forms where each retrieved list was limited to a length 15 or less citations. The automatic expansions separately took all synonyms, all narrower terms, all broader terms, and all related terms. For interactive expansion users chose from a alphabetized union list of the terms in thesaurus records for query terms. These selections were then incorporated in the query expansion by the searcher. Users chose from all groups but took over half of the suggested synonyms and broader terms, and over a quarter of the narrower and related terms. Synonyms and narrower terms augmented recall without a significant loss in precision in both automated and interactive searching, which argues for their use in automated expansion since less effort is required. Broader and related terms improved recall the most but would not be useful in automatic expansion if high precision is a goal. However, they, and particularly related terms, are seen as excellent candidates for use in interactive expansion.
Evaluating Internet Resources: Identity, Affiliation, and Cognitive Authority in a Networked World
John W. Fritch and Robert L. Cromwell
Page 499, Published online 8 March 2001

The filters in print media that provide authority are not available on the Internet so that authorship and thus accountability are uncertain. Determining true authorship and affiliation are likely to be the most significant need in establishing cognitive authority of a site. Fritch and Cromwell suggest the assessment of documents, authors, institutions and affiliations separately followed by integration of the results while indicating confidence in decisions on a separate scale. In their example, confirming the connection of the domain name to the assumed sponsor via the Whois search is a first step. Looking for author statements and affiliations to other sites is the second. The identification of overt and covert links may disclose bias.

Book Reviews

Electronic Expectations: Science Journals on the Web, by Tony Stankus
Michael Fosmire
Page 508, Published online 16 February 2001

Snap to Grid: A User's Guide to Digital Arts, Media, and Cultures, by Peter Lunenfeld
G. Benoit
Page 509, Published online 16 February 2001

rom Web to Workplace: Designing Open Hypermedia Systems, by Kaj Gronbaek and Randall H. Trigg
Ina Fourie
Page 510, Published online 21 February 2001

Organizing Audiovisual and Electronic Resources for Access: A Cataloging Guide, by Ingrid Hsieh-Yee
Karen Spern
Page 512, Published online 21 February 2001

Click here to return to the D-Lib Magazine clips column.