Competency E ~
Design, query, and evaluate information retrieval systems.
Statement
As library and information science professionals, one of our responsibilities is to facilitate the retrieval of information for all types of users and there are many ways in which this can be accomplished. For example, we can facilitate the retrieval of information by communicating with users face to face, using the telephone, through the WWW, or by creating and maintaining effective information retrieval systems (IRS) for users to navigate through on their own or with assistance and instruction from a librarian. As our world continues to grow in complexity, as our lives become more interconnected and dependent on technology, as institutions grow and people contiune to learn new skills, the technology and interface design of the IRS people use to find information must also grow and evolve so the exchange of information can remain as fluid as possible. Having the ability to design, query, and evaluate information retrieval systems allows information professionals to manage the organization and delivery of information for users throughout the world. Desigining IRS involves choosing the most efficient way to represent information so that a user's need to know how the information is organized can be eliminated (or at least drastically minimized) which allows users to focus on efficiently conveying their information needs. Today, the average person deals with increasingly more information in all aspects of life, and part of this dealing is the creation of even more information. Our job, as knowledge workers, is to do our best to organize, manage, make sense of, and communicate information to provide the best quality of service to as many people as possible.
Designing IRS can be viewed as a method of communicating with users. As designers, we have to predict the types of queries users will create, we must create queries of our own to test our designs and our own understanding of the information-seeking process, and we must continuously evaluate the feedback we receive so that retrieval systems can be improved to produce more relevant results and user satisfaction. The more efficient we are at communicating with our users through database design, the less complicated it becomes for our users to retrieve information. User possess many different levels of understanding of the information-seeking process which means retrieval systems will be used and interpreted differently by different people. We must design IRS so that we can communicate with all users. IRS designers must strive to create intuitive and simple interfaces for users to interact with. The presentation or display of information to the user is one of the most important components of IRS design. It includes deciding what to display, how to arrange the components on the screen, what symbols and language to use, and communicating how it all fits together so users understand.
Evidence
As evidence for this competency I will discuss two assignments I completed for LIBR 202 Information Retrieval with Professor Enid Irwin. The first piece of evidence is a report introducing, explaining, and describing a yarn database created by myself and four other classmates (Team 7). The primary purpose of the assignment was to give Team 7 experience with the creation of an IRS. The functional purpose of the IRS Team 7 created was to enable knitters to keep an inventory of their unused yarn. The driving factor behind the creation of this database was that querying the database will be a more efficient method of yarn retrieval than rummaging through bins of random and unorganized yarn each time a new knitting project is thought of. The project was extensive, involved, and time consuming, and because of this each team member was an equal collaborator during all phases of the assignment.
I chose this piece of evidence because it demonstrates my ability to create a practical, working database. The assignment took Team 7 through several steps of the design process: selecting an audience/user group to serve, assessing the user's need, identifying yarn attributes to include in the database, developing the data structure for the database (including validation lists where necessary) using DB/TextWorks inmagic database software, creating indexing rules for database population to ensure consistency and accuracy in indexing, creating records to populate the database, and beta testing the database with another team to test how well our indexing rules guide them in creating records for the database. By working thgough these steps as a team we began thinking about description and surrogates conceptually - we learned how to predict user needs by selecting the most relevant yarn attributes and assigning values to each one. As we worked together to understand the intricacies of database design we each played the role of database designer and indexer. As designers, we learned to create the rules or standards necessary to make the records, fields, and data consistent throughout the database in order to facilitate successful and easy information retrieval. As indexers, we learned how important it is to consider the user group being served. Through trial and error we determined what attributes or fields would be needed to search on and for what reasons. My understanding of how IRS design can be used as a method of communicating with users developed during the indexing phase of this assignment. I had to think like a user of the database would while trying to retrieve a skein of yarn. By predicting the search terms that will be used to retrieve yarn we increase the percentage of relevant records retrieved by the user.
The next piece of evidence is the second part of a two-part assignment which was chosen because it demonstrates my ability to query and evaluate information retrieval systems. Part A of the assignment was to, as a team, design a database of surrogates for a collection of documents about information storage and retrieval. The documents used to create the database were selected from a list of supplemental readinge for LIBR 202. Part B, which will serve as evidence here, was an individual assignment to critique the database of surrogates created in Part A. Each team member assumed the role of database analyst/evaluator and was tasked to write a critical evaluation of the ability to search by subject in our Part A database. The primary focus for evaluating subject access within the database was on the usefullness of the pre-co and post-co vocabularies and of the effectiveness of the title and abstract fields to produce relevant results.
One of the most useful learning components of this assignment was the creation and testing of individual queries developed by each team member. By allowing each team member to create one query we were able to cover a relatively broad range of subjects to fully test the effectiveness of the team's previous indexing efforts. Instead of individually testing four different queries, the team shared queries and results. This means that each team member only had to test one query (instead of each team member creating and testing four different queries) using four different points of subject access (APA Citation, Abstract, Pre-Co, and Post-Co). Within each search field, queries were tested using three different search methods (Boolean AND, Boolean OR, and natural language. By sharing the query testing the team drastically reduced the individual bias each team member brought to the tests.
Creating queries, testing them, and then evaluating the results for relevance (rate of precision, recall, and fallout) demonstrate that I understand the basic concepts of IRS design, query, and evaluation. After participating in the query testing process I am able to conceptualize what it means to create a successful IRS. Database designers strive for high rates of precision and recall so that overall a high rate of relevance is obtained; however, we must remember that measuring relevance is not an exact science. Ultimately, relevance is determined by any individual with an information need who is faced with deciding whether or not a query has been satisfied. As a professional librarian I will be able to apply my knowledge of IRS design and evaluation when creating new IRS to assist partons with information retrieval. My ability to evaluate information retrieval systems will help the institutions I work for refine their systems to have intuitive user interfaces which in turn will produce more accurate search results and make for more satisfied users.
Conclusion
The two assignments described above provide some background of my studies in information retrieval system design, query, and evaluation. With the successful completion of the assignments, I can now discuss the concepts of precision, the percent of retrieved records that are relevant to a search; recall, the ratio of the number of relevant records retrieved to the total number of relevant records present in the database; and fallout, the ratio of non-relevant records retrieved to the total number of non-relative records present in the database, in terms of the effectiveness of an IRS design. I have developed an understanding of the aspects of information retrieval that database designers have the most difficulty in anticipating during the desin process. As explained by Meadows, Boyce, Kraft, and Barry (2008) we have to anticipate the best way to represent information so users understand it easily; we have to design IRS to help users express their information needs and formulate queries, a skill many people are never taught; and we have to anticipate at what level of sophistication our systems will perform which involves possessing a solid understanding of what the targeted user group needs to know, or is willing to learn, in order to use the system effectively.
Constant evaluation of IRS is necessary because no system is ever complete or perfect. Evaluation shows us how to make IRS better. Performing test queries and evaluating the system's performance in response to those queries allows us to evaluate system usability, which is one of the most important components of IRS design. Often, users will apply minimal effort when searching for information. They will take the path of least resistence. They will stop searching once their perceived information need has been met. Without the knowledge to properly evaluate information for relevance or authority, users often leave a search achieving much less than what could have been achieved with the proper training or knowhow. To be successful, IRS designers need to figure out ways to marry ease of use with the ability to return a high percentage of relevant records.
References
Meadow, C.T., Boyce, B.R., Kraft, D.H., Barry, C. (2008). Text information retrieval systems (3rd Ed.). United Kingdom:
Emerald Group Publishing Limited.
Evidentiary Material
As library and information science professionals, one of our responsibilities is to facilitate the retrieval of information for all types of users and there are many ways in which this can be accomplished. For example, we can facilitate the retrieval of information by communicating with users face to face, using the telephone, through the WWW, or by creating and maintaining effective information retrieval systems (IRS) for users to navigate through on their own or with assistance and instruction from a librarian. As our world continues to grow in complexity, as our lives become more interconnected and dependent on technology, as institutions grow and people contiune to learn new skills, the technology and interface design of the IRS people use to find information must also grow and evolve so the exchange of information can remain as fluid as possible. Having the ability to design, query, and evaluate information retrieval systems allows information professionals to manage the organization and delivery of information for users throughout the world. Desigining IRS involves choosing the most efficient way to represent information so that a user's need to know how the information is organized can be eliminated (or at least drastically minimized) which allows users to focus on efficiently conveying their information needs. Today, the average person deals with increasingly more information in all aspects of life, and part of this dealing is the creation of even more information. Our job, as knowledge workers, is to do our best to organize, manage, make sense of, and communicate information to provide the best quality of service to as many people as possible.
Designing IRS can be viewed as a method of communicating with users. As designers, we have to predict the types of queries users will create, we must create queries of our own to test our designs and our own understanding of the information-seeking process, and we must continuously evaluate the feedback we receive so that retrieval systems can be improved to produce more relevant results and user satisfaction. The more efficient we are at communicating with our users through database design, the less complicated it becomes for our users to retrieve information. User possess many different levels of understanding of the information-seeking process which means retrieval systems will be used and interpreted differently by different people. We must design IRS so that we can communicate with all users. IRS designers must strive to create intuitive and simple interfaces for users to interact with. The presentation or display of information to the user is one of the most important components of IRS design. It includes deciding what to display, how to arrange the components on the screen, what symbols and language to use, and communicating how it all fits together so users understand.
Evidence
As evidence for this competency I will discuss two assignments I completed for LIBR 202 Information Retrieval with Professor Enid Irwin. The first piece of evidence is a report introducing, explaining, and describing a yarn database created by myself and four other classmates (Team 7). The primary purpose of the assignment was to give Team 7 experience with the creation of an IRS. The functional purpose of the IRS Team 7 created was to enable knitters to keep an inventory of their unused yarn. The driving factor behind the creation of this database was that querying the database will be a more efficient method of yarn retrieval than rummaging through bins of random and unorganized yarn each time a new knitting project is thought of. The project was extensive, involved, and time consuming, and because of this each team member was an equal collaborator during all phases of the assignment.
I chose this piece of evidence because it demonstrates my ability to create a practical, working database. The assignment took Team 7 through several steps of the design process: selecting an audience/user group to serve, assessing the user's need, identifying yarn attributes to include in the database, developing the data structure for the database (including validation lists where necessary) using DB/TextWorks inmagic database software, creating indexing rules for database population to ensure consistency and accuracy in indexing, creating records to populate the database, and beta testing the database with another team to test how well our indexing rules guide them in creating records for the database. By working thgough these steps as a team we began thinking about description and surrogates conceptually - we learned how to predict user needs by selecting the most relevant yarn attributes and assigning values to each one. As we worked together to understand the intricacies of database design we each played the role of database designer and indexer. As designers, we learned to create the rules or standards necessary to make the records, fields, and data consistent throughout the database in order to facilitate successful and easy information retrieval. As indexers, we learned how important it is to consider the user group being served. Through trial and error we determined what attributes or fields would be needed to search on and for what reasons. My understanding of how IRS design can be used as a method of communicating with users developed during the indexing phase of this assignment. I had to think like a user of the database would while trying to retrieve a skein of yarn. By predicting the search terms that will be used to retrieve yarn we increase the percentage of relevant records retrieved by the user.
The next piece of evidence is the second part of a two-part assignment which was chosen because it demonstrates my ability to query and evaluate information retrieval systems. Part A of the assignment was to, as a team, design a database of surrogates for a collection of documents about information storage and retrieval. The documents used to create the database were selected from a list of supplemental readinge for LIBR 202. Part B, which will serve as evidence here, was an individual assignment to critique the database of surrogates created in Part A. Each team member assumed the role of database analyst/evaluator and was tasked to write a critical evaluation of the ability to search by subject in our Part A database. The primary focus for evaluating subject access within the database was on the usefullness of the pre-co and post-co vocabularies and of the effectiveness of the title and abstract fields to produce relevant results.
One of the most useful learning components of this assignment was the creation and testing of individual queries developed by each team member. By allowing each team member to create one query we were able to cover a relatively broad range of subjects to fully test the effectiveness of the team's previous indexing efforts. Instead of individually testing four different queries, the team shared queries and results. This means that each team member only had to test one query (instead of each team member creating and testing four different queries) using four different points of subject access (APA Citation, Abstract, Pre-Co, and Post-Co). Within each search field, queries were tested using three different search methods (Boolean AND, Boolean OR, and natural language. By sharing the query testing the team drastically reduced the individual bias each team member brought to the tests.
Creating queries, testing them, and then evaluating the results for relevance (rate of precision, recall, and fallout) demonstrate that I understand the basic concepts of IRS design, query, and evaluation. After participating in the query testing process I am able to conceptualize what it means to create a successful IRS. Database designers strive for high rates of precision and recall so that overall a high rate of relevance is obtained; however, we must remember that measuring relevance is not an exact science. Ultimately, relevance is determined by any individual with an information need who is faced with deciding whether or not a query has been satisfied. As a professional librarian I will be able to apply my knowledge of IRS design and evaluation when creating new IRS to assist partons with information retrieval. My ability to evaluate information retrieval systems will help the institutions I work for refine their systems to have intuitive user interfaces which in turn will produce more accurate search results and make for more satisfied users.
Conclusion
The two assignments described above provide some background of my studies in information retrieval system design, query, and evaluation. With the successful completion of the assignments, I can now discuss the concepts of precision, the percent of retrieved records that are relevant to a search; recall, the ratio of the number of relevant records retrieved to the total number of relevant records present in the database; and fallout, the ratio of non-relevant records retrieved to the total number of non-relative records present in the database, in terms of the effectiveness of an IRS design. I have developed an understanding of the aspects of information retrieval that database designers have the most difficulty in anticipating during the desin process. As explained by Meadows, Boyce, Kraft, and Barry (2008) we have to anticipate the best way to represent information so users understand it easily; we have to design IRS to help users express their information needs and formulate queries, a skill many people are never taught; and we have to anticipate at what level of sophistication our systems will perform which involves possessing a solid understanding of what the targeted user group needs to know, or is willing to learn, in order to use the system effectively.
Constant evaluation of IRS is necessary because no system is ever complete or perfect. Evaluation shows us how to make IRS better. Performing test queries and evaluating the system's performance in response to those queries allows us to evaluate system usability, which is one of the most important components of IRS design. Often, users will apply minimal effort when searching for information. They will take the path of least resistence. They will stop searching once their perceived information need has been met. Without the knowledge to properly evaluate information for relevance or authority, users often leave a search achieving much less than what could have been achieved with the proper training or knowhow. To be successful, IRS designers need to figure out ways to marry ease of use with the ability to return a high percentage of relevant records.
References
Meadow, C.T., Boyce, B.R., Kraft, D.H., Barry, C. (2008). Text information retrieval systems (3rd Ed.). United Kingdom:
Emerald Group Publishing Limited.
Evidentiary Material
databasedesign_libr202.pdf | |
File Size: | 311 kb |
File Type: |
query_libr202.pdf | |
File Size: | 343 kb |
File Type: |