As promised, here is the second instalment of our paper on search strategy formulation, which Andy MacFarlane presented at 4th Spanish Conference in Information Retrieval in Granada last week. Andy has been teaching IR and search strategies for many years, and this paper represents a synthesis of his framework and my research insights. It describes a structured way to think about search strategy development and (hopefully) offers some valuable advice on how best to teach such skills. As always, comments & feedback welcome!
5. A FRAMEWORK FOR LEARNING
In this section we develop a framework for learning based on the methodology in section 4. Systematic review often requires two main activities : an initial search to identify any existing systematic reviews on the subject, and a full search if/when no prior review is found. The framework can be applied in both activities, but is more critical for the latter due to the complexity of the task. At each stage we outline good practice and identify common sources of error , outline learning objectives, curricula and learning materials, teaching methods and assessment and feedback methods.
5.1 Cognitive stage
A key problem at this stage is the growing volume of published medical studies , with the number increasing year on year. It is worth stressing the need for the user to reflect on the current state of their search and its ability to identify relevant studies that address the research question. Searchers should also be aware of the importance of sources, i.e. the databases that contain relevant information to fulfill the given information need. This will include peer review literature in prestigious journals, but other sources should be included such as non-English language articles, the ‘grey’ literature, non-refereed journals, conference proceedings, company reports etc. . This ensures that the searcher understands the comprehensive nature of the requirements of a systematic review, and that the search needs to be exhaustive before any filtering of the literature can take place . However searchers should be aware of information quality given the range and type of material available e.g. potential bias or error in published studies. To this end it would be useful to introduce the searcher to information literacy ideas to think through these issues . Standard checklists are used by search professionals to validate their search strategy . This includes the identification of a gold standard of known relevant records in section B of the standard checklist  which can be used to further citation pearl growing search strategies.
5.1.1 Learning Objectives
The learning objectives could include an understanding the following: 1) the importance of sources and potential bias in those sources, assessing information quality; 2) the exhaustive nature of systematic reviews and the process as a whole; 3) the notion of relevance and the use of gold standard records to assist strategy development.
5.1.2 Curricula and learning materials
A key resource would be the Cochrane Handbook for Systematic Reviews of Interventions , but general schemes on the body of professional knowledge from organizations such as CILIP would also be useful. There is useful work by Bates et al.  which surveys LIS curricula in Europe and recommends the use of Wilson’s nested model  to guide curricula design – this would embed the learning materials in the key academic work on information seeking over many years.
5.1.3 Teaching methods
Clearly the student needs to take a step back and understand the information need in detail before attempting searching as recommended by Cohen . The first author has used this method of a number of years where information needs are taught in conjunction with the linguistic stage, and this allows the student to focus on getting things right at the start. Putting students in groups and getting them to discuss the issues in tutorials has found to be a very successful form of learning .
5.1.4 Assessment and Feedback methods
A focus on assessment could involve encouraging students to develop their self-reflection skills through either formative or summative feedback schemes (perhaps even using peer review). Assessment would focus on information literacy tasks, assessing the information quality of sources using knowledge positive and negative examples and their relation to relevance.
5.2 Linguistic stage
A key issue at this stage is the ability to define an appropriate research question  given the information need identified at the cognitive stage. It would be useful to practice the writing of a document which describes this question, including its objectives, the subject area, the population concerned, type of evidence for evaluation and outcomes required . Once this is done, support can be provided for the extraction of facets from the healthcare question, using PICO  or some other appropriate facet analysis scheme. If appropriate or available, the use of tools for extracting PICO elements or other information using utilities such as ExaCT  could be useful. Section A of the standard checklist  can be used to collect information about the information need including the authors’ stated objective, the focus of the research etc.
5.2.1 Learning Objectives
The learning objectives could include the following: 1) defining and documenting a clear research question; 2) Using facet analysis techniques such as PICO to analyze the research question effectively.
5.2.2 Curricula and learning materials
Reference to good practice provided by Cochrane  on how to conduct systematic reviews would be appropriate. This would give the student an overall idea of how to initiate work on a research question and keep it up to date. A collection of example facet analyses for healthcare topics (e.g. with the PICO scheme) would be useful learning materials.
5.2.3 Teaching methods
Since facet analysis is not an exact science, students should be encouraged to develop their own ideas and to refer to case studies and examples illustrating good practice. This can be done individually or in group tutorials, or through online tasks using E-learning materials.
5.2.4 Assessment and Feedback methods
As with the previous level, both formative and summative feedback schemes are appropriate, providing individualized feedback to address specific student issues. The use of MCQs could be considered, but only to address known issues in facet analysis such as placing the terms in the correct PICO element.
5.3 Strategic stage
This stage is concerned with translating the facet analysis to the search strategy. A key issue is understanding the relationships between facets: in the case of Boolean search strategies, OR is typically applied to terms within facets, and the AND operator is applied between facets (see section 4.3). Students can confuse the two and use inappropriate operators e.g. AND within a facet. One author has been teaching this material for 15 years, and it is a common source of error.
A number of key problems at the strategic level are identified by Sampson et al . In some cases the wrong line number is used in a step, either omitting a set or using an incorrect set (this applies to any of the strategies described in section 4.3.1 to 4.3.2). The searcher can avoid this by drawing the relationship between line numbers/sets, to show the relationship between or within facets depending on the focus. MeSH and free text terms used on the same line can compromise reuse. A simple solution to this is to address the granularity of the strategy, and provide examples of when MeSH and free text terms could be decoupled. Terms can be reused leading to redundancy without rationale, which may not harm the search but may slow down run times for large searches and complicate the strategy unnecessarily. A way round this is to check for the use of a given term more than once in a strategy, and ensure that the term is required at that particular stage. In the case where searches are required over a number of databases, training on how to tailor the search strategy to each database should be provided. This should include a clear description of the strategy for the purposes of reproducibility (which is good practice in systematic reviews). Section C of the standard checklist  provides examples of issues to think about when forming the search strategy, including adapting an already existing search strategy, using a database thesaurus and thinking about how the final combination of terms were selected (see sections 5.3.1 to 5.3.3 below).
5.3.1 Building blocks
A key issue with this strategy is to get users to understand the drawbacks of the method e.g. a searcher focusing on one facet may lose focus on the whole topic (section 4.3.1). Users should be trained to understand that if they are to use the method, a clear understanding of each facet must be gained. This could include continual review of the information need and any related topics which could be useful for each facet. Links and relations between facets should be identified by the searcher and recorded in a checklist .
5.3.2 Successive fractions
In this strategy the sequential order of the facets is crucial, and the user needs to be taught to think about the starting point. Normally this would be the most specific facet first (e.g. the type of patient in PICO), with other more general facets following after (e.g. outcome in PICO). This is particularly important in more ad-hoc methods of analysis where the user has identified their own facets e.g. Object, Activity, Date. In such cases it would be better to start with the Object/Activity facets and finish with the Date facet. As with building blocks links and relations between facets should be identified by the searcher and recorded in a checklist .
5.3.3 Citation Pearl Growing
This requires an understanding of the use of gold standard records (the ‘pearls’) to develop an overall strategy. Section D of the checklist  provides useful advice on considering issues such as sensitivity (recall), precision and specificity . The ‘pearls’ can be used to check each metric and the strategy developed to meet a certain criteria e.g. a preference for a high level of sensitivity (recall) whilst ensuring a threshold of 50% for specificity . This is done by checking to see if the ‘pearls’ are retrieved by the search strategy, and an interactive process in search strategy development may be needed to in order to ensure that all ‘pearls’ are retrieved. The balance of the two can be adapted to the given needs of the searcher, but the linkage between the different terms needs to be emphasized. Choice of further strategies such as building blocks can then be addressed.
5.3.4 Learning Objectives
The learning objectives could include the following: 1) effective translation of facet analysis into an appropriate research strategy; 2) understanding the different forms of search strategy, their similarities and differences and when to apply a given strategy for a particular problem.
5.3.5 Curricula and learning materials
The curricula would focus on the different forms of search strategies available, with a clear link made to the facet analysis. The problems identified early in this section should be specifically addressed and built in to the learning materials. Each of the search strategies needs to be clearly explained with appropriate examples, with differences between building blocks, successive fractions and citation pearl growing demonstrated.
5.3.6 Teaching methods
There are a number of different methods for teaching search strategies including Bhavnani et al , which uses taxonomies of both tasks and general IR strategies to build a methodology to learn to search by 1) learning specific search strategies for frequent tasks, 2) using strategies for given contexts, 3) learning how to execute a strategy accurately and 4) applying strategies across different applications (in conjunction with the syntactic level below). Use of graphical online tools would also be a useful addition to the learning experience e.g. the relations between intermediate search sets.
5.3.7 Assessment and Feedback methods
The use of MCQs can be used to test understanding of the form of strategy, e.g. MacFarlane  specifies an example set of questions (labelled under the group C element of the MATH taxonomy ), which would use questions on the different forms to allow the user to assess their own understanding. For example, giving the student a facet analysis and asking them to identify the correct building blocks strategy from a number of distractors. Key problems identified in the Common errors should be built in to the distractors, e.g. using OR between facets instead of AND.
5.4 Tactical stage
The strategic and tactical stages are closely related and often need to be considered simultaneously. This requires thought on the use of terms and operators (section 3.4).
A number of common errors at the tactical level are identified by Sampson et al. . Spelling errors are a significant issue. Applying appropriate thesauri or other knowledge organization schemes (e.g. taxonomies, ontologies) can require further verification of medical terms. Google may be used as a source of verification but has limited value as the terms returned may reflect similar errors made on the web and may not provide relevant terms for the domain. Missed spelling variants can be dealt with by teaching the searcher to think about variations of words and use truncation as a tactic. However, the searcher can inadvertently choose irrelevant MeSH or free text terms, or alternatively miss other useful MeSH terms. A further problem is that MeSH terms can be exploded without any effect if the term is at the bottom of the hierarchy, since no further child terms exist. Encouraging the learner to reflect on the terms used and providing training on the MeSH scheme can help address these issues.
Section C of the standard checklist  provides examples of issues to think about when forming tactics, including terms extracted from documents and identifying different types of term checking including terms extracted from gold standard records, terms suggested by experts and from database thesauri etc.
5.4.1 Learning Objectives
The learning objectives could include the following: 1) how to successfully use appropriate tactics within a given strategy, 2) good practice on choosing operators, 3) good practice on choosing terms.
5.4.2 Curricula and learning materials
The learning materials would focus on when to use particular operators in a strategy, e.g. Boolean, proximity or wildcard operators, and best practice on picking terms e.g. those extracted from gold standard records.
5.4.3 Teaching methods
Given the subjective nature of term selection, students can be put into groups and given case studies along with examples of good and bad tactics for those strategies. The use of operators is more objective, and online self-reflection materials can be used.
5.4.4 Assessment and Feedback methods
For term selection tactics, either formative or summative feedback schemes would be appropriate, providing individualized feedback to address specific student issues. MCQs can be used for operator tactics with appropriate use with given distractors, which can be delivered with Group C questions  in strategies above  but as a separate question set, e.g. the correct use of MeSH terms.
5.5 Logical stage
Closely aligned with the tactical stage is the logical stage of the framework (section 3.4). Two key problems at the tactical stage are identified by Sampson et al . The first of these is confusion between the operators AND, OR with potential serious impact to the overall search strategy (section 3.4). This can occur with users unfamiliar with Boolean logic who are used to thinking in terms of AND as an OR: for example a request such as ‘Find me documents about cats and dogs’ is linguistically AND, but semantically it implies OR. This contrast can be confusing for students. Clarification on the natural language use of OR and AND needs to be highlighted to the user. The second issue is the inappropriate use of the NOT operator, which must be used with care as relevant documents may be eliminated from results. It should be stressed to the learner that the NOT operator should only be used where a given term or set of terms is known to be harmful to the overall search. Further training could be given on the relationship between the word operators (truncation, proximity) and Boolean operators (OR, AND) ensuring they understand that the former are special cases of the latter (section 4.5).
5.5.1 Learning Objectives
The learning objectives could include the following: 1) correct use of Boolean and extended Boolean operators.
5.5.2 Curricula and learning materials
The material would focus on understanding Boolean logic using methods such as Venn diagrams, together with providing some understanding of the underlying axioms of the mathematics e.g. AND, OR are symmetric, whereas NOT is not symmetric. This material can be drawn from any good textbook on discrete mathematics. The use of word operations e.g. proximity and wildcards can then be further explained from a Boolean logic perspective.
5.5.3 Teaching methods
Online delivery of the material would be appropriate for this level, with examples and self-assessment for each of the operators. The teaching scheme must not assume that the student is familiar with discrete mathematics . Tutorial group tasks have also proved to be successful for face to face students .
5.5.4 Assessment and Feedback methods
Group A questions  could be used to assess the understanding of Boolean and extended Boolean logic by providing text examples and asking the student to pick which queries would retrieve that text .
5.6 Syntactic stage
Implementing the search strategy on an operational information retrieval system is the final stage of the search (section 4.6). The syntax of the different search systems can be very different but there are certain commonalities. In cases where multiple searches are required, training on translation of queries to different systems should be provided. This includes training on unary operators (applied to a single term), binary operators (applied to two terms) and clarification of what operators are symmetric (two different terms can be on either side of the AND, OR operators) and non- symmetric (in Dialog ProQuest  the proximity operators “”/PRE impose order on words, whilst NEAR does not).
One particular problem at the syntactic level is identified by Sampson et al. . This is the inappropriate use of truncation e.g. using methods* instead of ‘method*’ to capture several terms on that concept. Training on truncation operators and their impact needs to be provided and examples given of both appropriate and inappropriate use.
5.6.1 Learning Objectives
The learning objectives could include the following: 1) understanding how to translate a Boolean search strategy with relevant tactics into a form which can be executed by an operational information retrieval system.
5.6.2 Curricula and learning materials
Materials will need to be developed for specific systems e.g. ProQuest Dialog , together with a general scheme of how to approach the translation of a generic Boolean query to relevant syntax. This will require a survey of existing systems used in systematic review. The material will need to address problems identified in the literature mentioned above .
5.6.3 Teaching methods
At this stage practice on real systems will be required to ensure that the user can truly understand the final stage. This could require the use of PC labs, with specific tasks – perhaps in conjunction with an overall task from all levels of the framework – with work on other levels being done prior to the lab. The teaching method needs to instill some self-reflection, to establish both the process of translation of the Boolean query to the target system, but also to instill confidence in the student in what can be a very complex activity. Online materials and self-assessments on individual elements of the system syntax would also be useful.
5.6.4 Assessment and Feedback methods
Assignments which give the student an opportunity to build their confidence and knowledge in search e.g. providing an example systematic review case study to search for and allowing them to build an operational query to find information for that case study. In-class tests could also be used, whereby students are provided pre-defined search strategies and given limited time to form real searches using a given system in a lab. Multiple choice questions can be used to tackle Group B questions, focusing on specific issues or known problem with syntax on a given search service . An example would be to give a list of search forms in the given syntax and get the student to choose the number of correct forms .
6. SUMMARY AND CONCLUSION
We have introduced here a structured search methodology which is used to inform a framework for learning how to develop search strategies which can be used in systematic reviews. This framework includes a number of discrete but interlinked stages: cognitive, linguistic, strategic, tactical, logical and syntactic. The learning framework applied to each stage is a follows:
Cognitive: In this stage the importance of assessing sources will be stressed, in particular understanding the issue of information quality and potential bias in publications. Ideas and concepts in information literacy can be used to inform this part of the framework.
Linguistic: A key skill here is forming a research question given a clinical need, and using an appropriate facet analysis scheme to identify the complementary concepts of the need. Training in the use of standard facet analysis schemes such as PICO are required, together with training on software which can be used to build the facets.
Strategic: Being able to take the facet analysis and form an appropriate search strategy is the key skill that needs to be developed at this stage. This includes in initial translation from the facet analysis to the strategy (OR is applied within facets, AND between facets), to choosing the type of strategy to be used: building blocks, successive fractions or citation pearl growing.
Tactical: With a strategy, the choice of terms and operators needs to be considered. Choice of terms will depend on domain knowledge and interaction with a subject matter expert, whilst choice of operator requires the appropriate knowledge of Boolean operators and proximity operators that extend Boolean logic in various ways. Training on the use of field operators would also be appropriate.
Logical: An understanding of the operators identified in the tactical stage is required, in particular the differences and relationships between the operators need to be established as well as the appropriate use of operators.
Syntactic: This final stage needs to be carried out with an operational information retrieval system, and an understanding of the systems functionality must be provided. The system’s ability to handle intermediate search sets must also be stressed to support the complex search strategies outlined above.
The next stage in this work is to develop learning materials to deliver this learning framework, to engage in outreach activities with users who undertake systematic reviews, and to provide them with a structured learning framework that they can use to improve their knowledge and skills. Guidance on how to develop learning objectives, curricula/learning materials, teaching methods and assessment/feedback for each individual level of the search framework is provided in section 5. It is our plan to develop these concepts further. The proposed outcome of this work is to give users the skills they need to be more effective searchers and to share their knowledge with others who have common interests. A broader outcome is to improve the quality of search strategies used in systematic reviews, thereby improving the quality and accuracy of those reviews.
The authors are grateful to Stephen Robertson, Lyn Robinson and David Bawden for their feedback on the original of the structured searching framework, and to Stephen Robertson for recommending the Taylor 1968 reference .
- Elliott, J. H., Turner, T., Clavisi, O., Thomas, J., Higgins, J. P. T., Mavergames, C., and Gruen, R. L. Living systematic reviews: an emerging opportunity to narrow the evidence-practice gap. In: Plos medicine, Vol. 11, No. 2.
- Lefebvre, C., Manheimer, E., and Glanville, J. 2011. Searching for Studies. In Higgins, J. P. T., and Green S., Eds. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]. The Cochrane Collaboration. Available on: http://handbook.cochrane.org/.
- Hemingway, P. and Brereton, N. 2009. What is a systematic review? 2nd Hayward Medical Communications.
- Shojania, K.G., Sampson, M., Ansari, M,T,, Ji, J., Doucette, and Moher, D. 2007. How Quickly Do Systematic Reviews Go Out of Date? A Survival Analysis. Ann Intern Med.
- Tsafnat, G., Glasziou, P., Choong, M.K., Dunn, A, Galgani, F. and Coiera, E. 2014. Systematic review automation technologies. Syst Rev 3,74, 1-15.
- Sampson M, McGowan J. 2006. Errors in search strategies were identified by type and frequency. J. Clinical Epidemiology. 59, 10, 1057–63.
- Fernandez-Luna J.M., Huete, J.M. MacFarlane, A. and Efthimiadis, E.N. 2009. Teaching and learning in information retrieval. Inf Retrieval, 12, 201-226.
- C.C. 1997. Learning in digital libraries: an information search process approach. Lib Trends 45,4, 708-724
- Kuhlthau, C.C. 1988. Developing a model of the library search process: cognitive and affective aspects. RQ, 232-242.
- McGregor, J. 1994. Information seeking and use: students’ thinking and their mental models. J. Youth Services Lib, 8, 69-76.
- Nicholson, S. 2005. Understanding the foundation: the state of generalist search education in library schools as related to the needs of expert searchers in medical libraries. J. Med. Lib. Asoc. 93, 1, 61-68.
- A. and Morgenroth, K. 2007. Information retrieval as e-learning course in German – Lessons learned after 5 years of experienced. Proc. 1st international workshop on teaching and learning of information retrieval. Available on: http://tinyurl.com/z2ueueh.
- Sacchanand, C. and Jaroenpuntaruk, V. 2006. Development of a web-based self-training package for information retrieval using the distance education approach. Elec. Lib. 24, 4, 501-516.
- MacFarlane, A. 2011. Using multiple choice questions to assist learning for Information Retrieval. In Efthimiadis, E., Fernandez-Luna, J.M. Huete, J.F. and MacFarlane, A. Eds. Teaching and Learning in Information Retrieval. Springer Verlag, Berlin, 107-121.
- Zhu, L. and Tang, C. 2006. A module-based integration of information retrieval into undergraduate curricula. J. Comp. Sci. Col. 22, 2, 288-294.
- Halttunen, E. and Sormunen, E. 2000. Learning information retrieval through an educational game. Is gaming sufficient for learning. Edu. Info, 18, 289-311.
- Taylor, R.S. 1968. Question-negotiation and information seeking in libraries. College & Research Libraries, 29, 3, 178-194.
- Belkin, N. J., Oddy, R.N. and Brooks, H.M. ASK for Information retrieval: part 1. background and theory. J.Documentation. 38, 2, 61-71.
- Dahlgren Memorial Library. 2016, Evidence-Based Medicine Resource Guide: Types of Clinical Questions. Available on: http://tinyurl.com/zfsa3ob.
- Ranganathan, S.R. 2006. Colon Classification (6th Edition). Ess Ess Publications. New Dehli.
- Markey, K. and Cochrane, P. 1981. Online training and practice manual for ERIC database searchers (2nd Edition). ERIC Clearinghouse on Information Resources, Syracuse University. Available on: http://tinyurl.com/j5v65wb.
- ProQuest Dialog. Available on: http://tinyurl.com/l22vdk9.
- Inskip, C. 2014. Information literacy is for life, not just a good degree: a literature review. CILIP. Available on: http://tinyurl.com/kjaujnh.
- Glanville, J., Bayliss, S., Booth, A., Dundar, Y., Fleeman, N., Foster, L., Fraser, C., Fry-Smith, A, Golder, S., Lefebvre, C., McNally, R., Miller, C., Paisley, S., Payne, L, Price, A. Shaikh, H., Sutton, A., Welch, K. and Wilkinson, A. 2008. So many filters, so little time: The development of a Search Filter Appraisal Checklist. MLA, 96,4, 356-361.
- Wilczynski, N.L., Haynes, R.B., Lavis, JN., Ramkissoonsingh, R. and Arnold-Oatley, A. 2004. Optimal search strategies for detecting health services research studies in MEDLINE. CMAJ 171, 10, 1179-1185.
- Bates, J,. Bawden, D., Corderio, I. Steinerova, J., Vakkari, P. and Vilar, P. 2005. Information Seeking and information Retrieval. In Kajberg, L. and Lorring, L. Eds. European curriculum reflections on Library and information Science, Denmark. RSLIS.
- Wilson, T.D. 1999. Models in information behavior research. J.Doc, 55, 3, 249-270.
- Cohen, L.B. 2001. 10 tips for teaching how to search the web. American Libs, Nov, 44-46.
- MacFarlane, A. 2009. Teaching mathematics for search using a tutorial style of delivery. Inf Retrieval, 12, 162-178.
- Bhavnani, S., Drabenstott, K., and Radev, D. 2001. Towards a united framework of IR tasks and strategies. In Proc. ASSIT annual meeting, 38, 340-354.
- Smith, G, Wood, L., Crawford, K., Coupland, M., Ball, G. and Stephenson, B. 1996. Constructing mathematical examinations to assess a range of knowledge and skills. Int J Math Educ Sci Tech, 30, 47-63.
- Search Strategy Formulation: A Framework For Learning
- Searching for talent: The information retrieval challenges of recruitment professionals (part 3)
- UXLabs ‘Internships’ for 2016
- User requirements for complex search strategies
- Search strategies considered harmful?