QueryBuilder

  OR     OR  

QueryBuilder Help

QueryBuilder (QB) provides one-stop shopping for information in FlyBase.

  • Using QB, you can search any field in any report in FlyBase (in a QuerySegment), and then combine the resulting hit-list with searches in other fields, to allow combinatorial searches (combining QuerySegments using Boolean operators).

  • Both simple and complex queries can be built in a few steps.

  • QB allows a user to perform much more sophisticated searches compared to QuickSearch or other search tools on FlyBase, that take full advantage of how the data is stored in FlyBase.

  • A useful feature of QB is that a list of FlyBase identifiers or valid symbols can be imported from an external file to use as a query segment.

  • A set of results can be exported to QB from other searches on FlyBase, through the 'Hit list refinement' button at the top right of a hit-list, and then modified to refine the search by adding additional query segments.

There are three options on the QB start page:

- Select a pre-constructed QueryTemplate

- Import a previously saved query

- Build a new query

hide Select a pre-constructed QueryTemplate

The first option on the QB start page allows one to choose a query from a list of pre-constructed query templates. The available templates are organized by data type/output. To see the list of templates related to a given class of data, choose the data class of interest from the pull down menu at the left. A list of pre-constructed query templates will appear at the right and a data class-specific list of “keywords” will appear at the left. The list of templates can be further refined by selecting one or more of the keywords. Only the templates containing the chosen keywords will remain. To return to the complete set of templates for a given data class, just deselect the chosen keywords.

When you find a template that matches or is similar to your query of interest, click on the button to the left of the template. This will bring you to a Query Builder Page with the specified query set up and ready to run. To modify the parameters to exactly match your own query specifications, use the green “Edit” tabs present in each segment of the query. Modify the search terms as desired, click “Finish Editing”, and then select “Run Query”.

hide Importing Saved Queries

Any QuerySchema (a collection of QuerySegments combined using Boolean operators) can be saved for running again at a later date using the 'Save Query' option on the results page. The QuerySchema is saved as a small text file.

hide How to Build a new Query

To query FlyBase using QB, you must build one or more segments.

To start building a query, click the yellow box titled 'Query is empty... Click here to start building'.

Note that building a segment using the Controlled Vocabulary (CV) hierarchy as your DataSet is slightly different from building a segment with any other data class.

Building a segment using a text-string

STEP 1: Select a data class to search from the DataSet menu.

There are 16 options to chose from. Choosing from any of the top 13 DataSets changes the window display to show all the fields found in the report for that DataSet. The first of the three remaining options is to query FlyBase using the controlled vocabularies (CVs) we use to add structured content to some fields. See below for information on using CVs to search FlyBase.

STEP 2: Select a field to search, or use "Any field" to search full records.

STEP 3: Enter text string to search for. The search algorithm will identify data fields that contain the text string you have entered. You may opt for case sensitivity if desired. Autocomplete will list the field entries corresponding to the text you have typed.

STEP 4: Click the "Finish editing" button.

STEP 5: (optional): To add additional segments, click the "+" button. Additional segments can be joined to existing segments using standard Boolean operators.

Building a segment using a Controlled Vocabulary term

STEP 1: Select "CV Hierarchy (GO/etc.)" from the DataSet drop-down menu.

STEP 2: Clicking this option changes the window display to show the top-level terms from various CVs used in FlyBase. You can either browse through the CVs from these top-level terms or you can search directory for terms matching what you are looking for, using the search box above the terms. By default, your search will be performed for CV terms from the whole subtree of the term you've chosen. If you wish to search only for the exact CV term you have chosen, select "This CV term only" from the drop down menu. (Hint: you'll retrieve more results by searching the whole subtree)

STEP 3: Once you've chosen your term, the window returns to the QB start page, now with the first QuerySegment composed with your chosen CV term.

STEP 4: Click "Done" button.

STEP 5: (optional): To add additional segments, click the "+" button. Additional segments can be joined to existing segments using standard Boolean operators.

Prepare, Check, and Run Query

STEP 1: Check Boolean operators (if the query consists of more than one segment). Default is "AND". Change to "OR" or "BUT NOT" if desired.

STEP 2: Check that QuerySegments are correct. Segments can be modified by clicking on them, or deleted, by clicking the "X" in the top right hand corner of the segment boxes.

STEP 3: Select output options. Default is to show related genes, to provide cross-references to other datasets, and to search D.melanogaster data only. Change if desired.

STEP 4: Click "Run query" button.

hide Searching Expression Data

  • Step 1:
  • Select the "Build a new query" option.


  • Step 2:
  • Select the "Expression Patterns" dataset from the DataSet menu.


  • Step 3:
  • Build your query using CV terms in the Stage, Tissue, and Subcellular Location text fields.
  • The auto-complete feature will help you choose valid CV terms to build an expression statement.


  •  
  • Hints and Tips:
    The input fields in this form use a sophisticated auto-complete feature. When you begin typing in (or even just click inside) a field, a list of suggested CV terms will appear. For the first field you fill in, all appropriate CV terms for that category are available.

    Each filled search field further constrains the auto-complete function for the remaining fields. For example, if you have entered "gastrula stage" in the Developmental Stage field, the auto-complete function for the Body Part/Tissue search field will include the CV term "parasegment 10", but will exclude the CV term "leg". Likewise, if you have entered the CV term "prothoracic leg" in the Body Part/Tissue seach field, the auto-complete function for the Developmental Stage search field will include "adult stage" but exclude "embryonic stage 4".

    If you select only terms suggested by the auto-complete feature, your expression statement query should always match some results.

    Below each search field is a Qualifier field, in which you can enter a qualifier, such as "early" for Developmental Stage, or "apical" for Subcellular Location. Each of the qualifier search fields also has an auto-complete function, and will only offer qualifiers that have been used in curation with the term entered in search field above it.

    Because of this hierarchical auto-completion, it is possible to select a subset of terms that exclude all possibilities in the other field or fields. In this situation, the auto-complete will tell you that there are 'no matching variants'. This is especially true if you select qualifiers for one or more terms. If you run such a query, no hits will be returned. Also, the auto-complete cannot take into account that an expression statement may only exist in, e.g., the "Insertions" dataset, when you are currently searching the "Genes" dataset. In these cases, your search will return no direct hits, but the "Crossreferences" buttons above the results in Step 4 will indicate that there are hits in one or both of the other datasets.

    To avoid running queries which produce no hits, it is highly recommended that you use terms suggested by the auto-complete feature.

    Use of partial contexts and/or wildcards will still allow the auto-complete and search features to function, but may result in over- or under-prediction (inclusion of non-relevant hits, or exclusion of relevant ones) by the complex search/retrieval algorithm.


  • Step 4:
  • Click on the green 'Finish editing' button.
  • You can edit your query before running it by clicking the green 'Edit' button, which will take you back to step 3.
    You can also add new clauses to your search by clicking on the yellow plus sign button. The logic here is similar to what is used for other QueryBuilder datasets.


  •  
  • Hints and Tips:
    Recombinant constructs and transgene insertions can be searched by changing the output option in Step 4 from "Genes" to "Insertions" or "Recombinant Constructs".


  • Step 5:
  • Run your query. (Click on the green 'Run query' button.)


  •  
  • Hints and Tips:
    Please note that you can view results from any of the three datasets ("Genes", "Insertions" and "Recombinant Constructs") in Step 5, even though you have selected one dataset for your output option. Crossreferences in other datasets are indicated above the output. Clicking one of these links will switch your view to the results from the indicated data set.
hide Features
  • Calculations
  • Calculations can be incorporated into searches of fields that contain numbers.
  • The options are greater than (>), less than (<), plus or minus (+/-) and range (-).
  • Any value, no value
  • Search for the presence or absence of information in a field, rather than a specific value.
  • The options are IS NULL and IS NOT NULL (this query is case sensitive).
  • Logical operators
  • Combine multiple query legs with logical operators.
  • The options are and, or, and but not.
  • Phrases
  • Multiple words are treated as a phrase.
  • Only records that include the search words in the order you specify will be matched.
  • Batch queries
  • Upload a list of FlyBase IDs, search for all related records.
  • Standard Batch download is also available for query results.
  • Hierarchical CV queries
  • Full support for GO and Anatomy/Development term relationships.
  • Searches of CV fields within standard data classes (e.g., Genes) find only records that contain the individual term you specify. The GO/Anatomy CV database associates each term in these CVs with all of the terms below it in the hierarchy, allowing a single search to find records that contain a term or any child of that term.
  • Field type tags
  • Five field type tags help organize and identify search options.
    • CV - Controlled Vocabulary, terms are consistent across records
    • Flag - Flags records with the presence of links of specified type (any search of flag field will be performed as "IS NOT NULL", ignoring user-supplied context)
    • Map - Genetic, cytogenetic, or genomic map data
    • Symbol - Symbols are the only, or predominant, datatype
    • Text - Data is free text, usage may not be consistent from record to record
  • Field content dictionaries
  • Preview the information in a field, or select dictionary entries to use in a search.
  • The field dictionary lists up to 100 most-commonly-used symbols, terms, numbers or words from the data in the selected field.
  • Alternative results
  • Related records in other FlyBase data classes are a click away via the green buttons.
  • QB creates a set of cross-references for the records that match your search criteria. An itemized results list (of Genes records, for example) is displayed for the data class that is selected when a search is run. A series of green buttons at the top of the results page provide links to related records in other data classes (Insertions, for example). With QB you do not need to open each report and click through layers of links to find related information. This feature can also be used to find information that may be difficult to search for directly because of unfamiliar nomenclature (such as Insertion Symbols). Only References are excluded from automatic generation of alternative results (because of the large size of this dataset).
  • Linkouts
  • Related information from other databases is a click away via the yellow buttons.
  • If the records identified by your search include links to external databases, these links are available from the yellow button or buttons in the Linkout section of the results page.
hide Further Information and Examples
  • Asterisk is wild. An asterisk (*) on either end of your search string, or embedded in the middle of the string, is interpreted as "any character".
    • Stocks|Symbol mam*
    • Alleles|CV:Phenotype Class *maternal*
    • Insertions|Symbol *ptc*
  • Wild cards are not automatically added to QB searches. If a query is unproductive, try it again with * on one or both ends.
  • Search Flag fields with * or any string of letters.
    • Genes|Flag:InteractiveFly default
    • Polypeptides|Flag:Antibody URL (DSHB Hybridoma: *
  • Case-insensitive searches are standard. There are two exceptions:
    • A case-sensitive Symbol search is available for most data classes.
    • The reserved phrases IS NULL and IS NOT NULL are case sensitive.
  • Multiple words are treated as a phrase.
    • Genes|Text:Other information tissue culture cells
  • Cytological location searches are redirected to the GBrowse dataset, which uses estimated sequence ranges of cytological locations.
  • Join query segments with AND or OR.
  • When using two or more query segments, QB gives precedence to the previous segments.
    • haltere AND wing OR leg is interpreted as (haltere AND wing) OR (leg)
  • Calculation query examples:
    • GBrowse Data|Exact Number of exons > 2   
    • Polypeptides|Protein size (kD: < 50  
    • Annotations|Map:Sequence range 3L:5,787,637..5,819,561 +/- 5000 (commas are optional)
    • Insertions|Map:Cytogenetic location 67B-D
  • References record sets are created only when the References dataset is searched.
    • References|Author Wakimoto (creates a References dataset)
    • Alleles|Text: Discoverer Wakimoto (does not create a References dataset)
hide Notes, Known Problems and Features yet to come
  • To find out more about the controlled ontology databases:
  • GO - Gene Ontology:
  • http://www.geneontology.org
  • To search for GO terms and their definitions, we recommend:
  • http://www.ebi.ac.uk/ego
  • To find out more about our Anatomy and Developmental terms, go to Termlink:
  • http://www.flybase.org/cgi-bin/fbcvq.html?start
  • Cross-references to stocks and images are generated, but cross-references from these data types are blocked. This is because these records may include tangentially related objects, such as the set of genes that are mutant in a multiply marked mapping stock.
  • People data are not included in QB.
  • All of the menus and dictionary files are produced automatically. Dictionary files remain on the server for 2 hours. If an index dictionary for a given field isn't already present on the server, it will take a bit of time to generate it
  • If you encounter any problems with QueryBuilder, or would like help with your queries, please use the contact FlyBase form to write to us.