QueryFlow Keyword Analyzer (Single Patent)
The QueryFlow tool differs from the Keyword Analyzer in several ways:
- QueryFlow uses a powerful algorithm called TF/IDF (Term Frequency/Inverse Document Frequency) to find "important" terms in the source document.
- QueryFlow defaults to a Boolean OR operator compared to the Boolean AND in the Keyword Tool.
- Terms are weighted when using QueryFlow.
An "important" term is defined as a term that appears in the patent with a high frequency but does not appear very often in the entire patent corpus. As a result, TF/IDF doesn't need stop terms. Terms like the, and, but, with, etc. appear so often in the corpus that they are never identified as important terms. Even patentese terms such as method, apparatus, or system appear so frequently in patent data that the algorithm never deems them important.
Because of the weighting, and the nature of the Boolean OR operator, QueryFlow does a much better job of creating inclusive lists. This does mean that the lists can be very large. However, the most relevant patents will be displayed at the top of the search results list.
Term weighting boosts a term's relevance in the search results and increases the overall precision of your query. Patents with higher weighted terms appear higher in the search results. Weighting terms does NOT impact the recall, which means the same number of documents will be returned by the search engine no matter what weighting you use. But the order in which they appear will be affected.
Where to Find QueryFlow (Single Document)
To access QueryFlow TermExtract (for a single document) go to Analyze (1) on the Document Details window and choose QueryFlow TermExtract, then either Claims or Full Text.
In this example we'll choose Full Text.
The QueryFlow Window
The QueryFlow window will ultimately help you create a sophisticated query that looks something like this:
(SPEC:(atomizer^100 OR atomization^97 OR nicotine^96 OR cigarette^95 OR mm-1.3^94 OR smoking^93 OR mouthpiece^92 OR bottle^89 OR ripple^89 OR reed^89 OR smoker^89 OR colpitts^87 OR separator^86 OR piezoelectric^85 OR supplying^84 OR ejection^83 OR cavity^83 OR postponed^81 OR vapor^80 OR substance^80 OR smoker's^79 OR shell^79 OR aerosol^79 OR substitutes^77 OR sensor^77 OR microswitch^76 OR foam^75 OR exhilarant^74 OR liquid^72 OR inebriety^72 OR stream^72 OR porous^71 OR magnetic^71 OR 0.1-3.1^70)) AND APD:([NOW-20YEARS TO NOW])
This query could take an hour or more to build manually, with QueryFlow you can do it in seconds!
The QueryFlow window shown here reflects the same query but in a form that is much easier to interpret and modify by the user.
Elements of the QueryFlow Window
The default QueryFlow window has the following presets:
- The Boolean OR operator is used by default in the Operator column (1).
- Keywords are listed in the Term column (2) in descending order of importance.
- "Importance," which becomes the weighting or "term boost," is listed in the Weight column (3).
The query that you generate is completely modifiable:
- To delete a term, click the trash icon in the right most column.
- To require a term, (i.e., switch to a Boolean AND) double-click the word "Favored (OR)" and switch it to the "Required (AND)" operator choice.
- To exclude a term, (i.e., switch to a Boolean NOT) double-click the word "Favored (OR)" and switch it to the "Excluded (NOT)" operator choice.
- To add synonyms, double-click the term and enter synonyms for the term displayed, joined by another Boolean OR operator (you can add the Boolean AND as well, but it will then make them both required).
- To add terms not found in the patent, click Add Term in the toolbar. You can then add a term and give it an operator and a weight.
Viewing the Modified Query
The original suggested query was manually modified in the figure above. Each modification causes a small red triangle to appear in the cell to let you know it was modified.
The query generated in this example will look like this:
(SPEC:atomizer^100 AND SPEC:(nicotine^96 OR (cigarette OR "smoking device")^96 OR smoking^93 OR mm-1.3^50 OR mouthpiece^92 OR (bottle OR container)^91 OR postposed^89 OR reed^100 OR ripple^88 OR (separator OR divider)^87 OR cavity^86 OR supplying^86 OR ejection^85 OR piezoelectric^85 OR shell^85 OR colpitts^85 OR terylene^82 OR sensor^82 OR vapor^82 OR liquid^82 OR smokers^82 OR hole^81 OR steel^81 OR substitutes^81 OR aerosol^81 OR microswitch^80 OR foam^80 OR stream^80 OR porous^80 OR valve^80 OR inebriety^80 OR magnetic^80 OR *:*)) AND APD:([NOW-20YEARS TO NOW])
Before you run the query, you can view it by choosing to first display the query with or without class constraints. That way, if you love AcclaimIP's advanced syntax you can test that the software is giving you what you want.
You can also choose to save your query into a research folder by clicking Save Query on the toolbar.
Running the Query
To run the query, click Search (1) and choose to run it With Class Constraints (or without class constraints by using the Keywords Only option).
Class constraints will limit the results to the class of the source document (or one of the children classes in the classification hierarchy) and will greatly reduce the size of your results set. You'll likely get more relevant results than searching with Keywords Only, but you might miss some patents found in related classes that are not immediate children classes of the source document's class.
Viewing the Search Results
The search results above clearly demonstrate the power of keyword rich, weighted, and broad OR'd queries.
But notice there are millions of results. Don't be intimidated! The patents are ranked by relevance with the best matches on top. As you page through these results the relevance will eventually drop off, and you can stop reviewing the patents. For example, the very last patent on the list probably only hit on the word "supply" that may be in many patents that have nothing to do with what you are looking for. But we don't want to limit you to just, say, the top 100 search results. You can miss important patents. Instead, AcclaimIP returns all of the search results and lets you choose when they no longer apply to what you want. An additional note: Since Remove Granted Apps, Remove Expired Patents, and Family Dedupe are turned off, the number of results is increased here.
We also help you investigate your search results by implementing something called "SuperFacets."
The figure above shows the Assignee (Original) SuperFacet (1) from the top 200. SuperFacets are implemented in multiple places in AcclaimIP where you might get 1 million or more search results and relevance is important. The QueryFlow and Find Similar Documents queries are good examples of this.
What QueryFlow is Good For
QueryFlow does an amazing job of finding patents by casting a very broad net but still bringing the best results to the top. SuperFacets identify the top classes where highly keyword relevant patents appear.
What QueryFlow is Not Good For
Queryflow (or any keyword method, really) is not very good at isolating patents covering a specific technology. The result sets are generally massive, and before long, the relevance will drop. It is much better to isolate specific technologies using classification queries, class queries with keywords, and by manually proctoring them (although this last option is not really recommended unless you have nothing but time on your hands).