Document clustering is a technique for organizing text documents into clearly labeled thematic folders without the need for an external knowledge base such as a categorization or classification system.
Some of the benefits of document clustering include:
- Quick overview of a document set summarizing important subjects, concepts, and themes.
- Fast navigation to relevant documents in your search results.
- Query refinement: Many of the themes discovered by clustering will become keywords in your regular searches.
AcclaimIP clusters up to 1,000 documents at a time. While you can cluster any set, you'll get more useful results if you first narrow down your search results using classification and keyword queries so that the patents broadly cover the same topic.
Clustering Your Search Results
First, perform a basic search. Then:
- Click the Analyze menu on the toolbar.
- Click the Cluster menu option (1).
- Choose a clustering option to cluster from 100 to 1,000 documents.
AcclaimIP clusters the results in your list based on your current sort order. One strategy is to sort your documents with the newest on top.
Cluster a Fairly Narrow Search Result Set
Of course, your should first narrow your search using class and keyword searches to ensure the clustered documents cover roughly the same subject matter.
In the example above, I clustered the first 1,000 patents to capture a random sample of patents covering a wide range of technologies. As you can see, if you don't narrow your search, the clusters might not be very helpful, as the First Document subterm under the term Document does not give you enough details to know anything about the patents in that folder.
Clustering as a Concept Extracting Engine
Even without narrowing your search, though, clustering can help you uncover the big ideas in a large set of patents much faster than reading and scanning 1,000 patents.
Let's say you have to perform a search on "endoscope," and let's also assume you have never done a patent search in this area before and have little knowledge of the patent landscape.
Simply typing in "endoscope" into the Search window and clustering the results gives you the result above. Notice how major concepts, themes, and ideas are presented for you to further analyze.