As the number of patents filed with the US Patent Office has ballooned over the last two decades, the need for more powerful patent analytics tools has grown stronger. In 2012, the US Federal Government’s America Invents Act (AIA) put into place a new post-grant review process by which any member of the public could challenge an existing patent through the Patent Trials and Appeal Board (PTAB). Our capstone team developed a tool to predict outcomes for this post-grant review process. We developed algorithms to predict two major outcomes: whether a case brought by a member of the public will be accepted by the Patent Trials and Appeal Board and, once that case is accepted, whether the relevant patent will be invalidated by the Board.
In this report, I focus on the former algorithm—acceptance vs. denial prediction. To predict case acceptance/denial we use natural language processing (NLP) techniques to convert each litigated patent document into thousands of numeric features. Upon combining these text-based features with patent metadata, we used two primary machine learning algorithms to attempt to classify these documents based on their case acceptance/denial outcome: support vector classification and random forests. In this report, I focus both on the efforts we went through to wrangle the data as well as the hyperparameters we tuned across these two algorithms. We found that we were able to achieve performant algorithms that exhibited classification accuracy slightly better than the base rate data skew, although further room for improvement exists. As the post-grant review process matures, there will be further opportunity to gather more case data, refine the tools we have built over the past year, and increase the confidence associated with post-grant review analytics.
Title
Predicting Bad Patents: Employing Machine Learning to Predict Post-Grant Review Outcomes for US Patents
Published
2017-05-11
Full Collection Name
Electrical Engineering & Computer Sciences Technical Reports
Other Identifiers
EECS-2017-60
Type
Text
Extent
46 p
Archive
The Engineering Library
Usage Statement
Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).