# DataMASTER 2018-19 **The Effects of Patent Filing Acceleration on the Evolution of Technological Innovation** Author: [Ryan Steed](https://rbsteed.com) Source code: [github.com/ryansteed/datamaster](https://github.com/ryansteed/datamaster) [DataMASTER Fellowship](https://math.columbian.gwu.edu/data-master) 2018-19 --- *Acknowledgement and thanks to:* - [Professor Rahul Simha](https://www.seas.gwu.edu/rahul-simha) for research mentorship and guidance - [George Washington University Colonial One High Performance Computing](https://colonialone.gwu.edu/) for compute resources and data storage - [PatentsView](http://www.patentsview.org) for easy public API access to USPTO patent data ## Abstract Economists, historians and business leaders generally agree that innovation is inextricably linked to continued prosperity and national competitiveness. Accordingly, nations sponsor research and craft legislation, such as intellectual property protection, to stimulate innovation. Yet, to persuade a public skeptical of government expenditures, leaders and policymakers often seek ways to assess this investment and quantify its benefits. This study aims to address the broad practical question: is there a rigorous way to quantifiably assess innovation and its spread with currently available data? This investigation measures the propagation of innovation and the evolution of knowledge by examining the changing structures of patent citation networks. Patents comprise the best source of public intellectual property information and are commonly used to construct a large network of patent nodes linked by their citations, which generally represent flows of knowledge. Within this analytical framework, an index of total knowledge contribution (TKC) is developed to to measure the impact of the intellectual property in individual patents on subsequent inventions. The index is applied to citation networks constructed from patents granted between 1976 and 2018 in a variety of USPC technology sectors. Comparing the distribution and rates of TKC for networks from different fields of research, this study interprets statistically significant differences between sectors and identifies quickly evolving areas of development. Subsampling by policy regime determines the impact of "first-to-file" patent legislation on innovation rates. Finally, an ARIMA model is applied to forecast TKC for each test sector, demonstrating the use of the index to identify emerging areas of appropriable research. This research constitutes a novel method for assessing the contribution of individual patents to public knowledge and predicting the effect of observable patent features, technology sectors, and policy programs on the evolution of innovation in patent citation networks. View the full project proposal [here](https://rbsteed.com/docs/datamaster/proposal.pdf). ## Index | Contents | Description | | ---------- | ----------- | | `app/` | The application source code. | | `data/` | A folder for loaded data (graphs, patent trees, and custom queries). | | `docs/` | API, project, and data exploration documentation. | | `logs/` | Storage location for server logs, named by environment. | | `scripts/` | Individual use scripts for slurm data collection and processing jobs and R analysis. | | `slurm/` | Storage location for slurm log files. | | `env.yml` | Dependencies for `conda` environment. | | `main.py` | Driver script for the application containing API [endpoints](#api). | ## Installation After installing `git` and `conda`: ```bash git clone https://github.com/ryansteed/datamaster # clone this repo cd datamaster conda env update env.yml # create conda env source activate datamaster # activate env python main.py -h # view help ``` ## Making the Docs This documentation is autogenerated from docstrings in the codebase. Follow these instructions to refresh the documentation. From the root project folder, run: ```bash cd docs # Build documentation hierarchy (.rst files) in source folder from app package sphinx-apidoc --implicit-namespaces --separate -o source ../app # Make the html folder make clean make html ``` HTML documentation can be accessed from the project root `html` symlink. --- © Ryan Steed 2019