It’s been a hectic few weeks, but it felt great to finally hand in my thesis this afternoon. I have dropped on a page the abstract of my paper and a graphical walk-through of my findings. The paper itself is available here. I’m especially grateful to my thesis advisor, Oded Galor, for so many conversations and comments. I’m also very appreciative of many friends for helpful discussions along the way about my findings (and particularly to Chris and Christine for comments this week!).
I’ll be presenting my thesis to the Economics department May 1 (anyone is welcome to attend). I’ll also be making a less technical presentation at Theories in Action on Sunday, April 28.
I am writing an interdisciplinary senior thesis at Brown spanning the fields of computer science and economics. The subject is submarine patents.
A submarine patent is a patent whose prosecution review at the US Patent & Trademark Office is purposefully prolonged by the applicant, in the hopes of “emerging” some years later with a patent on what has become a fundamental technology, extracting licensing fees from businesses who have already built upon this technology without knowledge of the patent’s filing.
Two reforms made submarine patents much less worthwhile to pursue. While patent terms used to be determined from issue date, starting in June of 1995, all new patent filings would receive terms from date of filing — the fact that the clock was ticking during prosecution made stalling at this stage much less desirable. A further reform came in November of 2000, when the USPTO announced that most patent applications would be published to the world 18 months after filing. Given the changes in patent term structure and the lifting of the veil of secrecy surrounding patent applications, the ability for inventors to unexpectedly corner a market long after filing their invention has been effectively eliminated.
Despite the closure of these loopholes, submarine patents continue to issue. In examining all patents that have issued over the past several decades, a single anomaly is prominent: many applications self-sorted to file prior to the closing of the loophole. A tremendous number of patent applications were filed in these days and weeks, and we now see that these were no ordinary applications. In fact, applications filed in these few weeks represent the most pronounced spike in average pendency in modern history. Identifying submarine patents as those filing prior to this discontinuity offers a unique vantage point from which to study the motives for and outcomes of submarine patents.
First, I have downloaded the full text of every patent granted in the past three decades.
I transformed these documents into roughly the following relations (number of tuples in parentheses):
- Basic bibliographic info — one line for each patent grant (4.8M) (dta sample)
- Assignee: name, address, etc. (for all assignees of patent) (4.3M) (dta sample)
- Inventor: name, address, etc. (for all assignees of patent) (11M) (dta sample)
- References to other patents (in the US and abroad) (55M) (dta sample)
- References to “non-patent literature” (papers, brochures, etc.) (16M) (dta sample)
- Parents (1.8M) (dta sample)
- Fields searched by examiner as prior arts (11M) (dta sample)
Presentation to Brown’s Economics Honors Thesis Class Nov. 20, 2012 (PDF):
It’s been a few months since I’ve posted here; blogging was a bit taboo this summer at work (though it turns out I found plenty of other ways to raise red flags for the Cyber Security team using just Python + the Interwebs).
Working in an office with several other research assistants who were proficient with some statistical scripting languages (Stata, SAS), I began to think there’s probably a niche for a more general-purpose language in academic social science research (as well as in automating some of the tasks involved with casework around the office). I was already using Python in much of my work. What started out as a few trips to coworkers’ desks to help them write this or that script quickly turned into a few pages of notes, and that turned into some thirty pages of charts, explanations, and instructional tasks. (I must note, the final formatting was inspired by the style of my linear algebra lecture notes from last semester.)
I presented a version of Python for Economists to some coworkers at the FTC Bureau of Economics in July. I’ve been a student of three different college classes that taught Python from scratch, but I’ve never seen a way of teaching Python that I thought was appropriate for students already familiar with scripting languages such as Stata. I focus on two broad applications of Python I’ve found very useful in social science research: web scraping and textual processing (including regular expressions).
- PDF of the booklet (34 pages, colored Python syntax highlighting)
- Zipped supporting materials used in the exercises
I’m a bit disappointed now that I’m finally going through the data I downloaded from Google Patents throughout the semester. It doesn’t seem like it will be very useful for looking at patent trends prior to 2000. It’s unclear what sampling of patent applications they’re actually providing; I wish they were more transparent about what data they’re providing.