Skip to content

2020

Firm Historical Headquarter State from SEC 10K/Q Filings

Why the need to use SEC filings?

In the Compustat database, a firm's headquarter state (and other identification) is in fact the current record stored in comp.company. This means once a firm relocates (or updates its incorporate state, address, etc.), all historical observations will be updated and not recording historical state information anymore.

To resolve this issue, an effective way is to use the firm's historical SEC filings. You can follow my previous post Textual Analysis on SEC filings to extract the header information, which includes a wide range of meta data. Alternatively, the University of Notre Dame's Software Repository for Accounting and Finance provides an augmented 10-X header dataset.

2023 March Update

In this update I use 1,491,368 8-K filings of U.S. firms from 2004 to Dec 2022 and extract their HQ state and zipcode. hist_state_zipcode_from_8k_2004_2022.csv.zip

Compute Jackknife Coefficient Estimates in SAS

In certain scenarios, we want to estimate a model's parameters on the sample for each observation with itself excluded. This can be achieved by estimating the model repeatedly on the leave-one-out samples but is very inefficient. If we estimate the model on the full sample, however, the coefficient estimates will certainly be biased. Thankfully, we have the Jackknife method to correct for the bias, which produces the Jackknifed coefficient estimates for each observation.

Python Shared Memory in Multiprocessing

Python 3.8 introduced a new module multiprocessing.shared_memory that provides shared memory for direct access across processes. My test shows that it significantly reduces the memory usage, which also speeds up the program by reducing the costs of copying and moving things around.1

Textual Analysis on SEC Filings

Nowadays top journals favour more granular studies. Sometimes it's useful to dig into the raw SEC filings and perform textual analysis. This note documents how I download all historical SEC filings via EDGAR and conduct some textual analyses.

Call Option Value from Two Approaches

Suppose today the stock price is \(S\) and in one year time, the stock price could be either \(S_1\) or \(S_2\). You hold an European call option on this stock with an exercise price of \(X=S\), where \(S_1<X<S_2\) for simplicity. So you'll exercise the call when the stock price turns out to be \(S_2\) and leave it unexercised if \(S_1\).

Merge Compustat and CRSP

Using the CRSP/Compustat Merged Database (CCM) to extract data is one of the fundamental steps in most finance studies. Here I document several SAS programs for annual, quarterly and monthly data, inspired by and adapted from several examples from the WRDS.1

Decomposing Herfindahl–Hirschman (HHI) Index

Herfindahl–Hirschman (HHI) Index is a well-known market concentration measure determined by two factors:

  1. the size distribution (variance) of firms, and
  2. the number of firms.

Intuitively, having a hundred similar-sized gas stations in town means a far less concentrated environment than just one or two available, and when the number of firms is constant, their size distribution or variance determines the magnitude of market concentration.

Since these two properties jointly determine the HHI measure of concentration, naturally we want a decomposition of HHI that can reflects these two dimensions respectively. This is particularly useful when two distinct markets have the same level of HHI measure, but the concentration may result from different sources. Note that here these two markets do not necessarily have to be industry A versus industry B, but can be the same industry niche in two geographical areas, for example.

Thus, we can think of HHI as the sum of the actual market state's deviation from 1) all firms having the same size, and the deviation from 2) a fully competitive environment with infinite number of firms in the market. Some simple math can solve our problem.

Bloomberg BQuant (BQNT)

Bloomberg is developing a new function in the Terminal, called BQuant, BQNT, under the Bloomberg Anywhere license. I happen to be able to test it thanks to a fund manager and find it could be a future way of using Bloomberg Terminal.

Identify Chinese State-Owned Enterprise using CSMAR

Many research papers on Chinese firms include a control variable that indicates if the firm is a state-owned enterprise (SOE). This is important as SOEs and non-SOEs differ in many aspects and may have structural differences. This post documents the way to construct this indicator variable from the CSMAR databases.

Working Remotely on a Windows Machine from VSCode on a Mac

Now I only need a MacBook (1.3 GHz dual-core i5) to do all my work anywhere, thanks to a powerful workstation provided by the university. Yet the workstation is based on Windows 10 and sitting behind the university VPN. I don't want to use Remote Desktop every time I need to do some coding, so I decided to make it so I can code remotely on the workstation but from the lovely VSCode on my little MacBook.