Skip to content

SEC Filling Metadata

Overview

SEC Filling metadata has been published by FinnHub through Kaggle on website from 1994 to 2020

Prerequisite

Tool Description Access
Kaggle CLI Interactive with kaggle, implemted for Python Official GitHub
fsspec Filesystem interfaces for Python Documentation

Method

[1] Declare environment

Declare requirements both for selected phase: production or development-only

requirements-dev.txt

requirements.txt

then execute to set up

Execute Vitural Environment

[2] Download dataset:

Using command from copy from Kaggle UI with path and unzip support.

Download dataset from Kaggle

You can see the progress with the total resources downloaded (in MB) and the download rate.

Download=

[2] Using fsspec to interactive with folder and multiple read using polars

*) Declare dependencies to working with


a) Get CSV file path in the download folder


b) Read by read_csv and merge it together by concat


There are 3 strategy of concat by polars:

flowchart LR
    polars_method[Polars Concat Method]
    polars_method --> vertical[vertical] --> vd[applies multiple vstack operations]
    polars_method --> diagonal[diagonal] --> dd[finds a union between the column schemas and fills missing column values with null]
    polars_method --> horizontal[horizontal] --> hd[stacks Series from DataFrames horizontally and fills with nulls if the lengths dont match]

Download=

As you can see, total of 115 files yield 21M record rows of 8 columns

c) Basic Look on dataframe


d) Data

Response Attributes:

acceptedDate Accepted date %Y-%m-%d %H:%M:%S.

accessNumber Access number.

cik CIK.

filedDate Filed date %Y-%m-%d %H:%M:%S.

filingUrl Filing's URL.

form Form type.

reportUrl Report's URL.

symbol Symbol.

e) Answer some question

Total

Reference

https://finnhub.io/docs/api/filings-sentiment

https://seaborn.pydata.org/examples/timeseries_facets.html