Releases: JerBouma/FinanceDatabase
FinanceDatabase 2.4.0
Even though not much in the package itself changed there are major new additions to the database itself, these are the main topics:
EXPANDED METADATA
Added hundreds of thousands of new entries across Equities, ETFs, Funds, Indices, Currencies, and Cryptocurrencies. These are fields such as business summaries, sectors, industry groups, and categories for precise filtering and screening.This goes alongside new company IPOs. There now also exists a delisted flag that indicates whether the company is delisted or not.
DATA QUALITY IMPROVEMENTS
Resolved conflicting company names, refined existing categorizations, and completely re-evaluated existing mappings. This includes adding over 15,000 new ISIN and CUSIP codes to significantly improve database precision. Furthermore, the equities, funds and etfs datasets now have been split up in exchanges (before compression) so PRs are easily reviewable.
IMPROVED ROBUSTNESS
Upgraded the core Python package to handle edge cases and invalid inputs gracefully, ensuring a seamless integration with tools like the Finance Toolkit. Furthermore, there exists now a delisted filter that automatically filters out companies that have the delisted flag set to True.
What's Changed
- Bump fonttools from 4.58.5 to 4.61.0 by @dependabot[bot] in #117
- fix: correct and enrich 229 ISIN codes in equities database by @AlfaStake in #126
- Add ISIN codes for ETFs by @AlfaStake in #124
- Bump pillow from 11.3.0 to 12.1.1 by @dependabot[bot] in #123
- Bump protobuf from 6.31.1 to 6.33.5 by @dependabot[bot] in #121
- Added SCMB by @pettijohn in #135
- Added some missing currencies and names from SEC Data by @pettijohn in #136
- Backfill ~4.5k CUSIP values in equities.csv from SEC 13F filings by @dokson in #138
- Backfill ISIN/CUSIP + fix placeholder names from yfinance by @dokson in #139
- Fix 1,084 placeholder names + 614 country + 35 exchange from yfinance by @dokson in #141
- Replace 725 boilerplate summaries with yfinance longBusinessSummary by @dokson in #142
- ASE exchange fix + ISIN/FIGI backfill from public data by @dokson in #143
- Fill 17,252 empty exchange + market cells in equities.csv by @dokson in #145
- Baseline test improvements: snapshots, local data, docstrings, infra cleanup by @dokson in #140
- Test infra follow-ups #2: library invariants, helpers.py coverage 45→86%, cov in CI by @dokson in #146
- ETFs/Funds data quality + cross-asset invariants + equities country/ISIN backfill + SPAC cleanup + README stats by @dokson in #147
- Add mic_code (ISO 10383 MIC) column by @dokson in #149
- Enrich composite_figi and fill mic on new tickers by @dokson in #150
- Enrich FIGI via OpenFIGI, add 709 tickers (#119), fix README stats workflow by @dokson in #151
- Added delisted column into the database and split up the equities, funds and etfs files by @JerBouma
New Contributors
- @AlfaStake made their first contribution in #126
- @pettijohn made their first contribution in #135
- @dokson made their first contribution in #138
Full Changelog: 2.3.1...2.4.0
FinanceDatabase 2.3.1
FinanceDatabase 2.3.0
The latest release of FinanceDatabase (v2.3.0) significantly expands the quality of the database and the ability to better parse through the 300.000+ entries as found in the database with the related financedatabase package.
I've added hundreds of thousands of new metadata entries across Equities, ETFs, Funds, Indices, Moneymarkets, Currencies, and Cryptocurrencies. These are fields such as summaries, sectors, industry groups, industries, category groups, categories, and currency and more. This allows for more precise and granular classification for better filtering, grouping, and analysis.
I've also updated the associated Python package by adding more robustness, handling edge cases and invalid inputs more gracefully. Upgrade now to v2.3.0 with pip install financedatabase -U.
FinanceDatabase 2.2.0
This release features the introduction of the FinanceFrame. This is a DataFrame with extras, namely the function to_toolkit. As also visible in this example, the Finance Database can now directly convert the symbols to the Finance Toolkit 🛠️ making it possible to do combine the exploration of tickers from the Finance Database with the in-depth financial analysis from the Finance Toolkit.
As an example, it is possible to use both .select() and .search() and combine this with .to_toolkit().
For fundamental data, you need to obtain an API Key from FinancialModelingPrep. This is used to gain access to 30+ years of financial statement both annually and quarterly. Note that the Free plan is limited to 250 requests each day, 5 years of data and only features companies listed on US exchanges.
Through the link you are able to subscribe for the free plan and also premium plans at a 15% discount. This is an affiliate link and thus supports the project at the same time. I have chosen FinancialModelingPrep as a source as I find it to be the most transparent, reliable and at an affordable price. I have yet to find a platform offering such low prices for the amount of data offered.
FinanceDatabase 2.1.1
FinanceDatabase 2.1.0
This release includes over 9.000 ISIN codes, more than 25.000 FIGI codes and more than 2.000 CUSIP codes. This is a start to not only allowing you to work with tickers (like MSFT which is Microsoft Corporation) but also with ISIN (US5949181045), FIGIs (BBG000BPHFS9) and CUSIP (594918104) codes.
Furthermore, I have also updated the exchange names being much more clear now. Lastly, made a small bug fix when you wish to work with the database locally.
FinanceDatabase 2.0.0
I've improved this database not only by increasing the amount of symbols (from 180k to 300k) but also:
- Approximated the The Global Industry Classification Standard (GICS®), a standard used for sectors and industries everywhere. Note that this was approximated and therefore no actual data is collected. Furthermore, not all categories are included.
- Updated and removed tickers that either no longer exist or had outdated information.
- Made the package itself object orientated making data collecting and searching much more efficient and logical. (shoutout to Colin Delahunty for the help here too)
- The database initially featured thousands of JSON files. At the time it made sense also given my rather novice background in programming. However, a much more efficient (and manageable way) is to work with CSV files. So instead, one CSV file per asset class
- Due to using CSV files, it becomes really easy to update accordingly.
- To make loading data itself still quick, it automatically compresses the data so that loading in data is not slowed down by using a format that is more easy to update.
- Updated the README, Contributing Guidelines and overal documentation.
FinanceDatabase 1.0.0
Added market cap to Equities and total assets to ETFs and Funds. Furthermore, renamed the package to financedatabase (from FinanceDatabase) to be more in line with PEP8.
FinanceDatabase 0.1.11
In this release I have included a parameter called exclude_exchanges. By default this paramater is set to True which means all exchanges are excluded from the selection (via for example select_equities). This is done to prevent the user from receiving several times the same ticker (but in different exchanges). Setting this parameter to False returns the amount of tickers as listed in the Key Statistics in README.md
FinanceDatabase 0.1.10
Added the possibility to point to a different URL as well as the option to select a local location. Furthermore, fixed a small bug in the select_etfs function and added test cases to ensure for a more robust package.
