About the Lab — Corporate Narrative Intelligence Lab

Purpose

Demonstrate product thinking, text analytics, interpretable AI-assisted analysis, and dashboard storytelling. This project is a research portfolio piece — not financial advice, legal guidance, or an academic regression model ready for peer review.

The underlying research question: do companies that experience negative cyclical financial performance, or top management team turnover, exhibit measurably different disclosure language patterns in their annual reports?

Out of Scope

The following are intentionally excluded from this demo version:

User authentication · Live SEC/EDGAR scraping · Paid data integrations (Bloomberg, Refinitiv) · Real-time financial data · Multi-user workspaces · Full academic regression engine with bootstrapped confidence intervals · Production-grade MLOps pipeline.

Tech Stack

Analysis: Python · pandas · scikit-learn (TF-IDF, cosine similarity) · NLTK (tokenization, Gunning Fog) · Loughran–McDonald financial dictionary · FinBERT-compatible sentiment schema.

Demo interface: Streamlit (Python) — the original app — ported to static HTML/JS for integration into this blog. Chart.js for interactive visualizations.

Data: EDGAR 10-K filings (Item 7 MD&A sections) · S&P Global financial data · BoardEx leadership events.

Extensions

Natural extensions of the current architecture:

· Replace the LM lexical scorer with a fine-tuned FinBERT model once labeled examples exist
· Add entity extraction to auto-classify leadership events from 8-K press releases
· Extend to 10-K Business Description (Item 1) for strategy-level language changes
· Build a regression model to test statistical significance of narrative shifts vs. financial outcomes
· Multi-sector expansion (energy, financial services, consumer)

Source

Full source code available on GitHub:

github.com/AntoineNgx/corporate-narrative-intelligence-lab

Built by Antoine Nguyen · antoinengx@gmail.com

Corporate NarrativeIntelligence

Corporate Narrative
Intelligence