How The Good Index finds, scores, and organizes the signals that matter most.
RSS and Atom feeds are fetched on a configurable schedule. New articles are deduplicated by URL and stored with their full content.
Each article's title and content are converted into a 1536-dimensional vector embedding using OpenAI's text-embedding model. These vectors capture semantic meaning.
Related articles are grouped using agglomerative clustering based on cosine similarity between their embeddings. This ensures coverage of the same story is unified.
Each article is evaluated by AI across multiple configurable dimensions (e.g., novelty, commercial viability, funding). Scores range from 1-10 with a rubric-guided assessment.
AI generates landscape reports connecting themes across clusters. These syntheses identify trends, patterns, and emerging developments in the domain.
Signals are monitored over time with trajectory analysis. Trends are classified as accelerating, growing, plateauing, or declining based on score velocity.
Before scoring, every article is assessed for relevance to the ethical innovation domain. Articles that don't pass the relevance threshold are filtered out to keep the feed focused and high-signal.
Relevance Scoring Rubric
Every signal is evaluated across multiple dimensions that capture real-world impact, momentum, and novelty. Explore each dimension to see what we look for.
The radar chart visualizes an article's score profile across all dimensions. Different shapes indicate different types of signals.
Early signal
Novel approach to an underserved problem. Worth watching closely.
Opportunity signal
Major problem screaming for attention but few building solutions.
Established player
Mature sector with incremental progress.
Not all sources are equal. We assess credibility to ensure high-quality signals rise to the top.
Established track record, publishing history, and recognition within the domain. Major outlets and peer-reviewed journals score highest.
Evidence of editorial oversight, correction policies, and separation of opinion from reporting. Sources with clear editorial processes are weighted more heavily.
Track record of accuracy, use of primary sources, and citation of evidence. Sources that regularly cite data and link to primary research score higher.
Clear disclosure of funding, conflicts of interest, and methodology. Organizations that openly share their methods and affiliations are rated higher.
Articles covering the same story are automatically grouped using vector embeddings and agglomerative clustering. Each article's content is converted to a 1536-dimensional embedding, then compared using cosine similarity to find related coverage.
Similarity Threshold
0.65
Minimum cosine similarity for two articles to be considered related
Merge Threshold
0.70
If two cluster centroids exceed this similarity, they are merged
Min Cluster Size
2
Minimum number of articles required to form a cluster
Window
7 days
Only articles within this time window are clustered together
Risk that this cluster contains misleading environmental claims. Based on ratio of corporate PR to independent sources, specificity of claims, and coordinated timing.
Risk that this cluster contains greenwashing or misleading claims
AI Assessment Guidance
“Consider ratio of corporate PR to independent sources, vague vs specific claims, coordinated release timing.”