Skip to main content
Entities are named things extracted from signals and resolved to canonical identities. Gildea tracks 15 entity types across the AI ecosystem:
TypeExamples
OrganizationNVIDIA, Anthropic, OpenAI
PersonSam Altman, Dario Amodei, Jensen Huang
ModelGPT-4o, Claude 3.5 Sonnet, Llama 3 70B
ProductChatGPT, Gemini, Claude Code
HardwareH100, A100, TPU v5e
DatasetThe Pile, CommonCrawl, MMLU
BenchmarkGPQA, HumanEval, MATH
FrameworkLangChain, vLLM, PyTorch
SoftwareGitHub Copilot, Cursor, Replit
Distribution ChannelHuggingFace Hub, AWS Bedrock, Azure OpenAI
Regulation/PolicyEU AI Act, Executive Order 14110
PublicationResearch papers, reports
LocationCountries, regions, cities
EventConferences, product launches
ConceptNamed technical concepts
This means you can track not just which companies are trending, but which models, hardware, datasets, and regulations are gaining attention.

Entity extraction and disambiguation

Gildea uses a two-pass entity extraction system to ensure accurate identification:

Pass 1 — Google Natural Language API

Signals are processed through Google Cloud NL to extract entity mentions with Knowledge Graph linking, type classification across 15 types, and salience scoring.

Pass 2 — Domain-specific disambiguation

A curated rule set disambiguates AI-specific entities that general NLP models struggle with:
  • Model families: “Claude 3.5 Sonnet”, “GPT-4o”, “Llama 3 70B” resolve to specific model entries, not generic company mentions
  • Hardware: “H100”, “A100”, “TPU v5e” resolve to specific chips
  • Contextual disambiguation: “Claude” + context mentioning “sonnet” resolves to the model, not a person

Canonical identity resolution

Every entity gets a stable canonical ID through a priority chain:
  1. Google Knowledge Graph MID — highest confidence, links to Google’s knowledge base
  2. Curated domain ID — for AI-specific entities (e.g., org:/nvidia, mdl:/openai-gpt-4o)
  3. Name-based fallback — for entities without external links
This means searching for “Meta”, “Meta Platforms”, or “Facebook” all resolve to the same entity.

Noise filtering

Generic concepts (“AI”, “market”, “industry”) are excluded and corporate suffixes (“Inc”, “LLC”, “Corp”) are normalized so “NVIDIA Corporation” and “NVIDIA” resolve to the same entity.

Entity profiles

Each entity has a rich profile including:
  • Signal count — total signals mentioning this entity
  • Trend stats — Theil-Sen slope, Mann-Kendall significance, share of voice, volatility
  • Content type mix — breakdown of expert analysis vs. event signals
  • Theme distributions — which value chain segments and market forces this entity appears in
  • Related entities — co-occurrence relationships

Trend analytics

Every entity includes a trend object with analytics across four dimensions:
FieldDimensionDescription
share_of_voiceScaleEntity’s share of total corpus this week
source_diversityScaleCount of distinct source domains
recency_scoreScaleExponential decay from last mention (1.0 = today)
theil_sen_slopeDirectionalityRobust trend slope (resistant to outliers)
streakDirectionalityConsecutive weeks of growth
mk_tauSignificanceMann-Kendall tau (-1 to +1)
mk_p_valueSignificanceStatistical significance of trend (< 0.1 = significant)
coefficient_of_variationVolatilityHow variable is coverage (stddev / mean)

Interpretation fields

Each entity includes interpretation fields derived from the raw trend stats. These are discrete labels that agents can act on without statistical expertise.
FieldValuesWhat it tells you
scaleHigh, Medium, LowHow prominent is this entity relative to the corpus
directionRising, Stable, Declining, NewWhich way is coverage trending
confidenceSignificant, InsignificantIs the trend statistically reliable
stabilityVolatile, SteadyHow consistent is coverage week to week
priorityHigh, Medium, Low, NegligibleCombined assessment across all dimensions
priority_reasoningFree textHuman-readable explanation of the priority assignment
Entities with fewer than 8 mentions in the 12-week window return scale only — direction, confidence, stability, and priority will be null. The priority_reasoning field will contain “Insufficient data for trend analysis.”
{
  "entity_id": "org:/nvidia",
  "display_name": "NVIDIA",
  "scale": "High",
  "direction": "Rising",
  "confidence": "Significant",
  "stability": "Steady",
  "priority": "High",
  "priority_reasoning": "High-scale entity with significant upward trend and steady coverage; reliable signal of growing dominance.",
  "trend": { "..." : "..." }
}

Composable filtering

The list entities endpoint supports filtering by any combination of interpretation fields. This replaces the old trending modes with a more powerful, composable approach.
FilterValuesReplaces
direction=Rising + confidence=SignificantReliably trending upOld rising mode
direction=New + sort=first_seenRecently tracked entitiesOld emerging mode
stability=Volatile + sort=trendMost variable coverageOld volatile mode
Filters can be combined freely. For example, ?direction=Rising&scale=High&sort=trend finds high-prominence entities with upward trends.

Co-occurrences

The entity detail endpoint includes related_entities — entities that frequently appear together in signals. This reveals industry relationships, competitive dynamics, and supply chain connections.
{
  "related_entities": [
    {
      "entity_id": "org:/tsmc",
      "name": "TSMC",
      "type": "Organization",
      "co_occurrence_count": 28
    }
  ]
}