Entities - Gildea API

Entities are named things extracted from signals and resolved to canonical identities. Gildea classifies entities into 8 types across the AI ecosystem:

Type	Examples
`organization`	NVIDIA, Anthropic, OpenAI
`person`	Sam Altman, Dario Amodei, Jensen Huang
`model`	GPT-5, Claude Opus 4.7, Llama 4
`hardware`	H100, A100, TPU v5e
`location`	Countries, regions, cities
`event`	Conferences, product launches
`regulation_policy`	EU AI Act, Executive Order 14110
`other`	Everything else: products, software, frameworks, datasets, benchmarks, publications, and named technical concepts

Use type as the authoritative type. The other bucket is a deliberate catch-all: entities that aren’t one of the seven specific types live here rather than in a long tail of sparsely-populated categories. To find a specific product or framework, search by name rather than filtering on type. This means you can track not just which companies are trending, but which models, hardware, and regulations are gaining attention.

Entity extraction and disambiguation

Gildea uses a two-pass entity extraction system to ensure accurate identification:

Pass 1: Entity extraction

Signals are processed to extract entity mentions with knowledge-graph linking, type classification, and salience scoring.

Pass 2: Domain-specific disambiguation

A curated rule set disambiguates AI-specific entities that general NLP models struggle with:

Model families: “Claude Opus 4.7”, “GPT-5”, “Llama 4” resolve to specific model entries, not generic company mentions
Hardware: “H100”, “A100”, “TPU v5e” resolve to specific chips
Contextual disambiguation: “Claude” + context mentioning “sonnet” resolves to the model, not a person

Entity identifiers

Every entity has a stable, opaque public ID of the form gld:/<hex> (e.g., gld:/a1b2c3d4e5f6). It is the only entity identifier the API exposes: it appears in every entity_id field, and you pass it to GET /v1/entities/{name_or_id} or any entity filter. The ID stays the same over time, even if the entity’s classification is later corrected. Behind the scenes, “Meta”, “Meta Platforms”, and “Facebook” all resolve to the same entity, and therefore the same ID.

Noise filtering

Generic concepts (“AI”, “market”, “industry”) are excluded and corporate suffixes (“Inc”, “LLC”, “Corp”) are normalized so “NVIDIA Corporation” and “NVIDIA” resolve to the same entity.

Entity profiles

Each entity has a rich profile including:

Signal count: total signals mentioning this entity
Trend stats: share of voice, weekly counts, growth streak, source diversity
Content type mix: breakdown of expert analysis vs. event signals
Theme distributions: which value chain segments and market forces this entity appears in
Related entities: co-occurrence relationships

Trend analytics

Every entity includes a trend object:

Field	Description
`share_of_voice`	Entity’s share of total corpus over the trailing 4 weeks
`streak`	Consecutive weeks of growth
`current_week`	Signal count this week
`prior_week`	Signal count last week
`source_diversity`	Count of distinct source domains

The discrete interpretation labels below (direction, confidence, stability, notability) are derived from the underlying statistics so you can act on them without doing the statistics yourself.

Interpretation fields

Each entity includes interpretation fields derived from the raw trend stats. These are discrete labels that agents can act on without statistical expertise.

Field	Values	What it tells you
`scale`	`Large`, `Medium`, `Small`	How prominent is this entity relative to the corpus
`direction`	`Rising`, `Stable`, `Declining`, `New`	Which way is coverage trending
`confidence`	`Significant`, `Insignificant`	Is the trend statistically reliable
`stability`	`Volatile`, `Steady`	How consistent is coverage week to week
`notability`	`High`, `Medium`, `Low`, `Negligible`	How much this entity warrants attention right now: foreground vs. background
`notability_reasoning`	Free text	Human-readable explanation of the notability assignment

The interpretation labels are computed over a rolling 12-week window: the current week plus 11 prior weeks. Entities with fewer than 8 mentions in that window return scale only; direction, confidence, stability, and notability will be null. The notability_reasoning field will contain “Insufficient data for trend analysis.” An entity whose first_seen date is within the last 30 days is classified as direction: "New" regardless of its underlying slope; this overrides the Rising/Stable/Declining classification until enough history accumulates. confidence is forced to Insignificant and stability is null for new entities. As a result, filtering ?direction=Rising will not surface genuine breakouts younger than 30 days; use ?direction=New to find recent arrivals.

{
  "entity_id": "gld:/a1b2c3d4e5f6",
  "name": "NVIDIA",
  "scale": "Large",
  "direction": "Rising",
  "confidence": "Significant",
  "stability": "Steady",
  "notability": "High",
  "notability_reasoning": "Large-scale entity with confirmed upward trend and consistent coverage; notable upward shift reliably gaining share.",
  "trend": { "..." : "..." }
}

Entity matching is literal

Entity filters and attribution match at the unit level, not the article level: a result is a unit whose own text names the entity, not just any unit from an article that mentions it elsewhere.

GET /v1/signals/{id}: each unit in units[] carries an entities array of the public entity IDs (gld:/…) named in that unit’s text.
GET /v1/search?entity=<id>: returns only units whose text actually contains the entity, not every unit from an article that happens to mention it.

A few consequences worth knowing:

High precision. A result for ?entity=gld:/a1b2c3d4e5f6 literally names that entity.
No coreference resolution. A unit that refers to an entity only by pronoun or description (“the company shipped a 200K-context model”) won’t match an entity filter, even when the referent is obvious from context.
Article-level counts. Trend and co-occurrence statistics count at the article level: an entity mentioned several times in one article counts once.

Composable filtering

The list endpoint filters by any combination of interpretation fields, so you can express precise discovery queries:

Filter	Finds
`direction=Rising` + `confidence=Significant`	Entities reliably trending up
`direction=New` + `sort=first_seen`	Recently tracked entities
`stability=Volatile` + `sort=trend`	Entities with the most variable coverage

Combine freely: ?direction=Rising&scale=Large&sort=trend finds high-prominence entities with upward trends.

Co-occurrences

The entity detail endpoint includes related_entities: entities that frequently appear together in signals. This reveals industry relationships, competitive dynamics, and supply chain connections.

{
  "related_entities": [
    {
      "entity_id": "gld:/b2c3d4e5f6a7",
      "name": "TSMC",
      "type": "organization",
      "co_occurrence_count": 28
    }
  ]
}

​Entity extraction and disambiguation

​Pass 1: Entity extraction

​Pass 2: Domain-specific disambiguation

​Entity identifiers

​Noise filtering

​Entity profiles

​Trend analytics

​Interpretation fields

​Entity matching is literal

​Composable filtering

​Co-occurrences