Skip to main content
Search your territory for the verified signals that bear on your question. By hand this is hours across newsletters, filings, and X, and the real work is not finding a source but finding the right slice: the central positions, the dissent, the most recent moves, across many independent sources. Gildea’s search is semantic and faceted, so one call returns the on-topic, already-cited units, and you narrow by entity, theme, time, role, or similarity until the slice is exactly what you need.

The recipe

Search is a retrieval primitive: a natural-language query in, a ranked list of verified units out. The power is in the facets you compose over it. You scope each query to your territory, pull a specific layer of the decomposition, find more units like one that matters, tune for recency and source diversity, then drill any signal to its full set of units.
1

Search your territory

A natural-language query, narrowed by entity, theme, and window to the scope you defined. Retrieval is semantic, so phrase the query like a question, not a keyword string.
2

Filter by role

Pull one layer of the decomposition with role: thesis / synopsis (the central position), argument (the reasoning), or claim (atomic factual assertions).
3

Find more like this

Pass similar_to a unit id to get embedding-backed “more like this”: the cluster of units making the same point, across independent sources.
4

Tune the ranking

recency_boost favors newer signals; diversity_cap limits units per source, so one prolific author can’t dominate the page.
5

Split by content type, then drill

signals.list takes content_type to separate event (what happened) from analysis (what experts think). Pull either, then fetch any signal’s full verified decomposition as a flat units[] with roles and evidence.
Install gildea, set GILDEA_API_KEY, then (pure SDK, no model calls):
from gildea import Gildea

gildea = Gildea()

# Your scope from the previous recipe, inlined so this block runs on its own.
scope = {"themes": ["Infrastructure"], "entities": ["NVIDIA"], "window": "6m"}

# 1. Semantic + faceted retrieval: a natural-language query narrowed to your
#    territory (entity + theme + window). Retrieval is semantic, not keyword.
hits = gildea.search(
    "data center moat durability and competitive threats",
    entity=scope["entities"][0], theme=scope["themes"][0], window=scope["window"],
    limit=10,
)["results"]
print(f"{len(hits)} verified units in scope. top:")
for h in hits[:5]:
    u, c = h["unit"], h["citation"]
    print(f"  [{u['role']:9}] {u['text'][:62]}  <- {c['domain']}")

# 2. Role facet: pull one layer of the decomposition -- thesis/synopsis (the
#    central position), argument (reasoning), claim (atomic factual assertions).
theses = gildea.search("AI infrastructure spending", role="thesis", window="3m", limit=5)["results"]
print(f"\n{len(theses)} thesis-level units (positions experts are taking):")
for h in theses[:3]:
    print(f"  - {h['unit']['text'][:80]}")

# 3. similar_to: embedding-backed "more like this" off any unit id. Find the
#    cluster making the same point, across sources.
seed = hits[0]["unit"]["id"]
similar = gildea.search(similar_to=seed, limit=5)["results"]
print(f"\nunits similar to the top hit ({len(similar)} found):")
for h in similar[:3]:
    print(f"  ~ {h['unit']['text'][:66]}  <- {h['citation']['domain']}")

# 4. Tune the ranking: recency_boost favors newer signals; diversity_cap limits
#    units per source so one prolific author can't dominate the page.
fresh = gildea.search(
    "frontier model competition", recency_boost=1, diversity_cap=1, window="1m", limit=8,
)["results"]
print(f"\nrecency-boosted, source-diversified: {len(fresh)} hits across "
      f"{len({h['citation']['domain'] for h in fresh})} domains")

# 5. Split the feed by content_type -- event (what happened) vs analysis (what
#    experts think) -- then drill a signal to its full verified decomposition
#    as a flat units[] with roles + evidence.
theme = scope["themes"][0]
events = gildea.signals.list(theme=theme, content_type="event", window="3w", limit=5)["signals"]
analyses = gildea.signals.list(theme=theme, content_type="analysis", window="3w", limit=5)["signals"]
print(f"\n{len(events)} event / {len(analyses)} analysis signals in '{theme}'")
detail = gildea.signals.get((events or analyses)[0]["signal_id"])
roles = ", ".join(sorted({u["role"] for u in detail["units"]}))
print(f"drilled '{detail['title'][:48]}' -> {len(detail['units'])} units ({roles})")

What you get

The verified, cited units for your territory, sliced exactly how you need them: by entity and theme, by decomposition layer, by similarity, by recency and source spread. Every unit is already verified and carries its citation, so the set you assemble is one you can trace to source and embed beside your own data. Retrieval is semantic, so phrasing the query like a question works better than keyword matching.

Search

List signals

Get signal