The recipe
You pull the verified units for your territory, embed their text with your own embedder, and persist the vectors next to their provenance. Use the same model you embed your own documents with, so every vector is comparable and one nearest-neighbor search spans both.Pull the units you track
Take the verified units for your territory (from Search or your context store), keeping each unit’s
id and citation as provenance.Embed with your own model
Embed the unit text with your embedder (Cohere, OpenAI, Voyage, your choice). Use the same model everywhere, so every vector is comparable and search across your own data and Gildea’s is meaningful.
gildea, cohere, and numpy, set GILDEA_API_KEY and COHERE_API_KEY, then:
What you get
A persisted vector for every verified unit, each carrying itsid and citation, sitting in the same space as your own documents. One similarity search now spans your private context and Gildea’s verified market record. You embed once with your own model and hold the vectors, so there’s no lock-in and no second retrieval system to reconcile. Keep the set current with Update.