Filters and Recency Boosts

Metadata filters

Filters are hard constraints. A chunk must match every filter to be scored and returned. They run against the redb metadata store using precomputed bitsets, so they resolve in microseconds.

Operators

Operator	Syntax	Type	Example
Exact match	`"field": value`	string, bool, number	`"scene_type": "goal"`
Numeric range	`"field": { "gte": n, "lte": n }`	int, float	`"timerange_start": { "gte": 2040.0, "lte": 2100.0 }`
Array contains	`"field": { "contains": value }`	string list	`"tags": { "contains": "sports" }`
Set membership	`"field": { "in": [...] }`	any	`"doc_type": { "in": ["segment", "flow"] }`

All filters in the filters object combine as AND. There is no OR operator at the top level. Use "in" for multi-value matching on a single field.

Examples

$ # Exact match on two fields (AND)
$ curl -X POST $COMPASS_BASE_URL/collections/media/search \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "striker celebrating",
>     "filters": {
>       "scene_type": "goal",
>       "active_speaker": "commentator"
>     }
>   }'
$ 
$ # Numeric range on time window
$ curl -X POST $COMPASS_BASE_URL/collections/media/search \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "goal celebration",
>     "filters": {
>       "timerange_start": { "gte": 2040.0 },
>       "timerange_end": { "lte": 2100.0 }
>     }
>   }'
$ 
$ # Set membership for doc type plus exact scene
$ curl -X POST $COMPASS_BASE_URL/collections/media/search \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "goal celebration",
>     "filters": {
>       "doc_type": { "in": ["segment"] },
>       "scene_type": "goal"
>     }
>   }'

Recency bias

Recency bias favors newer content without hiding older content. Old chunks score lower but still appear. The decay is exponential: score *= max(min_score, 2^(-age_days / half_life_days)). A chunk with a 30-day half-life at age 30 days scores 0.5x what it would have scored at age 0.

The recency_field tells Compass which metadata field holds the timestamp. Compass never assumes a default.

Presets

$ curl -X POST $COMPASS_BASE_URL/collections/media/search \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "live event highlights",
>     "recency_preset": "aggressive",
>     "recency_field": "created_at",
>     "top_k": 10
>   }'

Preset	Half-life	Score floor	Use when
`aggressive`	3 days	5%	Real-time alerts, live events, TAMS segments
`recent`	7 days	20%	News feeds, support tickets
`mild`	30 days	30%	Docs, reports, meeting notes
`archive`	90 days	50%	Legal docs, compliance, long-lived content

Custom recency

For full control, use recency instead. This overrides any preset:

$ curl -X POST $COMPASS_BASE_URL/collections/media/search \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "quarterly report",
>     "recency": {
>       "field": "created_at",
>       "half_life_days": 14,
>       "min_score": 0.15
>     }
>   }'

half_life_days: time in days after which a chunk’s recency contribution is halved. min_score: floor value so older chunks never fully disappear. Both are required when using the recency object.

Field boosts

Boosts are multiplicative score additions applied after retrieval. A chunk that matches a boost condition gets its score multiplied by weight. Chunks that don’t match the condition are unaffected.

$ curl -X POST $COMPASS_BASE_URL/collections/media/search \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "striker near corner flag",
>     "recency_preset": "mild",
>     "recency_field": "created_at",
>     "boosts": [
>       { "field": "scene_type", "value": "goal", "weight": 2.0 },
>       { "field": "shot_quality_score", "gte": 0.7, "weight": 1.5 }
>     ]
>   }'

Boost fields:

Field	Type	Description
`field`	string	Metadata field name to match on.
`value`	any	Exact match condition.
`gte`	number	Numeric lower bound for the condition.
`lte`	number	Numeric upper bound for the condition.
`weight`	float	Score multiplier (1.0 = no change, 2.0 = double).

Use value for exact match. Use gte/lte for numeric range conditions. Both gte and lte can appear in the same boost entry.

Hybrid score weights

In mode: "hybrid", BM25 and HNSW scores are merged via RRF. Three parameters control the blend:

$ curl -X POST $COMPASS_BASE_URL/collections/media/search \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "query": "quarterly earnings call",
>     "mode": "hybrid",
>     "score_weights": {
>       "rrf_k": 60.0,
>       "fts_weight": 2.0,
>       "semantic_weight": 0.5
>     }
>   }'

Parameter	Default	Effect
`rrf_k`	60.0	RRF constant. Lower values amplify differences between top-ranked and lower-ranked results.
`fts_weight`	1.0	Relative weight of BM25. Set higher when keyword precision matters (legal, compliance, code).
`semantic_weight`	1.0	Relative weight of HNSW. Set higher when meaning matters more than exact terms.

For transcript search where exact speaker names and technical terms matter, try fts_weight: 2.0, semantic_weight: 0.5. For open-ended queries against visual content, try semantic_weight: 2.0, fts_weight: 0.5.