What's been cooking — April 2026

15 merged PRs across 8 repos

April was mostly about squeezing more frames per second out of silicon that's already on people's desks. Apple's Neural Engine got a serious workout in Deep-Live-Cam, CoreML got a couple of new ops upstream in onnxruntime, GLiNER's cold starts got cut in half, and Exa quietly showed up everywhere — both as a new search backend in two different agent frameworks and as the new copyright holder on a small pile of LICENSE files.

hacksider/Deep-Live-Cam

Four PRs, all aimed at the same thing: getting the live face-swap pipeline off the CPU and onto whatever accelerator is actually available. Deep-Live-Cam#1746 kicked things off by taking an M3 Max from a slideshow-grade 1.5 FPS to a steady 10+ FPS with bit-exact output, and Deep-Live-Cam#1775 followed up with the heavier surgery — CoreML graph rewrites that eliminate CPU fallbacks entirely (Pad(reflect) becomes Slice+Concat, Shape→Gather folds to constants, Split becomes Slice pairs), all cached to disk so you only pay the rewrite cost once per machine. Windows CUDA users got pulled along for the ride with a 4–5x FPS bump and proper platform routing. Deep-Live-Cam#1776 attacks the other side of the pipeline: paste-back compositing, which had been quietly running erode+blur kernels at face-bbox scale and burning O(area · k²) per frame; rewriting it to operate on the crop area with uint8 cv2 SIMD blends fixes that. And Deep-Live-Cam#1777 closes a subtle annoyance where ONNX Runtime's CoreML EP rejects rank-0 Gather indices, which was kicking GFPGAN's 1024 variant off the ANE via 16 separate scalar-index slices — widening those indices keeps the whole subgraph on the accelerator.

urchade/GLiNER

A productive month on the model-loading ergonomics front. GLiNER#348 tightened up from_pretrained so it actually loads at the requested dtype rather than casting after the fact, and narrowed quantize= down to int8 (the pure-downcast aliases that briefly existed never made it to a release, so nobody's wheel breaks). GLiNER#354 adds a variant= kwarg so you can pull half-precision weights from the Hub directly instead of downloading 745 MB of fp32 just to cast it down in RAM, and GLiNER#355 wires up low_cpu_mem_usage=True for roughly 2x faster cold starts. The unsung hero is GLiNER#351, which fixed 35 pre-existing test failures, removed the PYTHONPATH=. shim, and added a CI workflow so this kind of bit-rot doesn't accumulate again — going from 230/35/1-error to 265/0 on Python 3.12.

microsoft/onnxruntime

Two small but useful additions to the CoreML EP's op coverage. onnxruntime#28182 adds HardSigmoid, which maps cleanly onto MIL's sigmoid_hard with no decomposition needed, covering both the MLProgram and NeuralNetwork paths. onnxruntime#28184 adds com.microsoft:QuickGelu on the MLProgram side, decomposed into the obvious three-op mul/sigmoid/mul chain that matches the op's own schema function body. Both are the kind of unglamorous coverage work that keeps subgraphs from getting partitioned back to CPU.

NVIDIA/NeMo-Agent-Toolkit

NeMo-Agent-Toolkit#1846 adds an exa_internet_search tool built on langchain_exa.ExaSearchResults, structured as a near-mirror of the existing Tavily integration. Configurable max_results, search_type (auto/neural/keyword), and livecrawl options are exposed through ExaInternetSearchToolConfig, so swapping search backends in an agent config is a one-line affair.

NVIDIA-AI-Blueprints/aiq

Same idea, different framework: aiq#181 ships an exa_web_search NAT data source that mirrors the existing tavily_web_search package — same stub-on-missing-key behavior, same retries, same content-truncation and XML-tagged output. auto/deep/fast search types are exposed as first-class options.

exa-labs/company-researcher

company-researcher#10 adds a proper MIT LICENSE file, with Exa Labs as the copyright holder, matching the rest of Exa's OSS surface area. The repo had been quietly unlicensed.

exa-labs/aiq

exa-labs/exa-py

exa-py#198 does the same thing for the Python SDK — adds an MIT LICENSE attributed to Exa Labs rather than an individual, bringing it in line with exa-js.

exa-labs/exa-js

And rounding out the licensing tidy-up, exa-js#161 updates the existing LICENSE's copyright holder from "Hubert Yuan" to "Exa Labs". One line changed, no code touched.

That's April: real performance work where it counts, more CoreML coverage where it doesn't yet exist, and a coordinated round of paperwork at Exa. See you in May.