AI Radar Public Daily - 2026-06-01

1. Comparative Analysis of Large Language Model Inference Serving Systems: A Performance Study of vLLM and HuggingFace TGI

Comparative Analysis of Large Language Model Inference Serving Systems: A Performance Study of vLLM and HuggingFace TGI is an AI Radar public signal from arXiv AI Infra with score 15.52.

Why it matters

Matched focus keywords: GPU, PagedAttention, TGI, memory, vLLM.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2511.17593v1

2. Blink: CPU-Free LLM Inference by Delegating the Serving Stack to GPU and SmartNIC

Blink: CPU-Free LLM Inference by Delegating the Serving Stack to GPU and SmartNIC is an AI Radar public signal from arXiv AI Infra with score 15.52.

Why it matters

Matched focus keywords: GPU, NPU, RDMA, SGLang, TensorRT-LLM, memory, vLLM.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2604.07609v1

3. TokenWeave: Efficient Compute-Communication Overlap for Distributed LLM Inference

TokenWeave: Efficient Compute-Communication Overlap for Distributed LLM Inference is an AI Radar public signal from arXiv AI Infra with score 15.52.

Why it matters

Matched focus keywords: GPU, NVLink, SGLang, TensorRT-LLM, vLLM.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2505.11329v5

4. freeforall06/awesome-claude-code

freeforall06/awesome-claude-code is an AI Radar public signal from GitHub Coding Agent Search with score 14.37.

Why it matters

Matched focus keywords: Claude Code, Codex, agentic coding.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/freeforall06/awesome-claude-code

5. Pendragonaffectation426/maestro

Pendragonaffectation426/maestro is an AI Radar public signal from GitHub Kernel Search with score 14.37.

Why it matters

Matched focus keywords: Claude Code, Codex, Cursor.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/Pendragonaffectation426/maestro

6. Ada-MK: Adaptive MegaKernel Optimization via Automated DAG-based Search for LLM Inference

Ada-MK: Adaptive MegaKernel Optimization via Automated DAG-based Search for LLM Inference is an AI Radar public signal from arXiv AI Infra with score 13.50.

Why it matters

Matched focus keywords: GPU, HBM, MLIR, TensorRT-LLM, memory, vLLM.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2605.11581v1

7. QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving is an AI Radar public signal from arXiv AI Infra with score 13.22.

Why it matters

Matched focus keywords: DiT, GPU, KV cache, TensorRT-LLM, memory, quantization.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2405.04532v3

8. FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving

FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving is an AI Radar public signal from arXiv AI Infra with score 13.22.

Why it matters

Matched focus keywords: DiT, FlashInfer, GPU, SGLang, memory, vLLM.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2501.01005v2

9. SCOOT: SLO-Oriented Performance Tuning for LLM Inference Engines

SCOOT: SLO-Oriented Performance Tuning for LLM Inference Engines is an AI Radar public signal from arXiv AI Infra with score 13.22.

Why it matters

Matched focus keywords: TensorRT-LLM, vLLM.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2408.04323v2

10. SGLang: Efficient Execution of Structured Language Model Programs

SGLang: Efficient Execution of Structured Language Model Programs is an AI Radar public signal from arXiv AI Infra with score 13.22.

Why it matters

Matched focus keywords: KV cache, NPU, SGLang.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2312.07104v2

11. COMET: Towards Partical W4A4KV4 LLMs Serving

COMET: Towards Partical W4A4KV4 LLMs Serving is an AI Radar public signal from arXiv AI Infra with score 13.22.

Why it matters

Matched focus keywords: DiT, GPU, KV cache, TensorRT-LLM, memory, quantization.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2410.12168v1

12. CadPosting/third-brain

CadPosting/third-brain is an AI Radar public signal from GitHub Coding Agent Search with score 12.50.

Why it matters

Matched focus keywords: Agent, Claude Code, DPO, MCP, memory.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/CadPosting/third-brain

13. proto-dredge424/modus-memory

proto-dredge424/modus-memory is an AI Radar public signal from GitHub Kernel Search with score 12.50.

Why it matters

Matched focus keywords: Agent, MCP, memory.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/proto-dredge424/modus-memory

14. ronie-aduana/mcp-ai-memory

ronie-aduana/mcp-ai-memory is an AI Radar public signal from GitHub Agents MCP Search with score 12.30.

Why it matters

Matched focus keywords: Agent, MCP, memory.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/ronie-aduana/mcp-ai-memory

15. UdayKumarVeera/multi-agent-mcp-system

UdayKumarVeera/multi-agent-mcp-system is an AI Radar public signal from GitHub Agents MCP Search with score 12.30.

Why it matters

Matched focus keywords: Agent, LangGraph, MCP.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/UdayKumarVeera/multi-agent-mcp-system

16. Katta-Nitish/mcp-agentic-rag

Katta-Nitish/mcp-agentic-rag is an AI Radar public signal from GitHub Agents MCP Search with score 12.30.

Why it matters

Matched focus keywords: Agent, LangGraph, MCP.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/Katta-Nitish/mcp-agentic-rag

17. deepanshumody/discovery-agents

deepanshumody/discovery-agents is an AI Radar public signal from GitHub Eval Bench Search with score 12.30.

Why it matters

Matched focus keywords: Agent, LangGraph, MCP.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/deepanshumody/discovery-agents

18. HGVAbyte/rlhf-data-agent-full

HGVAbyte/rlhf-data-agent-full is an AI Radar public signal from GitHub Synthetic Data Search with score 12.20.

Why it matters

Matched focus keywords: Agent, DPO, RLHF, synthetic data.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/HGVAbyte/rlhf-data-agent-full

19. korbinjoe/openteam

korbinjoe/openteam is an AI Radar public signal from GitHub Coding Agent Search with score 12.07.

Why it matters

Matched focus keywords: Claude Code, Codex.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/korbinjoe/openteam

20. joohw/clovapi

joohw/clovapi is an AI Radar public signal from GitHub Coding Agent Search with score 12.07.

Why it matters

Matched focus keywords: Agent, Claude Code, Codex.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/joohw/clovapi