AI Radar Public Daily - 2026-06-01
1. Comparative Analysis of Large Language Model Inference Serving Systems: A Performance Study of vLLM and HuggingFace TGI
- Source: arXiv AI Infra
- Link: http://arxiv.org/abs/2511.17593v1
Comparative Analysis of Large Language Model Inference Serving Systems: A Performance Study of vLLM and HuggingFace TGI is an AI Radar public signal from arXiv AI Infra with score 15.52.
Why it matters
Matched focus keywords: GPU, PagedAttention, TGI, memory, vLLM.
Suggested action
Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2511.17593v1
2. Blink: CPU-Free LLM Inference by Delegating the Serving Stack to GPU and SmartNIC
- Source: arXiv AI Infra
- Link: http://arxiv.org/abs/2604.07609v1
Blink: CPU-Free LLM Inference by Delegating the Serving Stack to GPU and SmartNIC is an AI Radar public signal from arXiv AI Infra with score 15.52.
Why it matters
Matched focus keywords: GPU, NPU, RDMA, SGLang, TensorRT-LLM, memory, vLLM.
Suggested action
Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2604.07609v1
3. TokenWeave: Efficient Compute-Communication Overlap for Distributed LLM Inference
- Source: arXiv AI Infra
- Link: http://arxiv.org/abs/2505.11329v5
TokenWeave: Efficient Compute-Communication Overlap for Distributed LLM Inference is an AI Radar public signal from arXiv AI Infra with score 15.52.
Why it matters
Matched focus keywords: GPU, NVLink, SGLang, TensorRT-LLM, vLLM.
Suggested action
Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2505.11329v5
4. freeforall06/awesome-claude-code
- Source: GitHub Coding Agent Search
- Link: https://github.com/freeforall06/awesome-claude-code
freeforall06/awesome-claude-code is an AI Radar public signal from GitHub Coding Agent Search with score 14.37.
Why it matters
Matched focus keywords: Claude Code, Codex, agentic coding.
Suggested action
Review the source and decide whether it deserves deeper follow-up: https://github.com/freeforall06/awesome-claude-code
5. Pendragonaffectation426/maestro
- Source: GitHub Kernel Search
- Link: https://github.com/Pendragonaffectation426/maestro
Pendragonaffectation426/maestro is an AI Radar public signal from GitHub Kernel Search with score 14.37.
Why it matters
Matched focus keywords: Claude Code, Codex, Cursor.
Suggested action
Review the source and decide whether it deserves deeper follow-up: https://github.com/Pendragonaffectation426/maestro
6. Ada-MK: Adaptive MegaKernel Optimization via Automated DAG-based Search for LLM Inference
- Source: arXiv AI Infra
- Link: http://arxiv.org/abs/2605.11581v1
Ada-MK: Adaptive MegaKernel Optimization via Automated DAG-based Search for LLM Inference is an AI Radar public signal from arXiv AI Infra with score 13.50.
Why it matters
Matched focus keywords: GPU, HBM, MLIR, TensorRT-LLM, memory, vLLM.
Suggested action
Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2605.11581v1
7. QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
- Source: arXiv AI Infra
- Link: http://arxiv.org/abs/2405.04532v3
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving is an AI Radar public signal from arXiv AI Infra with score 13.22.
Why it matters
Matched focus keywords: DiT, GPU, KV cache, TensorRT-LLM, memory, quantization.
Suggested action
Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2405.04532v3
8. FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
- Source: arXiv AI Infra
- Link: http://arxiv.org/abs/2501.01005v2
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving is an AI Radar public signal from arXiv AI Infra with score 13.22.
Why it matters
Matched focus keywords: DiT, FlashInfer, GPU, SGLang, memory, vLLM.
Suggested action
Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2501.01005v2
9. SCOOT: SLO-Oriented Performance Tuning for LLM Inference Engines
- Source: arXiv AI Infra
- Link: http://arxiv.org/abs/2408.04323v2
SCOOT: SLO-Oriented Performance Tuning for LLM Inference Engines is an AI Radar public signal from arXiv AI Infra with score 13.22.
Why it matters
Matched focus keywords: TensorRT-LLM, vLLM.
Suggested action
Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2408.04323v2
10. SGLang: Efficient Execution of Structured Language Model Programs
- Source: arXiv AI Infra
- Link: http://arxiv.org/abs/2312.07104v2
SGLang: Efficient Execution of Structured Language Model Programs is an AI Radar public signal from arXiv AI Infra with score 13.22.
Why it matters
Matched focus keywords: KV cache, NPU, SGLang.
Suggested action
Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2312.07104v2
11. COMET: Towards Partical W4A4KV4 LLMs Serving
- Source: arXiv AI Infra
- Link: http://arxiv.org/abs/2410.12168v1
COMET: Towards Partical W4A4KV4 LLMs Serving is an AI Radar public signal from arXiv AI Infra with score 13.22.
Why it matters
Matched focus keywords: DiT, GPU, KV cache, TensorRT-LLM, memory, quantization.
Suggested action
Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2410.12168v1
12. CadPosting/third-brain
- Source: GitHub Coding Agent Search
- Link: https://github.com/CadPosting/third-brain
CadPosting/third-brain is an AI Radar public signal from GitHub Coding Agent Search with score 12.50.
Why it matters
Matched focus keywords: Agent, Claude Code, DPO, MCP, memory.
Suggested action
Review the source and decide whether it deserves deeper follow-up: https://github.com/CadPosting/third-brain
13. proto-dredge424/modus-memory
- Source: GitHub Kernel Search
- Link: https://github.com/proto-dredge424/modus-memory
proto-dredge424/modus-memory is an AI Radar public signal from GitHub Kernel Search with score 12.50.
Why it matters
Matched focus keywords: Agent, MCP, memory.
Suggested action
Review the source and decide whether it deserves deeper follow-up: https://github.com/proto-dredge424/modus-memory
14. ronie-aduana/mcp-ai-memory
- Source: GitHub Agents MCP Search
- Link: https://github.com/ronie-aduana/mcp-ai-memory
ronie-aduana/mcp-ai-memory is an AI Radar public signal from GitHub Agents MCP Search with score 12.30.
Why it matters
Matched focus keywords: Agent, MCP, memory.
Suggested action
Review the source and decide whether it deserves deeper follow-up: https://github.com/ronie-aduana/mcp-ai-memory
15. UdayKumarVeera/multi-agent-mcp-system
- Source: GitHub Agents MCP Search
- Link: https://github.com/UdayKumarVeera/multi-agent-mcp-system
UdayKumarVeera/multi-agent-mcp-system is an AI Radar public signal from GitHub Agents MCP Search with score 12.30.
Why it matters
Matched focus keywords: Agent, LangGraph, MCP.
Suggested action
Review the source and decide whether it deserves deeper follow-up: https://github.com/UdayKumarVeera/multi-agent-mcp-system
16. Katta-Nitish/mcp-agentic-rag
- Source: GitHub Agents MCP Search
- Link: https://github.com/Katta-Nitish/mcp-agentic-rag
Katta-Nitish/mcp-agentic-rag is an AI Radar public signal from GitHub Agents MCP Search with score 12.30.
Why it matters
Matched focus keywords: Agent, LangGraph, MCP.
Suggested action
Review the source and decide whether it deserves deeper follow-up: https://github.com/Katta-Nitish/mcp-agentic-rag
17. deepanshumody/discovery-agents
- Source: GitHub Eval Bench Search
- Link: https://github.com/deepanshumody/discovery-agents
deepanshumody/discovery-agents is an AI Radar public signal from GitHub Eval Bench Search with score 12.30.
Why it matters
Matched focus keywords: Agent, LangGraph, MCP.
Suggested action
Review the source and decide whether it deserves deeper follow-up: https://github.com/deepanshumody/discovery-agents
18. HGVAbyte/rlhf-data-agent-full
- Source: GitHub Synthetic Data Search
- Link: https://github.com/HGVAbyte/rlhf-data-agent-full
HGVAbyte/rlhf-data-agent-full is an AI Radar public signal from GitHub Synthetic Data Search with score 12.20.
Why it matters
Matched focus keywords: Agent, DPO, RLHF, synthetic data.
Suggested action
Review the source and decide whether it deserves deeper follow-up: https://github.com/HGVAbyte/rlhf-data-agent-full
19. korbinjoe/openteam
- Source: GitHub Coding Agent Search
- Link: https://github.com/korbinjoe/openteam
korbinjoe/openteam is an AI Radar public signal from GitHub Coding Agent Search with score 12.07.
Why it matters
Matched focus keywords: Claude Code, Codex.
Suggested action
Review the source and decide whether it deserves deeper follow-up: https://github.com/korbinjoe/openteam
20. joohw/clovapi
- Source: GitHub Coding Agent Search
- Link: https://github.com/joohw/clovapi
joohw/clovapi is an AI Radar public signal from GitHub Coding Agent Search with score 12.07.
Why it matters
Matched focus keywords: Agent, Claude Code, Codex.
Suggested action
Review the source and decide whether it deserves deeper follow-up: https://github.com/joohw/clovapi