AI Radar Public Daily - 2026-06-01

1. Comparative Analysis of Large Language Model Inference Serving Systems: A Performance Study of vLLM and HuggingFace TGI

Source: arXiv AI Infra
Link: http://arxiv.org/abs/2511.17593v1

Comparative Analysis of Large Language Model Inference Serving Systems: A Performance Study of vLLM and HuggingFace TGI is an AI Radar public signal from arXiv AI Infra with score 15.52.

Why it matters

Matched focus keywords: GPU, PagedAttention, TGI, memory, vLLM.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2511.17593v1

2. Blink: CPU-Free LLM Inference by Delegating the Serving Stack to GPU and SmartNIC

Source: arXiv AI Infra
Link: http://arxiv.org/abs/2604.07609v1

Blink: CPU-Free LLM Inference by Delegating the Serving Stack to GPU and SmartNIC is an AI Radar public signal from arXiv AI Infra with score 15.52.

Why it matters

Matched focus keywords: GPU, NPU, RDMA, SGLang, TensorRT-LLM, memory, vLLM.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2604.07609v1

3. TokenWeave: Efficient Compute-Communication Overlap for Distributed LLM Inference

Source: arXiv AI Infra
Link: http://arxiv.org/abs/2505.11329v5

TokenWeave: Efficient Compute-Communication Overlap for Distributed LLM Inference is an AI Radar public signal from arXiv AI Infra with score 15.52.

Why it matters

Matched focus keywords: GPU, NVLink, SGLang, TensorRT-LLM, vLLM.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2505.11329v5

4. freeforall06/awesome-claude-code

Source: GitHub Coding Agent Search
Link: https://github.com/freeforall06/awesome-claude-code

freeforall06/awesome-claude-code is an AI Radar public signal from GitHub Coding Agent Search with score 14.37.

Why it matters

Matched focus keywords: Claude Code, Codex, agentic coding.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/freeforall06/awesome-claude-code

5. Pendragonaffectation426/maestro

Source: GitHub Kernel Search
Link: https://github.com/Pendragonaffectation426/maestro

Pendragonaffectation426/maestro is an AI Radar public signal from GitHub Kernel Search with score 14.37.

Why it matters

Matched focus keywords: Claude Code, Codex, Cursor.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/Pendragonaffectation426/maestro

6. Ada-MK: Adaptive MegaKernel Optimization via Automated DAG-based Search for LLM Inference

Source: arXiv AI Infra
Link: http://arxiv.org/abs/2605.11581v1

Ada-MK: Adaptive MegaKernel Optimization via Automated DAG-based Search for LLM Inference is an AI Radar public signal from arXiv AI Infra with score 13.50.

Why it matters

Matched focus keywords: GPU, HBM, MLIR, TensorRT-LLM, memory, vLLM.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2605.11581v1

7. QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Source: arXiv AI Infra
Link: http://arxiv.org/abs/2405.04532v3

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving is an AI Radar public signal from arXiv AI Infra with score 13.22.

Why it matters

Matched focus keywords: DiT, GPU, KV cache, TensorRT-LLM, memory, quantization.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2405.04532v3

8. FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving

Source: arXiv AI Infra
Link: http://arxiv.org/abs/2501.01005v2

FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving is an AI Radar public signal from arXiv AI Infra with score 13.22.

Why it matters

Matched focus keywords: DiT, FlashInfer, GPU, SGLang, memory, vLLM.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2501.01005v2

9. SCOOT: SLO-Oriented Performance Tuning for LLM Inference Engines

Source: arXiv AI Infra
Link: http://arxiv.org/abs/2408.04323v2

SCOOT: SLO-Oriented Performance Tuning for LLM Inference Engines is an AI Radar public signal from arXiv AI Infra with score 13.22.

Why it matters

Matched focus keywords: TensorRT-LLM, vLLM.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2408.04323v2

10. SGLang: Efficient Execution of Structured Language Model Programs

Source: arXiv AI Infra
Link: http://arxiv.org/abs/2312.07104v2

SGLang: Efficient Execution of Structured Language Model Programs is an AI Radar public signal from arXiv AI Infra with score 13.22.

Why it matters

Matched focus keywords: KV cache, NPU, SGLang.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2312.07104v2

11. COMET: Towards Partical W4A4KV4 LLMs Serving

Source: arXiv AI Infra
Link: http://arxiv.org/abs/2410.12168v1

COMET: Towards Partical W4A4KV4 LLMs Serving is an AI Radar public signal from arXiv AI Infra with score 13.22.

Why it matters

Matched focus keywords: DiT, GPU, KV cache, TensorRT-LLM, memory, quantization.

Suggested action

Review the source and decide whether it deserves deeper follow-up: http://arxiv.org/abs/2410.12168v1

12. CadPosting/third-brain

Source: GitHub Coding Agent Search
Link: https://github.com/CadPosting/third-brain

CadPosting/third-brain is an AI Radar public signal from GitHub Coding Agent Search with score 12.50.

Why it matters

Matched focus keywords: Agent, Claude Code, DPO, MCP, memory.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/CadPosting/third-brain

13. proto-dredge424/modus-memory

Source: GitHub Kernel Search
Link: https://github.com/proto-dredge424/modus-memory

proto-dredge424/modus-memory is an AI Radar public signal from GitHub Kernel Search with score 12.50.

Why it matters

Matched focus keywords: Agent, MCP, memory.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/proto-dredge424/modus-memory

14. ronie-aduana/mcp-ai-memory

Source: GitHub Agents MCP Search
Link: https://github.com/ronie-aduana/mcp-ai-memory

ronie-aduana/mcp-ai-memory is an AI Radar public signal from GitHub Agents MCP Search with score 12.30.

Why it matters

Matched focus keywords: Agent, MCP, memory.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/ronie-aduana/mcp-ai-memory

15. UdayKumarVeera/multi-agent-mcp-system

Source: GitHub Agents MCP Search
Link: https://github.com/UdayKumarVeera/multi-agent-mcp-system

UdayKumarVeera/multi-agent-mcp-system is an AI Radar public signal from GitHub Agents MCP Search with score 12.30.

Why it matters

Matched focus keywords: Agent, LangGraph, MCP.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/UdayKumarVeera/multi-agent-mcp-system

16. Katta-Nitish/mcp-agentic-rag

Source: GitHub Agents MCP Search
Link: https://github.com/Katta-Nitish/mcp-agentic-rag

Katta-Nitish/mcp-agentic-rag is an AI Radar public signal from GitHub Agents MCP Search with score 12.30.

Why it matters

Matched focus keywords: Agent, LangGraph, MCP.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/Katta-Nitish/mcp-agentic-rag

17. deepanshumody/discovery-agents

Source: GitHub Eval Bench Search
Link: https://github.com/deepanshumody/discovery-agents

deepanshumody/discovery-agents is an AI Radar public signal from GitHub Eval Bench Search with score 12.30.

Why it matters

Matched focus keywords: Agent, LangGraph, MCP.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/deepanshumody/discovery-agents

18. HGVAbyte/rlhf-data-agent-full

Source: GitHub Synthetic Data Search
Link: https://github.com/HGVAbyte/rlhf-data-agent-full

HGVAbyte/rlhf-data-agent-full is an AI Radar public signal from GitHub Synthetic Data Search with score 12.20.

Why it matters

Matched focus keywords: Agent, DPO, RLHF, synthetic data.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/HGVAbyte/rlhf-data-agent-full

19. korbinjoe/openteam

Source: GitHub Coding Agent Search
Link: https://github.com/korbinjoe/openteam

korbinjoe/openteam is an AI Radar public signal from GitHub Coding Agent Search with score 12.07.

Why it matters

Matched focus keywords: Claude Code, Codex.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/korbinjoe/openteam

20. joohw/clovapi

Source: GitHub Coding Agent Search
Link: https://github.com/joohw/clovapi

joohw/clovapi is an AI Radar public signal from GitHub Coding Agent Search with score 12.07.

Why it matters

Matched focus keywords: Agent, Claude Code, Codex.

Suggested action

Review the source and decide whether it deserves deeper follow-up: https://github.com/joohw/clovapi