vLLM v0.20.0 enhances memory and MoE serving efficiency with TurboQuant 2-bit KV cache.
vLLM 0.20 发布,提升内存和 MoE 服务效率。
OpenAI expands cloud availability beyond Azure, allowing distribution via Google TPU and AWS.
OpenAI放宽Azure独占协议,允许其产品在所有云平台上发布。
DeepSeek released V4 Pro and V4 Flash, featuring a 1M-token context and hybrid reasoning modes.
DeepSeek发布了DeepSeek-V4 Pro和DeepSeek-V4 Flash,首次引入双层架构,支持1M-token上下文,采用MIT许可证。
OpenAI launched GPT-5.5, emphasizing improved token efficiency and new features for coding and knowledge work.
OpenAI发布GPT-5.5,作为其新旗舰模型,专注于实际工作和代理功能。
Anthropic推出Claude Design,旨在挑战Figma等设计工具。
Anthropic推出Claude Opus 4.7,提升了编码和代理性能。