OpenAI has announced a new benchmark called PaperBench, which evaluates whether AI can understand and reproduce cutting-edge research papers. PaperBench tests an AI agent to reproduce 20 cutting-edge ...
At the world’s largest artificial-intelligence conference, known as NeurIPS, Weichen Huang almost blended in. Last month in Vancouver, he was one of numerous researchers explaining their ...