Alibaba's HDPO framework trains AI agents to skip unnecessary tool calls, cutting redundant invocations from 98% to 2% while ...
Multiple theses, coding marathons, joining research labs — this is life inside China's top AI training ground.