BEGIN:VCALENDAR
VERSION:2.0
CALSCALE:GREGORIAN
PRODID:adamgibbons/ics
METHOD:PUBLISH
X-PUBLISHED-TTL:PT1H
BEGIN:VEVENT
UID:5-wHOXGdM7DJQ179pCtgj
SUMMARY:Do LLMs have fluid intelligence? Lessons from competing in ARC AGI 
	2
DTSTAMP:20260513T113355Z
DTSTART:20260522T092500Z
DESCRIPTION:Description:\nThe integration of a Python REPL (Read-Eval-Print
	 Loop) into AI models\, as a means of “agent-based” code execution\, is pr
	oving to be a breakthrough for solving complex reasoning tasks. New findin
	gs show that even simple access to a runtime environment drastically impro
	ves model performance on the ARC AGI 2 benchmark – without complex prompt 
	engineering. This approach unlocks untapped capabilities for dynamic probl
	em-solving and could redefine the development of general AI. In their talk
	\, Dibya Chakravorty\, Bernhard Altaner\, and Debsankha Manik demonstrate 
	concrete performance leaps: The open-source model GPT OSS 120B High improv
	es from 6.11% to 26.38% with REPL\, while Minimax M2.1 climbs from 3.06% t
	o 10.56%. Even top models like GPT 5.2 XHigh benefit enormously (+13.55%).
	 The speakers will show how this paradigm shift not only revolutionizes be
	nchmarks but also has practical implications for AI development.\n--------
	------------------------\n\nSpeaker:\n- Dr. Bernhard Altaner\n- Dr. Debsan
	kha Manik\n- Dibya Chakravorty\n\n--------------------------------\n\nTalk
	 details:\n- Link to the Big Techday website: https://bigtechday.com/en/ta
	lks#aerGJ2c5AFUwlkgjzkvze\n
LOCATION:Kohlebunker I
DURATION:PT50M
END:VEVENT
END:VCALENDAR
