BEGIN:VCALENDAR
VERSION:2.0
CALSCALE:GREGORIAN
PRODID:adamgibbons/ics
METHOD:PUBLISH
X-PUBLISHED-TTL:PT1H
BEGIN:VEVENT
UID:-ivpFOcSvBSJXU3pJSW_r
SUMMARY:Do LLMs have fluid intelligence? Lessons from competing in ARC AGI 
	2
DTSTAMP:20260430T144231Z
DTSTART:20260522T092500Z
DESCRIPTION:Description:\nThe integration of a Python REPL (Read-Eval-Print
	 Loop) into AI models\, as a means of “agent-based” code execution\, is pr
	oving to be a breakthrough for solving complex reasoning tasks. New findin
	gs show that even simple access to a runtime environment drastically impro
	ves model performance on the ARC AGI 2 benchmark – without complex prompt 
	engineering. This approach unlocks untapped capabilities for dynamic probl
	em-solving and could redefine the development of general AI.In their talk\
	, Dibya Chakravorty\, Bernhard Altaner\, and Debsankha Manik demonstrate c
	oncrete performance leaps: The open-source model GPT OSS 120B High improve
	s from 6.11% to 26.38% with REPL\, while Minimax M2.1 climbs from 3.06% to
	 10.56%. Even top models like GPT 5.2 XHigh benefit enormously (+13.55%). 
	The speakers will show how this paradigm shift not only revolutionizes ben
	chmarks but also has practical implications for AI development.\n---------
	-----------------------\n\nSpeaker:\n- Bernhard Altaner\n- Debsankha Manik
	\n- Dibya Chakravorty\n\n--------------------------------\n\nTalk details:
	\n- Link to the Big Techday website: https://bigtechday.com/en/talks#aerGJ
	2c5AFUwlkgjzkvze\n
LOCATION:Kohlebunker I
DURATION:PT50M
END:VEVENT
END:VCALENDAR
