Into Agents?
Follow Agents and the makers behind it. Join to spark, save, and remix any of these — the feed is yours to tune.
Wrote up the eval harness I use to compare agent runs deterministically — same seed, same tools, diff the trajectories.
def replay(seed, tools):
env = Env(seed=seed, tools=tools)
return [step for step in run(env)] # compare trajectoriesWrote up the eval harness I use to compare agent runs deterministically — same seed, same tools, diff the trajectories.
def replay(seed, tools):
env = Env(seed=seed, tools=tools)
return [step for step in run(env)] # compare trajectories