In a new paper, Anthropic researchers tested the “faithfulness” of CoT models’ reasoning by slipping them a cheat sheet and waiting to see if they acknowledged the hint. The researchers wanted to see ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results