1 comments

  • dashersw 2 hours ago
    This paper introduces BRAID (Bounded Reasoning for Autonomous Inference and Decisions), a structured prompting framework that replaces free-form chain-of-thought with bounded, symbolic reasoning encoded as Mermaid flowcharts.

    We evaluate BRAID across GSM-Hard, SCALE MultiChallenge, and AdvancedIF.

    Key findings:

    - Structured symbolic reasoning improves accuracy on complex tasks

    - Smaller models often match or outperform larger models using classic prompting

    - Significant cost reductions (up to 74× performance-per-dollar)

    - Even SOTA models see accuracy gains when pure performance is the goal

    All benchmarks and detailed logs are public: https://benchmark.openserv.ai

    Happy to discuss methodology, evaluation choices, limitations, or failure cases.