|
[1] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information pro cessing systems, vol. 30, 2017. [2] OpenAI, “Gpt-4 technical report,” arXiv, 2023. [3] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozi`ere, N. Goyal, E. Hambro, F. Azhar et al., “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023. [4] T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa, “Large language models are zero-shot reasoners,” in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35. Curran Associates, Inc., 2022, pp. 22199–22213. [Online]. Available: https://proceedings.neurips.cc/paper files/paper/2022/file/ 8bb0d291acd4acf06ef112099c16f326-Paper-Conference.pdf [5] X. L. Li, A. Kuncoro, J. Hoffmann, C. d. M. d’Autume, P. Blunsom, and A. Ne matzadeh, “A systematic investigation of commonsense knowledge in large language models,” arXiv preprint arXiv:2111.00607, 2021. [6] N. Dziri, X. Lu, M. Sclar, X. L. Li, L. Jiang, B. Y. Lin, S. Welleck, P. West, C. Bhagavatula, R. Le Bras et al., “Faith and fate: Limits of transformers on com positionality,” Advances in Neural Information Processing Systems, vol. 36, 2024. [7] J. Kaddour, J. Harris, M. Mozes, H. Bradley, R. Raileanu, and R. McHardy, “Chal lenges and applications of large language models,” arXiv preprint arXiv:2307.10169, 2023. [8] R. T. McCoy, S. Yao, D. Friedman, M. Hardy, and T. L. Griffiths, “Embers of autoregression: Understanding large language models through the problem they are trained to solve,” arXiv preprint arXiv:2309.13638, 2023. [9] U. Anwar, A. Saparov, J. Rando, D. Paleka, M. Turpin, P. Hase, E. S. Lubana, E. Jenner, S. Casper, O. Sourbut et al., “Foundational challenges in assuring align ment and safety of large language models,” arXiv preprint arXiv:2404.09932, 2024. [10] P. Ding, J. Fang, P. Li, K. Wang, X. Zhou, M. Yu, J. Li, M. R. Walter, and H. Mei, “Mango: A benchmark for evaluating mapping and navigation abilities of large language models,” arXiv preprint arXiv:2403.19913, 2024. 2, 2023, pp. 14–15. [11] A. Singla, “Evaluating chatgpt and gpt-4 for visual programming,” in Proceedings of the 2023 ACM Conference on International Computing Education Research-Volume [12] H. W. Chung, L. Hou, S. Longpre, B. Zoph, Y. Tay, W. Fedus, E. Li, X. Wang, M. Dehghani, S. Brahma et al., “Scaling instruction-finetuned language models,” arXiv preprint arXiv:2210.11416, 2022. [13] J. Wei, Y. Tay, R. Bommasani, C. Raffel, B. Zoph, S. Borgeaud, D. Yogatama, M. Bosma, D. Zhou, D. Metzler et al., “Emergent abilities of large language models,” arXiv preprint arXiv:2206.07682, 2022. [14] V. Nair, E. Schumacher, G. Tso, and A. Kannan, “Dera: enhancing large lan guage model completions with dialog-enabled resolving agents,” arXiv preprint arXiv:2303.17071, 2023. [15] Z. Wang, S. Cai, G. Chen, A. Liu, X. Ma, and Y. Liang, “Describe, explain, plan and select: Interactive planning with large language models enables open-world multi task agents,” arXiv preprint arXiv:2302.01560, 2023. [16] J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V. Le, D. Zhou et al., “Chain-of-thought prompting elicits reasoning in large language models,” Advances in neural information processing systems, vol. 35, pp. 24824–24837, 2022. [17] S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y. Cao, “React: Syn ergizing reasoning and acting in language models,” arXiv preprint arXiv:2210.03629, 2022. [18] P.-L. Chen and C.-S. Chang, “Interact: Exploring the potentials of chatgpt as a cooperative agent,” arXiv preprint arXiv:2308.01552, 2023. [19] A. Madaan, N. Tandon, P. Gupta, S. Hallinan, L. Gao, S. Wiegreffe, U. Alon, N. Dziri, S. Prabhumoye, Y. Yang et al., “Self-refine: Iterative refinement with self-feedback,” Advances in Neural Information Processing Systems, vol. 36, 2024. [20] S. Bubeck, V. Chandrasekaran, R. Eldan, J. Gehrke, E. Horvitz, E. Kamar, P. Lee, Y. T. Lee, Y. Li, S. Lundberg et al., “Sparks of artificial general intelligence: Early experiments with gpt-4,” arXiv preprint arXiv:2303.12712, 2023. [21] W. Huang, F. Xia, T. Xiao, H. Chan, J. Liang, P. Florence, A. Zeng, J. Tompson, I. Mordatch, Y. Chebotar et al., “Inner monologue: Embodied reasoning through planning with language models,” arXiv preprint arXiv:2207.05608, 2022. [22] Y. Mu, Q. Zhang, M. Hu, W. Wang, M. Ding, J. Jin, B. Wang, J. Dai, Y. Qiao, and P. Luo, “Embodiedgpt: Vision-language pre-training via embodied chain of thought,” Advances in Neural Information Processing Systems, vol. 36, 2024. [23] D. Driess, F. Xia, M. S. Sajjadi, C. Lynch, A. Chowdhery, B. Ichter, A. Wahid, J. Tompson, Q. Vuong, T. Yu et al., “Palm-e: An embodied multimodal language model,” arXiv preprint arXiv:2303.03378, 2023. [24] G. Wang, Y. Xie, Y. Jiang, A. Mandlekar, C. Xiao, Y. Zhu, L. Fan, and A. Anandku mar, “Voyager: An open-ended embodied agent with large language models,” arXiv preprint arXiv:2305.16291, 2023. [25] C. Huang, O. Mees, A. Zeng, and W. Burgard, “Visual language maps for robot navigation,” in 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 10608–10615. [26] D. Shah, B. Osi´ nski, S. Levine et al., “Lm-nav: Robotic navigation with large pre trained models of language, vision, and action,” in Conference on robot learning. PMLR, 2023, pp. 492–504. [27] D. Osmankovi´c and S. Konjicija, “Implementation of q—learning algorithm for solv ing maze problem,” in 2011 proceedings of the 34th international convention MIPRO. IEEE, 2011, pp. 1619–1622. [28] M. Mitchell, A. B. Palmarini, and A. Moskvichev, “Comparing humans, gpt-4, and gpt-4v on abstraction and reasoning tasks,” arXiv preprint arXiv:2311.09247, 2023. [29] X. Yue, Y. Ni, K. Zhang, T. Zheng, R. Liu, G. Zhang, S. Stevens, D. Jiang, W. Ren, Y. Sun et al., “Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert agi,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 9556–9567. [30] “MAN1986,Python-Maze-World-pyamaze. https://github.com/man1986/pyamaze.git,” 2021. [31] H. Chase, “Langchain,” Oct 2022, if you use this software, please cite it as below. [Online]. Available: https://github.com/langchain-ai/langchain [32] K. Daniel, Thinking, fast and slow, 2017. [33] S. Kambhampati, K. Valmeekam, L. Guan, K. Stechly, M. Verma, S. Bhambri, L. Saldyt, and A. Murthy, “Llms can’t plan, but can help planning in llm-modulo frameworks,” arXiv preprint arXiv:2402.01817, 2024. [34] A. Neelakantan, T. Xu, R. Puri, A. Radford, J. M. Han, J. Tworek, Q. Yuan, N. Tezak, J. W. Kim, C. Hallacy et al., “Text and code embeddings by contrastive pre-training,” arXiv preprint arXiv:2201.10005, 2022. [35] J. Shlens, “A tutorial on principal component analysis,” arXiv preprint arXiv:1404.1100, 2014. [36] L. Van der MaatenandG.Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008.
|