Best practices labelling/benchmarking of RAG chatbot systems?

Irina_Malkin_Ondik · August 5, 2025, 11:37am

Hi.
Are there any best practices for development and especially labelling/benchmarking of RAG chatbot systems? We have found out, that labelling of every new version/architecture of chatbot by human is quite demanding and takes time in company environment, but any automatisation does not really work for us.