logo

PARROT

A Practical And Realistic BenchmaRk for CrOss-System SQL Translation

About PARROT 🦜

PARROT (Practical And Realistic BenchmaRk for CrOss-System SQL Translation) was created to support the task of Cross-System SQL Translation (i.e., SQL-to-SQL translation), which involves adapting a query written for one database system into its functionally equivalent form for another.

The main dataset comprises 598 translation pairs from 38 open-source benchmarks and real-world business services, specifically prepared to challenge system-specific SQL understanding.

News

  • May 15, 2025: We have released PARROT-1.0 (28,003 translation pairs from 38 open-source benchmarks for extensive syntax testing) and published the leaderboard.

Surprise from PARROT

We have experimented different LLMs in terms of (1) usage license, (2) parameter scale, and (3) task scope. These LLMs attain an average accuracy below 38.53 %, underscoring the substantial challenges inherent to SQL-to-SQL translation and the pressing need for more advanced techniques.

Email Subscription

Citation

@inproceedings{zhou2025parrot,
  author       = {Wei Zhou and
                  Guoliang Li and
                  Haoyu Wang and
                  Yuxing Han and
                  Xufei Wu and
                  Fan Wu and
                  Xuanhe Zhou},
  title        = {PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation},
  booktitle    = {NeurIPS},
  year         = {2025}
}
@article{zhou2025cracksql,
  author       = {Wei Zhou and
                  Yuyang Gao and
                  Xuanhe Zhou and
                  Guoliang Li},
  title        = {{Cracking SQL Barriers:} {An}  LLM-based Dialect Transaltion System},
  journal      = {Proc. {ACM} Manag. Data},
  volume       = {3},
  number       = {3 (SIGMOD)},
  year         = {2025}
}
@article{zhou2025cracksqldemo,
  author       = {Wei Zhou and
                  Yuyang Gao and
                  Xuanhe Zhou and
                  Guoliang Li},
  title        = {CrackSQL: A Hybrid SQL Dialect Translation System Powered by Large Language Models},
  journal      = {arXiv Preprint},
  url       = {https://arxiv.org/abs/2504.00882},
  year         = {2025}
}
            

PARROT

We have publicly released PARROT along with detailed usage instructions. For more details, please visit the GitHub repository. To update the leaderboard, ensure that your paper or resource is publicly accessible and submit a pull request.

Leaderboard - Dialect Compatability (AccEX)
Model Code Size Accuracy (%)

Human Performance
Translation Tool + Human DBAs
> 90.00
GPT-4o
OpenAI
UNK 53.32
DeepSeek-V3 671B
DeepSeek
671B 50.64
Claude 3.7 Sonnet
Anthropic
UNK 48.09
DeepSeek-R1 671B
DeepSeek
671B 44.42
DeepSeek-R1 32B
DeepSeek
32B 41.98
o3-mini
OpenAI
UNK 27.94
DeepSeek-Coder-V2 Lite
DeepSeek
15.7B 24.84
DeepSeek-R1 7B
DeepSeek
7B 17.03
Leaderboard - Result Consistency (AccRES)
Model Code Size Accuracy (%)

Human Performance
Translation Tool + Human DBAs
> 90.00
o3-mini
OpenAI
UNK 54.23
o1-preview
OpenAI
UNK 48.69
DeepSeek-R1 671B
DeepSeek
671B 40.52
DeepSeek-V3 671B
DeepSeek
671B 32.65
Doubao 1.5 Pro Thinking
Doubao
UNK 25.70
Claude 3.7 Sonnet
Anthropic
UNK 22.74
GPT-4o
OpenAI
UNK 21.87
DeepSeek-R1 32B
DeepSeek
32B 16.91
Doubao 1.5 Pro
Doubao
UNK 14.29