background

ChatGPT for SQL: How to Generate better SQL Queries with OpenAI

ChatGPT for SQL: How to Generate better SQL Queries with OpenAI
Sheldon Niu
Sheldon Niu

Introduction

You can leverage ChatGPT for various tasks like:

  1. Gaining Business Insights.
  2. Designing and Migrating Table Schemas
  3. Conducting Data Analysis
  4. Optimizing and Correcting SQL Queries

Although ChatGPT can write SQL, but there are some important best practices that can help ChatGPT write SQL with much higher efficiency and accuracy.

In this article, we'll delve into how to unleash the full potential of ChatGPT in SQL-related tasks.

Context Preparation

ChatGPT needs to understand some context to generate valid SQL queries. Typically, for SQL generation, you need three types of data:

  1. Table schemas related to your question.
  2. Background documentation about tables and columns.
  3. Your question.

Table schemas are the most important information; they contain what columns your database has and how tables are related through foreign keys.

However, table schemas are not always enough for AI to generate accurate results, because there is usually some implicit information that is hard to infer from the schema itself, and only the developers know it.

For example, if you have a column named "status" typed as VARCHAR, and it has three values: "Active", "Inactive", "Pending", it's impossible for ChatGPT to know the exact enumerations and their meanings of this column without being given this extra information.

Table Schemas Related to Your Question

First, you need to obtain the string representation of your table structures.

Taking MySQL as an example, we use the "DESCRIBE" command to retrieve the table schema. By executing this command, you can get a detailed view of the table's structure:

Get table schema by running DESCRIBE query
Get table schema by running DESCRIBE query

It becomes cumbersome when your query involves several tables; you have to manually execute repetitive commands and concatenate them. This process becomes even more tedious if your schema changes, forcing you to redo this.

However, if you are using tools like AskYourDatabase (the ChatGPT for SQL), you don't have to go through these steps at all. It automatically fetches all related table schemas for you, including table relations, comments, data types, and then feeds this information into ChatGPT, just like this:

AskYourDatabase (ChatGPT for SQL) will automatically understands your database
AskYourDatabase (ChatGPT for SQL) will automatically understands your database

It saves you a lot of time, and will save you more when you have tons of tables.

AskYourDatabase automatically fetches related schemas, syncing with the latest version against your database, and let ChatGPT understand your database automatically.

It's worth noting that the clearer and more self-explanatory your schemas are, the more likely you are to obtain better SQL results.

So, how can you make your schema more AI-friendly? Here are some tips:

  1. Use semantic naming for tables and columns so that ChatGPT can better understand your schema. For example, user is better than u, sales_amount is better than sa.
  2. Use foreign constraints to link tables, so that ChatGPT can understand the relationships between them.
  3. Add descriptive comments to schema when it's not self-explanatory or has implicit conventions.

For more information about how to get better generated SQL, kindly refer to these two articles:

Background documentation about tables and columns

Why this matters?

Imagine ChatGPT is a new intern who just came to your office yesterday, although he/she is very smart and capable of writing complex SQL queries, it's still hard for he/she to deeply know the business meaning behind the table schema.

Usually only the backend or BI engineer who knows the business meaning behind the schema so well, so giving these background documentations to ChatGPT will help it better understand your database's business logic.

What should the background documentations include? Here's our advice:

  1. The business meaning of each table.
  2. SQL examples. These examples are more important when you wanna ask some quick questions like "What's the revenue of yesterday", if you have given the right SQL of "What's the revenue of the last month", then ChatGPT will quickly come up with the write SQL instead of thinking from scratch.

If you are using AskYourDatabase, adding these docs is really easy and straightforward, just go to the training page, and add the docs you need.

First, click the 'Train' button:

AskYourDatabase - Train to get better responses
AskYourDatabase - Train to get better responses

Then add all the docs and SQL examples:

ChatGPT for SQL - Add new database documentation in AskYourDatabase
ChatGPT for SQL - Add new database documentation in AskYourDatabase

Adding SQL Examples in AskYourDatabase
Adding SQL Examples in AskYourDatabase

After adding all of this information, you are ready to go without any extra setup, and ChatGPT will know all of the information you specified and hopefully give you better results.

Your question

The thing is, the more information you provide in the documentation, the shorter and more ambiguous your question can be.

If you have a well-defined SQL example of "What are the best five customer acquisition channels last month?"

You can directly ask another question about "What are the top 4 customer acquisition channels last week?" without having to specify any columns and tables.

But if you have no docs, you have to manually specify all the columns, like "Give me the top five customer channels (the source is stored in the source column) sorted by the signup users divided by the visitors, and the signup users are in the users table, and the visitors are in the log table....)"

So you see, if you have some often asked questions that are complex but have some fixed patterns, you'd better write a SQL example. It's like a "Quick manual" for ChatGPT.

SQL Generation vs Chat with databases

Actually, there are two main ways of interacting with databases:

  1. Generate SQL, copy the SQL, execute the SQL, view the response in a database client.
  2. Chat with databases in ChatGPT - ChatGPT will fetch the data, explain the data for you, and even do further data analysis & visualization.

It's obvious that chatting with databases is a more preferable way to interact with databases because you don't need to do a lot of trivial things like copy & paste, and ChatGPT serves as an AI data analyst who can connect to the database and perform any possible data analysis.

If you are using raw ChatGPT or other tools like text2sql.io, you are just doing SQL generation. To chat with databases, it's highly recommended to use tools like AskYourDatabase, as they let you get any data insights without leaving your chat thread:

Instead of doing SQL generation, you can chat with databases using AI.
Instead of doing SQL generation, you can chat with databases using AI.

Data analysis & visualization

SQL work always involves further data tasks, including analysis and visualization.

Instead of toggling between several tools, you can directly do all of these things in AskYourDatabase.

Below is a visualization example:

AskYourDatabase Desktop - visualization by asking a question
AskYourDatabase Desktop - visualization by asking a question

You can even let ChatGPT to do data analysis tasks like:

Question: Analyze the order details to identify potential problems.

Answer:

Get business insights from your db
Get business insights from your db

Conclusions

With enough context, ChatGPT can indeed generate relatively accurate SQL.

Using tools like AskYourDatabase, which is highly optimized for querying SQL databases, will enable you to chat with your databases, saving you a significant amount of time.

More importantly, AskYourDatabase is a desktop app that keeps your database credentials safe and even supports SSH tunneling connections.

If you are looking for effective ways to generate SQL using AI, then tools like AskYourDatabase might be a suitable solution for you.

You can also learn more about SQL AI articles here:

How does AskYourDatabase protects your data

Comparison of the Top 4 SQL AI Tools in 2024

Chat with Supabase PostgreSQL using AI

The Best GPT for Data Analysis


More Blogs

Query SQL database in natural language

Query SQL database in natural language

The best approach depends on your team and whether you can invest time in building a SQL AI Agent. For technical teams who want more customized solutions, tools like Langchain may be the best choice. For non-tech teams, choosing tools like AskYourDatabase will likely be the lowest-cost option.

Sheldon Niu
Sheldon Niu
How AskYourDatabase has saved 40 hours per week for SmartLead.ai

How AskYourDatabase has saved 40 hours per week for SmartLead.ai

AskYourDatabase has helped the famous cold email SaaS company SmartLead.ai save 40 hours each week between the CS team and the engineering team. More importantly, the tool has taught the CS team the skills crucial for their growth and success.

Sheldon Niu
Sheldon Niu
;