Documentation
Get better answer

Get better answer

Although we have tried our best to enable ChatGPT to understand your database schema and generate SQL queries, there are still cases where ChatGPT may generate incorrect SQL queries or behave poorly.

After reading this document, you will know how to maximize the chances of generating correct SQL queries.

Tip 1: Make Schema Human-Understandable

The better your schema is named and easier to understand, the better ChatGPT can comprehend your schema and generate correct SQL queries.

If you are starting a new database or creating a new table, follow these tips:

  1. Use semantic naming for tables and columns so that ChatGPT can better understand your schema. For example, user is better than u, sales_amount is better than sa.
  2. Use foreign constraints to link tables, so that ChatGPT can understand the relationships between them.

Tip 2: Add Descriptive Comments

Add descriptive comments to schema when it's not self-explanatory or has implicit conventions, like:

  1. Some columns have enum values, each with a specific meaning. For example:

You have a table storing all "orders", and each order has a status, which can be "pending", "paid", "shipped", "delivered", or "cancelled".

If your order data type is a string, ChatGPT will not know your status will have such values, and when you ask "How many orders are not paid yet"?

Although you mean "How many orders have status 'pending'", ChatGPT may generate a query like this:

SELECT COUNT(*) FROM orders WHERE status == 'not paid'

Which is obviously incorrect.

  1. Some columns have implicit meanings or abbreviations that are hard to understand, like "tz_location", "tblRep", "WS_MAPPING". It's hard to know what they mean without context.

For all these cases where your schema itself does not convey enough information, we recommend you add comments to your schema to help ChatGPT understand your schema better.

Adding a comment is easy. For example, to add a comment to the "status" column, just say:

Add comment to orders->status column: "pending", "paid", "shipped", "delivered", "cancelled"

Add comment

If "WS_MAPPING" means "Workstation Mapping", just say:

Add comment to WS_MAPPING column: "WS_MAPPING" means "Workstation Mapping"

The principle is:

If there's something implicit or hard to understand in your schema but needed for generating better SQL, just add a comment to explain it in a descriptive way with enough context.

If you do not know your schema well, you could let your DBA or someone who knows your schema well add comments to your schema.

Tip 3: Debug SQL and Give Feedback

If you keep getting wrong answers from some SQL queries, you can look into the generated SQL to see what's wrong (If you lucikly know SQL):

Add comment

If you luckily know what's wrong with the SQL, it's better to give feedback to ChatGPT, so that ChatGPT can learn from the mistake and generate better SQL next time.

Tip 4: Ask Good Questions

Don't be vague and ambiguous; be specific and clear.

If you could say "How many distinct locations are there in the customer table?", don't just say "How many locations are there".

If you know the table name and column name, it's better to mention them, as that helps a lot in generating correct SQL.

Tip 5: Prevent too long Questions

ChatGPT has a limited context window, so submitting lengthy SQL queries or extensive articles can lead to a context limit error.

Train your own AI for better SQL generation

Add comments is often not enough to make ChatGPT understand your schema better, please refer to the training guide to train your own AI model for better SQL generation.