MySQL GROUP BY Clause

Introduction

In this chapter, we will learn about the GROUP BY clause in MySQL. The GROUP BY clause is used to arrange identical data into groups with the help of aggregate functions such as COUNT, SUM, AVG, MAX, and MIN. This is particularly useful for summarizing data, generating reports, and performing data analysis. We will cover the syntax, examples, and important considerations for using the GROUP BY clause.

Syntax

The basic syntax for the GROUP BY clause is:

SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1;
  • column1: The column by which you want to group the results.
  • aggregate_function(column2): The aggregate function to apply to the grouped data.
  • table_name: The name of the table from which to select data.
  • condition: The condition to filter the rows (optional).

Aggregate Functions

  • COUNT(column): Counts the number of rows in each group.
  • SUM(column): Calculates the sum of the values in each group.
  • AVG(column): Calculates the average of the values in each group.
  • MAX(column): Finds the maximum value in each group.
  • MIN(column): Finds the minimum value in each group.

Using GROUP BY

Example with COUNT

SELECT last_name, COUNT(*) AS count
FROM students
GROUP BY last_name;

This example counts the number of students with each last name.

Example with SUM

SELECT last_name, SUM(score) AS total_score
FROM students
GROUP BY last_name;

This example calculates the total score for students with each last name.

Example with AVG

SELECT last_name, AVG(score) AS average_score
FROM students
GROUP BY last_name;

This example calculates the average score for students with each last name.

Example with MAX

SELECT last_name, MAX(score) AS highest_score
FROM students
GROUP BY last_name;

This example finds the highest score for students with each last name.

Example with MIN

SELECT last_name, MIN(score) AS lowest_score
FROM students
GROUP BY last_name;

This example finds the lowest score for students with each last name.

Combining GROUP BY with HAVING

The HAVING clause is used to filter groups based on a condition. It is similar to the WHERE clause but applies to groups rather than individual rows.

Example

SELECT last_name, COUNT(*) AS count
FROM students
GROUP BY last_name
HAVING COUNT(*) > 1;

This example selects last names that appear more than once in the students table.

Full Example

Let's go through a full example where we create a table, insert data into it, and use the GROUP BY clause to group and summarize the data.

  1. Create a Database:
CREATE DATABASE school;
  1. Select the Database:
USE school;
  1. Create a Table:
CREATE TABLE students (
    id INT PRIMARY KEY AUTO_INCREMENT,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE,
    enrollment_date DATE,
    score INT
);
  1. Insert Data into the Table:
INSERT INTO students (first_name, last_name, email, enrollment_date, score) VALUES
('Rahul', 'Sharma', 'rahul.sharma@example.com', '2023-07-01', 85),
('Priya', 'Singh', 'priya.singh@example.com', '2023-07-02', 90),
('Amit', 'Kumar', 'amit.kumar@example.com', '2023-07-03', 75),
('Neha', 'Verma', 'neha.verma@example.com', '2023-07-04', 95),
('Sahil', 'Mehta', 'sahil.mehta@example.com', '2023-07-05', 80),
('Rahul', 'Sharma', 'rahul.sharma2@example.com', '2023-07-06', 88);
  1. Use the GROUP BY Clause with COUNT:
SELECT last_name, COUNT(*) AS count
FROM students
GROUP BY last_name;

Output

last_name count
Sharma 2
Singh 1
Kumar 1
Verma 1
Mehta 1
  1. Use the GROUP BY Clause with SUM:
SELECT last_name, SUM(score) AS total_score
FROM students
GROUP BY last_name;

Output

last_name total_score
Sharma 173
Singh 90
Kumar 75
Verma 95
Mehta 80
  1. Use the GROUP BY Clause with AVG:
SELECT last_name, AVG(score) AS average_score
FROM students
GROUP BY last_name;

Output

last_name average_score
Sharma 86.5
Singh 90
Kumar 75
Verma 95
Mehta 80
  1. Use the GROUP BY Clause with MAX:
SELECT last_name, MAX(score) AS highest_score
FROM students
GROUP BY last_name;

Output

last_name highest_score
Sharma 88
Singh 90
Kumar 75
Verma 95
Mehta 80
  1. Use the GROUP BY Clause with MIN:
SELECT last_name, MIN(score) AS lowest_score
FROM students
GROUP BY last_name;

Output

last_name lowest_score
Sharma 85
Singh 90
Kumar 75
Verma 95
Mehta 80
  1. Combine GROUP BY with HAVING:
SELECT last_name, COUNT(*) AS count
FROM students
GROUP BY last_name
HAVING COUNT(*) > 1;

Output

last_name count
Sharma 2

Important Considerations

  • Performance: Using GROUP BY can impact performance, especially on large datasets. Ensure appropriate indexing to optimize query performance.
  • NULL Values: GROUP BY treats NULL values as equivalent. If a column contains NULL values, all NULL values are grouped together.
  • Combining with Aggregate Functions: Always use aggregate functions with GROUP BY to summarize the grouped data.

Conclusion

The GROUP BY clause is used for grouping and summarizing data in MySQL queries. This chapter covered how to use the GROUP BY clause with various aggregate functions, combining it with other clauses like HAVING.

Comments