Pam Follow 블로그 주인장

N+1 문제 해결

Summary

N+1 발생
- 회원 조회
- 앨범 조회
성능 측정(개선 전)
개선 방법
- Fetch Join
  - Fetch Join과 페이지네이션
- BatchSize()
개선 결과
성능 측정(개선 후)
추가 개선점

N+1 발생

회원 조회

{
  "uuid": "d48ec5cd-95aa-49f9-80ec-c74c66dfd7f2",
  "name": "팜",
  "description": "",
  "following": 2,
  "follower": 0,
  "role": "ROLE_USER",
  "email": "yeonnu82g@gmail.com",
  "artistImage": "https://..."
}

현재 회원 조회 시 반환되는 데이터는 위와 같다. 해당 회원을 A라고 해보자.

주목해야하는 부분은 following와 follower이다. 팔로워/팔로잉 기능은 이번에 새로 개발하게 된 기능으로, 해당 필드는 A가 팔로우하는 회원과 A를 팔로잉하는 회원의 수를 반환해준다.

    public static ArtistResponseDto from(Artist artist) {
        return ArtistResponseDto.builder()
                .uuid(artist.getUuid())
                .follower(artist.getFollowers().size())
                .following(artist.getFollowings().size())
                ...
                .build();
    }

객체를 DTO로 변환하는 정적 팩터리 메서드는 위와 같다. getFollowers(), getFollowings()을 통해 A의 팔로워와 팔로잉을 조회한 후 그 콜렉션 사이즈를 반환하는 로직을 지닌다. (이때 Artist와 Follow는 일대다 관계이다.)

 start transaction 
    - [driving thread] http-nio-8080-exec-3
    - param: d48ec5cd-95aa-49f9-80ec-c74c66dfd7f2
    - OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection) [2ms] -- OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection)
    - setAutoCommit(false) [0ms] -- setAutoCommit(false)
    - PRE> select a1_0.id,a1_0.uuid, .....
			     from artist a1_0 
			     where (a1_0.is_deleted = @{1}) and a1_0.uuid=? [0,'d48ec5cd-95aa-49f9-8'] 2 ms
    - RESULT-SET-FETCH #1 1 ms
    - PRE> select f1_0.follower_id,f1_0.id,f1_0.created_at,f1_0.following_id,f1_0.updated_at 
				   from follow f1_0 
				   where f1_0.follower_id=? [2] 5 ms
    - RESULT-SET-FETCH #2 0 ms
    - PRE> select f1_0.following_id,f1_0.id,f1_0.created_at,f1_0.follower_id,f1_0.updated_at 
				   from follow f1_0 
				   where f1_0.following_id=? [2] 1 ms
    - COMMIT [1ms] -- COMMIT
    - setAutoCommit(true) [0ms] -- setAutoCommit(true)
    - CLOSE [0ms] -- CLOSE
end of transaction

특정 회원 A를 조회했을 때 반환되는 SQL은 위와 같다. 총 3개의 SQL을 확인할수 있다.

회원 A의 정보를 조회하는 SQL
회원 A가 가진 모든 Follow(Following) 정보를 조회하는 SQL
회원 A가 가진 모든 Follow(Follower) 정보를 조회하는 SQL

start transaction 
[driving thread] http-nio-8080-exec-3
OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection) [1ms] -- OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection)
setAutoCommit(false) [0ms] -- setAutoCommit(false)
PRE> select a1_0.id,a1_0.uuid,.... 
		 from artist a1_0 
		 where (a1_0.is_deleted = @{1}) order by a1_0.created_at desc limit ?,? [0,0,10] 7 ms
RESULT-SET-FETCH #5 3 ms
PRE> select f1_0.following_id,f1_0.id,f1_0.created_at,f1_0.follower_id,f1_0.updated_at 
		 from follow f1_0 
		 where f1_0.following_id=? [5] 1 ms
PRE> select f1_0.follower_id,f1_0.id,f1_0.created_at,f1_0.following_id,f1_0.updated_at 
		 from follow f1_0 
		 where f1_0.follower_id=? [5] 1 ms
PRE> select f1_0.following_id,f1_0.id,f1_0.created_at,f1_0.follower_id,f1_0.updated_at 
		 from follow f1_0 
		 where f1_0.following_id=? [4] 0 ms
RESULT-SET-FETCH #1 2 ms
PRE> select f1_0.follower_id,f1_0.id,f1_0.created_at,f1_0.following_id,f1_0.updated_at 
		 from follow f1_0 
		 where f1_0.follower_id=? [4] 4 ms
PRE> select f1_0.following_id,f1_0.id,f1_0.created_at,f1_0.follower_id,f1_0.updated_at 
		 from follow f1_0 
		 where f1_0.following_id=? [3] 3 ms
RESULT-SET-FETCH #1 0 ms
PRE> select f1_0.follower_id,f1_0.id,f1_0.created_at,f1_0.following_id,f1_0.updated_at 
		 from follow f1_0 
		 where f1_0.follower_id=? [3] 2 ms
PRE> select f1_0.following_id,f1_0.id,f1_0.created_at,f1_0.follower_id,f1_0.updated_at 
		 from follow f1_0 
		 where f1_0.following_id=? [2] 0 ms
PRE> select f1_0.follower_id,f1_0.id,f1_0.created_at,f1_0.following_id,f1_0.updated_at 
		 from follow f1_0 
		 where f1_0.follower_id=? [2] 1 ms
RESULT-SET-FETCH #2 3 ms
PRE> select f1_0.following_id,f1_0.id,f1_0.created_at,f1_0.follower_id,f1_0.updated_at 
		 from follow f1_0 
		 where f1_0.following_id=? [1] 1 ms
PRE> select f1_0.follower_id,f1_0.id,f1_0.created_at,f1_0.following_id,f1_0.updated_at 
		 from follow f1_0 
		 where f1_0.follower_id=? [1] 2 ms
COMMIT [2ms] -- COMMIT
setAutoCommit(true) [0ms] -- setAutoCommit(true)
CLOSE [0ms] -- CLOSE
end of transaction 

이번엔 특정 회원을 조회하는 것이 아닌, 전체 회원을 조회하는 API를 실행시켜보았다. 현재 저장된 회원은 5명 뿐인데 총 11개의 SQL이 실행되었다.

전체 회원 조회 SQL
1번 회원의 Follower & Following을 조회하는 SQL 2개
2번 회원의 Follower & Following을 조회하는 SQL 2개
…

위와 같은 식으로 총 N*2+1 = 11의 쿼리가 실행된 것이다. 회원의 수가 많아지면 많아질 수록 발생하는 추가 쿼리의 양 또한 심각해진다.

앨범 조회

이보다 더 심각한 경우를 찾아 이번엔 앨범 목록 조회를 실행해보았다. 아래는 앨범 목록 조회에 따른 결과이다.

{
  "albumResponseDto": {
    "uuid": "5b10264b-1edf-4113-be25-502b3f05ba4f",
    "title": "tipsy",
    "description": "sbz",
    "artImage": "https://...s3.ap-northeast-2.amazonaws.com/resources/images/597959c4-bb19-4c98-bba3-1c27afd61714_tipsy.webp",
    "releaseDate": "2025-02-16"
  },
  "trackResponseDtos": [
    {
      "uuid": "1427e153-b6f0-4c34-881d-ca93a69f18bb",
      "title": "a bar song",
      "lyric": "",
      "trackUrl": "https://...s3.ap-northeast-2.amazonaws.com/resources/musics/5dd0de41-dfdf-41d8-a35a-20ab75096e73_a%20bar%20song.mp3",
      "duration": 173,
      "artUrl": "https://...s3.ap-northeast-2.amazonaws.com/resources/images/597959c4-bb19-4c98-bba3-1c27afd61714_tipsy.webp"
    }
  ],
  "artistResponseDto": {
    "uuid": "64c18fa8-f15d-4861-af55-f6d44391a639",
    "name": "ipcgrdn",
    "description": "🦋",
    "link1": "https://www.instagram.com/ipcgrdn",
    "link2": "",
    "following": 0,
    "follower": 0,
    "role": "ROLE_USER",
    "email": "asdf@gmail.com",
    "artistImage": "https://...s3.ap-northeast-2.amazonaws.com/resources/images/3a69b482-ae2d-4857-b8a1-dd826b32ed94_profile_pic_1.webp"
  }
}

앨범 조회의 경우에는 앨범과 일대다 관계인 트랙, 앨범과 다대일 관계인 아티스트 정보가 포함된 데이터를 반환한다.

start transaction
-    [driving thread] http-nio-8080-exec-1
-    param: 5b10264b-1edf-4113-be25-502b3f05ba4f
-    param: 0
-    param: 10
-    OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection) [0ms] -- OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection)
-    setAutoCommit(false) [0ms] -- setAutoCommit(false)
-    PRE> select a1_0.id,a1_0.uuid,... from album a1_0 
					where (a1_0.is_deleted = @{1}) and a1_0.uuid=?[0,'5b10264b-1edf-4113-b'] 9 ms
-    RESULT-SET-FETCH #1 1 ms
-    PRE> select t1_0.id,t1_0.uuid,... from track t1_0 
					where t1_0.album_id=? order by t1_0.created_at desc limit ?[8,10] 1 ms
-    RESULT-SET-FETCH #1 0 ms
-    PRE> select a1_0.id,a1_0.uuid,... from artist a1_0 
					where a1_0.id=? and (a1_0.is_deleted = @{1})[0,1] 0 ms
-    RESULT-SET-FETCH #1 0 ms
-    PRE> select f1_0.following_id,f1_0.id,f1_0.created_at,f1_0.follower_id,f1_0.updated_at 
					from follow f1_0 
					where f1_0.following_id=?[1] 0 ms
-    PRE> select f1_0.follower_id,f1_0.id,f1_0.created_at,f1_0.following_id,f1_0.updated_at 
					from follow f1_0 
					where f1_0.follower_id=?[1] 0 ms
-    COMMIT [1ms] -- COMMIT
-    setAutoCommit(true) [0ms] -- setAutoCommit(true)
-    CLOSE [0ms] -- CLOSE
end of transaction 

따라서 앨범 조회 시 발생하는 SQL은 총 5개이다: 앨범 조회 + (트랙 조회 + 회원 조회 + (팔로잉 조회 + 팔로워 조회))

앨범 조회에서 트랙과 회원에 대한 N+1이, 회원에서 팔로잉 팔로워에 대한 N+1이 발생했다.

성능 측정(개선 전)

실무와 유사한 환경 구성을 위해 DB에 총 10만개의 데이터를 추가한 후 성능을 측정했다.
성능 측정과 부하 테스트 툴로는 각각 Scouter와 JMeter를 사용했다.

데이터를 추가하는 방법은 다음 게시글을 첨고하자: DB에 대량 데이터 삽입하기

► txid    = x2upjd8qduric
► objName = /088efa1456d7/tomcat1
► thread  = http-nio-8080-exec-91
► endtime = 20250408 18:12:29.815
► elapsed = 336 ms
► service = /albums<GET>
► ipaddr=172.18.0.4, userid=-8854012464114361035
► cpu=0 ms, kbytes=0
► sqlCount=393, sqlTime=230 ms
► userAgent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36
► referer=https://soundforest.kro.kr/v1/swagger-ui/index.html
► group= / **
► profileCount=723
► profileSize=16446
------------------------------------------------------------------------------------------
    p#      #    	  TIME         T-GAP   CPU          CONTENTS
------------------------------------------------------------------------------------------
         [******] 18:12:29.479        0      0  start transaction 
    -    [000000] 18:12:29.479        0      0  [driving thread] http-nio-8080-exec-91
    -    [000001] 18:12:29.485        6      0  param: 0
    -    [000002] 18:12:29.485        0      0  param: 100
    -    [000003] 18:12:29.485        0      0  OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection) [0ms] -- OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection)
    -    [000004] 18:12:29.486        1      0  setAutoCommit(false) [0ms] -- setAutoCommit(false)
    -    [000005] 18:12:29.488        2      0  PRE> select a1_0.id,a1_0.art_image,a1_0.artist_id,a1_0.created_at,a1_0.description,a1_0.is_deleted,a1_0.release_date,a1_0.title,a1_0.updated_at,a1_0.uuid from album a1_0 where (a1_0.is_deleted = @{1}) order by a1_0.created_at desc limit ?,?
                                                [0,0,100] 14 ms
    -    [000006] 18:12:29.502       14      0  RESULT-SET-FETCH #100 0 ms
    -    [000007] 18:12:29.503        1      0  PRE> select count(a1_0.id) from album a1_0 where (a1_0.is_deleted = @{1})
                                                [0] 5 ms
                                                
                                                .....

위는 단일 사용자가 100개의 앨범을 조회하는 경우 발생한 쿼리이다.
총 393개의 쿼리가 실행되고, 230ms의 sqlTime이 소요됨을 확인할 수 있다.

► txid    = x4hv54q38pqbal
► objName = /088efa1456d7/tomcat1
► thread  = http-nio-8080-exec-257
► endtime = 20250408 18:20:30.123
► elapsed = 2,071 ms
► service = /albums<GET>
► ipaddr=172.18.0.4, userid=0
► cpu=0 ms, kbytes=0
► sqlCount=393, sqlTime=1,565 ms
► userAgent=Apache-HttpClient/4.5.14 (Java/21.0.5)
► group=/ **
► profileCount=723
► profileSize=16988
------------------------------------------------------------------------------------------
    p#      #    	  TIME         T-GAP   CPU          CONTENTS
------------------------------------------------------------------------------------------
         [******] 18:20:28.052        0      0  start transaction 
    -    [000000] 18:20:28.052        0      0  [driving thread] http-nio-8080-exec-257
    -    [000001] 18:20:28.082       30      0  param: 0
    -    [000002] 18:20:28.082        0      0  param: 100
    -    [000003] 18:20:28.088        6      0  OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection) [5ms] -- OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection)
    -    [000004] 18:20:28.094        6      0  setAutoCommit(false) [0ms] -- setAutoCommit(false)
    -    [000005] 18:20:28.095        1      0  PRE> select a1_0.id,a1_0.art_image,a1_0.artist_id,a1_0.created_at,a1_0.description,a1_0.is_deleted,a1_0.release_date,a1_0.title,a1_0.updated_at,a1_0.uuid from album a1_0 where (a1_0.is_deleted = @{1}) order by a1_0.created_at desc limit ?,?
                                                [0,0,100] 159 ms
    -    [000006] 18:20:28.255      160      0  RESULT-SET-FETCH #100 1 ms
    -    [000007] 18:20:28.255        0      0  PRE> select count(a1_0.id) from album a1_0 where (a1_0.is_deleted = @{1})
                                                [0] 63 ms
    -    [000008] 18:20:28.318       63      0  RESULT-SET-FETCH #1 0 ms
    -    [000009] 18:20:28.318        0      0  PRE> select t1_0.id,t1_0.album_id,t1_0.created_at,t1_0.duration,t1_0.is_deleted,t1_0.lyric,t1_0.title,t1_0.track_url,t1_0.updated_at,t1_0.uuid from track t1_0 where t1_0.album_id=? order by t1_0.created_at desc limit ?
                                                [8,10] 1 ms
                                                
                                                ......

이번엔 10명의 사용자가 동시 접속한 후 100개의 앨범을 조회하는 경우를 테스트했다.
SqlTime이 1.5초 가량으로 상승했다.

개선 방법

Fetch Join

    @Query("SELECT a FROM Album a "
            + "JOIN FETCH a.tracks t ")
    Page<Album> findAll(Pageable pageable);

n+1 문제를 해결하는 것으로 가장 잘 알려진 방법으로는 Fetch Join이 있다.
Fetch Join을 사용하면 추가적인 쿼리 없이, 하나의 쿼리만으로 Join한 테이블의 데이터까지 모두 조회가능하다.

Join vs Fetch Join
Join: 연관된 엔티티를 단순히 결합할 뿐 연관된 객체를 즉시 로딩하지 않음 (따라서 이후 연괸 객체 조회시 추가 쿼리 발생)
Fetch Join: 연관된 객체를 한번의 쿼리로 즉시 로딩

Hibernate: 
    select
        a1_0.id,
        ...
        a1_0.uuid 
    from
        album a1_0 
    where
        (
            a1_0.is_deleted = 0
        ) 
    order by
        a1_0.created_at desc 
    limit
        ?, ?
Hibernate: 
    select
        t1_0.id,
        ...
        t1_0.uuid 
    from
        track t1_0 
    where
        t1_0.album_id=? 
    order by
        t1_0.created_at desc 
    limit
        ?

기존에는 Album 조회시 n+1에 의해 Track 조회 쿼리가 추가적으로 실행되었다.
(편의를 위해 artist와 follow 쿼리는 제거했다.)

2025-04-18T16:56:43.351+09:00  WARN 71337 --- [nio-8080-exec-1] org.hibernate.orm.query                  : HHH90003004: firstResult/maxResults specified with collection fetch; applying in memory
Hibernate: 
    selec
        a1_0.id,
        ...
        a1_0.uuid,
        t1_0.id,
        ...
        t1_0.uuid
    from
        album a1_0 
    join
        track t1_0 
            on a1_0.id=t1_0.album_id 
    where
        (
            a1_0.is_deleted = 0
        ) 
    order by
        a1_0.created_at desc

그러나 Album에서 Track을 Fetch Join한 경우 위와 같이 하나의 쿼리만으로 Album과 Track의 정보를 한꺼번에 조회한다.

Fetch Join과 페이지네이션

그러나 Fetch Join에는 문제가 있다. 위에서 본 두 쿼리 중 앨범 조회 쿼리의 마지막 부분을 비교해보자.

....
    order by
        a1_0.created_at desc 
    limit
        ?, ?

....   
    order by
        a1_0.created_at desc

페이징을 통해 앨범을 조회하고 있었는데, Fetch Join을 사용하니 페이징이 적용되지 않는다!

정확히는 OneToMany관계에서 Fetch Join을 사용하는 경우 페이지네이션은 적용되지 않는다.

앨범 조회시 페이지네이션을 사용하여 한 페이지당 2개의 앨범을 지닌다고 가정해보자.

앨범은 트랙과 OneToMany 관계이다. 일반 Join을 사용하여 쿼리를 호출한다면 앨범 조회 쿼리와 함께 n개의 트랙 조회 쿼리가 생성될 것이다.

그러나 Fetch Join을 사용할 경우 Album 조회 쿼리와 함께 Track 데이터가 조회된다.

JPA는 앨범 조회시 2개의 데이터를 한 페이지로 정의하기로 했다. 그러나 정해진 데이터보다 훨씬 많은 데이터가 들어왔으므로 어떤 기준으로 페이징해야할지 알 수 없다.

이번엔 트랙을 페이지네이션으로 조회시 한 페이지당 3개의 트랙을 지닌다고 가정해보자.

트랙과 앨범의 관계는 ManyToOne이다. ManyToOne의 경우에는 Fetch Join을 사용해도 페이징이 가능하다.

트랙에서 앨범을 Fetch Join한 경우, 어차피 하나의 트랙당 한 개의 앨범을 가지게 되므로 페이징의 기준이 명확하기 때문이다.

그렇다면 OneToMany 관계에서 페이지네이션을 사용하면서, n+1 문제도 해결하기 위해선 어떻게 해야할까?

BatchSize()

OneToMany 관계에서 페이지네이션을 사용하면서도 추가 쿼리를 줄이고 싶다면 BatchSize()를 사용하자.

    @BatchSize(size = 2)
    @OneToMany(mappedBy = "album", orphanRemoval = true)
    private List<Track> tracks;

Album 클래스의 Track 필드에 BatchSize를 2으로 설정해주었다. (테스트용으로 사이즈를 작게 설정했다.)

Hibernate: 
    select
        a1_0.id,
        ...
        a1_0.uuid 
    from
        album a1_0 
    order by
        a1_0.created_at desc 
    limit
        ?
Hibernate: 
    select
        t1_0.album_id,
        ...
        t1_0.uuid 
    from
        track t1_0 
    where
        t1_0.album_id in (?, ?)
Hibernate: 
    select
        t1_0.album_id,
        ...
        t1_0.uuid 
    from
        track t1_0 
    where
        t1_0.album_id in (?, ?)
Hibernate: 
    select
        t1_0.album_id,
				...
        t1_0.uuid 
    from
        track t1_0 
    where
        t1_0.album_id=?

DB에 저장된 Album의 개수가 5개라 할때, 실행된 쿼리는 총 4개이다: 앨범 조회 + 트랙조회 + 트랙조회 + 트랙조회

원래라면 n+1에 의해 n(=5)개의 추가 쿼리가 발생해야했으나, 3개밖에 발생하지 않은 것이다.

BatchSIze()를 사용하면 여러개의 쿼리를 묶어서 실행할 수 있다.

위 경우에는 BatchSize를 2로 설정하여 원래 발생했어야 했던 5개의 추가 쿼리가 2개씩 묶여 2+2+1, 즉 3개의 쿼리가 실행되었다.

여전히 추가 쿼리가 발생하긴 하나, 원래 발생했어야 할 n개의 추가 쿼리를 n/{Batch size}만큼 줄이는 것으로 성능 향상을 이루어낼수 있다.

물론 Batch Size를 굉장히 크게 잡으면 추가 쿼리의 수야 줄어들겠지만 그만큼 가져올 데이터가 많아져 메모리에 부담이 가니 Batch size는 데이터의 수에 따라 적당히 조절하도록 하자.

개선 결과

Hibernate: 
    select
        a1_0.id,
        ...
        a1_0.uuid 
    from
        album a1_0 
    where
        (
            a1_0.is_deleted = 0
        ) 
    order by
        a1_0.created_at desc 
    limit
        ?, ?
Hibernate: 
    select
        t1_0.album_id,
        ...
        t1_0.uuid 
    from
        track t1_0 
    where
        t1_0.album_id=?
Hibernate: 
    select
        a1_0.id,
        ...
        a1_0.uuid 
    from
        artist a1_0 
    where
        a1_0.id=? 
        and (
            a1_0.is_deleted = 0
        )
Hibernate: 
    select
        f1_0.following_id,
        ...
        f1_0.updated_at 
    from
        follow f1_0 
    where
        f1_0.following_id=?
Hibernate: 
    select
        f1_0.follower_id,
        ...
        f1_0.updated_at 
    from
        follow f1_0 
    where
        f1_0.follower_id=?
Hibernate: 
    select
        t1_0.album_id,
        ...
        t1_0.uuid 
    from
        track t1_0 
    where
        t1_0.album_id=?
Hibernate: 
    select
        a1_0.id,
        ...
        a1_0.uuid 
    from
        artist a1_0 
    where
        a1_0.id=? 
        and (
            a1_0.is_deleted = 0
        )
Hibernate: 
    select
        f1_0.following_id,
        ...
        f1_0.updated_at 
    from
        follow f1_0 
    where
        f1_0.following_id=?
Hibernate: 
    select
        f1_0.follower_id,
        ...
        f1_0.updated_at 
    from
        follow f1_0 
    where
        f1_0.follower_id=?

개선 전 앨범 조회시 발생한 쿼리는 위와 같다. 이 때 앨범에 두개의 트랙이 있고, 두 트랙의 아티스트가 다르다고 가정한다.

album 조회 + 2*(track 조회 + artist 조회 + follower 조회 + following 조회) = 9개 쿼리

Hibernate: 
    select
        a1_0.id,
        ...
        a1_0.uuid,
        a2_0.id,
        ...
        a2_0.uuid
    from
        album a1_0 
    join
        artist a2_0 
            on a2_0.id=a1_0.artist_id 
            and (a2_0.is_deleted = 0) 
    where
        (
            a1_0.is_deleted = 0
        ) 
    order by
        a1_0.created_at desc 
    limit
        ?
Hibernate: 
    select
        t1_0.album_id,
        ...
        t1_0.uuid 
    from
        track t1_0 
    where
        t1_0.album_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: 
    select
        f1_0.following_id,
        ...
        f1_0.updated_at 
    from
        follow f1_0 
    where
        f1_0.following_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: 
    select
        f1_0.follower_id,
        ...
        f1_0.updated_at 
    from
        follow f1_0 
    where
        f1_0.follower_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)

개선 후 동일 상황에서 발생하는 쿼리는 위아 같다. 개선 내용은 아래와 같다.

artist → follow (one to many) = batch size 적용
album → artist (many to one) = fetch join 적용
album → track (one to many) = batch size 적용

위와 동일한 상황에서 album&artist 조회 + track 조회 + follower 조회 + following조회 = 4개 쿼리가 발생했다.

개선 전과 비교하여 5개의 추가쿼리가 제거되었다. 조회하는 데이터의 수가 커질수록 개선 효과는 커질 것이다.

성능 측정(개선 후)

► txid    = x1kqro991h1oi6
► objName = /825e5c2a2716/tomcat1
► thread  = http-nio-8080-exec-667
► endtime = 20250418 23:14:12.905
► elapsed = 52 ms
► service = /albums<GET>
► ipaddr=172.18.0.4, userid=0
► cpu=0 ms, kbytes=0
► sqlCount=21, sqlTime=41 ms
► userAgent=Apache-HttpClient/4.5.14 (Java/21.0.6)
► group=/ **
► profileCount=50
► profileSize=2440
------------------------------------------------------------------------------------------
    p#      #    	  TIME         T-GAP   CPU          CONTENTS
------------------------------------------------------------------------------------------
         [******] 23:14:12.853        0      0  start transaction 
    -    [000000] 23:14:12.853        0      0  [driving thread] http-nio-8080-exec-667
    -    [000001] 23:14:12.854        1      0  param: 0
    -    [000002] 23:14:12.854        0      0  param: 100
    -    [000003] 23:14:12.854        0      0  OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection) [0ms] -- OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection)
    -    [000004] 23:14:12.854        0      0  setAutoCommit(false) [0ms] -- setAutoCommit(false)
    -    [000005] 23:14:12.855        1      0  PRE> select a1_0.id,a1_0.art_image,a1_0.artist_id,a2_0.id,a2_0.artist_image,a2_0.created_at,a2_0.description,a2_0.email,a2_0.is_deleted,a2_0.link1,a2_0.link2,a2_0.name,a2_0.provider,a2_0.role,a2_0.updated_at,a2_0.uuid,a1_0.created_at,a1_0.description,a1_0.is_deleted,a1_0.release_date,a1_0.title,a1_0.updated_at,a1_0.uuid from album a1_0 join artist a2_0 on a2_0.id=a1_0.artist_id and (a2_0.is_deleted = @{1}) where (a1_0.is_deleted = @{2}) order by a1_0.created_at desc limit ?
                                                [0,0,101] 32 ms
    -    [000006] 23:14:12.888       33      0  RESULT-SET-FETCH #101 1 ms
    -    [000007] 23:14:12.888        0      0  PRE> select t1_0.album_id,t1_0.id,t1_0.created_at,t1_0.duration,t1_0.is_deleted,t1_0.lyric,t1_0.title,t1_0.track_url,t1_0.updated_at,t1_0.uuid from track t1_0 where t1_0.album_id in (?,?,?,?,?,?,?,?,?,?)
                                                [8,7,4,3,2,1,1001,1002,1003,1004] 1 ms
    -    [000008] 23:14:12.889        1      0  RESULT-SET-FETCH #17 0 ms
    -    [000009] 23:14:12.889        0      0  PRE> select f1_0.following_id,f1_0.id,f1_0.created_at,f1_0.follower_id,f1_0.updated_at from follow f1_0 where f1_0.following_id in (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
                                                [1,2,3,11901,12130,13421,3505,18305,2481,2783,12655,5801,20731,8265,3400,3516,10606,10344,11703,2250] 0 ms
    -    [000010] 23:14:12.889        0      0  RESULT-SET-FETCH #23 0 ms
    -    [000011] 23:14:12.891        2      0  PRE> select f1_0.follower_id,f1_0.id,f1_0.created_at,f1_0.following_id,f1_0.updated_at from follow f1_0 where f1_0.follower_id in (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
                                                [1,2,3,11901,12130,13421,3505,18305,2481,2783,12655,5801,20731,8265,3400,3516,10606,10344,11703,2250] 0 ms
    -    [000012] 23:14:12.891        0      0  RESULT-SET-FETCH #28 0 ms
    
    ...

단일 사용자가 100개의 앨범을 조회하는 경우를 테스트 하였다.

개선 전과 비교했을 때 발생하는 쿼리가 393개에서 21개로 줄어들고, sqlTime도 230ms에서 41ms로 줄어들었다.

► txid    = z5ukk2r8dmmakg
► objName = /825e5c2a2716/tomcat1
► thread  = http-nio-8080-exec-745
► endtime = 20250503 17:08:32.934
► elapsed = 206 ms
► service = /albums<GET>
► ipaddr=172.18.0.4, userid=0
► cpu=0 ms, kbytes=0
► sqlCount=21, sqlTime=183 ms
► userAgent=Apache-HttpClient/4.5.14 (Java/21.0.6)
► group=/ **
► profileCount=50
► profileSize=2496
------------------------------------------------------------------------------------------
    p#      #    	  TIME         T-GAP   CPU          CONTENTS
------------------------------------------------------------------------------------------
         [******] 17:08:32.728        0      0  start transaction 
    -    [000000] 17:08:32.728        0      0  [driving thread] http-nio-8080-exec-745
    -    [000001] 17:08:32.730        2      0  param: 0
    -    [000002] 17:08:32.730        0      0  param: 100
    -    [000003] 17:08:32.730        0      0  OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection) [0ms] -- OPEN-DBC jdbc:mysql://soundflyer-db:3306/soundflyer (com.zaxxer.hikari.HikariDataSource#getConnection)
    -    [000004] 17:08:32.732        2      0  setAutoCommit(false) [0ms] -- setAutoCommit(false)
    -    [000005] 17:08:32.733        1      0  PRE> select a1_0.id,a1_0.art_image,a1_0.artist_id,a2_0.id,a2_0.artist_image,a2_0.created_at,a2_0.description,a2_0.email,a2_0.is_deleted,a2_0.link1,a2_0.link2,a2_0.name,a2_0.provider,a2_0.role,a2_0.updated_at,a2_0.uuid,a1_0.created_at,a1_0.description,a1_0.is_deleted,a1_0.release_date,a1_0.title,a1_0.updated_at,a1_0.uuid from album a1_0 join artist a2_0 on a2_0.id=a1_0.artist_id and (a2_0.is_deleted = @{1}) where (a1_0.is_deleted = @{2}) order by a1_0.created_at desc limit ?
                                                [0,0,101] 147 ms
    -    [000006] 17:08:32.884      151      0  RESULT-SET-FETCH #101 4 ms
    -    [000007] 17:08:32.887        3      0  PRE> select t1_0.album_id,t1_0.id,t1_0.created_at,t1_0.duration,t1_0.is_deleted,t1_0.lyric,t1_0.title,t1_0.track_url,t1_0.updated_at,t1_0.uuid from track t1_0 where t1_0.album_id in (?,?,?,?,?,?,?,?,?,?)
                                                [8,7,4,3,2,1,1001,1002,1003,1004] 0 ms
    -    [000008] 17:08:32.887        0      0  RESULT-SET-FETCH #17 0 ms
    -    [000009] 17:08:32.889        2      0  PRE> select f1_0.following_id,f1_0.id,f1_0.created_at,f1_0.follower_id,f1_0.updated_at from follow f1_0 where f1_0.following_id in (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
                                                [1,2,3,11901,12130,13421,3505,18305,2481,2783,12655,5801,20731,8265,3400,3516,10606,10344,11703,2250] 1 ms
    -    [000010] 17:08:32.890        1      0  RESULT-SET-FETCH #23 0 ms
    -    [000011] 17:08:32.890        0      0  PRE> select f1_0.follower_id,f1_0.id,f1_0.created_at,f1_0.following_id,f1_0.updated_at from follow f1_0 where f1_0.follower_id in (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
                                                [1,2,3,11901,12130,13421,3505,18305,2481,2783,12655,5801,20731,8265,3400,3516,10606,10344,11703,2250] 2 ms
    -    [000012] 17:08:32.892        2      0  RESULT-SET-FETCH #28 0 ms
    
    ....

10명의 사용자가 동시 접속하는 경우도 비교해보자.

개선 후 sqlTime은 제일 오래 걸린 트랜잭션이 183ms, 제일 적게 걸린 트랜잭션이 33ms로 개선전 성능이었던 1,565ms에 비해 88%의 향상을 보였다.

다만 이건 동시 접속자 수가 10명밖에 안되는 경우이다…

추가 개선점

목표였던 ‘쿼리 수 줄이기’는 달성했으나 신경쓰이는 점이 있다.

PRE> select a1_0.id,a1_0.art_image,a1_0.artist_id,a1_0.created_at,a1_0.description,a1_0.is_deleted,a1_0.release_date,a1_0.title,a1_0.updated_at,a1_0.uuid from album a1_0 where (a1_0.is_deleted = @{1}) order by a1_0.created_at desc limit ?,?
[0,0,100] 159 ms

개선 전 여러 사용자가 동시에 요청을 보냈을 때 발생한 sqlTime이 1,565ms였을 때 이 중 앨범 조회 쿼리에 소요된 시간은 159ms이다.

PRE> select a1_0.id,a1_0.art_image,a1_0.artist_id,a2_0.id,a2_0.artist_image,a2_0.created_at,a2_0.description,a2_0.email,a2_0.is_deleted,a2_0.link1,a2_0.link2,a2_0.name,a2_0.provider,a2_0.role,a2_0.updated_at,a2_0.uuid,a1_0.created_at,a1_0.description,a1_0.is_deleted,a1_0.release_date,a1_0.title,a1_0.updated_at,a1_0.uuid from album a1_0 join artist a2_0 on a2_0.id=a1_0.artist_id and (a2_0.is_deleted = @{1}) where (a1_0.is_deleted = @{2}) order by a1_0.created_at desc limit ?
[0,0,101] 147 ms

개선 후에 발생한 sqlTime은 183ms, 이 중 앨범 조회 쿼리가 사용한 시간이 147ms로 개선 전과 다르게 sqlTime의 대부분을 차지한다.

이상한 이야기는 아니다. 개선 전 쿼리는 쿼리수가 많은 대신 각 쿼리가 복잡하지 않았다.

그러나 개선 후 쿼리는 쿼리 수가 줄어든 대신, 앨범 조회 쿼리가 Join+Where+Order by 절로 인해 복잡해졌다. 따라서 sqlTime의 대부분을 차지하게 된 것이다.

그러나 문제는 여기부터다.

개선 전

개선 후

동시 접속 사용자 수를 늘리면 결과적으로 평균 시간은 늘어나고 TPS는 줄어든다.

개선 전

개선 후

확인해보면 SQL Time이 거의 2배 가까이 늘어난 것을 확인할수 있다.

내가 해석한 바로는,

트랜잭션이 적었을 때: 어차피 Connection Pool이 부족하지 않음, sqlTime이 좀 길더라도 Connection Pool을 대기하진 않음, CPU 과부화도 비교적 적음

트랜잭션이 많을 때: 많은 양의 트랜잭션 + 복잡한 쿼리 조합으로 DB 과부화 심해짐 → sqlTime 증가 → cpu 과부화로 요청지연 or Connection Pool이 부족한 와중에 sqlTime이 길어서 Connection Pool을 오래 붙잡고 있음 → Connection Pool 대기시간 늘어남 → 요청시간과 TSP가 떨어짐…

정도로 이해했다.

개선 후 늘어난 SQL time의 대부분을 잡아먹는 쿼리는 artist와 album을 조인한 후 select하는 쿼리이다.

Explain을 통해 해당 쿼리를 확인한 결과 Table을 full scan하고 있고, 정렬을 메모리로 수행하고 있다는 정보를 확인할 수 있었다.

해당 성능 개선은 인덱스 게시글에서 찾아보도록 하겠다… -> https://lcqff.github.io/index/

한 개의 무거운 쿼리가 여러 개의 가벼운 쿼리보다 훨씬 더 빠른 이유

13 May 2025

SOFO

« DB에 대량 데이터 삽입하기 인덱스 적용 »

Pam Tree