무던히 하다보면 느는

[mysql] Investigating a Drop in User Engagement 본문

SQL/코딩테스트

[mysql] Investigating a Drop in User Engagement

무던히 하다보면 느는 2025. 4. 24. 19:21

WAU 하락 원인 분석

[tutorial.yammer_events table]

(1) 주차별 활동 유저

select date_sub(date(e.occurred_at), interval weekday(e.occurred_at) day) as week
     , count(distinct user_id) as weekly_active_user
from tutorial.yammer_events e
where e.occurred_at between '2014-04-28 00:00:00' and '2014-08-25 23:59:59'
and e.event_type = 'engagement'
group by week
order by week

 

date_sub(date(e.occurred_at), interval weekday(e.occurred_at) day) as week

-> e.occurred_at 이. 속한 주의 월요일 날짜를 계산

 

(2) 가입 코호트별 WAU 분석

- 각 날짜별로 몇 명이 가입했는지, 가입한 사람들 중에서 active 상태인 사람은 몇 명인지 계산

select date(created_at) as signup_date
    , count(user_id) as signup_users
    , count(case when activated_at is not null then user_id else null end) as activated_users
from tutorial.yammer_users
where created_at between '2014-06-01 00:00:00' and '2014-08-31 23:59:59'
group by signup_date

(3) Email Actions

- 특정 주차에 이메일 이벤트가 몇 개가 들어왔는지 단순 카운트

select date_trunc('week', occurred_at) as week
     , count(case when e.action = 'sent_weekly_digest' then e.user_id else null end) as weekly_emails
     , count(case when e.action = 'sent_reengagement_email' then e.user_id else null end) as reengagement_emails
     , count(case when e.action = 'email_open' then e.user_id else null end) as email_opens
     , count(case when e.action = 'email_clickthrough' then e.user_id else null end) as email_clickthroughs
from tutorial.yammer_emails e
where occurred_at between '2014-04-28 00:00:00' and '2014-08-25 23:59:59'
group by week
order by week

 

- 같은 유저에게서 발생한 이메일 오픈 로그 중

기존의 로그 발생시점 이후로 5분 이내에 발생한 이메일 오픈 로그를 찾는 쿼리

select e1.user_id, e1.occurred_at, e1.action
     , e2.user_id, e2.occurred_at, e2.action
from tutorial.yammer_emails e1
  left join tutorial.yammer_emails e2 
        on e2.occurred_at between e1.occurred_at and e1.occurred_at + interval '5 minute'
        and e2.user_id = e1.user_id
        and e2.action = 'email_open'
where e1.occurred_at between '2014-06-01 00:00:00' and '2014-08-31 23:59:59'
and e1.action in ('sent_weekly_digest', 'sent_reengagement_email')
더보기

from tutorial.yammer_emails e1

    left join tutorial.yammer_emails e2 -- occurred_at 의 빈 부분을 null 처리 (확인 안함 =/= 메일 받지 않음)

    on e2.occurred_at between e1.occurred_at and e1.occurred_at + interval '5 minute'

    and e2.user_id = e1.user_id

    and e2.action = 'email_open'

user_id 이 2014-06-03 09:30:00 에 메일 받고 2014-06-03 09:30:25 에 메일 확인

언제 inner join을, left join 을 써야하는지 잘 판단하기

 

- 메일 발송 -> 오픈 -> 클릭까지의 전환 퍼널을 주간 기준으로 집계

주 단위로 weekly_digest 메일을 얼마나 보냈고, 그 중 얼마나 확인하고, 클릭됐는지를 집계하는 쿼리

select date_trunc('week', e1.occurred_at) as week
     , count(case when e1.action = 'sent_weekly_digest' then e1.user_id else null end) as weekly_digest_email
     , count(case when e2.action = 'sent_weekly_digest' then e2.user_id else null end) as weekly_digest_email_open
     , count(case when e3.action = 'sent_weekly_digest' then e3.user_id else null end) as weekly_digest_email_clickthrough
from tutorial.yammer_emails e1
  left join tutorial.yammer_emails e2
     on e2.occurred_at between e1.occurred_at and e1.occurred_at + interval '5 minute'
     and e1.user_id = e2.user_id
     and e2.action = 'email_open'
  left join tutorial.yammer_emails e3
     on e3.occurred_at between e1.occurred_at and e1.occurred_at + interval '5 minute'
     and e1.user_id = e3.user_id
     and e3.action = 'email_clickthrough'
where e1.occurred_at between '2014-06-01 00:00:00' and '2014-08-31 23:59:59'
  and e1.action in ('sent_weekly_digest', 'sent_reengagement_email')
group by week
더보기

주차별(datetime) count 될때, count(case when (...) then user_id else null end) 많이 사용됨 암기 !!