February 21, 2022
An article by Sergey Matrosov
Before we start, we must warn you: there is a lack of data sharing because of the GDPR and iOS 14.5, so you won’t have consistent data as a result.
There are at least two trajectories of user-journey tracking: within a product, which is the scope of product analysis, and within traffic channels, which is within the scope of marketing analysis. The latter is called a multi-channel funnel. Nowadays, there are web and app promotions, so there are three multi-channel funnels 1: for the web, for apps and for a combination of these.
Today, we will build a user journey and a multi-channel funnel using AppsFlyer data for app acquisition. As a result, there will be a funnel of app sources, so-called
media_sources, which acquareted users. Here are the final results:
The problem with this is that it is really hard to do because of several difficulties associated with app promotion tracking.
Let’s remember web tracking, which is pretty straightforward:
Using a UTM, you can track almost all online sources 2.
App tracking has two additional midpoints:
One of the problems is that you cannot get information directly from the App Store. Mobile Measurement Partners (MMPs) such as AppsFlyer can do this, and you have to use these in order to track sources.
clicks, impressions, clicks retargeting, impressions retargeting, sessions, retargeting sessions, cross-platform engagements.
installtable, but if one of the contributors was FB (for example, it was contributor-1), you will get nothing 4 in that column.
Yes, you can enrich it with data from the
click table, but there will be no SRN data like Google, FB etc. 5 6 The good news, based on our experience, is that the formula “one channel, one install” works most of the time, and if there are a lot of contributors to install, you can be sure that you have full data about all touchpoints before install 95% of the time.
restrictedwithin raw data 7 in the
installtable. For example, this is true for Facebook. There is a way to get around this to some extent for Android, called Google Install Referrer8. Furthermore, if the media_source is
restricted, all data about contributors (assists) to install are erased too. If the contributor is from the list of restricted data sources, the value of the field is null (meaning that not every null in contributor data refers to lack of contributors).
“Organic”as the media_source is not always the case when a user opens Google Play or the App Store, searches for an app and installs it without clicking on any ads. AppsFlyer marks everything as
“organic”whenever it cannot define the source or if the attribution window of the source is over.
To sum up: even accurately joined data from AppsFlyer will be very fragmented. The main question is, speaking by analogy, can you drink from a broken cup that has been mended using only half of its pieces?
installwhen we are trying to highlight the touch of retargeting campaigns. Say, for example, that the install of your app was from source
lastclickcityand you started a retargeting campaign within
“registration”. If you don’t send a
“registration”event to AppsFlyer and the user made a
registration, you won’t see any rows with
(No rows with login event = no rows with “google” source)
Because touch = event firing (= engagement 9), there is no source tracking by default. Please do not forget this 10. Yes, there is a session in AppsFlyer, but there is no such thing as the source of the session, and these are only available at DataLocker! Only events can highlight a touch: no event means no touch from the channel. However, you cannot just send every (sic!) event to AppsFlyer. We will talk about this more later.
is_retargeting_true) from google as a retargeting campaign within AppsFlyer,
“registration”will be attributed to the install source.
In some cases, it can be null.
Don’t forget to set a retargeting campaign as retargeting to get this:
There is more information about attribution of retargeting campaigns with visuals here.
“registration”event, it will be attributed to an install source.
We suggest changing this second row to
customer_user_id), which is key to joining app and web data. This can’t be done through
appsflyer_id(like GA’s client ID but for apps) because you cannot implement this dimension on the web; it’s only App-scope.
We suggest that you do the following using AppsFlyer:
“open”AppsFlyer event to highlight the channel.
“open”AppsFlyer event to highlight the channel.
If you set up an “open” event for every or even just the main screen, you will dramatically increase your raw data.
Again, availability of the raw data depends on your subscription plan!
You can get raw data using the following 14:
Now let’s see what we get from AppsFlyer using DataLocker in our database. 9 major tables 15 are enough to make a user journey:
- post-attribution installs
- post-attribution in-apps
Here is a schema showing how most of these tables representing the user journey are connected:
There are 4 additional tables:
- post-attribution installs
- post-attribution in-app events
The last two are reports from the Protect360 tool, which provides fraud protection.
post-attribution installs: Protect can change the media source of install for 7 days after install (8 days in total) on organic or from the list of contributors 16 to install. The correct media source will be in the
rejected_reason_value column 17. Most of the time, it is null or organic.
post-attribution in-app events: Protect can change the media source of events during the 30 days after install. The correct media source will be in the
rejected_reason_value column. Most of the time, it is null or organic.
We constructed nine dummy tables with the required number of columns so that they repeat the data like in real AppsFlyer raw data.
There are six users with different trajectories, but each of them has its own features.
100-200 has a fraudulent install media_source and a fraudulent retargeting campaign. We have to change both of these using tables from Protect360.
111-222 has a different install time for
inapps tables. We will use install time from
inapps because it was earlier. There is also a duplicate of the touch in clicks and installs by the
mobtop source. This will be deduplicated by DISTINCT.
333-444 has reinstall from the retargeting campaign.
777-888 has a fraudulent install media_source and doesn’t have the correct media source, so we have to delete this user.
999-100 has a fraudulent retargeting campaign and does not have the correct media source for it, so we have to delete the row with the retargeting campaign for this user.
We tried to use all the data fields as is 18.
These are clicks from ads but without SRNs, like Google, Facebook, etc. 19 Remember: if a user has already installed your app but still clicks on your ad, these clicks will be in the “click” table. Furthermore, if users click on an ad but don’t open your app, there will be clicks in this table too.
WITH clicks AS ( SELECT '2022-03-29 15:03:02 UTC' AS event_time, 'click' AS event_name, 'cpanetwork1' AS media_source, 'aaa111-bbb111' AS advertising_id UNION ALL SELECT '2022-03-30 16:15:24 UTC', 'click', 'cpanetwork2', 'aaa111-bbb111' UNION ALL SELECT '2022-04-01 17:15:00 UTC', 'click', 'cpanetwork3', 'aaa111-bbb111' UNION ALL SELECT '2022-03-31 16:03:47 UTC', 'click', 'mobtop', 'aaa111-bbb111' UNION ALL SELECT '2022-03-28 17:15:23 UTC', 'click', 'cpanetwork1', 'ccc555-ddd555' UNION ALL SELECT '2022-03-30 18:15:32 UTC', 'click', 'cpanetwork2', 'ccc555-ddd555' ), SELECT * FROM clicks
This means if the user uninstalls your app during the reattribution window and reinstalls it either from
- another user acquisition campaign – there will be no attribution (= no install event)
- retargeting campaigns (set as retargeting! e.g., retargeting_is_true), in which case it will be recorded in the
“inapps_retargeting” table (not in the
“install” table!) with
You also need to remember that “regardless of the reattribution window, when an application is backed up using iCloud and later restored (on the same device or on another device), AppsFlyer doesn’t count it as a new install or reinstall. A user who restores an app from iCloud maintains their AppsFlyer ID and attribution data 21.”
Let’s say a few words about how AppsFlyer assigns contributors to install. When your app is opened the very first time by the user, it initiates the AppsFlyer SDK. The AppsFlyer then goes to its databases with clicks and impressions from all integrated partners that are activated for your AppsFlyer account. AppsFlyer also sends API calls to SRNs. When AppsFlyer gets an answer, it uses advertising_id or several parameters like user Agent + IP (for probabilistic attribution) to make an attribution, considering the following priorities:
The winner is assigned as install media_sources, and all others are assigned as contributors.
However, if the winner from restricted list = the media_source is
restricted, all data about contributors (assists) to install are erased. If the contributor is from the list of restricted data sources, the value of the field is null. This means that not every null in contributor data means lack of a contributor.
Some words about deterministic and probabilistic attribution methods:
- The deterministic method means that AppsFlyer sends advertising_id connected to install to all active integrated networks, and if it gets information from one of these networks indicating that this advertising_id is from the exact network, it assigns install to this exact network.
- There are cases where there are no data about the install source found by this ID. AppsFlyer has a set of attributes about this install, like device_id, IP, etc., and networks can share this info too. Using data from the last 24 hours, AppsFlyer compares matches for each attribute with data from networks.
For example, we have the data below:
Network-1 matches with AppsFlyer data 1 of 3 and Network-2 - 2 of 3. AppsFlyer attributes install to Network-2. This is a very simple probabilistic method, but AppsFlyer claims that it has a 93% probability of correctly attributing the media_source of install.
WITH installs AS ( SELECT '2022-03-31 18:03:47 UTC' AS install_time, 'install' AS event_name, 'googleads' AS media_source, '2022-03-31 17:03:47 UTC' AS contributor_1_touch_time, 'twitter' AS contributor_1_media_source, '2022-03-31 16:03:47 UTC' AS contributor_2_touch_time, 'mobtop' AS contributor_2_media_source, '2022-03-31 15:03:47 UTC' AS contributor_3_touch_time, 'organic' AS contributor_3_media_source, '111-222' AS appsflyer_id, 'aaa111-bbb111' AS advertising_id UNION ALL SELECT '2022-03-30 18:05:47 UTC', 'install', 'organic', null, null, null, null, null, null, '333-444', 'bbb222-ccc222' UNION ALL SELECT '2022-03-29 17:06:50 UTC', 'install', 'facebook', '2022-03-28 15:06:50 UTC', 'mobtop', null, null, null, null, '555-666', 'ccc555-ddd555' UNION ALL --must be deleted due to fraudulent install SELECT '2022-03-29 17:07:50 UTC', 'install', 'fraudcpa', null, null, null, null, null, null, '777-888', 'eee777-fff777' UNION ALL --media_source will be changed due to fraudulent install on organic SELECT '2022-03-29 17:08:50 UTC', 'install', 'fraudcpa', null, null, null, null, null, null, '999-100', 'ggg999-iii999' UNION ALL --media_source will be changed due to fraudulent install on contributor1 SELECT '2022-03-29 18:08:50 UTC', 'install', 'fraudcpa', '2022-03-29 15:00:50 UTC', 'goodcpa', null, null, null, null, '100-200', 'jjj100-kkk100' ) SELECT * FROM installs
One of the tables from Protect360 gives the correct media_source if it considers the current one a fraud.
Remember: “Protect360 performs post-attribution detection during the calendar month of the install and up to the seventh day of the following month. This means, for example, that for installs in November, Protect360 checks for fraud up to December 7 22”.
WITH post_attribution_installs AS ( SELECT '777-888' AS appsflyer_id, NULL AS rejected_reason_value UNION ALL SELECT '999-100', 'organic' UNION ALL SELECT '100-200', 'contributor1' ) SELECT * FROM post_attribution_installs
Remember: there is a double attribution of retargeting events. This means that events from retargeting campaigns are also attributed to the install campaign for this appsflyer_id. As a result, there is duplication of events: one is stored in the in-apps (events) table and another one in inapps_retargeting (events). The difference is that the in-apps table for the event from the retargeting campaigns field
is_primary_attribution = false 23. This is why we filter events with such values.
inapps AS ( SELECT '2022-03-31 18:02:47 UTC' AS install_time, 'content_view' AS event_name, '2022-03-31 18:06:02 UTC' AS event_time, 'googleads' AS media_source, TRUE AS is_primary_attribution, '111-222' AS appsflyer_id, null AS customer_user_id UNION ALL SELECT '2022-03-31 18:02:47 UTC', 'registation', '2022-03-31 18:09:47 UTC', 'googleads', TRUE, '111-222', '111' --double_attribution for '111-222' UNION ALL SELECT '2022-03-31 18:02:47 UTC', 'login', '2022-04-01 20:07:02 UTC', 'googleads', FALSE, '111-222', '111' -- UNION ALL SELECT '2022-03-30 18:05:47 UTC', 'content_view', '2022-03-30 18:07:47 UTC', 'organic', TRUE, '333-444', '333' UNION ALL SELECT '2022-03-29 17:06:47 UTC', 'content_view', '2022-03-29 17:08:50 UTC', 'facebook', TRUE, '555-666', null), SELECT * FROM inapps
We made the install time for appsflyer_id
111-222 in this table earlier than in the install table for the same user. Why?
Well, the first possible issue is that
installstable is different from the
inappstable. We therefore need to consider it to have an
“install”event before all other events. This is absolutely fine. AppsFlyer does it because of the abnormal behaviour of the install.
Other possible issues:
appsflyer_id(for the same reason as in point 1 above).
customer_user_id) due to the user ID having installed your app twice. Each install initializes a new
appsflyer_id! Once again, we need to remind you about a small unauthorized zone as a must in order to get full data about the user!
customer_user_id) for one
appsflyer_idbecause the user has made several registrations.
These are clicks from retargeting ads but without SRNs, like Google, Facebook, etc. Remember: if a user has already re-engaged with your app but still clicks on your retargeting ad, these clicks will be in the “click_retargeting” table. Furthermore, if users click on a retargeting ad but do not open your app, there will be clicks in this table too.
WITH clicks_retargeting AS ( SELECT '2022-04-01 21:03:02 UTC' AS event_time, 'click' AS event_name, 'email' AS media_source, 'aaa111-bbb111' AS advertising_id UNION ALL SELECT '2022-03-31 08:15:23 UTC', 'click', 'google_retarg', 'bbb222-ccc222' ) SELECT * FROM clicks_retargeting
“content_view”.) The retargeting campaign is attributed to this event. Say only
“purchase”is sent to AppsFlyer. This event is unlikely to occur that often, compared with other possible engagements within your app, and if the user clicks on your retargeting app, gets to your app and doesn’t make any purchase, there will be no data in this table about this touch.
“reattribution”. You can therefore change the very first event of the retargeting campaign if
is_primary_attribution = true.
WITH inapps_retargeting AS ( SELECT 'email' AS media_source, '2022-04-01 19:06:02 UTC' AS event_time, 'content_view' AS event_name, '111-222' AS appsflyer_id, '111' AS customer_user_id, NULL AS retargeting_conversion_type UNION ALL SELECT 'email', '2022-04-01 20:07:02 UTC', 'login', '111-222', '111', NULL UNION ALL SELECT 'google_retarg', '2022-03-31 09:07:00 UTC', 'content_view', '333-444', '333', 'reattribution' UNION ALL SELECT 'google_retarg', '2022-03-31 09:07:02 UTC', 'registration', '333-444', '333', NULL UNION ALL SELECT 'cpanetwork_retarg', '2022-03-31 10:10:10 UTC', 'content_view', '100-200', NULL, NULL UNION ALL SELECT 'cpanetwork_retarg', '2022-03-31 13:15:10 UTC', 'content_view', '999-100', NULL, NULL ), SELECT * FROM inapps_retargeting
WITH post_attribution_in_app_events AS ( SELECT '2022-03-31 10:10:10 UTC' AS event_time, 'content_view' AS event_name, '100-200' AS appsflyer_id, TRUE AS is_retargeting, 'organic' AS rejected_reason_value UNION ALL SELECT '2022-03-31 13:15:10 UTC', 'content_view', '999-100', TRUE, NULL ) SELECT * FROM post_attribution_in_app_events
Note that the event time represents the time AppsFlyer determines the app was uninstalled, and not the actual uninstall itself 24.
WITH organic_uninstalls AS ( SELECT 'googleads' AS media_source, '2022-04-01 10:09:47 UTC' AS event_time, '333-444' AS appsflyer_id ) SELECT * FROM organic_uninstalls
Note that the event time represents the time AppsFlyer determines the app was uninstalled, and not the actual uninstall itself.
WITH uninstalls AS ( SELECT 'googleads' AS media_source, '2022-04-02 18:09:47 UTC' AS event_time, '111-222' AS appsflyer_id ) SELECT * FROM uninstalls
Now it’s time to combine data items simply by joining them and making some simple transformations.
raw_data AS (SELECT CASE WHEN true_install_time_by_userid IS NULL THEN true_install_time_by_afid ELSE true_install_time_by_afid END AS true_install_time, * FROM (SELECT FIRST_VALUE(install_time) OVER (PARTITION BY customer_user_id ORDER BY install_time ASC) AS true_install_time_by_userid, FIRST_VALUE(install_time) OVER (PARTITION BY appsflyer_id ORDER BY install_time ASC) AS true_install_time_by_afid, * FROM (SELECT LAST_VALUE(customer_user_id IGNORE NULLS) OVER (PARTITION BY appsflyer_id ORDER BY install_time DESC RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS customer_user_id, appsflyer_id, media_source, true_install_source, clicks_media_source, clicks_event_name, clicks_event_time, install_time, contributor_1_media_source, contributor_1_touch_time, contributor_2_media_source, contributor_2_touch_time, contributor_3_media_source, contributor_3_touch_time, clicks_retarg_source, clicks_retarg_event_name, clicks_retarg_event_time, retarg_source, retarg_event_name, retarg_event_time, inapps_media_source, inapps_event_name, inapps_event_time, uninstall_organic_source, uninstall_organic_time, uninstall_nonorg_source, uninstall_nonorg_time FROM (SELECT DISTINCT customer_user_id, --delete fraud installs CASE WHEN problem_install_afid IS NOT NULL AND true_install_source IS NULL THEN 'delete' ELSE appsflyer_id END AS appsflyer_id, CASE WHEN problem_install_afid IS NOT NULL AND true_install_source = 'organic' THEN 'organic' WHEN problem_install_afid IS NOT NULL AND true_install_source = 'contributor1' THEN contributor_1_media_source WHEN problem_install_afid IS NOT NULL AND true_install_source = 'contributor2' THEN contributor_2_media_source WHEN problem_install_afid IS NOT NULL AND true_install_source = 'contributor3' THEN contributor_3_media_source ELSE media_source END AS media_source, true_install_source, clicks_media_source, clicks_event_name, clicks_event_time, CASE WHEN install_time_inapp IS NULL THEN install_time_installs ELSE install_time_inapp END AS install_time, CASE WHEN problem_install_afid IS NOT NULL AND true_install_source = 'contributor1' THEN NULL ELSE contributor_1_media_source END AS contributor_1_media_source, CASE WHEN problem_install_afid IS NOT NULL AND true_install_source = 'contributor1' THEN NULL ELSE contributor_1_touch_time END AS contributor_1_touch_time, CASE WHEN problem_install_afid IS NOT NULL AND true_install_source = 'contributor2' THEN NULL ELSE contributor_2_media_source END AS contributor_2_media_source, CASE WHEN problem_install_afid IS NOT NULL AND true_install_source = 'contributor2' THEN NULL ELSE contributor_2_touch_time END AS contributor_2_touch_time, CASE WHEN problem_install_afid IS NOT NULL AND true_install_source = 'contributor3' THEN NULL ELSE contributor_3_media_source END AS contributor_3_media_source, CASE WHEN problem_install_afid IS NOT NULL AND true_install_source = 'contributor3' THEN NULL ELSE contributor_3_touch_time END AS contributor_3_touch_time, clicks_retarg_source, clicks_retarg_event_name, clicks_retarg_event_time, --make nulls in retargeting columns if fraud CASE WHEN true_retarg_source = 'delete' THEN NULL WHEN true_retarg_source IS NOT NULL AND true_retarg_source != 'delete' THEN true_retarg_source ELSE retarg_source END AS retarg_source, CASE WHEN true_retarg_source = 'delete' THEN NULL ELSE retarg_event_name END AS retarg_event_name, CASE WHEN true_retarg_source = 'delete' THEN NULL ELSE retarg_event_time END AS retarg_event_time, inapps_media_source, inapps_event_name, inapps_event_time, uninstall_organic_source, uninstall_organic_time, uninstall_nonorg_source, uninstall_nonorg_time FROM (SELECT inapps.customer_user_id, installs.appsflyer_id, installs.media_source, post_attribution_installs.appsflyer_id AS problem_install_afid, rejected_reason_value AS true_install_source, clicks.media_source AS clicks_media_source, clicks.event_name AS clicks_event_name, CAST(clicks.event_time AS STRING) AS clicks_event_time, CAST(installs.install_time AS STRING) AS install_time_installs, CAST(inapps.install_time AS STRING) AS install_time_inapp, installs.contributor_1_media_source, CAST(installs.contributor_1_touch_time AS STRING) AS contributor_1_touch_time, installs.contributor_2_media_source, CAST(installs.contributor_2_touch_time AS STRING) AS contributor_2_touch_time, installs.contributor_3_media_source, CAST(installs.contributor_3_touch_time AS STRING) AS contributor_3_touch_time, clicks_retarg.media_source AS clicks_retarg_source, clicks_retarg.event_name AS clicks_retarg_event_name, CAST(clicks_retarg.event_time AS STRING) AS clicks_retarg_event_time, retarg.media_source AS retarg_source, retarg.event_name AS retarg_event_name, CAST(retarg.event_time AS STRING) AS retarg_event_time, fraud_retarg.media_source AS true_retarg_source, inapps.media_source AS inapps_media_source, inapps.event_name AS inapps_event_name, CAST(inapps.event_time AS STRING) AS inapps_event_time, organic_uninstalls.media_source AS uninstall_organic_source, CAST(organic_uninstalls.event_time AS STRING) AS uninstall_organic_time, uninstalls.media_source AS uninstall_nonorg_source, CAST(uninstalls.event_time AS STRING) AS uninstall_nonorg_time FROM (SELECT appsflyer_id, advertising_id, media_source, install_time, contributor_1_media_source, contributor_1_touch_time, contributor_2_media_source, contributor_2_touch_time, contributor_3_media_source, contributor_3_touch_time FROM installs ) AS installs LEFT JOIN (SELECT advertising_id, media_source, event_name, event_time FROM clicks ) AS clicks ON installs.advertising_id = clicks.advertising_id LEFT JOIN (SELECT appsflyer_id, rejected_reason_value FROM post_attribution_installs ) AS post_attribution_installs ON installs.appsflyer_id = post_attribution_installs.appsflyer_id LEFT JOIN --add customer_user_id, because all customer_user_id are NULL at installs (SELECT FIRST_VALUE (customer_user_id IGNORE NULLS) OVER (PARTITION BY appsflyer_id ORDER BY install_time) AS customer_user_id, appsflyer_id, media_source, install_time, event_name, event_time FROM inapps WHERE is_primary_attribution = TRUE ) AS inapps ON installs.appsflyer_id = inapps.appsflyer_id LEFT JOIN (SELECT advertising_id, media_source, event_name, event_time FROM clicks_retargeting ) AS clicks_retarg ON installs.advertising_id = clicks_retarg.advertising_id LEFT JOIN --add retarteting campaigns and events name with event_time (as contidition) (SELECT customer_user_id, appsflyer_id, CONCAT('utm_retargeting=', media_source) AS media_source, event_time, CASE WHEN retargeting_conversion_type = 'reattribution' THEN 'reinstall' ELSE event_name END AS event_name FROM inapps_retargeting ) AS retarg ON installs.appsflyer_id = retarg.appsflyer_id LEFT JOIN (SELECT appsflyer_id, CASE WHEN rejected_reason_value IS NULL THEN 'delete' ELSE CONCAT('utm_retargeting=', rejected_reason_value) END AS media_source, event_time, event_name FROM post_attribution_in_app_events WHERE is_retargeting IS TRUE ) AS fraud_retarg ON retarg.appsflyer_id = fraud_retarg.appsflyer_id AND retarg.event_time = fraud_retarg.event_time AND retarg.event_name = fraud_retarg.event_name LEFT JOIN (SELECT appsflyer_id, IF(media_source IS NULL, 'none', 'none') AS media_source, event_time FROM organic_uninstalls ) AS organic_uninstalls ON installs.appsflyer_id = organic_uninstalls.appsflyer_id LEFT JOIN (SELECT appsflyer_id, IF(media_source IS NULL, 'none', 'none') AS media_source, event_time FROM uninstalls ) AS uninstalls ON installs.appsflyer_id = uninstalls.appsflyer_id ) ) ) --excluding user with adif = '777-888' AND WHERE (appsflyer_id != 'delete') ) ) SELECT * FROM raw_data
(With contributor_2 and other columns)
Now we need to store channels in one column. To do so, we need to use the UNPIVOT function.
channels AS (SELECT DISTINCT appsflyer_id, customer_user_id, CASE WHEN channel LIKE '%utm_retargeting%' THEN REGEXP_EXTRACT(channel, 'utm_retargeting=([^&]*)') ELSE channel END AS channel, CASE WHEN channel LIKE '%utm_retargeting%' THEN 1 ELSE 0 END AS is_retarg, true_install_time, CAST(touch_time AS TIMESTAMP) AS touch_time FROM raw_data UNPIVOT ((channel, touch_time) FOR channel_time_pair IN ((clicks_media_source, clicks_event_time), (media_source, install_time), (contributor_1_media_source, contributor_1_touch_time), (contributor_2_media_source, contributor_2_touch_time), (contributor_3_media_source, contributor_3_touch_time), (clicks_retarg_source, clicks_retarg_event_time), (inapps_media_source, inapps_event_time), (retarg_source, retarg_event_time), (uninstall_organic_source, uninstall_organic_time), (uninstall_nonorg_source, uninstall_nonorg_time) )) ) SELECT * FROM channels
We also have to do the same for the events. We cannot use UNPIVOT simultaneously for channels and events because you’ll get CROSS JOIN events with channels, which will create meaningless rows.
events AS (SELECT DISTINCT appsflyer_id, customer_user_id, event, CAST(time_of_event AS TIMESTAMP) AS time_of_event FROM raw_data UNPIVOT ((event, time_of_event) FOR event_time_pair IN ((clicks_event_name, clicks_event_time), (media_source, install_time), (clicks_retarg_event_name, clicks_retarg_event_time), (inapps_event_name, inapps_event_time), (retarg_event_name, retarg_event_time) )) ) SELECT * FROM events
The next step is just to join it, making an install
“event” based on the timestamp and converting some channels to
“direct/none”, as we proposed before.
user_journey AS (SELECT appsflyer_id, customer_user_id, CASE WHEN LAG(channel) OVER (PARTITION BY appsflyer_id, customer_user_id, is_retarg ORDER BY touch_time ASC ) = channel AND event_name = 'install' THEN channel WHEN LAG(channel) OVER (PARTITION BY appsflyer_id, customer_user_id, is_retarg ORDER BY touch_time ASC ) = channel AND event_name IS NOT NULL THEN 'direct/none' ELSE channel END AS channel, touch_time, --or leave it as null if you wish --event_name CASE WHEN event_name IS NULL THEN 'click' ELSE event_name END AS event_name FROM( SELECT appsflyer_id, customer_user_id, channel, is_retarg, touch_time, CASE --where can not be events before install! WHEN CAST(true_install_time AS TIMESTAMP) > CAST(touch_time AS TIMESTAMP) THEN NULL --install event WHEN CAST(true_install_time AS TIMESTAMP) = touch_time THEN 'install' WHEN channel = 'none' THEN 'uninstall' WHEN event = 'click' THEN NULL ELSE event END AS event_name FROM (SELECT channels.appsflyer_id, channels.customer_user_id, channel, is_retarg, true_install_time, touch_time, event FROM (SELECT * FROM channels) AS channels LEFT JOIN (SELECT * FROM events) AS events ON channels.appsflyer_id = events.appsflyer_id AND channels.customer_user_id = events.customer_user_id AND channels.touch_time = events.time_of_event ORDER BY channels.appsflyer_id, touch_time) ) ) SELECT * FROM user_journey ORDER BY appsflyer_id, customer_user_id, touch_time ASC
Now we make
“install” for the conversion (you can change it to registration, for example):
conversion_paths_by_user AS (SELECT appsflyer_id, customer_user_id, channel, event_name FROM (SELECT *, --MIN = first interaction with desired event_name MIN(CASE WHEN event_name = 'install' THEN touch_number END) OVER (PARTITION BY appsflyer_id, customer_user_id) AS conversion FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY appsflyer_id, customer_user_id) AS touch_number FROM user_journey) ) WHERE touch_number <= conversion ) SELECT * FROM conversion_paths_by_user
This looks nice, but as we said earlier, there can be additional channels that are not represented here!
Here is our final result:
multi_channel_funnel AS ( SELECT multi_channel_funnel, COUNT(multi_channel_funnel) AS count FROM (SELECT ARRAY_TO_STRING(multi_channel_funnel, '->') AS multi_channel_funnel FROM (SELECT appsflyer_id, customer_user_id, ARRAY_AGG(channel) AS multi_channel_funnel FROM conversion_paths_by_user GROUP BY 1, 2) ) GROUP BY 1) SELECT * FROM multi_channel_funnel
Here is the query with dummy data, which you can easily rewrite to actual data. Change the name of tables if needed and select the time range. That’s all. We hope you find this helpful. Feel free to write to us if you would like a consultation.
1 Which can be combined into one, of course.
2 Note: by using custom unique domains/coupons, you can also track some offline ads.
4 We will use NULL as a synonym.
5 This also means that you cannot compare raw clicks/impressions data with SRN ad account data: you can only do it using AppsFlyer reports like Overview, within its user interface.
6 The same goes for the impressions table. We won’t use it because impressions data are also restricted for the web.
7 https://support.appsflyer.com/hc/en-us/articles/360006868017-Raw-data-content-restriction-media-source-or-user-identifiers – Facebook, TikTok, Twitter, Roku.
9 Note: clicks and impressions are engagements too.
10 If you send a server-to-server event, remember to add the appsflyer_id parameter for correct attribution.
11 You are charged when the user opens the app after clicking on the retargeting campaign. All following openings are not billed.
12 Only for install campaigns!
13 There is no other identifier to do this.
15 If some tables are missing, you need to look at your data streams from Data Locker.
17 It is hard to say why AppsFlyer calls it this, but you should treat it like a
19 Can you remember how to check discrepancies between Google Ads clicks and web sessions? It’s easy because you have a referrer for raw data from Google Analytics. You cannot do it for apps – not just because stores don’t share info. with you; the same goes for AppsFlyer too! Sadly, having faith is the only way.